Hi gents,
is it possible to use TermsFilter with the 'MUST' occurence rule, instead of
the 'SHOULD'?
In the code:
def tf = new TermsFilter()
for( some terms ){
tf.addTerm( new Term( ) )
}
I want that all terms MUST limit the hit list.
Thanks in advance
--
View this message in context:
Hi,
I have a large index and I want to remove the norms from a field. Is
there a way to do this without reindexing everything ?
Thank you,
Bogdan
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAI
OK I opened this issue and attached a patch:
https://issues.apache.org/jira/browse/LUCENE-1384
If possible could you test this patch to see if it resolves your
exceptions? Thanks.
Mike
Anthony Urso wrote:
I have implemented a MapReduce job to merge a bunch of Lucene 2.3.2
indices to
Unfortunately, I think you've hit a bug in Lucene's
ConcurrentMergeScheduler in 2.3. I'll open an issue & attach a
patch.
The bug only happens when you call addIndexesNoOptimize, and one
simple workaround would be to use SerialMergeScheduler.
I think this is already fixed in trunk (soonish to b
> -Original Message-
> From: Wojciech Strzałka [mailto:[EMAIL PROTECTED]
> Sent: den 12 september 2008 13:58
> To: java-user@lucene.apache.org
> Subject: Frequently updated fields
>
> Hi.
>
>I'm new to Lucene and I would like to get a few answers (they can
>be lame)
>
>I want to
Hi Wojciech,
can you please give us a bit more specific information about the meta
data fields that will change? I would recommend you looking at
creating filters from your primary persistency for query clauses such
as unread/read, mailbox folders, et c.
karl
12 sep 2008 kl. 13.57
12 sep 2008 kl. 12.25 skrev Bogdan Ghidireac:
I have a large index and I want to remove the norms from a field. Is
there a way to do this without reindexing everything ?
You could invoke IndexReader#setNorm(int, String, float) and set the
value to 1f.
karl
--
Thanks for reply.
Generally good idea and I like it - almost :) We just need to tweak
it a little more.
What if I have to search for both fields at the same time?
Is there any way to do something similiar to SQL JOIN on the two
documents / indexes? (I don't think so)
I think ca
Hi.
I'm new to Lucene and I would like to get a few answers (they can
be lame)
I want to index large amount of emails using Lucene (maybe SOLR), not only
the contents but also some metadata like state or flags. The
problem is that the metadata will change during mail lifecycle,
Yes, but the norms will be loaded at the search time.. I want to
remove them because I don't have enough memory.
Bogdan
On Fri, Sep 12, 2008 at 3:22 PM, Karl Wettin <[EMAIL PROTECTED]> wrote:
>
> 12 sep 2008 kl. 12.25 skrev Bogdan Ghidireac:
>
>> I have a large index and I want to remove the norm
The most changing fields will be I think:
Status (read/unread): in fact I'm affraid of this at most - any
mail incoming to the system will need to be indexed at
least twice
Flags: 0..n values from enum
Tags:0..n values from enum
Of course all the other field
12 sep 2008 kl. 14.51 skrev Wojciech Strzałka:
The most changing fields will be I think:
Status (read/unread): in fact I'm affraid of this at most - any
mail incoming to the system will need to be
indexed at least twice
This is why I recommended you to use a filte
If you search the archive, this very topic has been
discussed many times. You'e find a wealth of
discussion and more than a few options
outlined there
Best
Erick
2008/9/12 Wojciech Strzałka <[EMAIL PROTECTED]>
>
> The most changing fields will be I think:
> Status (read/unread): in fact I'm af
TermsFilter has taken the relatively easy option of ORing terms and this is
inexpensive to construct.
Adding more complex features (mixes of MUST/SHOULD/NOT clauses) starts to
require the sorts of optimisations you see in BooleanQuery (MUST clauses
accelerating processing of other clauses throu
Hi Mark,
I ended up implementing a MandatoryTermsFilter, which looks like:
class MandatoryTermsFilter extends Filter {
List terms
BitSet bits( IndexReader reader ){
int size = reader.maxDoc()
BitSet result = new BitSet( size )
BitSet andMask = new BitSet( size )
andMas
>>here I'm AND-ing each bitset. Does it look ok?
In principle it looks like it will work fine but the BooleanQuery approach I
described may prove to be faster on large datasets because ultimately
td.skipTo() will be called to avoid excessive disk reads.
Cheers
Mark
- Original Message ---
I think the important question is: in general how to cope with
frequently changing fields.
Karl Wettin wrote:
Hi Wojciech,
can you please give us a bit more specific information about the meta
data fields that will change? I would recommend you looking at
creating filters from your primary
There is no single easy answer to the question. There are a number of
solutions to the problem, in this thread we've so far listed the
following: reindex document in single index, using parallell indices
and filters created from the source data. There are other things one
can do too, but wh
Unfortunately, I think altering an existing index to remove it's norms
is not possible without writing some custom Java code (in package
org.apache.lucene.index) that directly manipulates the FieldInfos and
SegmentInfos.
Mike
Bogdan Ghidireac wrote:
Yes, but the norms will be loaded at
You might check out the tagindex issue in jira as well. Havn't looked at
it myself, but I believe its supposed to be an option for this.
Gerardo Segura wrote:
I think the important question is: in general how to cope with
frequently changing fields.
Karl Wettin wrote:
Hi Wojciech,
can you
Yes Tag Index will work. I have not had time to complete it however
if you are interested in working on it please feel free to contact me.
On Fri, Sep 12, 2008 at 3:48 PM, Mark Miller <[EMAIL PROTECTED]> wrote:
> You might check out the tagindex issue in jira as well. Havn't looked at it
> myself
21 matches
Mail list logo