+1 to share code for doing 1) and 3) both of which are tricky!
Safely moving / copying bytes around is a notoriously difficult problem ...
but Lucene's "end to end checksums" and per-segment-file-GUID make this
safer.
I think Lucene's replicator module is a good place for this?
Mike McCandless
Hi all,
I have a configuration file that lists multiple queries, of all different types,
and that lists words to be ignored.
Each of these lists is user configured, variable in length and content.
I know that, in general, unless the ignore word is in the query it won’t match,
but I need to be abl
Could you please describe the use case? maybe there is an easier solution
From: java-user@lucene.apache.org At: 07/09/19 14:27:10To:
java-user@lucene.apache.org
Subject: How to ignore certain words based on query specifics
Hi all,
I have a configuration file that lists multiple queries, of all
Sorry for the weird reply path, but I couldn’t find an easy reply method via
the list archive.
Anyway …
The use case is as follows:
Allow the user to specify queries such as ‘free*’
and also include similar words to be ignored, such as freedom.
Another example would be ‘secret*’ and secretary.
I think what you're saying in you're example is that "free*" should
match anything with a term matching that pattern, but not *only*
freedom. In other words, if a document has "freedom from stupidity"
then it should not match, but if the document has "free freedom from
stupidity" than it should.
Michael,
Thanks for your reply.
You are correct, the desired effect is to not match 'freedom ...'.
I hadn't considered the case where both free* and freedom match.
My solution 'free* and not freedom' would NOT match either of your examples.
I think what I really want is
Get every matching term f