Hi
In Lucene 4.4 we've improved the snapshotting process so that you don't
need to specify an ID.
Also, there's a new Replicator module which can be used for just that
purpose - take hot index backups of the index.
It pretty much hides most of the snapshotting from you. You can read about
it
TermFirstPassGroupingCollector loads all terms for a given group-by field,
through FieldCache.
Is it possible to instruct the class to group only pruned terms of a field,
based on a user-supplied query [RangeQuery, TermQuery etc...]
This way, only pruned terms are grouped and all others are
I am looking for a feature in solr that will give me all matched words in the
document when I search with a word.
My field uses Stemming and as well as Synonym filters.
For example I have documents and part of the text goes like below
1.We were very careful about my surgery
2.are still needing
Dear All,
Say suppose I have 3 documents. The sample text is
/*File 1 : */
Mr X David is a manager of the company. He is the senior most manager. I
also want to become manager of the company.
/*File 2 :*/
Mr X David manager of the company is also very senior. He happens to be
the senior
PhraseQuery?
You can skip the holes created by stopwords ... e.g. QueryParser does
this. Ie, the PhraseQuery becomes X David _ _ manager _ _ company
if is/a/of/the are stop words, which isn't perfect (could return false
matches) but should work well in practice ...
Mike McCandless
I tried using Phrase Query with slops. Now since I am specifying the
slop I also need to specify the 2nd term.
In my case the 2nd term is not present. The whole string to be searched
is still 1 single term.
How do I skip the holes created by stopwords. I do not know before hand
how many
This is unfortunately very trappy ... this happened with LUCENE-4876,
where we added cloning of IndexDeletionPolicy on IW construction.
It's very confusing that the IDP you set on your IWC is not in fact
the one that IW uses...
Mike McCandless
http://blog.mikemccandless.com
On Wed, Jul 24,
Did you consider using shingles?
It solves the to be or not to be problem quite nicely.
Dawn
On 24/07/2013 12:34, Ankit Murarka wrote:
I tried using Phrase Query with slops. Now since I am specifying the
slop I also need to specify the 2nd term.
In my case the 2nd term is not present. The
With PhraseQuery you can specify where each term must occur in the phrase.
So X must occur in position 0, David in position 1, and then manager
in position 4 (skipping 2 holes).
QueryParser does this for you: when it analyzes the users phrase, if
the resulting tokens have holes, then it sets the
I did some performance tests on a real index using a query having the
following pattern:
termA AND (termB1 OR termB2 OR ... OR termBn)
The results were not good and I was wondering if I may be doing something
wrong (and what I would need to do to improve performance), or is it just
that the OR
Clarification - I used an MMap'd index and warmed it up with similar
queries, as well as running the identical query many times before starting
measurements. I had ample heap space.
Sriram.
On Wed, Jul 24, 2013 at 9:11 AM, Sriram Sankar san...@gmail.com wrote:
I did some performance tests on
Thanks for the detailed numbers. Nothing seems unexpected to me. Increasing
query complexity or term count is simply going to increase query execution
time.
I think I'll add a new rule to my informal performance guidance - Query
complexity of no more than ten to twenty terms is a slam dunk,
Hi,
On Wed, Jul 24, 2013 at 6:11 PM, Sriram Sankar san...@gmail.com wrote:
termA AND (termB1 OR termB2 OR ... OR termBn)
Maybe this comment is not appropriate for your use-case, but if you
don't actually need scoring from the disjunction on the right of the
query, a TermsFilter will be faster
No I do not need scoring. This is a pure retrieval query - which matches
what we used to do with Unicorn in Facebook - something like:
(name:sriram AND (friend:1 OR friend:2 ...))
This automatically gives us second degree.
With Unicorn, we would always get sub-millisecond performance even for
Unicorn sounds like it was optimized for graph search. Specialized search
engines can in fact beat out generalized search engines for specific use
cases.
Scoring has been a major focus of Lucene. Non-scored filters are also
available, but the query parsers are focused (exclusively) on
On Wed, Jul 24, 2013 at 10:24 AM, Jack Krupansky j...@basetechnology.comwrote:
Unicorn sounds like it was optimized for graph search. Specialized search
engines can in fact beat out generalized search engines for specific use
cases.
Yes and no (I worked on it). Yes, there are many aspect of
I think I've exhausted my expertise in Lucene filters, but I think you can
wrap a query with a filter and also wrap a filter with a query. So, for
IndexSearcher.search, you could take a filter and wrap it with
ConstantScoreQuery. So, if a BooleanQuery got wrapped as a filter, it could
be
Hi,
I am using lucene 4 to index very big data.
The indexer crashed after three days (147Gig of current index size). I find
the stack trash weird.
Any ideas on this will be helpful.
Exception in thread main java.io.FileNotFoundException:
Greetings,
I have wrote a custom tokenizer class which extends lucene tokenizer class.
Thanks for all replies
Regards
DJ
--
View this message in context:
http://lucene.472066.n3.nabble.com/Tokenize-String-using-Operators-Logical-Operator-operator-etc-tp4079673p4080225.html
Sent from the
Recently I find my unit test will failed sometimes but no always. I use
Lucene 4.3.0
After inverstigation, I found when I try to open a IndexWriter for a disk
directory.
Some time it will throw this exception:
org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out:
20 matches
Mail list logo