Any help on this will be highly appreciated..I have been trying all
possible different option but to no avail.
Also tried LuceneDictionary BUT THIS ALSO DOES NOT SEEM TO BE HELPING...
Please guide.
On 7/30/2013 4:49 PM, Ankit Murarka wrote:
Hello.
Using DirectSpellChecker is not serving my p
Hi,
On Tue, Jul 30, 2013 at 8:19 PM, Nicolas Guyot wrote:
> When sorting numerically, the search seems to take a bit of a while
> compared to the lexically sorted search.
> Also when sorting numerically the result is sorted within each page but no
> globally as opposed to the lexical sorted searc
Hi Mike,
I did more tests with realistic text from different languages (typical
text for 8 different languages, English one is novel "Animal Farm").
What I found seems to be:
## Indexing:
36 and 43 comparable (your previous comment is very correct).
## Search:
43 seems to be slower (30%), che
Hello,
In Lucene 4.x is there a way to get the number of documents that were deleted
from calling IndexWriter. deleteDocuments(Query)?
Another question, if we call IndexWriter. tryDeleteDocument(Reader, docId)
utilizing a near-real-time reader, what is the appropriate order to close the
reader
Thanks Mike,
Got my sysadmins to upgrade our test machine to "1.7.0_09"
Will ask them to upgrade production which is currently 1.6.0_45-b06 on the
indexing machines and 1.6.0_16-b01 on the serving machines.
Tom
On Tue, Jul 30, 2013 at 1:47 PM, Michael McCandless <
luc...@mikemccandless.com> w
Hi,
we are using some of the latest features of lucene for sorting which are
very cool but we are facing some issues with the numerical sort:
We need two kinds of sort: numerical and lexical.
For the lexical we are using SortedDocValuesField and for the numerical we
use NumericDocValuesField.
The
You should also upgrade your Java!
1.6.0_16 is really ancient and has exciting bugs ...
Mike McCandless
http://blog.mikemccandless.com
On Tue, Jul 30, 2013 at 1:06 PM, Tom Burton-West wrote:
> Thanks Mike, Robert and Adrien,
>
> Unfortunately, I killed the processes, so its too late to get a
Thanks Mike, Robert and Adrien,
Unfortunately, I killed the processes, so its too late to get a stack
trace. On thing that was suspicious was that top was reporting memory use
as 20GB res even though I invoked the JVM with java -Xmx10g -Xms10g.
I'm going to double the memory, turn on GC logging,
Hi,
On Tue, Jul 30, 2013 at 5:34 PM, Robert Muir wrote:
> I'm not sure if there is a similar one for vectors.
There is, it has been done for stored fields and term vectors at the
same time[1].
[1] https://issues.apache.org/jira/browse/LUCENE-4928
--
Adrien
---
On Tue, Jul 30, 2013 at 8:41 AM, Michael McCandless <
luc...@mikemccandless.com> wrote:
> I think that's ~ 110 billion, not trillion, tokens :)
>
> Are you certain you don't have any term vectors?
>
> Even if your index has no term vectors, CheckIndex goes through all
> docIDs trying to load them,
Can you get a strack trace so we can see where the thread is stuck?
Mike McCandless
http://blog.mikemccandless.com
On Tue, Jul 30, 2013 at 11:08 AM, Tom Burton-West wrote:
> Thanks Mike,
>
> Billion not Trillion Doh!
>
> Wasn't thinking it through when I titled the e-mail The total number
Thanks Mike,
Billion not Trillion Doh!
Wasn't thinking it through when I titled the e-mail The total number of
tokens shouldn't be unusual compared to our other indexes since whether we
index pages or whole docs, the number of tokens shouldn't change
significantly.The main difference betw
Hi Adrien,
Thank you very much. I will have a look on your suggestion ;)
> From: jpou...@gmail.com
> Date: Tue, 30 Jul 2013 16:16:03 +0200
> Subject: Re: Cache Field Lucene 3.6.0
> To: java-user@lucene.apache.org
>
> Hi,
>
> On Tue, Jul 30, 2013 at 4:09 PM, andi rexha wrote:
> > Hi, I have a s
Hi,
On Tue, Jul 30, 2013 at 4:09 PM, andi rexha wrote:
> Hi, I have a stored and tokenized field, and I want to cache all the field
> values.
>
> I have one document in the index, with the "field.value" => "hello world"
> and with tokens => "hello", "world".
> I try to extract the field
Hi, I have a stored and tokenized field, and I want to cache all the field
values.
I have one document in the index, with the "field.value" => "hello world"
and with tokens => "hello", "world".
I try to extract the fields content :
String [] cachedFields = FieldCache.DEFAULT.getStri
Just to close the loop on this, I upgraded to 4.4 and the improvements to the
NGramTokenizer were just what I needed. I switched to using 1-2 grams (the
default), and now that the tokenizer emits the tokens in an order that makes
sense I'm in business. At search time I split on whitespace, ngr
I think that's ~ 110 billion, not trillion, tokens :)
Are you certain you don't have any term vectors?
Even if your index has no term vectors, CheckIndex goes through all
docIDs trying to load them, but that ought to be very fast, and then
you should see "test: doc values..." after that.
Mike M
Hello.
Using DirectSpellChecker is not serving my purpose. This seems to return
word suggestions from a dictionary whereas I wish to return search
suggestion from Indexes I created supplying my own Files (These files
are generally log files).
I created indexes for certain files in D:\\Indexe
18 matches
Mail list logo