Hi,
Just thought of sharing some more progress I made on this.
This time I created multiple (2) indexWriter writing different
documents (based on if it is odd or even based on an id - not doc-id) to
different indexes and the performance seems to scale up based on the
number of threads (and the
: a) once a doc is added to an index, it will not get modified/deleted
: b) all the fields added are keywords (mostly numbers) - no analysis is
: required.
: c) indexing speed is more important than querying speed.
: d) every document is the same - there is no boost or relevancy required.
:
: e
--Original Message-
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: Friday, December 19, 2008 12:12 AM
> To: java-user@lucene.apache.org
> Subject: Re: Combining results of multiple indexes
>
> I would recommend, very strongly, that you don
AUTOMATIC REPLY
LUX is closed until 5th January 2009
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
AUTOMATIC REPLY
LUX is closed until 5th January 2009
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
indexed.
Thanks,
~preetham
-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Friday, December 19, 2008 12:12 AM
To: java-user@lucene.apache.org
Subject: Re: Combining results of multiple indexes
I would recommend, very strongly, that you don't rely on th
I would recommend, very strongly, that you don't rely on the doc IDs being
the same in two different indexes. Doc IDs are just incremented by one
for each doc added, but.
optimization can change the doc ID. and is guaranteed to change at
least some of them if there are deletions from your inde
These results are surprising.
I'd expect single IndexWriter with 2 threads to do better than a
single thread, but in your test two threads are significantly worse
than one.
Is it possible there's a bottleneck outside of Lucene in sourcing the
documents?
How many segments are produced a
Hi,
I noticed that the doc id is the same. So, if I have HitCollector, just
collect the doc-ids of both Searchers (for the two indexes) and find the
intersection between them, it would work. Also, get the doc is even
where there are large number of hits is fast.
Of course, I am using somethin
Thanks. Yep the code is very easy. However, it take about 3 mins to
complete merging.
Looks like I will need to have an out of band merging of indexes once
they are closed (planning to store about 50mil entries in each index
partition)
However, as the data is being indexed, is there any oth
You will be stunned at how easy it is. The merging code should be
a dozen lines (and that only if you are merging 6 or so indexes)
See IndexWriter.addIndexes or
IndexWriter.addIndexesNoOptimize
Best
Erick
On Thu, Dec 18, 2008 at 5:03 AM, Preetham Kajekar wrote:
> Hi,
> I tried out a single
Hi,
I tried out a single IndexWriter used by two threads to index different
fields. It is slower than using two separate IndexWriters. These are my
findings
All Fields (9) using 1 IndexWriter 1 Thread - 38,000 object per sec
5 Fields using 1 IndexWriter 1 Thread - 62,000 object per sec
A
Thanks Erick and Michael.
I will try out these suggestions and post my findings.
~preetham
Erick Erickson wrote:
Well, maybe if I'd read the original post more carefully I'd have figured
that out,
sorry 'bout that.
I *think* I remember reading somewhere on the email lists that your indexing
sp
Well, maybe if I'd read the original post more carefully I'd have figured
that out,
sorry 'bout that.
I *think* I remember reading somewhere on the email lists that your indexing
speed goes up pretty linearly as the number of indexing tasks approaches
the number of CPUs. Are you, perhaps, on a dua
Have you tested your indexing throughput with two threads sharing one
IndexWriter (one index)?
Mike
Preetham Kajekar wrote:
Hi Erick,
Thanks for the response. Replies inline.
Erick Erickson wrote:
The very first question is always "are you opening a new searcher
each time you query"? But
Hi Erick,
Thanks for the response. Replies inline.
Erick Erickson wrote:
The very first question is always "are you opening a new searcher
each time you query"? But you've looked at the Wiki so I assume not.
This question is closely tied to what kind of latency you can tolerate.
A few more deta
The very first question is always "are you opening a new searcher
each time you query"? But you've looked at the Wiki so I assume not.
This question is closely tied to what kind of latency you can tolerate.
A few more details, please. What's slow? Queries? Indexing?
How slow? 100ms? 100s? What ar
Hi Grant,
Thanks four response. Replies inline.
Grant Ingersoll wrote:
On Dec 17, 2008, at 12:57 AM, Preetham Kajekar wrote:
Hi,
I am new to Lucene. I am not using it as a pure text indexer.
I am trying to index a Java object which has about 10 fields (like
id, time, srcIp, dstIp) - most of
On Dec 17, 2008, at 12:57 AM, Preetham Kajekar wrote:
Hi,
I am new to Lucene. I am not using it as a pure text indexer.
I am trying to index a Java object which has about 10 fields (like
id, time, srcIp, dstIp) - most of them being numerical values.
In order to speed up indexing, I figured t
19 matches
Mail list logo