Actually, my comment below was not quite accurate. It only matter on multiple CPU machines if you are writing everything to a memory index first.

If writing to a filesystem, then multiple threads on a single processor would allow more documents to be inverted while the disk write were occurring, as long as both COULD be done concurrently.


On Jan 15, 2007, at 12:28 PM, robert engels wrote:

I looked at doing a similar thing with the parallel 'inverting'.

I then decided that it will only make a difference on a multiple CPU machine, so I put it on the back burner.

But if you have code already done...

On Jan 15, 2007, at 12:24 PM, Chuck Williams wrote:

robert engels wrote on 01/15/2007 08:01 AM:
Is your parallel adding code available?

There is an early version in LUCENE-600, but without the enhancements
described. I didn't update that version because it didn't capture any
interest and requires Java 1.5 and so it seems will not be committed.

I could update jira with the new version, but would have to create a
clean patch that applies again the lucene head.  My local copy is
diverged due to a number of uncommitted patches and so patches generated
from it contain other stuff.

My use case for parallel subindexes is as an enabler for fast bulk
updates.  Only the subindexes containing changing fields need to be
updated, so long as the update algorithm does not change doc-ids. Even though this requires rewriting entire segments using techniques similar
to those used in merging (but not purging deleted docs), I'm still
getting 30x (when many fields changed) to many hundreds-x (when only a
few fields changing) faster update performance than the batched
delete-add method on very large indexes (million of documents, some very
large).

Chuck


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to