Yeah, I should have mentioned - this was merely with a jar replacement, we
haven't gotten around to doing fun 2.3-related stuff like making sure our
domain-specific tokenizers use the next(Token), as well as making sure set
all of our buffersizes by RAM used.

We tried multithreading the process, as we have a multi-core, multi-disk
architecture, but for some reason we never saw more than 99% (of one core)
cpu usage during indexing, as if some internal synchronization was getting
hit... I should try it again through the profiler and see if I can pinpoint
where it was getting tripped up.   On the other hand, I'm not sure if we
*need* faster than 26 minute indexing, so once we're sure we can move up to
2.3 for production, that may just solve our indexing perf issues.

Now if I can just figure out how to speed up our query performance too, I'll
be in an even *better* mood. :)

  -jake

On Feb 3, 2008 2:11 PM, Michael McCandless <[EMAIL PROTECTED]>
wrote:

>
> Awesome!  We are glad to hear that :)
>
> You might be able to make it even faster with the steps here:
>
>     http://wiki.apache.org/lucene-java/ImproveIndexingSpeed
>
> Mike
>
> Jake Mannix wrote:
>
> > Hello all,
> >   I know you lucene devs did a lot of work on indexing performance
> > in 2.3,
> > and I just tested it out last thursday, so I thought I'd let you
> > know how it
> > fared:
> >
> >   On a 2.17 million document index, a recent test gave indexing
> > time to be:
> >
> >     * lucene 2.2: 4.83 hours
> >     * lucene 2.3: 26 minutes
> >
> >   About a factor of 11 speedup.  Holy smokes!  Great work folks.
> >
> >
> >   -jake
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

Reply via email to