Thanks Mike, I'll try that.

So nog being cpu-bound, you would indeed think indexing here is IO-bound?
(Maybe it generally is, I'm not sure. )
What's a good tool to profile IO on windows, anyone?


2008/4/7, Mike Klaas <[EMAIL PROTECTED]>:
>
> On 5-Apr-08, at 7:09 AM, Britske wrote:
>
>  Indexing of these documents takes a long time. Because of the size of the
> > documents (because of the indexed fields) I am currently batching 50
> > documents at once which takes about 2 seconds.Without adding the 10000
> > indexed fields to the document, indexing flies at about 15 ms for these
> > 50
> > documents. INdexing is done using SolrJ
> >
> > This is on a intel core 2 6400 @2.13ghz and 2 gb ram.
> >
> > To speed this up I let 2 threads do the indexing in parallel. What
> > happens
> > is that solr just takes double the time (about 4 seconds) to complete
> > these
> > two jobs of 50 docs each in parallel. I figured because of the
> > multi-core
> > setup indexing should improve, which it doesn't.
> >
>
> Multiple processors really only help indexing speeds when there is heavy
> analysis.
>
>  Does this perhaps indicate that the setup is IO-bound? What would be your
> > best guess  (given the fact that the schema has a big amount of indexed
> > fields) to try next to improve indexing performance?
> >
>
> Use Lucene 2.3 with solr 1.2, or simple try out solr trunk.  The indexing
> has been reworked to be considerably faster (it also makes better use of
> multiple processors by spawing a background merging thread).
>
> -Mike
>

Reply via email to