All,
I'm in need of some pointers, hints or tips on indexing large collections
of data. I know I saw some tips on this list before but when I tried searching
the list, I came up blank.
I have a large collection of XML files (336000 files around 5K apiece) that I'm
indexing and its taking quite a bit of time (27 hours). I've played around with the
mergeFactor, RAMDirectories and multiple threads (X number of threads indexing
a subset of the data and then merging the indexes at the end) but I cannot seem
to bring the time down. I'm probably not doing these things properly but from
what I read I believe I am. Maybe this is the best I can do with this data but I
would be really grateful to hear how others have tackled this same issue.
As always pointers to places in the mailing list archive or other places would be
appreciated.


Thanks, Mike.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to