did you tried to increase IndexWriter.mergeFactor. I tried to increase it to 1000 and index speed is about 10 time faster than defualt = 10 .
Regards
Che Dong http://www.chedong.com/
Aalap Parikh åé:
My machine is pretty good and fairly new. The disk for sure is not slow and also I am not indexing large Documents; 27 fields with each field value being a string with no more than 15-20 characters long.
I tried setting the maxFieldLength value of the Indexwriter to a low value but that didn't help.
Also, I am not using Hibernate at all.
Thanks, Aalap.
--- Otis Gospodnetic <[EMAIL PROTECTED]> wrote:
That sounds way too long, unless you have veeery slow disks, veeery large Documents (long fields that you analyze, index, and store in Lucene), or some such. If you have very loooong fiiiiieeeelds you could try setting
http://lucene.apache.org/java/docs/api/org/apache/lucene/index/IndexWriter.html#maxFieldLength
to a very small number and see if that changes performance drastically. There are other IndexWriter knobs you can fiddle with.
I've seen Hibernate 2.* get sluggish once its Session gets filled up with a lot of objects.
Otis
--- Aalap Parikh <[EMAIL PROTECTED]> wrote:
Hi,
I have similar issues in indexing time.
I am doing a SELECT from database and getting back 10,000 rows. I then start indexing each row and
hence
would have 10,000 documents in my Lucene index.
Each
doc has 27 fields.
I added some timing code to my indexing process.
The
DB select call takes around 23 seconds and the indexing process takes 567 seconds. Also, I
profiled
the app using JProfiler and found out that 90% of
time
is spent in the IndexWriter.addDocument call. As expected, there were 10,000 invocation of that
method
(one for each doc) and the profiler showed that
the
method took 90% of the processing time.
I am concerned that it is taking around 9.5
minutes
for 10,000 docs and I am expecting to have around 600,000 docs to index. So that would take 570
minutes
(9-10 hours) to index and which is HUGE!!!
My machine: Pentium 4 CPU 2.40 GHz RAM 1 GB
Any help appreciated.
Thanks, Aalap.
--- [EMAIL PROTECTED] wrote:
Ãâ ÃÂÃÂÃÂÃÂÃâÃÂÃÂÃÂàÃÂÃâ ÃÂÃâÃÂÃÂà20 ÃÂÃÂÃâÃÂÃÂÃÅ 2005 04:07 Mufaddal Khumri ÃÂÃÂÃÂÃÂÃÂÃÂÃÂ(a):
The 20000 products I mentioned are 20000 rows.
I
get the products in
bulk by using a limit clause.
I am using hibernate with MySQL server on a
2.8GHz, 1.00GB Ram machine.
Maybe your session-level cache in hibernate
grows
incredibly. Do you do Session.clear() sometimes while doing indexing?
Here's a link about batching & hibernate:
http://blog.hibernate.org/cgi-bin/blosxom.cgi/2004/08/
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail:
[EMAIL PROTECTED]
For additional commands, e-mail:
[EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
