Hi,

I have similar issues in indexing time.

I am doing a SELECT from database and getting back
10,000 rows. I then start indexing each row and hence
would have 10,000 documents in my Lucene index. Each
doc has 27 fields.

I added some timing code to my indexing process. The
DB select call takes around 23 seconds and the
indexing process takes 567 seconds. Also, I profiled
the app using JProfiler and found out that 90% of time
is spent in the IndexWriter.addDocument call. As
expected, there were 10,000 invocation of that method
(one for each doc) and the profiler showed that the
method took 90% of the processing time.

I am concerned that it is taking around 9.5 minutes
for 10,000 docs and I am expecting to have around
600,000 docs to index. So that would take 570 minutes
(9-10 hours) to index and which is HUGE!!!

My machine: Pentium 4 CPU 2.40 GHz
            RAM 1 GB

Any help appreciated.

Thanks,
Aalap.


--- [EMAIL PROTECTED] wrote:
> В сообщении от Среда 20
> Апрель 2005 04:07 Mufaddal Khumri
> написал(a):
> > The 20000 products I mentioned are 20000 rows. I
> get the products in
> > bulk by using a limit clause.
> >
> > I am using hibernate with MySQL server on a
> 2.8GHz, 1.00GB Ram machine.
> 
> Maybe your session-level cache in hibernate grows
> incredibly. Do you do 
> Session.clear() sometimes while doing indexing?
> Here's a link about batching 
> & hibernate:
>
http://blog.hibernate.org/cgi-bin/blosxom.cgi/2004/08/
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> [EMAIL PROTECTED]
> For additional commands, e-mail:
> [EMAIL PROTECTED]
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to