RE: frequency of commit when building index from scratch
But again, why someone has OOM??? I never had... What I discovered is: committing millions docs (in SOLR-1.4) may take several days (although adding docs takes a day) if you have somehow _many_segments_ and bad I/O with <= 2 CPUs; I am using heavy ramBufferSizeMB instead of heavy mergeFactor, and quad cores... Yes, I am using SolrJ with binary format. 20 minutes to commit millions of docs (including overwrites of existing ones with same uniqueId); I usually have 2 segments (>10 Gb each) -Fuad http://www.casaGURU.com = If you're using SolrJ, it's due to improvements there too: 1) binary format by default - no XML parsing 2) not used by default, but try using StreamingUpdateSolrServer -Yonik http://www.lucidimagination.com > Bill in most cases you probably cannot do one large commit as you will > hit OOM. How many documents can be uncommitted is based on the size of > the documents. Committing every document is slow. I have done a commit > every 10,000 mostly. Results may vary. Someone might have a better > answer then me.
Re: frequency of commit when building index from scratch
On Tue, Aug 25, 2009 at 8:37 PM, Lance Norskog wrote: > The latest Solr 1.4 can index 200k records in several minutes, then commit > in a few seconds. I don't know but I'm guessing it is due to Lucene > improvements. It does not use much memory doing this. If you're using SolrJ, it's due to improvements there too: 1) binary format by default - no XML parsing 2) not used by default, but try using StreamingUpdateSolrServer -Yonik http://www.lucidimagination.com
Re: frequency of commit when building index from scratch
The latest Solr 1.4 can index 200k records in several minutes, then commit in a few seconds. I don't know but I'm guessing it is due to Lucene improvements. It does not use much memory doing this. Lance On Tue, Aug 25, 2009 at 2:43 PM, Fuad Efendi wrote: > I do commit once a day, millions of small docs... it takes 20 minutes in > average... why OOM? I see only reduced I/O... > > > -Original Message- > From: Edward Capriolo [mailto:edlinuxg...@gmail.com] > Sent: August-25-09 5:35 PM > To: solr-user@lucene.apache.org > Subject: Re: frequency of commit when building index from scratch > > On Tue, Aug 25, 2009 at 5:29 PM, Bill Au wrote: > > Just curious, how often do folks commit when building their Solr/Lucene > > index from scratch for index with millions of documents? Should I just > wait > > and do a single commit at the end after adding all the documents to the > > index? > > > > Bill > > > > Bill in most cases you probably cannot do one large commit as you will > hit OOM. How many documents can be uncommitted is based on the size of > the documents. Committing every document is slow. I have done a commit > every 10,000 mostly. Results may vary. Someone might have a better > answer then me. > > > -- Lance Norskog goks...@gmail.com
RE: frequency of commit when building index from scratch
I do commit once a day, millions of small docs... it takes 20 minutes in average... why OOM? I see only reduced I/O... -Original Message- From: Edward Capriolo [mailto:edlinuxg...@gmail.com] Sent: August-25-09 5:35 PM To: solr-user@lucene.apache.org Subject: Re: frequency of commit when building index from scratch On Tue, Aug 25, 2009 at 5:29 PM, Bill Au wrote: > Just curious, how often do folks commit when building their Solr/Lucene > index from scratch for index with millions of documents? Should I just wait > and do a single commit at the end after adding all the documents to the > index? > > Bill > Bill in most cases you probably cannot do one large commit as you will hit OOM. How many documents can be uncommitted is based on the size of the documents. Committing every document is slow. I have done a commit every 10,000 mostly. Results may vary. Someone might have a better answer then me.
Re: frequency of commit when building index from scratch
That's my gut feeling (start big and go lower if OOM occurs) too. Bill On Tue, Aug 25, 2009 at 5:34 PM, Edward Capriolo wrote: > On Tue, Aug 25, 2009 at 5:29 PM, Bill Au wrote: > > Just curious, how often do folks commit when building their Solr/Lucene > > index from scratch for index with millions of documents? Should I just > wait > > and do a single commit at the end after adding all the documents to the > > index? > > > > Bill > > > > Bill in most cases you probably cannot do one large commit as you will > hit OOM. How many documents can be uncommitted is based on the size of > the documents. Committing every document is slow. I have done a commit > every 10,000 mostly. Results may vary. Someone might have a better > answer then me. >
Re: frequency of commit when building index from scratch
On Tue, Aug 25, 2009 at 5:29 PM, Bill Au wrote: > Just curious, how often do folks commit when building their Solr/Lucene > index from scratch for index with millions of documents? Should I just wait > and do a single commit at the end after adding all the documents to the > index? > > Bill > Bill in most cases you probably cannot do one large commit as you will hit OOM. How many documents can be uncommitted is based on the size of the documents. Committing every document is slow. I have done a commit every 10,000 mostly. Results may vary. Someone might have a better answer then me.
frequency of commit when building index from scratch
Just curious, how often do folks commit when building their Solr/Lucene index from scratch for index with millions of documents? Should I just wait and do a single commit at the end after adding all the documents to the index? Bill