Yep, I was aware of the solr.commit.size setting. However, I'm not even getting commits that big. As my previous example shows, half of my commits are single digit commits. 90% of them never break 100.
I'm trying to figure out why they're so small. On Tue, Apr 30, 2013 at 10:45 AM, Canan GİRGİN <[email protected]>wrote: > >Secondly, it's adding my documents in small chunks. I was fetching in 100 > >document cycles and when I run solrindex I get messages such as the > >following. > > In nutch-default.xml file, there is a parameter about solr commit size. > If you want to add big chunks to index, you should increase this paremeter: > > > <property> > <name>solr.commit.size</name> > <value>250</value> > <description> > Defines the number of documents to send to Solr in a single update batch. > Decrease when handling very large documents to prevent Nutch from running > out of memory. NOTE: It does not explicitly trigger a server side commit. > </description> > </property> > > > On Thu, Apr 25, 2013 at 4:35 PM, Bai Shen <[email protected]> wrote: > > > I'm having two problems with the solrindex job in Nutch 2.1 > > > > When I run it with -all, it indexes every single parsed document, not > just > > the newly generated ones, as fetch and parse do. > > > > Secondly, it's adding my documents in small chunks. I was fetching in > 100 > > document cycles and when I run solrindex I get messages such as the > > following. > > > > Adding 87 documents > > Adding 5 documents > > Adding 2 documents > > Adding 3 documents > > Adding 14 documents > > Adding 34 documents > > Adding 233 documents > > > > Any ideas what causes this and how to fix it? > > > > Thanks. > > >

