CheckIndex prints these stats.

java -cp lucene-core-WHATEVER.jar org.apache.lucene.index.CheckIndex

----- Original Message -----
| From: "Shawn Heisey" <s...@elyograg.org>
| To: solr-user@lucene.apache.org
| Sent: Monday, October 15, 2012 9:46:33 PM
| Subject: Re: How many documents in each Lucene segment?
| 
| On 10/15/2012 8:06 PM, Michael Ryan wrote:
| > Easiest way I know of without parsing any of the index files is to
| > take the size of the fdx file in bytes and divide by 8. This will
| > give you the exact number of documents before 4.0, and a close
| > approximation in 4.0.
| >
| > Though, the fdx file might not be on disk if you haven't committed.
| 
| When you are importing 12 million documentsfrom a database, you get
| LOTS
| of completed segments even if there is no commit until the end.  The
| ramBuffer fills up pretty quick.
| 
| I intend to figure out how many documents are in the segments
| (ramBufferSizeMB=256) and try out an autoCommit setting a little bit
| lower than that.  I had trouble with autoCommit on previous versions,
| but with 4.0 I can turn off openSearcher, which may allow it to work
| right.
| 
| Thanks,
| Shawn
| 
| 

Reply via email to