CheckIndex prints these stats. java -cp lucene-core-WHATEVER.jar org.apache.lucene.index.CheckIndex
----- Original Message ----- | From: "Shawn Heisey" <s...@elyograg.org> | To: solr-user@lucene.apache.org | Sent: Monday, October 15, 2012 9:46:33 PM | Subject: Re: How many documents in each Lucene segment? | | On 10/15/2012 8:06 PM, Michael Ryan wrote: | > Easiest way I know of without parsing any of the index files is to | > take the size of the fdx file in bytes and divide by 8. This will | > give you the exact number of documents before 4.0, and a close | > approximation in 4.0. | > | > Though, the fdx file might not be on disk if you haven't committed. | | When you are importing 12 million documentsfrom a database, you get | LOTS | of completed segments even if there is no commit until the end. The | ramBuffer fills up pretty quick. | | I intend to figure out how many documents are in the segments | (ramBufferSizeMB=256) and try out an autoCommit setting a little bit | lower than that. I had trouble with autoCommit on previous versions, | but with 4.0 I can turn off openSearcher, which may allow it to work | right. | | Thanks, | Shawn | |