Hey Juan, I had the same problem. If you check the nutch logs (in the logs folder in nutch) you will most likely see solrindex throwing errors on some of your documents.
For example, some of the date formats on some of my docs wasn't being properly parsed, so I had to create a patch (here is my bug entry: https://issues.apache.org/jira/browse/NUTCH-871) You could be having a different error that demonstrates a bug in a different part of the pipeline, but the logs are the place to start. -Max On Mon, Nov 1, 2010 at 7:23 PM, Juan Felix <[email protected]> wrote: > > Hi. > > I'm trying to index all the documents using solrindex command, but for some > reason sometimes it doesn't index all the documents. > > For example, I saw the crawl db stats and it has 75,031 fetched pages but > after index them to solr, the number of documents in solr are 74,827 > > Any Idea? What about the other 204 pages that are not on solr? > > Thanks > Juan Felix >

