There is a good chance that parse is failing. check the stats of the segment that contains the large PDF. Also dump the segment and see result.
do a and See if Tika able to parse the PDF or not ... -- View this message in context: http://lucene.472066.n3.nabble.com/Nutch-Solr-Pdf-content-is-not-getting-indexed-tp4125992p4126145.html Sent from the Nutch - User mailing list archive at Nabble.com.

