There is a good chance that parse is failing. 

check the stats of the segment that contains the large PDF. Also dump the
segment and see result.

do a 

and 


See if Tika able to parse the PDF or not ...







--
View this message in context: 
http://lucene.472066.n3.nabble.com/Nutch-Solr-Pdf-content-is-not-getting-indexed-tp4125992p4126145.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to