Re: Nutch/Solr - Pdf content is not getting indexed

anupamk Fri, 21 Mar 2014 15:14:29 -0700

There is a good chance that parse is failing. 

check the stats of the segment that contains the large PDF. Also dump the
segment and see result.


do a 

and 


See if Tika able to parse the PDF or not ...







--
View this message in context: 
http://lucene.472066.n3.nabble.com/Nutch-Solr-Pdf-content-is-not-getting-indexed-tp4125992p4126145.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Re: Nutch/Solr - Pdf content is not getting indexed

Reply via email to