Nutch/Solr - The pdf is not getting indexed if the pdf size is big enough, I am not getting any exceptions but the content in the pdf is not getting indexed.
If I am using any small pdf link which does not have any images or urls, then the content is getting indexed and coming into solr. But when I am using the pdf links which contains more content the data is not getting indexed. I have changed the file.content.limit in nutch-default.xml to -1 and http.content.size in nutch-site.xml to -1 but it did not helped. I have followed the below links to get the thing worked but it did not helped, any further help would be much appreciated: http://grokbase.com/t/nutch/user/129ef77wa7/nutch-solr-pdf-getting-indexed-but-content-is-not-showing-in-solr http://grokbase.com/t/nutch/user/131apskpxq/crawling-pdfs -- View this message in context: http://lucene.472066.n3.nabble.com/Nutch-Solr-Pdf-content-is-not-getting-indexed-tp4125992.html Sent from the Nutch - User mailing list archive at Nabble.com.

