Please see below On Sat, Jan 12, 2013 at 8:48 PM, Bayu Widyasanyata <[email protected]>wrote:
> > That's tomcat port for Solr. > Should we activate the proxy setting? > Is it already activated in nutch-site.xml? No I do not think it should be activated unless you have a proxy running. > > > > But the strange is the whole status of documents fetched is 2. > > This is fine, there is clearly no problem with fetching. It is a parsing problem for sure. > > So, why the PDF parser could not parsed completely to whole PDFs docs? > http.content.limit?

