On Tue, Jan 15, 2013 at 11:28 PM, Lewis John Mcgibbney < [email protected]> wrote:
> Did you check the http.accept property in nutch-site.xml I copied from nutch-default.xml, then add application/pdf: <property> <name>http.accept</name> <value>text/html,application/xhtml+xml,application/xml,application/pdf;q=0.9,*/*;q=0.8</value> <description>Value of the "Accept" request header field. </description> </property> Also has shown on hadoop.log: 2013-01-16 07:39:22,232 INFO http.Http - http.accept = text/html,application/xhtml+xml,application/xml,application/pdf;q=0.9,*/*;q=0.8 -- wassalam, [bayu]

