Markus Jelsma created NUTCH-2418:
------------------------------------
Summary: NPE in org.apache.hadoop.io.Text from FetcherThread
Key: NUTCH-2418
URL: https://issues.apache.org/jira/browse/NUTCH-2418
Project: Nutch
Issue Type: Bug
Components: fetcher
Affects Versions: 1.13
Reporter: Markus Jelsma
{code}
2017-09-05 15:28:54,539 INFO [FetcherThread]
org.apache.nutch.fetcher.FetcherThread: FetcherThread 38 fetch of
https://www.provinciegroningen.nl/fileadmin/user_upload/Documenten/Downloads/vanturfvntoervfol.pdf
failed with: java.lang.NullPointerException
at org.apache.hadoop.io.Text.encode(Text.java:450)
at org.apache.hadoop.io.Text.encode(Text.java:431)
at org.apache.hadoop.io.Text.writeString(Text.java:480)
at org.apache.nutch.parse.ParseData.write(ParseData.java:168)
at org.apache.nutch.parse.ParseImpl.write(ParseImpl.java:69)
at org.apache.hadoop.io.GenericWritable.write(GenericWritable.java:142)
at
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:98)
at
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:82)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1157)
at
org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:610)
at org.apache.nutch.fetcher.FetcherThread.output(FetcherThread.java:773)
at org.apache.nutch.fetcher.FetcherThread.run(FetcherThread.java:360)
{code}
Never seen it before, no idea what's going on. Opening issue to track it.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)