[
https://issues.apache.org/jira/browse/NUTCH-2585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-2585:
---------------------------------
Description:
Stumbled on this one just now:
{code}
2018-05-25 14:29:31,844 INFO [FetcherThread]
org.apache.nutch.fetcher.FetcherThread: FetcherThread 42 fetch of
http://www.ndcmediagroep.nl/wp-content/uploads/2017/03/Leaflet-Noflik-Wenje.pdf
failed with: java.lang.NullPointerException
at
org.apache.nutch.util.TrieStringMatcher$TrieNode.getChild(TrieStringMatcher.java:107)
at
org.apache.nutch.util.SuffixStringMatcher.shortestMatch(SuffixStringMatcher.java:74)
at
org.apache.nutch.urlfilter.suffix.SuffixURLFilter.filter(SuffixURLFilter.java:164)
at org.apache.nutch.net.URLFilters.filter(URLFilters.java:43)
at
org.apache.nutch.fetcher.FetcherThread.handleRedirect(FetcherThread.java:487)
at org.apache.nutch.fetcher.FetcherThread.run(FetcherThread.java:404)
{code}
Edit - added on 1 may 2019, i got a slightly different strack trace this time:
{code}
2019-05-01 08:50:07,282 INFO [FetcherThread]
org.apache.nutch.fetcher.FetcherThread: FetcherThread 38 fetch of
https://kanaalstreek.nl/fzh/2018/06/04/vijf-maal-goud-voor-pegasus-op-nk failed
with: java.lang.NullPointerException
at
org.apache.nutch.util.TrieStringMatcher$TrieNode.getChild(TrieStringMatcher.java:107)
at
org.apache.nutch.util.PrefixStringMatcher.shortestMatch(PrefixStringMatcher.java:79)
at
org.apache.nutch.urlfilter.prefix.PrefixURLFilter.filter(PrefixURLFilter.java:73)
at org.apache.nutch.net.URLFilters.filter(URLFilters.java:43)
at
org.apache.nutch.fetcher.FetcherThread.handleRedirect(FetcherThread.java:487)
at org.apache.nutch.fetcher.FetcherThread.run(FetcherThread.java:404)
{code}
was:
Stumbled on this one just now:
{code}
2018-05-25 14:29:31,844 INFO [FetcherThread]
org.apache.nutch.fetcher.FetcherThread: FetcherThread 42 fetch of
http://www.ndcmediagroep.nl/wp-content/uploads/2017/03/Leaflet-Noflik-Wenje.pdf
failed with: java.lang.NullPointerException
at
org.apache.nutch.util.TrieStringMatcher$TrieNode.getChild(TrieStringMatcher.java:107)
at
org.apache.nutch.util.SuffixStringMatcher.shortestMatch(SuffixStringMatcher.java:74)
at
org.apache.nutch.urlfilter.suffix.SuffixURLFilter.filter(SuffixURLFilter.java:164)
at org.apache.nutch.net.URLFilters.filter(URLFilters.java:43)
at
org.apache.nutch.fetcher.FetcherThread.handleRedirect(FetcherThread.java:487)
at org.apache.nutch.fetcher.FetcherThread.run(FetcherThread.java:404)
{code}
> NPE in TrieStringMatcher
> ------------------------
>
> Key: NUTCH-2585
> URL: https://issues.apache.org/jira/browse/NUTCH-2585
> Project: Nutch
> Issue Type: Bug
> Affects Versions: 1.14
> Reporter: Markus Jelsma
> Priority: Major
> Fix For: 1.16
>
>
> Stumbled on this one just now:
> {code}
> 2018-05-25 14:29:31,844 INFO [FetcherThread]
> org.apache.nutch.fetcher.FetcherThread: FetcherThread 42 fetch of
> http://www.ndcmediagroep.nl/wp-content/uploads/2017/03/Leaflet-Noflik-Wenje.pdf
> failed with: java.lang.NullPointerException
> at
> org.apache.nutch.util.TrieStringMatcher$TrieNode.getChild(TrieStringMatcher.java:107)
> at
> org.apache.nutch.util.SuffixStringMatcher.shortestMatch(SuffixStringMatcher.java:74)
> at
> org.apache.nutch.urlfilter.suffix.SuffixURLFilter.filter(SuffixURLFilter.java:164)
> at org.apache.nutch.net.URLFilters.filter(URLFilters.java:43)
> at
> org.apache.nutch.fetcher.FetcherThread.handleRedirect(FetcherThread.java:487)
> at org.apache.nutch.fetcher.FetcherThread.run(FetcherThread.java:404)
> {code}
> Edit - added on 1 may 2019, i got a slightly different strack trace this time:
> {code}
> 2019-05-01 08:50:07,282 INFO [FetcherThread]
> org.apache.nutch.fetcher.FetcherThread: FetcherThread 38 fetch of
> https://kanaalstreek.nl/fzh/2018/06/04/vijf-maal-goud-voor-pegasus-op-nk
> failed with: java.lang.NullPointerException
> at
> org.apache.nutch.util.TrieStringMatcher$TrieNode.getChild(TrieStringMatcher.java:107)
> at
> org.apache.nutch.util.PrefixStringMatcher.shortestMatch(PrefixStringMatcher.java:79)
> at
> org.apache.nutch.urlfilter.prefix.PrefixURLFilter.filter(PrefixURLFilter.java:73)
> at org.apache.nutch.net.URLFilters.filter(URLFilters.java:43)
> at
> org.apache.nutch.fetcher.FetcherThread.handleRedirect(FetcherThread.java:487)
> at org.apache.nutch.fetcher.FetcherThread.run(FetcherThread.java:404)
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)