Author: ferdy
Date: Wed May 30 09:26:12 2012
New Revision: 1344168

URL: http://svn.apache.org/viewvc?rev=1344168&view=rev
Log:
NUTCH-1379 NPE when reprUrl is null in ParseUtil

Modified:
    nutch/branches/nutchgora/CHANGES.txt
    nutch/branches/nutchgora/src/java/org/apache/nutch/parse/ParseUtil.java

Modified: nutch/branches/nutchgora/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/nutch/branches/nutchgora/CHANGES.txt?rev=1344168&r1=1344167&r2=1344168&view=diff
==============================================================================
--- nutch/branches/nutchgora/CHANGES.txt (original)
+++ nutch/branches/nutchgora/CHANGES.txt Wed May 30 09:26:12 2012
@@ -2,6 +2,8 @@ Nutch Change Log
 
 Release 2.1 (22/02/2012)
 
+* NUTCH-1379 NPE when reprUrl is null in ParseUtil (ferdy)
+
 * NUTCH-1378 HostDb NullPointerException (ferdy)
 
 * NUTCH-XX Commit to add configuration for separation of ant distribution 
targets (lewismc + jnioche)

Modified: 
nutch/branches/nutchgora/src/java/org/apache/nutch/parse/ParseUtil.java
URL: 
http://svn.apache.org/viewvc/nutch/branches/nutchgora/src/java/org/apache/nutch/parse/ParseUtil.java?rev=1344168&r1=1344167&r2=1344168&view=diff
==============================================================================
--- nutch/branches/nutchgora/src/java/org/apache/nutch/parse/ParseUtil.java 
(original)
+++ nutch/branches/nutchgora/src/java/org/apache/nutch/parse/ParseUtil.java Wed 
May 30 09:26:12 2012
@@ -209,7 +209,12 @@ public class ParseUtil extends Configure
           String reprUrl = URLUtil.chooseRepr(url, newUrl,
               refreshTime < FetcherJob.PERM_REFRESH_TIME);
           WebPage newWebPage = new WebPage();
-          page.setReprUrl(new Utf8(reprUrl));
+          if (reprUrl == null) {
+            LOG.warn("reprUrl==null for " + url);
+            return redirectedPage;
+          } else {
+            page.setReprUrl(new Utf8(reprUrl));
+          }
           page.putToMetadata(FetcherJob.REDIRECT_DISCOVERED, 
TableUtil.YES_VAL);
           redirectedPage = new URLWebPage(reprUrl, newWebPage);
         }


Reply via email to