Author: lewismc
Date: Mon May 21 18:25:09 2012
New Revision: 1341137

URL: http://svn.apache.org/viewvc?rev=1341137&view=rev
Log:
commit to address NUTCH-1364 and update to CHANGES.txt

Modified:
    nutch/branches/nutchgora/CHANGES.txt
    
nutch/branches/nutchgora/src/java/org/apache/nutch/crawl/GeneratorReducer.java

Modified: nutch/branches/nutchgora/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/nutch/branches/nutchgora/CHANGES.txt?rev=1341137&r1=1341136&r2=1341137&view=diff
==============================================================================
--- nutch/branches/nutchgora/CHANGES.txt (original)
+++ nutch/branches/nutchgora/CHANGES.txt Mon May 21 18:25:09 2012
@@ -2,6 +2,8 @@ Nutch Change Log
 
 Release nutchgora - Current Development
 
+* NUTCH-1364 Add a counter for malformed urls (Jason Trost via lewismc)
+
 * NUTCH-1361 Fix mishandling of malformed urls in generator job (Jason Trost 
via lewismc)
 
 * NUTCH-1360 Support the storing of IP address connected to when web crawling 
(lewismc)

Modified: 
nutch/branches/nutchgora/src/java/org/apache/nutch/crawl/GeneratorReducer.java
URL: 
http://svn.apache.org/viewvc/nutch/branches/nutchgora/src/java/org/apache/nutch/crawl/GeneratorReducer.java?rev=1341137&r1=1341136&r2=1341137&view=diff
==============================================================================
--- 
nutch/branches/nutchgora/src/java/org/apache/nutch/crawl/GeneratorReducer.java 
(original)
+++ 
nutch/branches/nutchgora/src/java/org/apache/nutch/crawl/GeneratorReducer.java 
Mon May 21 18:25:09 2012
@@ -77,6 +77,7 @@ extends GoraReducer<SelectorEntry, WebPa
       try {
         context.write(TableUtil.reverseUrl(key.url), page);
       } catch (MalformedURLException e) {
+       context.getCounter("Generator", "MALFORMED_URL").increment(1);
         continue;
       }
       context.getCounter("Generator", "GENERATE_MARK").increment(1);


Reply via email to