Author: lewismc
Date: Wed Nov 27 10:14:18 2013
New Revision: 1545982
URL: http://svn.apache.org/r1545982
Log:
NUTCH-1673 Title isn't reset in MoreIndexingFilter
Modified:
nutch/branches/2.x/CHANGES.txt
nutch/branches/2.x/src/plugin/index-more/src/java/org/apache/nutch/indexer/more/MoreIndexingFilter.java
Modified: nutch/branches/2.x/CHANGES.txt
URL:
http://svn.apache.org/viewvc/nutch/branches/2.x/CHANGES.txt?rev=1545982&r1=1545981&r2=1545982&view=diff
==============================================================================
--- nutch/branches/2.x/CHANGES.txt (original)
+++ nutch/branches/2.x/CHANGES.txt Wed Nov 27 10:14:18 2013
@@ -2,6 +2,8 @@ Nutch Change Log
Current Development
+* NUTCH-1673 Title isn't reset in MoreIndexingFilter (Nguyen Manh Tien via
lewismc)
+
* NUTCH-1621 Remove deprecated class o.a.n.crawl.Crawler (Rui Gao via jnioche)
* NUTCH-1651 modifiedTime and prevmodifiedTime never set (Talat UYARER via
lewismc)
Modified:
nutch/branches/2.x/src/plugin/index-more/src/java/org/apache/nutch/indexer/more/MoreIndexingFilter.java
URL:
http://svn.apache.org/viewvc/nutch/branches/2.x/src/plugin/index-more/src/java/org/apache/nutch/indexer/more/MoreIndexingFilter.java?rev=1545982&r1=1545981&r2=1545982&view=diff
==============================================================================
---
nutch/branches/2.x/src/plugin/index-more/src/java/org/apache/nutch/indexer/more/MoreIndexingFilter.java
(original)
+++
nutch/branches/2.x/src/plugin/index-more/src/java/org/apache/nutch/indexer/more/MoreIndexingFilter.java
Wed Nov 27 10:14:18 2013
@@ -258,6 +258,7 @@ public class MoreIndexingFilter implemen
for (int i = 0; i < patterns.length; i++) {
if (matcher.contains(contentDisposition.toString(), patterns[i])) {
result = matcher.getMatch();
+ doc.removeField("title");
doc.add("title", result.group(1));
break;
}