Author: jnioche
Date: Wed Dec 22 16:59:17 2010
New Revision: 1051985
URL: http://svn.apache.org/viewvc?rev=1051985&view=rev
Log:
NUTCH-936 LanguageIdentifier should not set empty lang field on NutchDocument
(Markus Jelsma via jnioche)
Modified:
nutch/trunk/CHANGES.txt
nutch/trunk/src/plugin/languageidentifier/src/java/org/apache/nutch/analysis/lang/LanguageIndexingFilter.java
Modified: nutch/trunk/CHANGES.txt
URL:
http://svn.apache.org/viewvc/nutch/trunk/CHANGES.txt?rev=1051985&r1=1051984&r2=1051985&view=diff
==============================================================================
--- nutch/trunk/CHANGES.txt (original)
+++ nutch/trunk/CHANGES.txt Wed Dec 22 16:59:17 2010
@@ -2,6 +2,8 @@ Nutch Change Log
Release 2.0 - Current Development
+* NUTCH-936 LanguageIdentifier should not set empty lang field on
NutchDocument (Markus Jelsma via jnioche)
+
* NUTCH-949 Conflicting ANT jars in classpath (jnioche)
* NUTCH-825 Publish nutch artifacts to central maven repository (mattmann)
Modified:
nutch/trunk/src/plugin/languageidentifier/src/java/org/apache/nutch/analysis/lang/LanguageIndexingFilter.java
URL:
http://svn.apache.org/viewvc/nutch/trunk/src/plugin/languageidentifier/src/java/org/apache/nutch/analysis/lang/LanguageIndexingFilter.java?rev=1051985&r1=1051984&r2=1051985&view=diff
==============================================================================
---
nutch/trunk/src/plugin/languageidentifier/src/java/org/apache/nutch/analysis/lang/LanguageIndexingFilter.java
(original)
+++
nutch/trunk/src/plugin/languageidentifier/src/java/org/apache/nutch/analysis/lang/LanguageIndexingFilter.java
Wed Dec 22 16:59:17 2010
@@ -70,7 +70,7 @@ public class LanguageIndexingFilter impl
lang = Bytes.toString(blang.array());
}
- if (lang == null) {
+ if (lang == null || lang.length() == 0) {
lang = "unknown";
}