Hello nutchers !
i'm trying hard to build a plugin allow me to add some fields into database
(Mongo)
i use nutch 2.x
i update webpage.avsc with my customs fields like
{
"name": "pageLength",
"type": [
"null",
"string"
],
"doc": "description",
"default": null
}
and also i add
<field name="pageLength" docfield="pageLength" type="string"/>
into gora-mongodb-mapping.xml
and finaly this my plugin code:
package org.apache.nutch.indexer;
import java.util.Collection;
import java.util.HashSet;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.apache.hadoop.conf.Configuration;
import org.apache.nutch.indexer.IndexingFilter;
import org.apache.nutch.indexer.NutchDocument;
import org.apache.nutch.storage.WebPage;
import org.apache.nutch.storage.WebPage.Field;
import java.lang.Integer;
public class AddField implements IndexingFilter {
private static final Log LOG = LogFactory.getLog(AddField.class);
private Configuration conf;
//Boilerplate
public Configuration getConf() {
return conf;
}
//Boilerplate
public void setConf(Configuration conf) {
this.conf = conf;
}
@Override
public Collection<Field> getFields() {
// TODO Auto-generated method stub
return null;
}
@Override
public NutchDocument filter(NutchDocument doc, String url, WebPage page)
throws IndexingException {
// TODO Auto-generated method stub
String content = page.getText().toString();
//adds the new field to the document
String len = Integer.toString(content.length());
doc.add("pageLength", len);
return doc;
}
}
the problem is when i run
ant eclipse
the build is failed and this is the output error:
[ivy:resolve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
[taskdef] Could not load definitions from resource
org/sonar/ant/antlib.xml. It could not be found.
copy-libs:
compile-core:
[javac] Compiling 257 source files to
/Users/lsroudiSpinergie/Documents/workspace/2.x/build/classes
[javac] warning: [options] bootstrap class path not set in conjunction
with -source 1.7
[javac]
/Users/lsroudiSpinergie/Documents/workspace/2.x/src/java/org/apache/nutch/parse/ParserChecker.java:112:
error: cannot find symbol
[javac] if (!protocolOutput.getStatus().isSuccess()) {
[javac] ^
[javac] symbol: method isSuccess()
[javac] location: class ProtocolStatus
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.
[javac] 1 error
[javac] 1 warning
BUILD FAILED
any help would be appreciated