Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The following page has been changed by RenaudRichardet: http://wiki.apache.org/nutch/CreateNewFilter New page: = CreateNewFilter = Howto add a category metadata to your index and be able to search for it. For this, you need to write an indexing filter and a query filter. == Indexing your custom metadata == For the indexing filter, copy the index-more plugin, and change names, dirs, and build files appropriately. The main thing to change is the filter method: {{{ public Document filter(Document doc, Parse parse, FetcherOutput fo) }}} In it, you can add your own fields. To add a new category with value "puppies", it will look something like this: {{{ doc.add(new Field("category", "puppies", false, true, false)); }}} See the Document.add API for more info on the booleans. That's pretty much it for indexing. == Searching your metadata == To search for this, you need to create a query filter. Copy the query-site plugin. Again change file names, directories, and build files as needed. The main java file is very simple, just change the string in the line with "super". Instead of: {{{ super("site"); }}} You would have {{{ super("category"); }}} Make sure that you put your new index-category and query-category plugins in your nutch-default.xml file. Don't forget to check that it's in your WEB-INF/classess directory too. Credits: HowieWang Thread: http://www.nabble.com/index-search-filtering-by-category-tf2136864.html ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Nutch-cvs mailing list Nutch-cvs@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-cvs