Please do share it. I'd appreciate it, and I guess a lot of others as well. And I bet it could even be enhanced by the community. :-)
Regards, Stefan Ernesto De Santis wrote: > I did a url-category-indexer. > > It works with a .properties file that map urls writed as regexp and > categories. > example: > > http://www.misite.com/videos/.*=videos > > If it seems useful, I can share it. > > Maybe, it could be better config it in a .xml file. > > Regards, > Ernesto. > > Stefan Neufeind escribió: >> Alvaro Cabrerizo wrote: >> >>> Have you included a node to describe your new searcher filter into >>> plugin.xml? >>> >>> 2006/10/11, xu nutch <[EMAIL PROTECTED]>: >>> >>>> I have a question about myplugin for indexfilter and queryfilter. >>>> Can u Help me ! >>>> ------------------------------------- >>>> MoreIndexingFilter.java in add >>>> doc.add(new Field("category", "test", false, true, false)); >>>> ------------------------------------- >>>> >>>> -------------------------------------- >>>> >>>> >>>> package org.apache.nutch.searcher.more; >>>> >>>> import org.apache.nutch.searcher.RawFieldQueryFilter; >>>> >>>> /** Handles "category:" query clauses, causing them to search the >>>> field indexed by >>>> * BasicIndexingFilter. */ >>>> public class CategoryQueryFilter extends RawFieldQueryFilter { >>>> public CategoryQueryFilter() { >>>> super("category"); >>>> } >>>> } >>>> ----------------------------------------------- >>>> ----------------------------------------------- >>>> >>>> <property> >>>> <name>plugin.includes</name> >>>> <value>nutch-extensionpoints|protocol-http|urlfilter-regex|parse-(text|html)|index-(basic|more)|query-(basic|site|url|more)</value> >>>> >>>> >>>> <description>Regular expression naming plugin directory names to >>>> include. Any plugin not matching this expression is excluded. >>>> In any case you need at least include the nutch-extensionpoints >>>> plugin. By >>>> default Nutch includes crawling just HTML and plain text via HTTP, >>>> and basic indexing and search plugins. >>>> </description> >>>> </property> >>>> >>>> <property> >>>> <name>plugin.includes</name> >>>> <value>nutch-extensionpoints|protocol-http|urlfilter-regex|parse-(text|html)|index-(basic|more)|query-(basic|site|url|more)</value> >>>> >>>> >>>> <description>Regular expression naming plugin directory names to >>>> include. Any plugin not matching this expression is excluded. >>>> In any case you need at least include the nutch-extensionpoints >>>> plugin. By >>>> default Nutch includes crawling just HTML and plain text via HTTP, >>>> and basic indexing and search plugins. >>>> </description> >>>> </property> >>>> ----------------------------------------------- >>>> >>>> I use luke to query "category:test" is ok! >>>> but I use tomcat webstie to query "category:test" , >>>> no return result. >>>> >> >> In case you get the search working: >> How do you plan to categorize URLs/sites? I'm looking for a solution >> there, since I didn't yet manage to implement something >> URL-prefix-filter based to map categories to URLs or so. >> >> >> Regards, >> Stefan ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
