Thanks to both for response me! What's a meta tag? It's some thing of nutch, it isn't a lucene field?
I suppose that implementing IndexFilter.filter: filter(Document doc, Parse parse, UTF8 url, CrawlDatum datum, Inlinks inlinks) I can add my field to a doc instance. Well, seems that the way is to try, to crash, and to try again... :) Thanks, Ernesto. Chris Stephens escribió: > You can't do it unless you write a plugin to parse a custom meta tag > called category. > > I'm trying to do something like this now, but the plugin documentation > is horrible. > > Lourival Júnior wrote: >> Hi Ernesto! >> >> I know what you mean. Sometimes I get no answers too. Unfortunately, >> I'm new >> in nutch and lucene and I can't help you. Continue trying, the >> comunity will >> help you :). >> >> On 8/22/06, Ernesto De Santis <[EMAIL PROTECTED]> wrote: >>> >>> Hi All >>> >>> Please, some body can answer my questions? >>> I'm a nutch beginner, I hope that my questions/doubts are easy... ;) >>> >>> Or if my email is wrong, tell me. Or confirm me if I'm in the right >>> way. >>> >>> Thanks a lot! >>> Ernesto. >>> >>> Ernesto De Santis escribió: >>> > Hi >>> > >>> > I'm new in nutch, start yesterday. >>> > But I have experience with Lucene. >>> > >>> > I have some questions for you, a nutch experts... ;) >>> > >>> > I want to split my pages results in categories, to filter or to show >>> > its separately. >>> > This is my approach: >>> > >>> > *crawl/index* >>> > >>> > I want to index an extra field. >>> > Then, I need to do my own plugin for that, to develop my custom >>> logic. >>> > Then, I config my plugin in conf/nutch-site.xml. >>> > >>> > To develop my plugin, I see that I need to implements: Configurable >>> > < >>> http://lucene.apache.org/hadoop/docs/api/org/apache/hadoop/conf/Configurable.html >>> >>> >>> >, >>> > IndexingFilter >>> > < >>> http://lucene.apache.org/nutch/apidocs-0.8/org/apache/nutch/indexer/IndexingFilter.html >>> >>> >>> >, >>> > and Pluggable >>> > < >>> http://lucene.apache.org/nutch/apidocs-0.8/org/apache/nutch/plugin/Pluggable.html >>> >>> >>> >interfaces. >>> > >>> > Add to the Document instance the field value, category value. >>> > >>> > *search* >>> > >>> > Here I have a doubt, one way is set to nutch query a requiredTerm: >>> > >>> > query.addRequiredTerm(myCategory, "category"); >>> > >>> > I see that nutch use QueryFilters too, but I can't see how I do hook >>> > it to my query. >>> > >>> > *miscellaneous* >>> > >>> > Lucene has a rich query hierarchy, I don't see it in nutch. I don't >>> > see BooleanQuery, TermQuery, etc. The unique point to build the query >>> > in nutch is the Query class? >>> > >>> > Lucene searcher has a way to seperate the query to the filters. The >>> > queries conditions affect the rank, and filters don't. How nutch >>> > separates it? >>> > >>> > *documentation* >>> > >>> > I read the documentation in nutch site, tutorial, wiki, presentations >>> > and today.java.net article: >>> > >>> http://today.java.net/pub/a/today/2006/01/10/introduction-to-nutch-1.html >>> >>> > and part2 too. >>> > >>> > A lot of details aren't covered there. Some body know more detailed >>> > documentation? >>> > >>> > Thanks a lot. >>> > Ernesto. >>> > >>> >>> >>> >>> >>> __________________________________________________ >>> Preguntá. Respondé. Descubrí. >>> Todo lo que querías saber, y lo que ni imaginabas, >>> está en Yahoo! Respuestas (Beta). >>> ¡Probalo ya! >>> http://www.yahoo.com.ar/respuestas >>> >>> >> >> > > > __________________________________________________ Preguntá. Respondé. Descubrí. Todo lo que querías saber, y lo que ni imaginabas, está en Yahoo! Respuestas (Beta). ¡Probalo ya! http://www.yahoo.com.ar/respuestas ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
