Hi All
Please, some body can answer my questions?
I'm a nutch beginner, I hope that my questions/doubts are easy... ;)
Or if my email is wrong, tell me. Or confirm me if I'm in the right way.
Thanks a lot!
Ernesto.
Ernesto De Santis escribió:
Hi
I'm new in nutch, start yesterday.
But I have experience with Lucene.
I have some questions for you, a nutch experts... ;)
I want to split my pages results in categories, to filter or to show
its separately.
This is my approach:
*crawl/index*
I want to index an extra field.
Then, I need to do my own plugin for that, to develop my custom logic.
Then, I config my plugin in conf/nutch-site.xml.
To develop my plugin, I see that I need to implements: Configurable
<http://lucene.apache.org/hadoop/docs/api/org/apache/hadoop/conf/Configurable.html>,
IndexingFilter
<http://lucene.apache.org/nutch/apidocs-0.8/org/apache/nutch/indexer/IndexingFilter.html>,
and Pluggable
<http://lucene.apache.org/nutch/apidocs-0.8/org/apache/nutch/plugin/Pluggable.html>interfaces.
Add to the Document instance the field value, category value.
*search*
Here I have a doubt, one way is set to nutch query a requiredTerm:
query.addRequiredTerm(myCategory, "category");
I see that nutch use QueryFilters too, but I can't see how I do hook
it to my query.
*miscellaneous*
Lucene has a rich query hierarchy, I don't see it in nutch. I don't
see BooleanQuery, TermQuery, etc. The unique point to build the query
in nutch is the Query class?
Lucene searcher has a way to seperate the query to the filters. The
queries conditions affect the rank, and filters don't. How nutch
separates it?
*documentation*
I read the documentation in nutch site, tutorial, wiki, presentations
and today.java.net article:
http://today.java.net/pub/a/today/2006/01/10/introduction-to-nutch-1.html
and part2 too.
A lot of details aren't covered there. Some body know more detailed
documentation?
Thanks a lot.
Ernesto.
__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya!
http://www.yahoo.com.ar/respuestas