Thanks to both for response me!

What's a meta tag?
It's some thing of nutch, it isn't a lucene field?

I suppose that implementing IndexFilter.filter:

filter(Document doc, Parse parse, UTF8 url, CrawlDatum datum, Inlinks 
inlinks)

I can add my field to a doc instance.

Well, seems that the way is to try, to crash, and to try again... :)

Thanks,
Ernesto.

Chris Stephens escribió:
> You can't do it unless you write a plugin to parse a custom meta tag 
> called category.
>
> I'm trying to do something like this now, but the plugin documentation 
> is horrible.
>
> Lourival Júnior wrote:
>> Hi Ernesto!
>>
>> I know what you mean. Sometimes I get no answers too. Unfortunately, 
>> I'm new
>> in nutch and lucene and I can't help you. Continue trying, the 
>> comunity will
>> help you :).
>>
>> On 8/22/06, Ernesto De Santis <[EMAIL PROTECTED]> wrote:
>>>
>>> Hi All
>>>
>>> Please, some body can answer my questions?
>>> I'm a nutch beginner, I hope that my questions/doubts are easy... ;)
>>>
>>> Or if my email is wrong, tell me. Or confirm me if I'm in the right 
>>> way.
>>>
>>> Thanks a lot!
>>> Ernesto.
>>>
>>> Ernesto De Santis escribió:
>>> > Hi
>>> >
>>> > I'm new in nutch, start yesterday.
>>> > But I have experience with Lucene.
>>> >
>>> > I have some questions for you, a nutch experts... ;)
>>> >
>>> > I want to split my pages results in categories, to filter or to show
>>> > its separately.
>>> > This is my approach:
>>> >
>>> > *crawl/index*
>>> >
>>> > I want to index an extra field.
>>> > Then, I need to do my own plugin for that, to develop my custom 
>>> logic.
>>> > Then, I config my plugin in conf/nutch-site.xml.
>>> >
>>> > To develop my plugin, I see that I need to implements: Configurable
>>> > <
>>> http://lucene.apache.org/hadoop/docs/api/org/apache/hadoop/conf/Configurable.html
>>>  
>>>
>>> >,
>>> > IndexingFilter
>>> > <
>>> http://lucene.apache.org/nutch/apidocs-0.8/org/apache/nutch/indexer/IndexingFilter.html
>>>  
>>>
>>> >,
>>> > and Pluggable
>>> > <
>>> http://lucene.apache.org/nutch/apidocs-0.8/org/apache/nutch/plugin/Pluggable.html
>>>  
>>>
>>> >interfaces.
>>> >
>>> > Add to the Document instance the field value, category value.
>>> >
>>> > *search*
>>> >
>>> > Here I have a doubt, one way is set to nutch query a requiredTerm:
>>> >
>>> > query.addRequiredTerm(myCategory, "category");
>>> >
>>> > I see that nutch use QueryFilters too, but I can't see how I do hook
>>> > it to my query.
>>> >
>>> > *miscellaneous*
>>> >
>>> > Lucene has a rich query hierarchy, I don't see it in nutch. I don't
>>> > see BooleanQuery, TermQuery, etc. The unique point to build the query
>>> > in nutch is the Query class?
>>> >
>>> > Lucene searcher has a way to seperate the query to the filters. The
>>> > queries conditions affect the rank, and filters don't. How nutch
>>> > separates it?
>>> >
>>> > *documentation*
>>> >
>>> > I read the documentation in nutch site, tutorial, wiki, presentations
>>> > and today.java.net article:
>>> >
>>> http://today.java.net/pub/a/today/2006/01/10/introduction-to-nutch-1.html 
>>>
>>> > and part2 too.
>>> >
>>> > A lot of details aren't covered there. Some body know more detailed
>>> > documentation?
>>> >
>>> > Thanks a lot.
>>> > Ernesto.
>>> >
>>>
>>>
>>>
>>>
>>> __________________________________________________
>>> Preguntá. Respondé. Descubrí.
>>> Todo lo que querías saber, y lo que ni imaginabas,
>>> está en Yahoo! Respuestas (Beta).
>>> ¡Probalo ya!
>>> http://www.yahoo.com.ar/respuestas
>>>
>>>
>>
>>
>
>
>

        
        
                
__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya! 
http://www.yahoo.com.ar/respuestas


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to