Hi Chad

The link was a configuration example.

more explained example:
http://www.misite.com/videos/.*=videos  (rule A)

if the url fetched match which rule A, then index a Field named = 
'category' with value = 'videos'.

Later you can search over this field category to filter yours searches.

I will send this plugin in another new thread mail. I post the plugin 
here, in the list. I don't know another way to share it with you.

Regards
Ernesto.





[EMAIL PROTECTED] escribió:
> couldn't get the link to work but yes if you could share that would be 
> great.
>
> Chad Savage
>
>
>
>
> Ernesto De Santis wrote:
>> I did a url-category-indexer.
>>
>> It works with a .properties file that map urls writed as regexp and 
>> categories.
>> example:
>>
>> http://www.misite.com/videos/.*=videos
>>
>> If it seems useful, I can share it.
>>
>> Maybe, it could be better config it in a .xml file.
>>
>> Regards,
>> Ernesto.
>>
>> Stefan Neufeind escribió:
>>> Alvaro Cabrerizo wrote:
>>>  
>>>> Have you included a node to describe your new searcher filter into
>>>> plugin.xml?
>>>>
>>>> 2006/10/11, xu nutch <[EMAIL PROTECTED]>:
>>>>   
>>>>> I have a question about myplugin for indexfilter and queryfilter.
>>>>> Can u Help me !
>>>>> -------------------------------------
>>>>> MoreIndexingFilter.java in add
>>>>> doc.add(new Field("category", "test", false, true, false));
>>>>> -------------------------------------
>>>>>
>>>>> --------------------------------------
>>>>>
>>>>>
>>>>> package org.apache.nutch.searcher.more;
>>>>>
>>>>> import org.apache.nutch.searcher.RawFieldQueryFilter;
>>>>>
>>>>> /** Handles "category:" query clauses, causing them to search the
>>>>> field indexed by
>>>>>  * BasicIndexingFilter. */
>>>>> public class CategoryQueryFilter extends RawFieldQueryFilter {
>>>>>  public CategoryQueryFilter() {
>>>>>    super("category");
>>>>>  }
>>>>> }
>>>>> -----------------------------------------------
>>>>> -----------------------------------------------
>>>>>
>>>>> <property>
>>>>>  <name>plugin.includes</name>
>>>>> <value>nutch-extensionpoints|protocol-http|urlfilter-regex|parse-(text|html)|index-(basic|more)|query-(basic|site|url|more)</value>
>>>>>  
>>>>>
>>>>>
>>>>>  <description>Regular expression naming plugin directory names to
>>>>>  include.  Any plugin not matching this expression is excluded.
>>>>>  In any case you need at least include the nutch-extensionpoints
>>>>> plugin. By
>>>>>  default Nutch includes crawling just HTML and plain text via HTTP,
>>>>>  and basic indexing and search plugins.
>>>>>  </description>
>>>>> </property>
>>>>>
>>>>> <property>
>>>>>  <name>plugin.includes</name>
>>>>> <value>nutch-extensionpoints|protocol-http|urlfilter-regex|parse-(text|html)|index-(basic|more)|query-(basic|site|url|more)</value>
>>>>>  
>>>>>
>>>>>
>>>>>  <description>Regular expression naming plugin directory names to
>>>>>  include.  Any plugin not matching this expression is excluded.
>>>>>  In any case you need at least include the nutch-extensionpoints
>>>>> plugin. By
>>>>>  default Nutch includes crawling just HTML and plain text via HTTP,
>>>>>  and basic indexing and search plugins.
>>>>>  </description>
>>>>> </property>
>>>>> -----------------------------------------------
>>>>>
>>>>> I use luke to query "category:test" is ok!
>>>>> but I use tomcat webstie to query "category:test" ,
>>>>> no return result.
>>>>>       
>>>
>>> In case you get the search working:
>>> How do you plan to categorize URLs/sites? I'm looking for a solution
>>> there, since I didn't yet manage to implement something
>>> URL-prefix-filter based to map categories to URLs or so.
>>>
>>>
>>> Regards,
>>>  Stefan
>>>
>>>
>>>   
>>
>>                __________________________________________________
>> Preguntá. Respondé. Descubrí.
>> Todo lo que querías saber, y lo que ni imaginabas,
>> está en Yahoo! Respuestas (Beta).
>> ¡Probalo ya! http://www.yahoo.com.ar/respuestas
>>
>>
>

        
        
                
__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya! 
http://www.yahoo.com.ar/respuestas


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to