where are you from Sergio?



Sergio Morales wrote:
> 
> Hi Payo,
> 
> You need to add the right plugin to your nutch configuration file. Here is
> an extraction from my installation:
> 
> NUTCH_HOME\conf\nutch-site.xml:
> 
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> <configuration>
>  <property>
>    <name>plugin.includes</name>
>   
> <value>nutch-extensionpoints|ontology|protocol-ftp|protocol-httpclient|urlfilter-regex|parse-(text|html|pdf|rtf|msword|js|mspowerpoint|msexcel|oo|rss)|index-(basic|more)|query-(basic|site|url|more)|summary-lucene|scoring-opic</value>
>  </property>
> ...
> 
> Using the above configuration, I am able to index text, html, pbd, excel,
> etc.
> 
> Not sure about XML, I think there is already an enhacement request for
> this in JIRA. 
> 
> I hope this helps,
> 
> Sergio
> 
> ----- Original Message ----
> From: payo <[EMAIL PROTECTED]>
> To: [email protected]
> Sent: Friday, 19 October, 2007 4:16:20 PM
> Subject: Re: Indexing documents
> 
> 
> 
> 
> Goethe wrote:
>> 
>> 
>> 
>> payo wrote:
>>> 
>>> Hi
>>> 
>>> my questions are
>>> 
>>> 1.- Nutch can index documents PDF, HTML and XML?
>>> 
>>> 2.- Nutxh can index remote documents?
>>> 
>>> thanks
>>> 
>> 
>> Yes to both questions, and for the first question Nutch already comes
>> with
>> the plugins necessary to index those files types.
>> 
>> 
> 
> where i can obtain information on this?
> 
> -- 
> View this message in context:
> http://www.nabble.com/Indexing-documents-tf4653264.html#a13295436
> Sent from the Nutch - User mailing list archive at Nabble.com.
> 
> 
>       ___________________________________________________________ 
> Want ideas for reducing your carbon footprint? Visit Yahoo! For Good 
> http://uk.promotions.yahoo.com/forgood/environment.html
> 

-- 
View this message in context: 
http://www.nabble.com/Indexing-documents-tf4653264.html#a13302250
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to