hi there,

Within nutch-site.xml, I added pdf|msword for
parse-/index-/query-

I wonder if it is the proper way to tell nutch to
fetch,index and query these two file formats?

thanks,

Michael,

---------------------------------------------------

<property>
<name>plugin.includes</name>

<value>

nutch-extensionpoints|protocol-http|
urlfilter-regex|
parse-(text|html|pdf|msword)|
index-(basic|pdf|msword)|
query-(basic|site|url|pdf|msword)

</value>
  <description> </description>
</property>


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

Reply via email to