hi there, Within nutch-site.xml, I added pdf|msword for parse-/index-/query-
I wonder if it is the proper way to tell nutch to fetch,index and query these two file formats? thanks, Michael, --------------------------------------------------- <property> <name>plugin.includes</name> <value> nutch-extensionpoints|protocol-http| urlfilter-regex| parse-(text|html|pdf|msword)| index-(basic|pdf|msword)| query-(basic|site|url|pdf|msword) </value> <description> </description> </property> __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ------------------------------------------------------- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
