Hi Nutch Geeks,
I need urgent answer to the following queries:
1. Is there any documentation/tutorial available which helps us understands the
config parameters in different config files under conf directory ?
2.Suppose I want to parse different type of document like PDF, Ms-word,
Ms-Excel , ppt etc., What I need to do ? What are the necessary config
parameters for parsing of different type of documents(different mime type )
3.If I want to crawl local filesystem, what I need to add in urls.txt and
crawl-urlfilter.txt?
Thanx in advance for early response....
Regards,
Arun Kumar Sharma (Tech Lead -Java/J2EE)
Mob: +91.981.529.5761
---------------------------------
Enjoy this Diwali with Y! India Click here