Re: Could anyone teache me how to index the title or content of PDF?

2006-09-02 Thread Tomi NA
On 9/1/06, Frank Huang [EMAIL PROTECTED] wrote: But when I execute ./nutch crawl there show some messages like fetch okay ,but can`t parse http://(omit...).pdf reason:failed omit..content truncated at 70709 bytes.Parse can`t handle incomplete pdf file. Haven't had time to go through the

nutch with database

2006-09-02 Thread Amit Soni
Hi List, I am new user to nutch and as far i know using nutch i can crawl the html pages of given sites and add an index of that html pages. But i also want to add some database values as an index using nutch. So in this part i am fetching some records values from database and then want to

Re: nutch protocol-file

2006-09-02 Thread Thomas Delnoij
Just add scoring-opic to your plugin.includes in nutch-site.xml. Rgrds, Thomas On 9/1/06, Cam Bazz [EMAIL PROTECTED] wrote: Hello, I wanted to index my files so I followed the instructions at http://www.folge2.de/tp/search/1/crawling-the-local-filesystem-with-nutch I get : Exception in

Does Nutch index images?

2006-09-02 Thread Sidney
Does nutch index images? If not or/and if so how can I go about creating a separate search category for searching for images like the major search engines have? If anyone can give any information on this I would be very grateful. -- View this message in context: