Re: Hosting segments in NDFS

2006-02-04 Thread gekkokid
wouldn't link analysis be a problem? - Original Message - From: Chris Schneider [EMAIL PROTECTED] To: nutch-user@lucene.apache.org Sent: Saturday, February 04, 2006 6:44 AM Subject: Hosting segments in NDFS Gang, Would it be possible to modify Nutch so that a set of search servers

Re: Hosting segments in NDFS

2006-02-04 Thread Stefan Groschupf
Yes, I already had done this once, but is is not API conform any more, when porting ndfs to hadoop is done I may can bring things again to the api and provide a patch. However there is a list of other issues on my todo list already so it will not happen until next days. Stefan Am

Exact Match Query?

2006-02-04 Thread Albert Chern
Hello, I want to search for a specific URL with Nutch, but the results returned seem to be all URLs that contain my URL as a subsequence. For example, the query url:http://www.apple.com/; will also return the URLs http://www.apple.com.tw; and http://www.apple.com.au;. Is there any way to phrase

Re: Problems with MapRed-

2006-02-04 Thread Rafit Izhak_Ratzin
Hi Mike, Thanks for your advice. However, thinking about that the problem happens in level two and not in level one which means that you successly fetched the link you mentioned but you couldn't fetch the links it points to. so actually you have to find the link in the second level that make

Merging different crawls into a single index?

2006-02-04 Thread McCallie,David
Hello, First, let me thank all the developers who have created Nutch -- it is wonderful and elegant code. Second, a simple question: I am using bin/nutch crawl to crawl and index two separate sites: one is an http site, and the second is a network file system. These two crawls have completely

Re: Which version of rss does parse-rss plugin support?

2006-02-04 Thread 盖世豪侠
Hi Chris How do I change the plugin.xml? For example, if I want to crawl rss files end with xml, just add a new element? implementation id=org.apache.nutch.parse.rss.RSSParser class=org.apache.nutch.parse.rss.RSSParser

Does anybody here do some efforts about RSS/Blog search?

2006-02-04 Thread 盖世豪侠
Using nutch or lucene. See if we can exchange some ideas.