wouldn't link analysis be a problem?
- Original Message -
From: Chris Schneider [EMAIL PROTECTED]
To: nutch-user@lucene.apache.org
Sent: Saturday, February 04, 2006 6:44 AM
Subject: Hosting segments in NDFS
Gang,
Would it be possible to modify Nutch so that a set of search servers
Yes, I already had done this once, but is is not API conform any
more, when porting ndfs to hadoop is done I may can bring things
again to the api and provide a patch.
However there is a list of other issues on my todo list already so it
will not happen until next days.
Stefan
Am
Hello,
I want to search for a specific URL with Nutch, but the results
returned seem to be all URLs that contain my URL as a subsequence.
For example, the query url:http://www.apple.com/; will also return
the URLs http://www.apple.com.tw; and http://www.apple.com.au;. Is
there any way to phrase
Hi Mike,
Thanks for your advice.
However, thinking about that the problem happens in level two and not in
level one which means that you successly fetched the link you mentioned but
you couldn't fetch the links it points to.
so actually you have to find the link in the second level that make
Hello,
First, let me thank all the developers who have created Nutch -- it is
wonderful and elegant code.
Second, a simple question:
I am using bin/nutch crawl to crawl and index two separate sites: one
is an http site, and the second is a network file system. These two
crawls have completely
Hi Chris
How do I change the plugin.xml? For example, if I want to crawl rss files
end with xml, just add a new element?
implementation id=org.apache.nutch.parse.rss.RSSParser
class=org.apache.nutch.parse.rss.RSSParser
Using nutch or lucene.
See if we can exchange some ideas.