Re: Page meta-data is not stored in segments?

2005-07-07 Thread Jérôme Charron
I have a good idea of how to handle that situation. If there are multiple and conflicting values for important meta-data such as the content-type, the page is horribly broken, and Nutch shouldn't waste effort trying to figure out what's going on. For example, if [..] I understand

NDFS troubles

2005-07-07 Thread Jay Pound
ok I'm running suse 9.3 on 3 computers, a amd athlon 64 3500+ the x86_64 edition of course, and on a pentium 4 3.0ghz suse x86, and a athlon 1900+ x86 version. I am trying to setup ndfs across all the nodes, the athlon 1900+ is the namenode and a datanode, the athlon 64 and pentium 4 are

NDFS why

2005-07-07 Thread webmaster
ok I'm running suse 9.3 on 3 computers, a amd athlon 64 3500+ the x86_64 edition of course, and on a pentium 4 3.0ghz suse x86, and a athlon 1900+ x86 version. I am trying to setup ndfs across all the nodes, the athlon 1900+ is the namenode and a datanode, the athlon 64 and pentium 4 are

Re: [nutch 0.5] frames

2005-07-07 Thread Andrzej Bialecki
Philipp Suter wrote: does anybody know how to crawl frames? Or how to extend nutch to be able to crawl frames? We are using the api. The development version (available from SVN) should handle frames just fine, i.e. it should follow the src=... attributed in frames in order to retrieve the

Re: ndfs stuff

2005-07-07 Thread Piotr Kosiorowski
Hello Ferenc, Some documentation on running ndfs can be found on wiki: http://wiki.apache.org/nutch/NutchDistributedFileSystem Regards, Piotr [EMAIL PROTECTED] wrote: Have any location the ndfs usage documentation? Regards, Ferenc

Re: NDFS troubles

2005-07-07 Thread Jay Pound
how do I do a test against the mapred? -J - Original Message - From: Doug Cutting [EMAIL PROTECTED] To: nutch-user@lucene.apache.org Sent: Thursday, July 07, 2005 3:28 PM Subject: Re: NDFS troubles Trunk or mapred branch? If not mapred branch, please reproduce this in the mapred

Page Ranking

2005-07-07 Thread Zaheed Haque
Hello, I have just installed, test crawled and then tried out search. Search result page gives an option called explain. Score Explanation, yes I like to know a bit more about the ranking systems innerworkings. I would be very glad if someone could point me to some documentation. -- Best