Re: problem with ndfs

2005-11-24 Thread Stefan Groschupf
Sounds like a problem with the hostnames of your datanodes. Check that your are able to ping all the datanodes with the hostnames they had send to the namenode. check: bin/nutch ndfs -report to see the hostnames. Stefan Am 24.11.2005 um 16:04 schrieb Anton Potehin: When we start namenode

Re: [jira] Created: (NUTCH-128) second configuration nodes overwrites first node

2005-11-24 Thread Stefan Groschupf
Sorry for me terrible english. Sure I know the concept of nutch-default.xml and nutch-site.xml. I tried to say that in case you have a setup for plugin.inlcude in nutch-site.xml in the beginning of the file and may since you made a mistake a second time in the end of the same file, the last

Re: [jira] Created: (NUTCH-128) second configuration nodes overwrites first node

2005-11-24 Thread Andrzej Bialecki
Stefan Groschupf wrote: Sorry for me terrible english. Sure I know the concept of nutch-default.xml and nutch-site.xml. I tried to say that in case you have a setup for plugin.inlcude in nutch-site.xml in the beginning of the file and may since you made a mistake a second time in the end of

Re: [proposal] Generic Markup Language Parser

2005-11-24 Thread Stefan Groschupf
Jérôme, A mail archive is a amazing source of information, isn't it?! :-) To answer your question, just ask your self how many pages per second your plan to fetch and parse and how much queries per second a lucene index is able to handle - and you can deliver in the ui. I have here

RE: [proposal] Generic Markup Language Parser

2005-11-24 Thread Chris Mattmann
Hi Stefan, -1! Xsl is terrible slow! You have to consider what the XSL will be used for. Our proposal suggests XSL as a means of intermediate transformation of markup content on the backend, as Jerome suggested in his reply. This means that whenever markup content is encountered,

RE: [proposal] Generic Markup Language Parser

2005-11-24 Thread Chris Mattmann
Hi Stefan, and Jerome, A mail archive is a amazing source of information, isn't it?! :-) To answer your question, just ask your self how many pages per second your plan to fetch and parse and how much queries per second a lucene index is able to handle - and you can deliver in the ui. I

Re: [proposal] Generic Markup Language Parser

2005-11-24 Thread Stefan Groschupf
Correct me if I'm wrong, but isn't log4j used a lot within Nutch? :-) No, nutch uses java logging, only some plugins use jar that depends on log4j. Stefan