java.net.NoRouteToHostException:

2009-08-03 Thread Saurabh Suman
Hi i am using Nutch1.0. I have one master and one slave. i can ping from my master to slave and from slave to master. my mastername is gem09 and slave is germ12. In host configuration is like- IP name aliases 192.168.0.116 germ16

Re: Meaning of ProtocolStatus.ACCESS_DENIED

2009-08-03 Thread Andrzej Bialecki
Otis Gospodnetic wrote: I don't know of an elegant way, but if you want to hack Nutch sources, you could set its refetch time to some point in time veeey far in the future, for example. Or introduce additional status. This won't work, because the pages will be checked again after a

Re: Nutch in C++

2009-08-03 Thread alxsss
Hi, I know nutch uses Lucene. But for what is Clucene then? Only for indexing files in a hard drive? I have knowledge of C++ and some experience. I wanted to code crawler of Nutch in C++ to get more experience and make it open source, only if it l be useful for the open source

Re: how to exclude some external links

2009-08-03 Thread alxsss
Hi, The plugin is enabled in nutch-default.xml file, but changes in it did not affect search. Instead changes in crawl-urlfilter.txt takes changes fetched links. Thanks. Alex. -Original Message- From: Paul Tomblin ptomb...@xcski.com To: nutch-user@lucene.apache.org Sent:

Re: Nutch in C++

2009-08-03 Thread Otis Gospodnetic
CLucene is just like Lucene (except a few versions behind), but written in C++. Yes, you could rewrite Nutch in C++ and have that use CLucene. Otis -- Sematext is hiring -- http://sematext.com/about/jobs.html?mls Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR - Original