Nodemanager crashing repeatedly
I am running Nutch-2.3.1 over Hadoop-2.5.2 and Hbase-1.2.3 with integration to Solr-6.5.1. I have crawled over 10 million pages. But while doing all this I am continuously facing two problems: 1. My Nodemanager is crashing repeatedly during different phases of crawl. It crashes my linux session and forces logout with nodemanager killed. I log-in again, restart NodeManger and the same failed crawl phase runs to success. [Nodemanager log has nothing to report] 2. I am running all my crawl phases one by one without crawl script, as with crawl script most of the time my jobs were exiting with "WaitForjobCompletion" error at different stages of crawl. So, I decided to go ahead with one by one method which prevented "WaitForjobCompletion" to occure. Any help will be highly appreciated. New to mailing-list, New to Nutch. -Gajanan
redirect bin/crwal log output to some other file
Hi All, We are using bin/crawl command to crawl and index data into solr, currently the output is writing into default logs/hadoop.log file, so my requirement is how can i log data writing into different file bin/crawl -i -D solr.server.url=http://localhost:8983/solr/jeepkr -s urls/ crawl/ 1 -->this will write log details under default path logs/hadoop.log How can i write log path by passing as part of bin/crawl? ex: bin/crawl -i -D solr.server.url=http://localhost:8983/solr/jeepkr -s urls/ crawl/ 1 >/tmp/myurls.log -- - Thanks and Regards, *Amarnath Polu*
IndexWriter interface in 1.15
Hi, I missed it at the time, but I just realized (the hard way) that the IndexWriter interface was changed in 1.15 in ways that are not backward compatible. That means that any custom IndexWriter implementation will no longer compile, and probably will not run either. I think this was a mistake (maybe a new interface should have been created, and the old one deprecated and supported for now, or just the old methods deprecated without change, and the new methods provided with a default implementation), but it's too late now. I still think this is something that should be highlighted in the release note for 1.15 (meaning at the top, as "breaking changes"). The main changes I encountered: 1. setConf and getConf were removed from the interface (without deprecation). 2. open was deprecated (that's fine), and its signature was changed (from JobConf to Configuration), which means it a completely different function technically, and there is no point in the deprecation. Yossi.