RE: Run crawl from java code

2010-10-04 Thread Marseld Dedgjonaj
Hello, Thanks for your answer. I try it but I got this error. Maybe any problem on reading in conf folder. I see its ok. If I run crawl from linux script it works. Thanks This is the error message: 10/10/04 10:46:40 INFO crawl.Crawl: crawl started in: crawl 10/10/04 10:46:40 INFO crawl.Crawl:

RE: About SOLR and Nutch

2010-10-04 Thread Thumuluri, Sai
Not very clear on the question - but we do use Nutch to crawl and Solr to index. -Original Message- From: Israel [mailto:wego...@gmail.com] Sent: Monday, October 04, 2010 12:02 AM To: user@nutch.apache.org Subject: About SOLR and Nutch * Hi, anyone know if integrating SOLR to Nuth, the

RE: Run crawl from java code

2010-10-04 Thread Marseld Dedgjonaj
Hello I see that conf was not in classpath. I added this classpathentry kind=src path=conf/ And now seems to be ok. Thanks for your helpĀ . -Original Message- From: Hannes Carl Meyer [mailto:hannesc...@googlemail.com] Sent: Monday, October 04, 2010 11:34 AM To: user@nutch.apache.org

Re: About SOLR and Nutch

2010-10-04 Thread Steve Cohen
Here is the wiki entry. http://wiki.apache.org/nutch/RunningNutchAndSolr Using solr was working fine when mapred.job.tracker was set to local when we crawl and run solrindex but when we setmapred.job.tracker to localhost:9001 and use the hadoop task daemons, solrindex gives us many errors

Nutch on file system and web

2010-10-04 Thread Davide Cavalaglio
Hi, I have a question: It's possible to configure nutch for crawling on file system and web at the same time? I want to start crawler on two seeds: 1) http://www.myWebSite.com/ 2) file:///C:/MyFile/ It's possible with single crawler? It's possible to use only one configuration (nutch-site.xml,

Re: Run crawl from java code

2010-10-04 Thread Hannes Carl Meyer
Hi, check wether your Working directory (Run - Run Configurations - Tab Arguments - Working Directory) points to the Nutch base directory (where your conf/nucht-site.xml is located). Regards Hannes On Mon, Oct 4, 2010 at 11:02 AM, Marseld Dedgjonaj marseld.dedgjo...@ikubinfo.com wrote: Hello,

Hadoop compression

2010-10-04 Thread Christopher Laux
Hi all, I have a fundamental question about compressing the segments nutch produces: is it switched on by default or how can I switch it on if needed? I can only find general documentation about turning compression on for Hadoop, but there is no hadoop-site.xml in nutch/conf ? Thanks, Chris