Any advice?
Matteo
2012/9/19 Matteo Simoncini sicc...@gmail.com
Hi,
I'm running Nutch 1.5.1 on a Virtual Machine to crawl a big amount of url.
I gave enought space to the crawl folder, the one where linkDB and
crawlDB go, and to the Solr folder.
It worked fine until 200.000 URL, but now
Hi Matteo,
have a look at the property hadoop.tmp.dir which allows you to direct
the temp folder to another volume with more space on it.
For local crawls:
- do not share this folder for two simultaneously running Nutch jobs
- you have to clean-up the temp folder, esp. after failed jobs
(if
Thanks, you really helped a lot.
Matteo
2012/9/20 Sebastian Nagel wastl.na...@googlemail.com
Hi Matteo,
have a look at the property hadoop.tmp.dir which allows you to direct
the temp folder to another volume with more space on it.
For local crawls:
- do not share this folder for two
3 matches
Mail list logo