Robert Young wrote:
In org.apache.nutch.crawl.LinkDb on line 261 it creates a working
directory (newLinkDb) based on the current working directory. This
should be configurable rather than being based on where Tomcat was
started. I am planning on writing a patch to pull the hadoop.tmp.dir
setting if it is available, falling back to the current directory.

Can anyone see any obvious problems with doing this?

I'm not sure what Tomcat has to do with this. LinkDb does it this way in order to avoid rename() operation across physical volumes - if you invoke rename() on a local FS it may trigger a costly copy operation.


--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply via email to