[ http://issues.apache.org/jira/browse/NUTCH-148?page=comments#action_12361206 ]
Piotr Kosiorowski commented on NUTCH-148: ----------------------------------------- 'df' command is required for NDFS operation so if you were not using NDFS in 0.7.1 and nutch shell scripts you were able to run it on Windows without cygwin. Now majority of tools use NDFS so cygwin is required on Windows. I would asssume the other bug is also cygwin related - please test it with cygwin and report if it fixed the issue. In future in case if doubts it is better to ask on the nutch-user mailing list rather than create JIRA issue first. I will close both your issues now assuming they are cygwin related. If you fins that it still does not work with cygwin please reopen. > org.apache.nutch.tools.CrawlTool throws error while doing deleteduplicates > -------------------------------------------------------------------------- > > Key: NUTCH-148 > URL: http://issues.apache.org/jira/browse/NUTCH-148 > Project: Nutch > Type: Bug > Components: indexer > Versions: 0.8-dev > Environment: Windows XP Home > Reporter: raghavendra prabhu > > I get the following error while running org.apache.nutch.tools.CrawlTool > The error actually is in deleteduplicates > 51223 001121 Reading url hashes... > 051223 001121 Sorting url hashes... > 051223 001121 Deleting url duplicates... > 051223 001121 Error moving bad file > G:\apache-tomcat-5.5.12\webapps\crux\WEB-INF > \classes\ddup-workingdir\ddup-20051223001121: java.io.IOException: > CreateProcess > : df -k > G:\apache-tomcat-5.5.12\webapps\crux\WEB-INF\classes\ddup-workingdir\ddup-20051223001121 > error=2 > It throws the error here in NFSDataInputStream.java > The exception is org.apache.nutch.fs.ChecksumException: Checksum > error: G:\apach > e-tomcat-5.5.12\webapps\crux\WEB-INF\classes\ddup-workingdir\ddup-20051223001121 > at 0 -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
