Does anyone use Nutch on EMR? I am using Nutch 1.3 and I get an error saying:
FATAL org.apache.nutch.crawl.Generator (main): Generator: java.lang.IllegalArgumentException: This file system object (hdfs://ip-44-169-41-187.ec2.internal:9000) does not support access to the request path 's3://Datasets/crawlResults/crawldb/.locked' You possibly called FileSystem.get(conf) when you should have called FileSystem.get(uri, conf) to obtain a file system supporting your path. I have seen other posts with this same problem but no resolution. Does anyone use Nutch-1.3 on EMR? Thanks for the help, Peter

