Semyon Semyonov created NUTCH-2524:
--------------------------------------
Summary: Crawl Script , if file exists in HDFS doesnt work.
Key: NUTCH-2524
URL: https://issues.apache.org/jira/browse/NUTCH-2524
Project: Nutch
Issue Type: Bug
Components: bin
Reporter: Semyon Semyonov
In crawl script you can find something likeĀ
if [[ -d "$CRAWL_PATH"/hostdb ]]; then
echo "Processing sitemaps based on hosts in HostDB"
__bin_nutch sitemap "$CRAWL_PATH"/crawldb -hostdb "$CRAWL_PATH"/hostdb
-threads $NUM_THREADS
fi
if [[ -d "$CRAWL_PATH"/hostdb ]]; doesnt work for HDFS only for local mode.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)