Semyon Semyonov created NUTCH-2524:
--------------------------------------

             Summary: Crawl Script , if file exists in HDFS doesnt work.
                 Key: NUTCH-2524
                 URL: https://issues.apache.org/jira/browse/NUTCH-2524
             Project: Nutch
          Issue Type: Bug
          Components: bin
            Reporter: Semyon Semyonov


In crawl script you can find something likeĀ 
if [[ -d "$CRAWL_PATH"/hostdb ]]; then
 echo "Processing sitemaps based on hosts in HostDB"
 __bin_nutch sitemap "$CRAWL_PATH"/crawldb -hostdb "$CRAWL_PATH"/hostdb 
-threads $NUM_THREADS
 fi

if [[ -d "$CRAWL_PATH"/hostdb ]]; doesnt work for HDFS only for local mode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to