[ https://issues.apache.org/jira/browse/NUTCH-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16395071#comment-16395071 ]
Sebastian Nagel commented on NUTCH-2524: ---------------------------------------- Thanks [~semyon.semyo...@mail.com], good catch! PR looks good! > Crawl Script , if file exists in HDFS doesnt work. > -------------------------------------------------- > > Key: NUTCH-2524 > URL: https://issues.apache.org/jira/browse/NUTCH-2524 > Project: Nutch > Issue Type: Bug > Components: bin > Reporter: Semyon Semyonov > Priority: Major > > In crawl script you can find something likeĀ > if [[ -d "$CRAWL_PATH"/hostdb ]]; then > echo "Processing sitemaps based on hosts in HostDB" > __bin_nutch sitemap "$CRAWL_PATH"/crawldb -hostdb "$CRAWL_PATH"/hostdb > -threads $NUM_THREADS > fi > if [[ -d "$CRAWL_PATH"/hostdb ]]; doesnt work for HDFS only for local mode. -- This message was sent by Atlassian JIRA (v7.6.3#76005)