[ https://issues.apache.org/jira/browse/NUTCH-780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12803193#action_12803193 ]
Vu Hoang edited comment on NUTCH-780 at 1/27/10 2:55 AM: --------------------------------------------------------- add method {code:java|title=org/apache/nutch/crawl/Crawl.java|borderStyle=solid} public static Configuration overwrite(Configuration nutchConfig) { Configuration crawlConfig = NutchConfiguration.createCrawlConfiguration(); Iterator<Entry<String, String>> entries = nutchConfig.iterator(); while (entries.hasNext()) { Entry<String, String> entry = (Entry<String, String>) entries.next(); crawlConfig.set(entry.getKey(), entry.getValue()); } return crawlConfig; } {code} add lines below into class org.apache.nutch.crawl.Crawl {code:java|title=org/apache/nutch/crawl/Crawl.java|borderStyle=solid} public static Configuration nutchConfig = null; public static void setNutchConfig(Configuration config) { nutchConfig = config; } {code} and re-configure nutch configuration inside of method main as below {code:java|title=org/apache/nutch/crawl/Crawl.java|borderStyle=solid} Configuration conf = null; if (nutchConfig != null) conf = overwrite(nutchConfig); else conf = NutchConfiguration.createCrawlConfiguration(); {code} I recommend that solution :) was (Author: vushogerts): add method {code:java|title=org/apache/nutch/crawl/Crawl.java|borderStyle=solid} public static Configuration overwrite(Configuration nutchConfig) { Configuration crawlConfig = NutchConfiguration.createCrawlConfiguration(); Iterator<Entry<String, String>> entries = nutchConfig.iterator(); while (entries.hasNext()) { Entry<String, String> entry = (Entry<String, String>) entries.next(); crawlConfig.set(entry.getKey(), entry.getValue()); } return crawlConfig; } {code} add lines below into class org.apache.nutch.crawl.Crawl {code:java|title=org/apache/nutch/crawl/Crawl.java|borderStyle=solid} public static Configuration nutchConfig = null; public static void setNutchConfig(Configuration config) { nutchConfig = config; } {code} and re-configure nutch configuration inside of method main as below {code:java|title=org/apache/nutch/crawl/Crawl.java|borderStyle=solid} Configuration conf = null; if (nutchConfig != null) conf = nutchConfig; else conf = NutchConfiguration.createCrawlConfiguration(); {code} I recommend that solution :) > Nutch crawler did not read configuration files > ---------------------------------------------- > > Key: NUTCH-780 > URL: https://issues.apache.org/jira/browse/NUTCH-780 > Project: Nutch > Issue Type: Bug > Components: fetcher > Affects Versions: 1.0.0 > Reporter: Vu Hoang > > Nutch searcher can read properties at the constructor ... > {code:java|title=NutchSearcher.java|borderStyle=solid} > NutchBean bean = new NutchBean(getFilesystem().getConf(), fs); > ... // put search engine code here > {code} > ... but Nutch crawler is not, it only reads data from arguments. > {code:java|title=NutchCrawler.java|borderStyle=solid} > StringBuilder builder = new StringBuilder(); > builder.append(domainlist + SPACE); > builder.append(ARGUMENT_CRAWL_DIR); > builder.append(domainlist + SUBFIX_CRAWLED + SPACE); > builder.append(ARGUMENT_CRAWL_THREADS); > builder.append(threads + SPACE); > builder.append(ARGUMENT_CRAWL_DEPTH); > builder.append(depth + SPACE); > builder.append(ARGUMENT_CRAWL_TOPN); > builder.append(topN + SPACE); > Crawl.main(builder.toString().split(SPACE)); > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.