Page search2.net deleted from Nutch Wiki

2010-01-26 Thread Apache Wiki
Dear wiki user,

You have subscribed to a wiki page Nutch Wiki for change notification.

The page search2.net has been deleted by search2.net.
The comment on this change is: empty page.
http://wiki.apache.org/nutch/search2.net


[jira] Issue Comment Edited: (NUTCH-780) Nutch crawler did not read configuration files

2010-01-26 Thread Vu Hoang (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12803193#action_12803193
 ] 

Vu Hoang edited comment on NUTCH-780 at 1/27/10 2:54 AM:
-

add method
{code:java|title=org/apache/nutch/crawl/Crawl.java|borderStyle=solid}
public static Configuration overwrite(Configuration nutchConfig)
  {
  Configuration crawlConfig = 
NutchConfiguration.createCrawlConfiguration();
  IteratorEntryString, String entries = nutchConfig.iterator();
  while (entries.hasNext())
  {
  EntryString, String entry = (EntryString, String) 
entries.next();
  crawlConfig.set(entry.getKey(), entry.getValue());
  }
  
  return crawlConfig;
  }
{code}

add lines below into class org.apache.nutch.crawl.Crawl
{code:java|title=org/apache/nutch/crawl/Crawl.java|borderStyle=solid}
public static Configuration nutchConfig = null;
public static void setNutchConfig(Configuration config) { nutchConfig = config; 
}
{code}

and re-configure nutch configuration inside of method main as below
{code:java|title=org/apache/nutch/crawl/Crawl.java|borderStyle=solid}
Configuration conf = null;
if (nutchConfig != null) conf = nutchConfig;
else conf = NutchConfiguration.createCrawlConfiguration();
{code}

I recommend that solution :)

  was (Author: vushogerts):
add lines below into class org.apache.nutch.crawl.Crawl
{code:java|title=org/apache/nutch/crawl/Crawl.java|borderStyle=solid}
public static Configuration nutchConfig = null;
public static void setNutchConfig(Configuration config) { nutchConfig = config; 
}
{code}

and re-configure nutch configuration inside of method main as below
{code:java|title=org/apache/nutch/crawl/Crawl.java|borderStyle=solid}
Configuration conf = null;
if (nutchConfig != null) conf = nutchConfig;
else conf = NutchConfiguration.createCrawlConfiguration();
{code}

I recommend that solution :)
  
 Nutch crawler did not read configuration files
 --

 Key: NUTCH-780
 URL: https://issues.apache.org/jira/browse/NUTCH-780
 Project: Nutch
  Issue Type: Bug
  Components: fetcher
Affects Versions: 1.0.0
Reporter: Vu Hoang

 Nutch searcher can read properties at the constructor ...
 {code:java|title=NutchSearcher.java|borderStyle=solid}
 NutchBean bean = new NutchBean(getFilesystem().getConf(), fs);
 ... // put search engine code here
 {code}
 ... but Nutch crawler is not, it only reads data from arguments.
 {code:java|title=NutchCrawler.java|borderStyle=solid}
 StringBuilder builder = new StringBuilder();
 builder.append(domainlist + SPACE);
 builder.append(ARGUMENT_CRAWL_DIR);
 builder.append(domainlist + SUBFIX_CRAWLED + SPACE);
 builder.append(ARGUMENT_CRAWL_THREADS);
 builder.append(threads + SPACE);
 builder.append(ARGUMENT_CRAWL_DEPTH);
 builder.append(depth + SPACE);
 builder.append(ARGUMENT_CRAWL_TOPN);
 builder.append(topN + SPACE);
 Crawl.main(builder.toString().split(SPACE));
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (NUTCH-780) Nutch crawler did not read configuration files

2010-01-26 Thread Vu Hoang (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12803193#action_12803193
 ] 

Vu Hoang edited comment on NUTCH-780 at 1/27/10 2:55 AM:
-

add method
{code:java|title=org/apache/nutch/crawl/Crawl.java|borderStyle=solid}
public static Configuration overwrite(Configuration nutchConfig)
{
  Configuration crawlConfig = 
NutchConfiguration.createCrawlConfiguration();
  IteratorEntryString, String entries = nutchConfig.iterator();
  while (entries.hasNext())
  {
  EntryString, String entry = (EntryString, String) 
entries.next();
  crawlConfig.set(entry.getKey(), entry.getValue());
  }
  
  return crawlConfig;
}
{code}

add lines below into class org.apache.nutch.crawl.Crawl
{code:java|title=org/apache/nutch/crawl/Crawl.java|borderStyle=solid}
public static Configuration nutchConfig = null;
public static void setNutchConfig(Configuration config) { nutchConfig = config; 
}
{code}

and re-configure nutch configuration inside of method main as below
{code:java|title=org/apache/nutch/crawl/Crawl.java|borderStyle=solid}
Configuration conf = null;
if (nutchConfig != null) conf = overwrite(nutchConfig);
else conf = NutchConfiguration.createCrawlConfiguration();
{code}

I recommend that solution :)

  was (Author: vushogerts):
add method
{code:java|title=org/apache/nutch/crawl/Crawl.java|borderStyle=solid}
public static Configuration overwrite(Configuration nutchConfig)
  {
  Configuration crawlConfig = 
NutchConfiguration.createCrawlConfiguration();
  IteratorEntryString, String entries = nutchConfig.iterator();
  while (entries.hasNext())
  {
  EntryString, String entry = (EntryString, String) 
entries.next();
  crawlConfig.set(entry.getKey(), entry.getValue());
  }
  
  return crawlConfig;
  }
{code}

add lines below into class org.apache.nutch.crawl.Crawl
{code:java|title=org/apache/nutch/crawl/Crawl.java|borderStyle=solid}
public static Configuration nutchConfig = null;
public static void setNutchConfig(Configuration config) { nutchConfig = config; 
}
{code}

and re-configure nutch configuration inside of method main as below
{code:java|title=org/apache/nutch/crawl/Crawl.java|borderStyle=solid}
Configuration conf = null;
if (nutchConfig != null) conf = nutchConfig;
else conf = NutchConfiguration.createCrawlConfiguration();
{code}

I recommend that solution :)
  
 Nutch crawler did not read configuration files
 --

 Key: NUTCH-780
 URL: https://issues.apache.org/jira/browse/NUTCH-780
 Project: Nutch
  Issue Type: Bug
  Components: fetcher
Affects Versions: 1.0.0
Reporter: Vu Hoang

 Nutch searcher can read properties at the constructor ...
 {code:java|title=NutchSearcher.java|borderStyle=solid}
 NutchBean bean = new NutchBean(getFilesystem().getConf(), fs);
 ... // put search engine code here
 {code}
 ... but Nutch crawler is not, it only reads data from arguments.
 {code:java|title=NutchCrawler.java|borderStyle=solid}
 StringBuilder builder = new StringBuilder();
 builder.append(domainlist + SPACE);
 builder.append(ARGUMENT_CRAWL_DIR);
 builder.append(domainlist + SUBFIX_CRAWLED + SPACE);
 builder.append(ARGUMENT_CRAWL_THREADS);
 builder.append(threads + SPACE);
 builder.append(ARGUMENT_CRAWL_DEPTH);
 builder.append(depth + SPACE);
 builder.append(ARGUMENT_CRAWL_TOPN);
 builder.append(topN + SPACE);
 Crawl.main(builder.toString().split(SPACE));
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (NUTCH-780) Nutch crawler did not read configuration files

2010-01-26 Thread Vu Hoang (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vu Hoang updated NUTCH-780:
---

Attachment: NUTCH-780.patch

 Nutch crawler did not read configuration files
 --

 Key: NUTCH-780
 URL: https://issues.apache.org/jira/browse/NUTCH-780
 Project: Nutch
  Issue Type: Bug
  Components: fetcher
Affects Versions: 1.0.0
Reporter: Vu Hoang
 Attachments: NUTCH-780.patch


 Nutch searcher can read properties at the constructor ...
 {code:java|title=NutchSearcher.java|borderStyle=solid}
 NutchBean bean = new NutchBean(getFilesystem().getConf(), fs);
 ... // put search engine code here
 {code}
 ... but Nutch crawler is not, it only reads data from arguments.
 {code:java|title=NutchCrawler.java|borderStyle=solid}
 StringBuilder builder = new StringBuilder();
 builder.append(domainlist + SPACE);
 builder.append(ARGUMENT_CRAWL_DIR);
 builder.append(domainlist + SUBFIX_CRAWLED + SPACE);
 builder.append(ARGUMENT_CRAWL_THREADS);
 builder.append(threads + SPACE);
 builder.append(ARGUMENT_CRAWL_DEPTH);
 builder.append(depth + SPACE);
 builder.append(ARGUMENT_CRAWL_TOPN);
 builder.append(topN + SPACE);
 Crawl.main(builder.toString().split(SPACE));
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (NUTCH-780) Nutch crawler did not read configuration files

2010-01-26 Thread Vu Hoang (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vu Hoang updated NUTCH-780:
---

Patch Info: [Patch Available]

 Nutch crawler did not read configuration files
 --

 Key: NUTCH-780
 URL: https://issues.apache.org/jira/browse/NUTCH-780
 Project: Nutch
  Issue Type: Bug
  Components: fetcher
Affects Versions: 1.0.0
Reporter: Vu Hoang
 Attachments: NUTCH-780.patch


 Nutch searcher can read properties at the constructor ...
 {code:java|title=NutchSearcher.java|borderStyle=solid}
 NutchBean bean = new NutchBean(getFilesystem().getConf(), fs);
 ... // put search engine code here
 {code}
 ... but Nutch crawler is not, it only reads data from arguments.
 {code:java|title=NutchCrawler.java|borderStyle=solid}
 StringBuilder builder = new StringBuilder();
 builder.append(domainlist + SPACE);
 builder.append(ARGUMENT_CRAWL_DIR);
 builder.append(domainlist + SUBFIX_CRAWLED + SPACE);
 builder.append(ARGUMENT_CRAWL_THREADS);
 builder.append(threads + SPACE);
 builder.append(ARGUMENT_CRAWL_DEPTH);
 builder.append(depth + SPACE);
 builder.append(ARGUMENT_CRAWL_TOPN);
 builder.append(topN + SPACE);
 Crawl.main(builder.toString().split(SPACE));
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.