Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The "GoogleSummerOfCode/SitemapCrawler/weeklyreport" page has been changed by 
CihadGuzel:
https://wiki.apache.org/nutch/GoogleSummerOfCode/SitemapCrawler/weeklyreport?action=diff&rev1=6&rev2=7

Comment:
Added week 3&4 report 

  Robot.txt file is checked while fetcher job is run. If robot.txt file have 
any sitemap urls, these are written to database. A column called sitemap(stm) 
for sitemap is added to db schema. The urls in stm column from db will be 
parsed at the next time.
  
  
- || '''Week :''' 3 (8 June 2015 - 21 June 2015) ||
+ || '''Week :''' 3 & 4 (8 June 2015 - 21 June 2015) ||
  
  '''Title :''' Sitemap parser plugin is developed.
  
  A plugin to parse sitemap file is developed. The plugin make use of crawler 
commons library. The sitemap file is parsed by the parse plugin. Inlinks from 
sitemap file is written to db. The inlinks will be parsed at the next time.
  
  
- || '''Week :''' 4 (22 June 2015 - 28 June 2015) ||
+ || '''Week :''' 5 (22 June 2015 - 28 June 2015) ||
+ ...
  
- '''Title :''' 
- 
- ----
- Example:
- 

Reply via email to