[ 
https://issues.apache.org/jira/browse/NUTCH-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated NUTCH-1741:
----------------------------------------
    Attachment: NUTCH-1741v7.patch

Managed to update this at the weekend and forgot to upload. 
Some thing which we need to consider
 * mappings in gora-*-mapping.xml files need to be more thoroughly tested as 
the backend mappings may not be most efficient for storing the new stiemaps and 
sitemap priority data structures. 
 * There are 4 tests being skipped in TestGeneratorJob, I'm going to log a new 
ticket for this and we can fix it over there. This is not a blocker for 
committing and further testing this rather substantial Sitemaps patch for 2.X.

Generally speaking sterling effort [~alparslan.avci] and especially [~cguzel] 
within GSoC 2015 :)

I'm going to commit to 2.X now as I've tested locally. 

> Support of Sitemaps in Nutch 2.x
> --------------------------------
>
>                 Key: NUTCH-1741
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1741
>             Project: Nutch
>          Issue Type: New Feature
>          Components: fetcher, generator
>            Reporter: Alparslan Avcı
>            Assignee: cihad güzel
>              Labels: gsoc2015
>             Fix For: 2.4
>
>         Attachments: NUTCH-1741-v2.patch, NUTCH-1741-v3.patch, 
> NUTCH-1741-v4.patch, NUTCH-1741.patch, NUTCH-1741v5.patch, 
> NUTCH-1741v6.patch, NUTCH-1741v7.patch, SitemapCrawlerLifeCycle.pdf, 
> SitemapDevelopmentFor2x.pdf
>
>
> Sitemap support has to be implemented for 2.x branch. It is being discussed 
> in NUTCH-1465 for trunk. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to