GitHub user driki opened a pull request:

    https://github.com/apache/nutch/pull/1

    redirects treated as external links

    Hi,
    
    I encountered an issue with the crawler adhering to the 
db.ignore.external.links property when encountering a link on the same domain 
that contains a redirect to an external domain. Tested locally against a few 
sites that I crawl and appears to be working.
    
    Thanks,
    Matt

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/NearbyFYI/nutch 2.x

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/nutch/pull/1.patch

----
commit a71133d257aa5c7835b9cc98134b1e7b3df5b5fe
Author: Matt MacDonald <[email protected]>
Date:   2012-09-08T12:25:28-07:00

    redirects treated as external links

----

Reply via email to