[ 
https://issues.apache.org/jira/browse/NUTCH-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17882446#comment-17882446
 ] 

Hudson commented on NUTCH-1806:
-------------------------------

SUCCESS: Integrated in Jenkins build Nutch ยป Nutch-trunk #168 (See 
[https://ci-builds.apache.org/job/Nutch/job/Nutch-trunk/168/])
NUTCH-1806 Delegate processing of URL domains to crawler commons (snagel: 
[https://github.com/apache/nutch/commit/f6bcec9200f48fcf758cc349cf9480c947259c7e])
* (edit) src/test/org/apache/nutch/util/TestURLUtil.java
NUTCH-1806 Delegate processing of URL domains to crawler commons (snagel: 
[https://github.com/apache/nutch/commit/bc2ae7e0c8d59c03ea8fe31a7043ed3bb9e6abb4])
* (edit) src/test/org/apache/nutch/util/TestURLUtil.java
NUTCH-1806 Delegate processing of URL domains to crawler commons (snagel: 
[https://github.com/apache/nutch/commit/e0fa35729b6d046f3c056872bc84433bd5a94d77])
* (edit) src/test/org/apache/nutch/util/TestURLUtil.java
* (edit) src/java/org/apache/nutch/util/URLUtil.java
NUTCH-1806 Delegate processing of URL domains to crawler commons (snagel: 
[https://github.com/apache/nutch/commit/d43f5793fd0e9681f8bfc12b046de93e3bcf6fa6])
* (delete) src/java/org/apache/nutch/util/domain/DomainSuffix.java
* (edit) 
src/plugin/tld/src/java/org/apache/nutch/indexer/tld/TLDIndexingFilter.java
* (edit) default.properties
* (delete) conf/domain-suffixes.xsd
* (edit) src/bin/nutch
* (delete) src/java/org/apache/nutch/util/domain/DomainStatistics.java
* (delete) src/java/org/apache/nutch/util/domain/TopLevelDomain.java
* (delete) src/java/org/apache/nutch/util/domain/DomainSuffixes.java
* (edit) 
src/plugin/urlfilter-domain/src/java/org/apache/nutch/urlfilter/domain/DomainURLFilter.java
* (delete) conf/domain-suffixes.xml.template
* (delete) 
src/plugin/tld/src/java/org/apache/nutch/scoring/tld/TLDScoringFilter.java
* (add) src/java/org/apache/nutch/util/DomainStatistics.java
* (delete) src/java/org/apache/nutch/util/domain/package-info.java
* (delete) src/java/org/apache/nutch/util/domain/DomainSuffixesReader.java
* (delete) 
src/plugin/tld/src/java/org/apache/nutch/scoring/tld/package-info.java
* (edit) 
src/plugin/urlfilter-domaindenylist/src/java/org/apache/nutch/urlfilter/domaindenylist/DomainDenylistURLFilter.java
NUTCH-1806 Delegate processing of URL domains to crawler commons (snagel: 
[https://github.com/apache/nutch/commit/40881e8b755e24d78a60689bd818058daba1a6fc])
* (edit) src/test/org/apache/nutch/util/TestURLUtil.java
* (edit) src/java/org/apache/nutch/util/URLUtil.java


> Delegate processing of URL domains to crawler commons
> -----------------------------------------------------
>
>                 Key: NUTCH-1806
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1806
>             Project: Nutch
>          Issue Type: Improvement
>    Affects Versions: 1.8
>            Reporter: Julien Nioche
>            Priority: Major
>              Labels: crawler-commons
>             Fix For: 1.21
>
>
> We have code in src/java/org/apache/nutch/util/domain and a resource file 
> conf/domain-suffixes.xml to handle URL domains. This is used mostly from 
> URLUtil.getDomainName.
> The resource file is not necessarily up to date and since crawler commons has 
> a similar functionality we should use it instead of having to maintain our 
> own resources.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to