This is an automated email from the ASF dual-hosted git repository.

snagel pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/nutch.git


    from 582cdd417 NUTCH-3058 Fetcher: counter for hung threads (#820)
     add f6bcec920 NUTCH-1806 Delegate processing of URL domains to crawler 
commons - add unit test for URLs without host/domain (cf. NUTCH-2450)
     add bc2ae7e0c NUTCH-1806 Delegate processing of URL domains to crawler 
commons - add unit tests for host names with trailing dot ("www.apache.org.")
     add e0fa35729 NUTCH-1806 Delegate processing of URL domains to crawler 
commons - use methods from crawler-commons' EffectiveTldFinder in URLUtil   
replacing classed and methods from the org.apache.nutch.util.domain   package - 
adapt and extend unit tests   - add tests for 
URLUtil.getTopLevelDomainName(url)   - changes to the public suffix list since 
2014     ("xyz" is now a public suffix / ICANN suffix)   - minor API changes    
 - URLUtil.getDomainName(url) returns the host name      [...]
     add d43f5793f NUTCH-1806 Delegate processing of URL domains to crawler 
commons NUTCH-1942 Remove TopLevelDomain - update DomainStatistics, 
TLDIndexingFilter and domain URL filters   to use the updated methods in 
URLUtil - remove TLDScoringFilter - remove package org.apache.nutch.util.domain 
- move DomainStatistics to org.apache.nutch.util - remove configuration files 
of domain utils
     add 40881e8b7 NUTCH-1806 Delegate processing of URL domains to crawler 
commons
     new 8b11962a4 Merge pull request #816 from 
sebastian-nagel/NUTCH-1942-domain-utils-to-use-crawler-commons

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 conf/domain-suffixes.xml.template                  | 4428 --------------------
 conf/domain-suffixes.xsd                           |  130 -
 default.properties                                 |    1 -
 src/bin/nutch                                      |    2 +-
 .../nutch/util/{domain => }/DomainStatistics.java  |    7 +-
 src/java/org/apache/nutch/util/URLUtil.java        |  214 +-
 .../org/apache/nutch/util/domain/DomainSuffix.java |   78 -
 .../apache/nutch/util/domain/DomainSuffixes.java   |   91 -
 .../nutch/util/domain/DomainSuffixesReader.java    |  164 -
 .../apache/nutch/util/domain/TopLevelDomain.java   |   66 -
 .../org/apache/nutch/util/domain/package-info.java |   28 -
 .../nutch/indexer/tld/TLDIndexingFilter.java       |   13 +-
 .../apache/nutch/scoring/tld/TLDScoringFilter.java |   60 -
 .../org/apache/nutch/scoring/tld/package-info.java |   19 -
 .../nutch/urlfilter/domain/DomainURLFilter.java    |    9 +-
 .../domaindenylist/DomainDenylistURLFilter.java    |    9 +-
 src/test/org/apache/nutch/util/TestURLUtil.java    |   81 +-
 17 files changed, 208 insertions(+), 5192 deletions(-)
 delete mode 100644 conf/domain-suffixes.xml.template
 delete mode 100644 conf/domain-suffixes.xsd
 rename src/java/org/apache/nutch/util/{domain => }/DomainStatistics.java (97%)
 delete mode 100644 src/java/org/apache/nutch/util/domain/DomainSuffix.java
 delete mode 100644 src/java/org/apache/nutch/util/domain/DomainSuffixes.java
 delete mode 100644 
src/java/org/apache/nutch/util/domain/DomainSuffixesReader.java
 delete mode 100644 src/java/org/apache/nutch/util/domain/TopLevelDomain.java
 delete mode 100644 src/java/org/apache/nutch/util/domain/package-info.java
 delete mode 100644 
src/plugin/tld/src/java/org/apache/nutch/scoring/tld/TLDScoringFilter.java
 delete mode 100644 
src/plugin/tld/src/java/org/apache/nutch/scoring/tld/package-info.java

Reply via email to