This is an automated email from the ASF dual-hosted git repository.

snagel pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/nutch.git.


    from 0cec7b5  Merge pull request #335 from r0ann3l/NUTCH-2580
     add 4f73c63  NUTCH-2583 Upgrading Nutch's dependencies - apply patch 
contributed by Ralf
     add 20ecad2  NUTCH-2584 Upgrade parse-tika to use Tika 1.18
     add f5e3a30  NUTCH-2584 Upgrade parse-tika to use Tika 1.18 - fix failing 
unit tests - use Tika parser to get DOM tree of test documents - fix 
HTMLMetaProcessor to extract no-cache and base-href   attributes on DOM tree 
modified by Tika - ignore links from FORM and SOURCE elements which are   not 
extracted by Tika parser
     add 217e646  Add target "report" to view dependency tree of plugins
     add 107b364  NUTCH-2589 HTML redirections are not followed when using 
parse-tika - extract meta-refresh redirects from DOM tree normalized by Tika - 
add unit test to check whether meta-refresh redirects are   extracted and parse 
status holds the redirect target
     new 2544fad  Merge pull request #336 from 
sebastian-nagel/NUTCH-2583-upgrade-dependencies

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 ivy/ivy.xml                                        |  67 +++++------
 src/plugin/build-plugin.xml                        |   4 +
 src/plugin/parse-tika/build.xml                    |  15 +--
 src/plugin/parse-tika/howto_upgrade_tika.txt       |  16 ++-
 src/plugin/parse-tika/ivy.xml                      |   2 +-
 src/plugin/parse-tika/plugin.xml                   |  65 +++++++----
 .../apache/nutch/parse/tika/HTMLMetaProcessor.java | 125 +++++++++++++--------
 .../org/apache/nutch/parse/tika/TikaParser.java    |  20 ++--
 .../{ => parse}/tika/TestDOMContentUtils.java      |  78 +++++++------
 .../nutch/{ => parse}/tika/TestFeedParser.java     |   2 +-
 .../nutch/{ => parse}/tika/TestHtmlParser.java     |   2 +-
 .../nutch/{ => parse}/tika/TestImageMetadata.java  |   2 +-
 .../nutch/{ => parse}/tika/TestMSWordParser.java   |   2 +-
 .../nutch/{ => parse}/tika/TestOOParser.java       |   2 +-
 .../nutch/{ => parse}/tika/TestPdfParser.java      |   2 +-
 .../nutch/{ => parse}/tika/TestRTFParser.java      |   2 +-
 .../{ => parse}/tika/TestRobotsMetaProcessor.java  |  70 ++++++++----
 17 files changed, 276 insertions(+), 200 deletions(-)
 rename src/plugin/parse-tika/src/test/org/apache/nutch/{ => 
parse}/tika/TestDOMContentUtils.java (89%)
 rename src/plugin/parse-tika/src/test/org/apache/nutch/{ => 
parse}/tika/TestFeedParser.java (99%)
 rename src/plugin/parse-tika/src/test/org/apache/nutch/{ => 
parse}/tika/TestHtmlParser.java (99%)
 rename src/plugin/parse-tika/src/test/org/apache/nutch/{ => 
parse}/tika/TestImageMetadata.java (98%)
 rename src/plugin/parse-tika/src/test/org/apache/nutch/{ => 
parse}/tika/TestMSWordParser.java (98%)
 rename src/plugin/parse-tika/src/test/org/apache/nutch/{ => 
parse}/tika/TestOOParser.java (98%)
 rename src/plugin/parse-tika/src/test/org/apache/nutch/{ => 
parse}/tika/TestPdfParser.java (98%)
 rename src/plugin/parse-tika/src/test/org/apache/nutch/{ => 
parse}/tika/TestRTFParser.java (98%)
 rename src/plugin/parse-tika/src/test/org/apache/nutch/{ => 
parse}/tika/TestRobotsMetaProcessor.java (68%)

-- 
To stop receiving notification emails like this one, please contact
sna...@apache.org.

Reply via email to