[
https://issues.apache.org/jira/browse/NUTCH-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766102#comment-17766102
]
ASF GitHub Bot commented on NUTCH-2978:
---------------------------------------
tballison commented on PR #772:
URL: https://github.com/apache/nutch/pull/772#issuecomment-1722508915
Fantastic! Thank you so much Sebastian!
On Sun, Sep 17, 2023 at 9:02 AM Sebastian Nagel ***@***.***>
wrote:
> +1
>
> A test with the pseudo-distributed Hadoop setup
> <https://github.com/sebastian-nagel/nutch-test-single-node-cluster/> was
> successful:
>
> - Nutch tools work properly, no issues
> - as expected, Hadoop puts slf4j-api-1.7.36.jar and
> slf4j-reload4j-1.7.36.jar in the classpath in front of the Nutch job
jars
> - consequently, task logs are formatted using the format defined in
> $HADOOP_HOMe/etc/hadoop/log4j.properties
> - (the good thing) log messages from Nutch classes appear in the task
> logs, e.g.
>
> 2023-09-17 07:29:21,726 INFO [FetcherThread]
org.apache.nutch.fetcher.FetcherThread: FetcherThread 33 fetching
https://nutch.apache.org/ (queue crawl delay=5000ms)
>
> - the log format defined in $NUTCH_HOME/conf/log4j2.xml is only
> applied to the logs of the Yarn job client, e.g.
>
> 2023-09-17 07:29:32,432 INFO fetcher.Fetcher: Fetcher: finished at
2023-09-17 07:29:32, elapsed: 00:00:25
>
> - in addition, I've included two PDFs, a XLSX and a ePub document, to
> test the Tika parser: the docs were successfully parsed using Tika
2.3.0 -
> if necessary I can repeat the test for NUTCH-2959
> <https://issues.apache.org/jira/browse/NUTCH-2959>
>
> —
> Reply to this email directly, view it on GitHub
> <https://github.com/apache/nutch/pull/772#issuecomment-1722472438>, or
> unsubscribe
>
<https://github.com/notifications/unsubscribe-auth/ABTNNPTYVXO7AZOVVC4NNYTX23YGLANCNFSM6AAAAAA4GB45VU>
> .
> You are receiving this because you authored the thread.Message ID:
> ***@***.***>
>
> Move to slf4j2 and remove log4j1 and reload4j
> ---------------------------------------------
>
> Key: NUTCH-2978
> URL: https://issues.apache.org/jira/browse/NUTCH-2978
> Project: Nutch
> Issue Type: Task
> Reporter: Markus Jelsma
> Priority: Major
> Attachments: NUTCH-2978-1.patch, NUTCH-2978-2.patch,
> NUTCH-2978-3.patch, NUTCH-2978-any23.patch, NUTCH-2978.patch
>
>
> I got in trouble upgrading some dependencies and got a lot of LinkageErrors
> today, or with a Tika upgrade, disappearing logs. This patch fixes that by
> moving to slf4j2, using the corrent log4j2-slfj4-impl2 and getting rid of old
> log4j -> reload4j.
>
> This patch fixes it.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)