[ https://issues.apache.org/jira/browse/NUTCH-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766102#comment-17766102 ]
ASF GitHub Bot commented on NUTCH-2978: --------------------------------------- tballison commented on PR #772: URL: https://github.com/apache/nutch/pull/772#issuecomment-1722508915 Fantastic! Thank you so much Sebastian! On Sun, Sep 17, 2023 at 9:02 AM Sebastian Nagel ***@***.***> wrote: > +1 > > A test with the pseudo-distributed Hadoop setup > <https://github.com/sebastian-nagel/nutch-test-single-node-cluster/> was > successful: > > - Nutch tools work properly, no issues > - as expected, Hadoop puts slf4j-api-1.7.36.jar and > slf4j-reload4j-1.7.36.jar in the classpath in front of the Nutch job jars > - consequently, task logs are formatted using the format defined in > $HADOOP_HOMe/etc/hadoop/log4j.properties > - (the good thing) log messages from Nutch classes appear in the task > logs, e.g. > > 2023-09-17 07:29:21,726 INFO [FetcherThread] org.apache.nutch.fetcher.FetcherThread: FetcherThread 33 fetching https://nutch.apache.org/ (queue crawl delay=5000ms) > > - the log format defined in $NUTCH_HOME/conf/log4j2.xml is only > applied to the logs of the Yarn job client, e.g. > > 2023-09-17 07:29:32,432 INFO fetcher.Fetcher: Fetcher: finished at 2023-09-17 07:29:32, elapsed: 00:00:25 > > - in addition, I've included two PDFs, a XLSX and a ePub document, to > test the Tika parser: the docs were successfully parsed using Tika 2.3.0 - > if necessary I can repeat the test for NUTCH-2959 > <https://issues.apache.org/jira/browse/NUTCH-2959> > > — > Reply to this email directly, view it on GitHub > <https://github.com/apache/nutch/pull/772#issuecomment-1722472438>, or > unsubscribe > <https://github.com/notifications/unsubscribe-auth/ABTNNPTYVXO7AZOVVC4NNYTX23YGLANCNFSM6AAAAAA4GB45VU> > . > You are receiving this because you authored the thread.Message ID: > ***@***.***> > > Move to slf4j2 and remove log4j1 and reload4j > --------------------------------------------- > > Key: NUTCH-2978 > URL: https://issues.apache.org/jira/browse/NUTCH-2978 > Project: Nutch > Issue Type: Task > Reporter: Markus Jelsma > Priority: Major > Attachments: NUTCH-2978-1.patch, NUTCH-2978-2.patch, > NUTCH-2978-3.patch, NUTCH-2978-any23.patch, NUTCH-2978.patch > > > I got in trouble upgrading some dependencies and got a lot of LinkageErrors > today, or with a Tika upgrade, disappearing logs. This patch fixes that by > moving to slf4j2, using the corrent log4j2-slfj4-impl2 and getting rid of old > log4j -> reload4j. > > This patch fixes it. -- This message was sent by Atlassian Jira (v8.20.10#820010)