tballison commented on PR #776:
URL: https://github.com/apache/nutch/pull/776#issuecomment-1726191372
:sob: Y, let's hold off until Hadoop 3.4.0 is released.
Thank you, again!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
[
https://issues.apache.org/jira/browse/NUTCH-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766832#comment-17766832
]
Tim Allison commented on NUTCH-2937:
As [~snagel] pointed out on the PR for NUTCH-2959 -- looks like
tballison commented on PR #776:
URL: https://github.com/apache/nutch/pull/776#issuecomment-1725801990
> Btw., I've just rediscovered that using Tika in (pseudo)distributed mode
is broken since the upgrade to Tika 2.3.0, see
[NUTCH-2937](https://issues.apache.org/jira/browse/NUTCH-2937).
sebastian-nagel commented on PR #776:
URL: https://github.com/apache/nutch/pull/776#issuecomment-1725795918
> Can we exclude commons-io from hadoop and then add it as a dependency in
the main ivy.xml?
When running in distributed or pseudo-distributed mode, commons-io 2.8.0 is
first
tballison commented on PR #776:
URL: https://github.com/apache/nutch/pull/776#issuecomment-1725746397
I'm getting a ConnectException when I try to run
nutch-test-single-node-cluster.
On hadoop startup, I see:
```
2023-09-19 10:25:15,186 INFO util.GSet: VM type = 64-bit
tballison commented on PR #776:
URL: https://github.com/apache/nutch/pull/776#issuecomment-1725714860
I'm guessing that commit won't work if distributed hadoop is bringing its
own jars (as you said!). Does hadoop do any custom classloading so that the
job jars are isolated from the
tballison commented on PR #776:
URL: https://github.com/apache/nutch/pull/776#issuecomment-1725611372
I haven't worked with ant in a while. According to `ant dependencytree`, it
looks like we don't have to exclude commons-io everywhere -- placing it in the
main ivy.xml has the same effect
tballison commented on PR #776:
URL: https://github.com/apache/nutch/pull/776#issuecomment-1725604218
Weird, I just pushed a commit bumping commons-io on my NUTCH-2959 branch,
and it isn't showing up in the PR... I'll wait a bit... Maybe github is out
for coffee?
--
This is an
Tim Allison created NUTCH-3003:
--
Summary: Consider integration testing in a Dockerized mini-hadoop
cluster via testcontainers?
Key: NUTCH-3003
URL: https://issues.apache.org/jira/browse/NUTCH-3003
sebastian-nagel opened a new pull request, #777:
URL: https://github.com/apache/nutch/pull/777
- implement class CaseInsensitiveMetadata providing case-insensitive
metadata look-ups (but no spell-checking)
- use CaseInsensitiveMetadata to hold HTTP header metadata in in the class
10 matches
Mail list logo