Sebastian Nagel created NUTCH-3117:
--------------------------------------
Summary: Index-more plugin fails to load configuration file
date-styles.txt in distributed mode
Key: NUTCH-3117
URL: https://issues.apache.org/jira/browse/NUTCH-3117
Project: Nutch
Issue Type: Bug
Components: indexer, plugin
Affects Versions: 1.18
Reporter: Sebastian Nagel
Fix For: 1.22
The index-more plugin fails to load the configuration file date-styles.txt in
(pseudo)distributed mode:
{noformat}
2025-07-15 16:14:53,056 ERROR [main]
org.apache.nutch.indexer.more.MoreIndexingFilter: Failed to load resource:
date-styles.txt
java.nio.file.NoSuchFileException:
file:/tmp/hadoop-ubuntu/nm-local-dir/usercache/ubuntu/appcache/application_1752584706091_0082/filecache/11/job.jar/job.jar!/date-styles.txt
at
java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)
at
java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
at
java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
at
java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:219)
at java.base/java.nio.file.Files.newByteChannel(Files.java:371)
at java.base/java.nio.file.Files.newByteChannel(Files.java:422)
at
java.base/java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:420)
at java.base/java.nio.file.Files.newInputStream(Files.java:156)
at java.base/java.nio.file.Files.newBufferedReader(Files.java:2839)
at java.base/java.nio.file.Files.readAllLines(Files.java:3330)
at org.apache.commons.io.FileUtils.readLines(FileUtils.java:2735)
at
org.apache.nutch.indexer.more.MoreIndexingFilter.setConf(MoreIndexingFilter.java:338)
{noformat}
If the file is read from the job.jar, it is not a real file, but needs to be
accessed in the jar file.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)