[
https://issues.apache.org/jira/browse/NUTCH-3117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-3117:
-----------------------------------
Environment: (pseudo)distributed mode, (single node) Hadoop cluster
> Index-more plugin fails to load configuration file date-styles.txt in
> distributed mode
> --------------------------------------------------------------------------------------
>
> Key: NUTCH-3117
> URL: https://issues.apache.org/jira/browse/NUTCH-3117
> Project: Nutch
> Issue Type: Bug
> Components: indexer, plugin
> Affects Versions: 1.18
> Environment: (pseudo)distributed mode, (single node) Hadoop cluster
> Reporter: Sebastian Nagel
> Priority: Minor
> Labels: help-wanted
> Fix For: 1.22
>
>
> The index-more plugin fails to load the configuration file date-styles.txt in
> (pseudo)distributed mode:
> {noformat}
> 2025-07-15 16:14:53,056 ERROR [main]
> org.apache.nutch.indexer.more.MoreIndexingFilter: Failed to load resource:
> date-styles.txt
> java.nio.file.NoSuchFileException:
> file:/tmp/hadoop-ubuntu/nm-local-dir/usercache/ubuntu/appcache/application_1752584706091_0082/filecache/11/job.jar/job.jar!/date-styles.txt
> at
> java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)
> at
> java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
> at
> java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
> at
> java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:219)
> at java.base/java.nio.file.Files.newByteChannel(Files.java:371)
> at java.base/java.nio.file.Files.newByteChannel(Files.java:422)
> at
> java.base/java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:420)
> at java.base/java.nio.file.Files.newInputStream(Files.java:156)
> at java.base/java.nio.file.Files.newBufferedReader(Files.java:2839)
> at java.base/java.nio.file.Files.readAllLines(Files.java:3330)
> at org.apache.commons.io.FileUtils.readLines(FileUtils.java:2735)
> at
> org.apache.nutch.indexer.more.MoreIndexingFilter.setConf(MoreIndexingFilter.java:338)
> {noformat}
> If the file is read from the job.jar, it is not a real file, but needs to be
> accessed in the jar file.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)