[
https://issues.apache.org/jira/browse/FLINK-10954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17124917#comment-17124917
]
Yun Tang commented on FLINK-10954:
----------------------------------
I think multiple directories could benefit when using FsStateBackend if we have
multi disks and I believe this could make the disk usage more balance. On the
other hand, if we decide to choose one directory randomly as the base
checkpoint directory of FsStateBackend, and there exists many
{{LocalRecoveryDirectoryProvider}}'s on that machine. I think this could
achieve similar effects compared with current implementation. I believe this
could make the problem much easier as you thought though this needs to refactor
current interfaces of {{LocalRecoveryDirectoryProvider}}.
Besides, for RocksDB state-backend, we cannot just choose randomly as the base
directory due to recovery from previous local state also depends on the hard
link mechanism.
> Hardlink from files of previous local stored state might cross devices
> ----------------------------------------------------------------------
>
> Key: FLINK-10954
> URL: https://issues.apache.org/jira/browse/FLINK-10954
> Project: Flink
> Issue Type: Bug
> Components: Runtime / State Backends
> Affects Versions: 1.6.2
> Reporter: Yun Tang
> Assignee: Yun Tang
> Priority: Critical
> Fix For: 1.12.0
>
>
> Currently, local recovery's base directories is initialized from
> '{{io.tmp.dirs}}' if parameter '{{taskmanager.state.local.root-dirs}}' is not
> set. For Yarn environment, the tmp dirs is replaced by its '{{LOCAL_DIRS}}',
> which might consist of directories from different devices, such as
> /dump/1/nm-local-dir, /dump/2/nm-local-dir. The local directory for RocksDB
> is initialized from IOManager's spillingDirectories, which might located in
> different device from local recovery's folder. However, hard-link between
> different devices is not allowed, it will throw exception below:
> {code:java}
> java.nio.file.FileSystemException: target -> souce: Invalid cross-device link
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)