[
https://issues.apache.org/jira/browse/FLINK-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16047668#comment-16047668
]
ASF GitHub Bot commented on FLINK-6776:
---------------------------------------
Github user StefanRRichter commented on a diff in the pull request:
https://github.com/apache/flink/pull/4019#discussion_r121632647
--- Diff:
flink-runtime/src/main/java/org/apache/flink/runtime/fs/hdfs/HadoopDataInputStream.java
---
@@ -89,4 +99,14 @@ public long skip(long n) throws IOException {
public org.apache.hadoop.fs.FSDataInputStream getHadoopInputStream() {
return fsDataInputStream;
}
+
+ public void forceSeek(long seekPos) throws IOException {
--- End diff --
I agree that doc wouldn't hurt. This class as a whole was rather
undocumented, but it is also internal and user will only interact
through`FSDataInputStream`, which is not exposing those methods. Can write
something anyways :)
> Use skip instead of seek for small forward repositioning in DFS streams
> -----------------------------------------------------------------------
>
> Key: FLINK-6776
> URL: https://issues.apache.org/jira/browse/FLINK-6776
> Project: Flink
> Issue Type: Improvement
> Components: State Backends, Checkpointing
> Reporter: Stefan Richter
> Assignee: Stefan Richter
> Priority: Minor
>
> Reading checkpoint meta data and finding key-groups in restores sometimes
> require to seek in input streams. Currently, we always use a seek, even for
> small position changes. As small true seeks are far more expensive than small
> reads/skips, we should just skip over small gaps instead of performing the
> seek.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)