[jira] [Commented] (FLINK-6776) Use skip instead of seek for small forward repositioning in DFS streams

ASF GitHub Bot (JIRA) Tue, 13 Jun 2017 02:33:08 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16047643#comment-16047643
 ]


ASF GitHub Bot commented on FLINK-6776:
---------------------------------------

Github user tillrohrmann commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4019#discussion_r121627538
  
    --- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/fs/hdfs/HadoopDataInputStream.java
 ---
    @@ -31,11 +31,15 @@
      */
     public final class HadoopDataInputStream extends FSDataInputStream {
     
    +   /** Minimum amount of bytes to skip forward before we issue a seek 
instead of discarding read */
    +   private static final int MIN_SKIP_BYTES = 1024 * 1024;
    --- End diff --
    
    How have you come up with this number?


> Use skip instead of seek for small forward repositioning in DFS streams
> -----------------------------------------------------------------------
>
>                 Key: FLINK-6776
>                 URL: https://issues.apache.org/jira/browse/FLINK-6776
>             Project: Flink
>          Issue Type: Improvement
>          Components: State Backends, Checkpointing
>            Reporter: Stefan Richter
>            Assignee: Stefan Richter
>            Priority: Minor
>
> Reading checkpoint meta data and finding key-groups in restores sometimes 
> require to seek in input streams. Currently, we always use a seek, even for 
> small position changes. As small true seeks are far more expensive than small 
> reads/skips, we should just skip over small gaps instead of performing the 
> seek.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-6776) Use skip instead of seek for small forward repositioning in DFS streams

Reply via email to