[ 
https://issues.apache.org/jira/browse/DRILL-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16321490#comment-16321490
 ] 

ASF GitHub Bot commented on DRILL-6002:
---------------------------------------

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1058#discussion_r160841030
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/spill/SpillSet.java
 ---
    @@ -107,7 +107,7 @@
          * nodes provide insufficient local disk space)
          */
     
    -    private static final int TRANSFER_SIZE = 32 * 1024;
    +    private static final int TRANSFER_SIZE = 1024 * 1024;
    --- End diff --
    
    Is a 1MB buffer excessive? The point of a buffer is to ensure we write in 
units of a disk block. For the local file system, experience showed no gain 
after 32K. In the MapR FS, each write is in units of 1 MB. Does Hadoop have a 
preferred size?
    
    Given this variation, if we need large buffers, should we choose a buffer 
size based on the underlying file system? For example, is there a preferred 
size for S3?
    
    32K didn't seem large enough to worry about, even if we had 1000 fragments 
busily spilling. But 1MB? 1000 * 1 MB = 1GB, which starts becoming significant, 
especially in light of our efforts to reduce heap usage. Should we worry?


> Avoid memory copy from direct buffer to heap while spilling to local disk
> -------------------------------------------------------------------------
>
>                 Key: DRILL-6002
>                 URL: https://issues.apache.org/jira/browse/DRILL-6002
>             Project: Apache Drill
>          Issue Type: Improvement
>            Reporter: Vlad Rozov
>            Assignee: Vlad Rozov
>
> When spilling to a local disk or to any file system that supports 
> WritableByteChannel it is preferable to avoid copy from off-heap to java heap 
> as WritableByteChannel can work directly with the off-heap memory.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to