[jira] [Commented] (MAPREDUCE-5791) Shuffle phase is slow in Windows - FadviseFileRegion::transferTo does not read disks efficiently

Nikola Vujic (JIRA) Mon, 24 Mar 2014 10:45:12 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13945407#comment-13945407
 ]


Nikola Vujic commented on MAPREDUCE-5791:
-----------------------------------------

Hi [~cnauroth],

I have applied all fixes except for the if-else in {{FadvisedFileRegion}}. Edge 
case is reading the last chunk of data from a file. {{customShuffleTransfer}} 
must read {{actualCount}} bytes from a file, starting from the 
{{this.position}}. This is done in the while loop and {{trans}} variable is 
used to calculate the number of remaining bytes. {{fileChannel.read}} returns 
the number of bytes read. For the last chunk of data this number can be higher 
than the remaining number of bytes to read. In that case we cannot use 
{{Buffer#flip}}. 

For example, let's suppose that we have 128 byte buffer and the we want to read 
200 bytes starting at position 1000 in a file (file size bigger than 1256 
bytes). At least two iterations of the while loop will be done: 
1. Iteration 1: {{fileChannel.read(byteBuffer, 1000+0)}} => 128 bytes are read 
=> all 128 bytes are needed => target.write
2. Iteration 2: {{fileChannel.read(byteBuffer, 1000+128)}} => 128 bytes are 
read => 128 bytes are read because file is big enough but only first 72 bytes 
are needed => {{byteBuffer.limit(72)}} => target.write

In the else block we don't set limit to the current position but to a number 
lower than the current position. Updating local {{position}} variable is needed 
in order to read data starting from a proper position in the next iterations of 
the loop. Does it make sense?

Regarding the resource leak in the test, I applied a change you suggested and I 
did the same with the {{fileRegion}} in order to eliminated one try block.

I changed {{customShuffleTransferCornerCases}} to private. It was public.

> Shuffle phase is slow in Windows - FadviseFileRegion::transferTo does not 
> read disks efficiently
> ------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5791
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5791
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Nikola Vujic
>            Assignee: Nikola Vujic
>         Attachments: MAPREDUCE-5791.patch, MAPREDUCE-5791.patch
>
>
> transferTo method in org.apache.hadoop.mapred.FadvisedFileRegion is using 
> transferTo method from a FileChannel to transfer data from a disk to socket. 
> This is performing slow in Windows, slower than in Linux. The reason is that 
> transferTo method for the java.nio is issuing 32K IO requests all the time. 
> In Windows, these 32K transfers are not optimal and we don't get the best 
> performance form the underlying IO subsystem. In order to achieve better 
> performance when reading from the drives, we need to read data in bigger 
> chunks, 512K for example.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (MAPREDUCE-5791) Shuffle phase is slow in Windows - FadviseFileRegion::transferTo does not read disks efficiently

Reply via email to