[ 
https://issues.apache.org/jira/browse/HDFS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-755:
-----------------------------

    Attachment: benchmark-8-256.png
                benchmark.png

Here's a graph showing benchmark results from an overnight run. The benchmark 
setup was:
- Set the CHUNKS_PER_READ variable to a number of different values (seen on the 
bottom of the X axis - 1,2,4,etc)
- Set the internal buffer of DFSClient (around line DFSClient.java:1618) to a 
number of different small values, plus 65536. 65536 was chosen as that was the 
io.file.buffer.size for this test, so it's what the current patch does.
- For each combination of the above two variables, I ran 100 trials of catting 
1GB off of a local datanode (pseudo-distributed HDFS). The machine had free RAM 
so the data stayed in buffer cache. I think this is appropriate for this 
benchmark as we're looking to see CPU improvement.
- Each trial is graphed as a box plot. The top edge of the box is the 75th 
percentile. The bottom edge is the 25th percentile. The middle line is the 
median.

Here's how I interpret the results:
- As we saw in HADOOP-3205, there is no significant performance difference 
between setting CHECKSUMS_PER_READ (nee MAX_CHUNKS) to a value higher than 8 or 
16. In that JIRA we decided to set it to 32. So I think that decision should 
stand. The second png (benchmark-8-256.png) zooms in on only the "good" 
settings of CHECKSUMS_PER_READ.
- For any "good" setting, we see an advantage of setting the internal buffer 
small. This is due to Raghu's observation - it allows the reader to skip a 
buffer copy for the reads.

The results above follow the intuition:
When BlockReader is constructed, it reads the following things:
* DataTransferProtocol.Status (2 bytes)
* DataChecksum: byte + int (5 bytes)
* firstChunkOffset (8 bytes)
(total 15 bytes)

Then for each packet:
* packetLen (4 bytes)
* offsetInBlock (8 bytes)
* seq num (8 bytes)
* lastPacket (1 byte)
* datalen (4 bytes)
(total 25 bytes)

So a 16 byte internal buffer is big enough for the connection headers, but not 
big enough for the packet headers. The 512 byte internal buffer is big enough 
that there's going to be significant "overread" into the packet data itself, 
and significant copying will happen.

I'm unsure whether setting the internal buffer low is "safe". That is to say, 
is it *always* wrapped by an external buffer, or are people depending on it for 
performance? If it's not always wrapped, this change will be slightly more 
complicated.

What do people think about committing now as is (with the large internal 
buffer) and then working in a separate JIRA to eliminate the copy? As the 
graphs show, there is still a performance improvement (about 6.0sec to 5.6 sec 
here).

> Read multiple checksum chunks at once in DFSInputStream
> -------------------------------------------------------
>
>                 Key: HDFS-755
>                 URL: https://issues.apache.org/jira/browse/HDFS-755
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs client
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: benchmark-8-256.png, benchmark.png, hdfs-755.txt, 
> hdfs-755.txt, hdfs-755.txt, hdfs-755.txt, hdfs-755.txt
>
>
> HADOOP-3205 adds the ability for FSInputChecker subclasses to read multiple 
> checksum chunks in a single call to readChunk. This is the HDFS-side use of 
> that new feature.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to