FYI, this looks like it would impact small WAL writes. On Tue, Jul 7, 2015 at 10:44 AM, Kihwal Lee (JIRA) <[email protected]> wrote:
> Kihwal Lee created HDFS-8722: > -------------------------------- > > Summary: Optimize datanode writes for small writes and flushes > Key: HDFS-8722 > URL: https://issues.apache.org/jira/browse/HDFS-8722 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Kihwal Lee > Priority: Critical > > > After the data corruption fix by HDFS-4660, the CRC recalculation for > partial chunk is executed more frequently, if the client repeats writing > few bytes and calling hflush/hsync. This is because the generic logic > forces CRC recalculation if on-disk data is not CRC chunk aligned. Prior to > HDFS-4660, datanode blindly accepted whatever CRC client provided, if the > incoming data is chunk-aligned. This was the source of the corruption. > > We can still optimize for the most common case where a client is > repeatedly writing small number of bytes followed by hflush/hsync with no > pipeline recovery or append, by allowing the previous behavior for this > specific case. If the incoming data has a duplicate portion and that is at > the last chunk-boundary before the partial chunk on disk, datanode can use > the checksum supplied by the client without redoing the checksum on its > own. This reduces disk reads as well as CPU load for the checksum > calculation. > > If the incoming packet data goes back further than the last on-disk chunk > boundary, datanode will still do a recalculation, but this occurs rarely > during pipeline recoveries. Thus the optimization for this specific case > should be sufficient to speed up the vast majority of cases. > > > > -- > This message was sent by Atlassian JIRA > (v6.3.4#6332) >
