Re: [jira] [Created] (HDFS-8722) Optimize datanode writes for small writes and flushes

Nick Dimiduk Mon, 03 Aug 2015 16:05:31 -0700

FYI, this looks like it would impact small WAL writes.

On Tue, Jul 7, 2015 at 10:44 AM, Kihwal Lee (JIRA) <[email protected]> wrote:


> Kihwal Lee created HDFS-8722:
> --------------------------------
>
>              Summary: Optimize datanode writes for small writes and flushes
>                  Key: HDFS-8722
>                  URL: https://issues.apache.org/jira/browse/HDFS-8722
>              Project: Hadoop HDFS
>           Issue Type: Improvement
>             Reporter: Kihwal Lee
>             Priority: Critical
>
>
> After the data corruption fix by HDFS-4660, the CRC recalculation for
> partial chunk is executed more frequently, if the client repeats writing
> few bytes and calling hflush/hsync.  This is because the generic logic
> forces CRC recalculation if on-disk data is not CRC chunk aligned. Prior to
> HDFS-4660, datanode blindly accepted whatever CRC client provided, if the
> incoming data is chunk-aligned. This was the source of the corruption.
>
> We can still optimize for the most common case where a client is
> repeatedly writing small number of bytes followed by hflush/hsync with no
> pipeline recovery or append, by allowing the previous behavior for this
> specific case.  If the incoming data has a duplicate portion and that is at
> the last chunk-boundary before the partial chunk on disk, datanode can use
> the checksum supplied by the client without redoing the checksum on its
> own.  This reduces disk reads as well as CPU load for the checksum
> calculation.
>
> If the incoming packet data goes back further than the last on-disk chunk
> boundary, datanode will still do a recalculation, but this occurs rarely
> during pipeline recoveries. Thus the optimization for this specific case
> should be sufficient to speed up the vast majority of cases.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>

Re: [jira] [Created] (HDFS-8722) Optimize datanode writes for small writes and flushes

Reply via email to