FSDataOutputStream should flush last partial CRC chunk
------------------------------------------------------

                 Key: HADOOP-2913
                 URL: https://issues.apache.org/jira/browse/HADOOP-2913
             Project: Hadoop Core
          Issue Type: Bug
          Components: dfs
            Reporter: dhruba borthakur


The FSDataOutputSteam.flush() api is supposed to flush all data to the 
underlying stream. However, for LocalFileSystem, the flush APi does not flush 
the last partial CRC chunk.

One solution is described in HADOOP-2657: We should change FSOutputStream to 
implement Seekable, and have the default implementation of seek throw an 
IOException, then use this in CheckSumFileSystem to rewind and overwrite the 
checksum. Then folks will only fail if they attempt to write more data after 
they've flushed on a ChecksumFileSystem that doesn't support seek. I don't 
think we will have any filesystems that both extend CheckSumFileSystem and 
can't support seek. Only LocalFileSystem currently extends CheckSumFileSystem, 
and it does support seek. So flush() shouldn't ever fail for existing 
FileSystem's, but seek() will fail for most output streams (probably all except 
local).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to