Xiaoyu Yao created HDFS-7911:
--------------------------------

             Summary: Buffer Overflow when running HBase on HDFS Encryption Zone
                 Key: HDFS-7911
                 URL: https://issues.apache.org/jira/browse/HDFS-7911
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: encryption
    Affects Versions: 2.6.0
            Reporter: Xiaoyu Yao
            Priority: Blocker


Create an HDFS EZ for HBase under /apps/hbase with some basic testing passed, 
including creating tables, listing, adding a few rows, scanning them, etc. 
However, when doing bulk load 100's k rows. After 10 minutes or so, we get the 
following error on the Region Server that owns the table.

{code}
2015-03-02 10:25:47,784 FATAL [regionserver60020-WAL.AsyncSyncer0] wal.FSHLog: 
Error while AsyncSyncer sync, request close of hlog 
java.io.IOException: java.nio.BufferOverflowException 
at 
org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.process(JceAesCtrCryptoCodec.java:156)
at 
org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.encrypt(JceAesCtrCryptoCodec.java:127)
at 
org.apache.hadoop.crypto.CryptoOutputStream.encrypt(CryptoOutputStream.java:162)
 
at 
org.apache.hadoop.crypto.CryptoOutputStream.flush(CryptoOutputStream.java:232) 
at 
org.apache.hadoop.crypto.CryptoOutputStream.hflush(CryptoOutputStream.java:267) 
at 
org.apache.hadoop.crypto.CryptoOutputStream.sync(CryptoOutputStream.java:262) 
at org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:123) 
at 
org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:165)
 
at 
org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncSyncer.run(FSHLog.java:1241)
 
at java.lang.Thread.run(Thread.java:744) 
Caused by: java.nio.BufferOverflowException 
at java.nio.DirectByteBuffer.put(DirectByteBuffer.java:357) 
at javax.crypto.CipherSpi.bufferCrypt(CipherSpi.java:823) 
at javax.crypto.CipherSpi.engineUpdate(CipherSpi.java:546) 
at javax.crypto.Cipher.update(Cipher.java:1760) 
at 
org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.process(JceAesCtrCryptoCodec.java:145)
... 9 more 
{code}

It looks like the HBase WAL  (Write Ahead Log) use case is broken on the 
CryptoOutputStream(). The use case has one flusher thread that keeps calling 
the hflush() on WAL file while other roller threads are trying to write 
concurrently to that same file handle.

As the class comments mentioned. *""CryptoOutputStream encrypts data. It is not 
thread-safe."* I check the code and it seems the buffer overflow is related to 
the race between the CryptoOutputStream#write() and CryptoOutputStream#flush() 
as both can call CryptoOutputStream#encrypt(). The inBuffer/outBuffer of the 
CryptoOutputStream is not thread safe. They can be changed during encrypt for 
flush() when write() is coming from other threads. 

I have validated this with multi-threaded unit tests that mimic the HBase WAL 
use case. For file not under encryption zone (*DFSOutputStream*), 
multi-threaded flusher/writer works fine. For file under encryption zone 
(*CryptoOutputStream*), multi-threaded flusher/writer randomly fails with 
Buffer Overflow/Underflow.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to