Xiaoyu Yao created HDFS-7911:
--------------------------------
Summary: Buffer Overflow when running HBase on HDFS Encryption Zone
Key: HDFS-7911
URL: https://issues.apache.org/jira/browse/HDFS-7911
Project: Hadoop HDFS
Issue Type: Bug
Components: encryption
Affects Versions: 2.6.0
Reporter: Xiaoyu Yao
Priority: Blocker
Create an HDFS EZ for HBase under /apps/hbase with some basic testing passed,
including creating tables, listing, adding a few rows, scanning them, etc.
However, when doing bulk load 100's k rows. After 10 minutes or so, we get the
following error on the Region Server that owns the table.
{code}
2015-03-02 10:25:47,784 FATAL [regionserver60020-WAL.AsyncSyncer0] wal.FSHLog:
Error while AsyncSyncer sync, request close of hlog
java.io.IOException: java.nio.BufferOverflowException
at
org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.process(JceAesCtrCryptoCodec.java:156)
at
org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.encrypt(JceAesCtrCryptoCodec.java:127)
at
org.apache.hadoop.crypto.CryptoOutputStream.encrypt(CryptoOutputStream.java:162)
at
org.apache.hadoop.crypto.CryptoOutputStream.flush(CryptoOutputStream.java:232)
at
org.apache.hadoop.crypto.CryptoOutputStream.hflush(CryptoOutputStream.java:267)
at
org.apache.hadoop.crypto.CryptoOutputStream.sync(CryptoOutputStream.java:262)
at org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:123)
at
org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:165)
at
org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncSyncer.run(FSHLog.java:1241)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.nio.BufferOverflowException
at java.nio.DirectByteBuffer.put(DirectByteBuffer.java:357)
at javax.crypto.CipherSpi.bufferCrypt(CipherSpi.java:823)
at javax.crypto.CipherSpi.engineUpdate(CipherSpi.java:546)
at javax.crypto.Cipher.update(Cipher.java:1760)
at
org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.process(JceAesCtrCryptoCodec.java:145)
... 9 more
{code}
It looks like the HBase WAL (Write Ahead Log) use case is broken on the
CryptoOutputStream(). The use case has one flusher thread that keeps calling
the hflush() on WAL file while other roller threads are trying to write
concurrently to that same file handle.
As the class comments mentioned. *""CryptoOutputStream encrypts data. It is not
thread-safe."* I check the code and it seems the buffer overflow is related to
the race between the CryptoOutputStream#write() and CryptoOutputStream#flush()
as both can call CryptoOutputStream#encrypt(). The inBuffer/outBuffer of the
CryptoOutputStream is not thread safe. They can be changed during encrypt for
flush() when write() is coming from other threads.
I have validated this with multi-threaded unit tests that mimic the HBase WAL
use case. For file not under encryption zone (*DFSOutputStream*),
multi-threaded flusher/writer works fine. For file under encryption zone
(*CryptoOutputStream*), multi-threaded flusher/writer randomly fails with
Buffer Overflow/Underflow.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)