[ 
https://issues.apache.org/jira/browse/HBASE-14501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934045#comment-14934045
 ] 

Enis Soztutar commented on HBASE-14501:
---------------------------------------

bq. What is TDE?
Transparent data encryption. 
bq. This seems like the proper behavior.
Agreed. We should not allow the Decoder to decide whether an EOF is ok or not. 
bq. Can you redo your bit about is.available? There are some bits missing which 
makes it hard to read. I think you are trying to say relying on is.available is 
dodgy... which yeah, generally is (though I put it here I believe).
Yes. IS.available() returns the number of bytes available to read without 
blocking. Since HDFS implements it to return the total number of remaining 
bytes instead, this part has been working. But we should not rely on all 
underlying filesystems to implement {{IS.available()}} this way. 

> NPE in replication with TDE
> ---------------------------
>
>                 Key: HBASE-14501
>                 URL: https://issues.apache.org/jira/browse/HBASE-14501
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Enis Soztutar
>
> We are seeing a NPE when replication (or in this case async wal replay for 
> region replicas) is run on top of an HDFS cluster with TDE configured.
> This is the stack trace:
> {code}
> java.lang.NullPointerException
>         at org.apache.hadoop.hbase.CellUtil.matchingRow(CellUtil.java:370)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.countDistinctRowKeys(ReplicationSource.java:649)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:450)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:346)
> {code}
> This stack trace can only happen if WALEdit.getCells() returns an array 
> containing null entries. I believe this happens due to 
> {{KeyValueCodec.parseCell()}} uses {{KeyValueUtil.iscreate()}} which returns 
> null in case of EOF at the beginning. However, the contract for the 
> Decoder.parseCell() is not clear whether returning null is acceptable or not. 
> The other Decoders (CompressedKvDecoder, CellCodec, etc) do not return null 
> while KeyValueCodec does. 
> BaseDecoder has this code: 
> {code}
>   public boolean advance() throws IOException {
>     if (!this.hasNext) return this.hasNext;
>     if (this.in.available() == 0) {
>       this.hasNext = false;
>       return this.hasNext;
>     }
>     try {
>       this.current = parseCell();
>     } catch (IOException ioEx) {
>       rethrowEofException(ioEx);
>     }
>     return this.hasNext;
>   }
> {code}
> which is not correct since it uses {{IS.available()}} not according to the 
> javadoc: 
> (https://docs.oracle.com/javase/7/docs/api/java/io/InputStream.html#available()).
>  DFSInputStream implements {{available()}} as the remaining bytes to read 
> from the stream, so we do not see the issue there. 
> {{CryptoInputStream.available()}} does a similar thing but see the issue. 
> So two questions: 
>  - What should be the interface for Decoder.parseCell()? Can it return null? 
>  - How to properly fix  BaseDecoder.advance() to not rely on {{available()}} 
> call. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to