[jira] [Commented] (HBASE-14501) NPE in replication with TDE

Enis Soztutar (JIRA) Mon, 28 Sep 2015 19:17:09 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-14501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934500#comment-14934500
 ]


Enis Soztutar commented on HBASE-14501:
---------------------------------------

Learnings so far: 
 - The interface between {{Decoder}} and underlying {{InputStream}} should be 
explicit. Ideally, the Decoder should not return null in parseCell(). 
 - The {{CellDecoder}} right now does not know whether the InputStream has 
ended or not (EOF), so {{advance()}} should look up how many bytes it can read. 
Sometimes the upper layer has context for how many cells it should read 
(IPCUtil, ProtobufLogReader). 
 - Not throwing an EOFException for every createCellScanner() in RPCUtil maybe 
important for performance. Otherwise we will be throwing an exception in the 
regular code path. 
 - Tried to pass in a LimitInputStream which knows about the file / buffer 
length and implements {{remaining()}} so that the Decoder does not try to 
parseCell() and cause an exception in the happy path. Turns out DFSInputStream 
for open file tailing gets the last data blocks length from the datanodes 
directly, possibly overflowing the file length returned at file open time from 
NN (important for replication WAL tailing). 

I'll pursue another alternative to change BaseDecoder.parseCell() to return a 
union of <Cell, EOF, Exception> without causing extra object allocations per 
cell. 

> NPE in replication with TDE
> ---------------------------
>
>                 Key: HBASE-14501
>                 URL: https://issues.apache.org/jira/browse/HBASE-14501
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Enis Soztutar
>
> We are seeing a NPE when replication (or in this case async wal replay for 
> region replicas) is run on top of an HDFS cluster with TDE configured.
> This is the stack trace:
> {code}
> java.lang.NullPointerException
>         at org.apache.hadoop.hbase.CellUtil.matchingRow(CellUtil.java:370)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.countDistinctRowKeys(ReplicationSource.java:649)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:450)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:346)
> {code}
> This stack trace can only happen if WALEdit.getCells() returns an array 
> containing null entries. I believe this happens due to 
> {{KeyValueCodec.parseCell()}} uses {{KeyValueUtil.iscreate()}} which returns 
> null in case of EOF at the beginning. However, the contract for the 
> Decoder.parseCell() is not clear whether returning null is acceptable or not. 
> The other Decoders (CompressedKvDecoder, CellCodec, etc) do not return null 
> while KeyValueCodec does. 
> BaseDecoder has this code: 
> {code}
>   public boolean advance() throws IOException {
>     if (!this.hasNext) return this.hasNext;
>     if (this.in.available() == 0) {
>       this.hasNext = false;
>       return this.hasNext;
>     }
>     try {
>       this.current = parseCell();
>     } catch (IOException ioEx) {
>       rethrowEofException(ioEx);
>     }
>     return this.hasNext;
>   }
> {code}
> which is not correct since it uses {{IS.available()}} not according to the 
> javadoc: 
> (https://docs.oracle.com/javase/7/docs/api/java/io/InputStream.html#available()).
>  DFSInputStream implements {{available()}} as the remaining bytes to read 
> from the stream, so we do not see the issue there. 
> {{CryptoInputStream.available()}} does a similar thing but see the issue. 
> So two questions: 
>  - What should be the interface for Decoder.parseCell()? Can it return null? 
>  - How to properly fix  BaseDecoder.advance() to not rely on {{available()}} 
> call. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14501) NPE in replication with TDE

Reply via email to