[ 
https://issues.apache.org/jira/browse/HBASE-16931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15601046#comment-15601046
 ] 

Anoop Sam John commented on HBASE-16931:
----------------------------------------

Oh.. nice debugging..  It would have been so hard for u debug this.
On that patch, so what u r doing now is change the seqId of the cell to zero 
and pass that time writer and out of the for loop, change back the seqId of the 
last cell to the original value. This also might fail in one case. You can see 
within the loop, there is a shipped() call now.  We have use shipped to do in 
btw cleanup in readers. But the writers also implemented as ShipperListener and 
this will do a clone of the last cells it refer.  This is needed because the 
when the blocks for compaction comes from a L2 shared memory cache, the shipped 
call might return back the block to BC and so that memory can get released.  If 
the lastCells refer to this block area, our cell data area can get corrupted.  
So what we do is before return back of the blocks we will do a clone of the 
cells. (Note that only the needed part we will clone.   FYI  I can forsee a bug 
here also.  We dont clone the seqId and all cloned cells will give seqId of 
0!..  That is another issue to solve)..   So said that when u come out of for 
loop, it might so happen that the prevCells that other part of code refer to is 
not the same as object within the cells list.
cc [~ram_krish]

> Cleaned seqid in compaction should set back when write finish.
> --------------------------------------------------------------
>
>                 Key: HBASE-16931
>                 URL: https://issues.apache.org/jira/browse/HBASE-16931
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 2.0.0
>            Reporter: binlijin
>            Assignee: binlijin
>         Attachments: HBASE-16931-master.patch
>
>
> Compactor#performCompaction
>       do {
>         hasMore = scanner.next(cells, scannerContext);
>         // output to writer:
>         for (Cell c : cells) {
>           if (cleanSeqId && c.getSequenceId() <= smallestReadPoint) {
>             CellUtil.setSequenceId(c, 0);
>           }
>           writer.append(c);
>         }
>         cells.clear();
>       } while (hasMore);
> scanner.next will choose at most "hbase.hstore.compaction.kv.max" kvs, the 
> last cell still reference by StoreScanner.prevCell, so if cleanSeqId is 
> called when the scanner.next call StoreScanner.checkScanOrder may throw 
> exception and cause regionserver down.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to