[jira] [Commented] (HBASE-14491) ReplicationSource#countDistinctRowKeys code logic is not correct

Enis Soztutar (JIRA) Tue, 13 Oct 2015 11:41:02 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-14491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955440#comment-14955440
 ]


Enis Soztutar commented on HBASE-14491:
---------------------------------------

In HRegion.doMiniBatchMutation(), we are preparing the WALEdits, by appending 
cells to the WALEdits one by by caling {{WALEdit.add(Cell)}}. 

{code}
    Map<byte[], List<Cell>>[] familyMaps = new Map[batchOp.operations.length];
...
        Map<byte[], List<Cell>> familyMap = mutation.getFamilyCellMap();
...
  private void addFamilyMapToWALEdit(Map<byte[], List<Cell>> familyMap,
      WALEdit walEdit) {
    for (List<Cell> edits : familyMap.values()) {
      assert edits instanceof RandomAccess;
      int listSize = edits.size();
      for (int i=0; i < listSize; i++) {
        Cell cell = edits.get(i);
        walEdit.add(cell);
      }
    }
  }
{code}

My understanding is that the cells within a WALEdit will be grouped by Mutation 
which is a single row. There can be multiple mutations in the same WALEdit 
sharing the same row key, but they are still distinct mutations that has to be 
counted separately. I would say we should resolve this jira now. 

> ReplicationSource#countDistinctRowKeys code logic is not correct
> ----------------------------------------------------------------
>
>                 Key: HBASE-14491
>                 URL: https://issues.apache.org/jira/browse/HBASE-14491
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ashish Singhi
>            Assignee: Ashish Singhi
>            Priority: Minor
>
> {code}
>       Cell lastCell = cells.get(0);
>       for (int i = 0; i < edit.size(); i++) {
>         if (!CellUtil.matchingRow(cells.get(i), lastCell)) {
>           distinctRowKeys++;
>         }
>       }
> {code}
> The above logic for finding the distinct row keys in the list needs to be 
> corrected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14491) ReplicationSource#countDistinctRowKeys code logic is not correct

Reply via email to