[jira] [Updated] (HBASE-17633) Update unflushed sequence id in SequenceIdAccounting after flush with the minimum sequence id in memstore

Duo Zhang (JIRA) Sun, 12 Feb 2017 17:27:02 -0800

     [ 
https://issues.apache.org/jira/browse/HBASE-17633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Duo Zhang updated HBASE-17633:
------------------------------
    Description: 
Now the tracking work is done by SequenceIdAccounting. And it is a little 
tricky when dealing with flush. We should remove the mapping for the given 
stores of a region from lowestUnflushedSequenceIds, so that we have space to 
store the new lowest unflushed sequence id after flush. But we still need to 
keep the old sequence ids in another map as we still need to use these values 
when reporting to master to prevent data loss(think of the scenario that we 
report the new lowest unflushed sequence id to master and we crashed before 
actually flushed the data to disk).

And when reviewing HBASE-17407, I found  that for CompactingMemStore, we have 
to record the minimum sequence id.in memstore. We could just update the 
mappings in SequenceIdAccounting using these values after flush. This means we 
do not need to update the lowest unflushed sequence id in SequenceIdAccounting, 
and also do not need to make space for the new lowest unflushed when 
startCacheFlush, and also do not need the extra map to store the old mappings.

This could simplify our logic a lot. But this is a fundamental change so I need 
sometime to implement, especially for modifying tests... And I also need 
sometime to check if I miss something.

  was:
Now the tracking work is done by SequenceIdAccounting. And it is a little 
tricky when dealing with flush. We should remove the mapping for the given 
stores of a region from lowestUnflushedSequenceIds, so that we have space to 
store the new lowest unflushed sequence id after flush. But we still need to 
keep the old sequence ids in another map as we still need to use these values 
when reporting to master to prevent data loss(think of the scenario that we 
report the new lowest unflushed sequence id to master and we crashed before 
actually flushed the data to disk).

And when reviewing HBASE-17407, I found  that for CompactingMemStore, we have 
to record the minimum sequence id.in memstore. We could just update the 
mappings in SequenceIdAccounting using these values after flush. This means we 
do not need to update the lowest unflushed sequence id in SequenceIdAccounting, 
and also do not need to make space for the new lowest unflushed when 
startCacheFlush, and also do not need the extra map to store the old mappings.

This could simplify our logic a lot. But this is an fundamental change so I 
need sometime to implement, especially for modifying tests... And I also need 
sometime to check if I miss something.


> Update unflushed sequence id in SequenceIdAccounting after flush with the 
> minimum sequence id in memstore
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-17633
>                 URL: https://issues.apache.org/jira/browse/HBASE-17633
>             Project: HBase
>          Issue Type: Improvement
>          Components: wal
>    Affects Versions: 2.0.0
>            Reporter: Duo Zhang
>             Fix For: 2.0.0
>
>
> Now the tracking work is done by SequenceIdAccounting. And it is a little 
> tricky when dealing with flush. We should remove the mapping for the given 
> stores of a region from lowestUnflushedSequenceIds, so that we have space to 
> store the new lowest unflushed sequence id after flush. But we still need to 
> keep the old sequence ids in another map as we still need to use these values 
> when reporting to master to prevent data loss(think of the scenario that we 
> report the new lowest unflushed sequence id to master and we crashed before 
> actually flushed the data to disk).
> And when reviewing HBASE-17407, I found  that for CompactingMemStore, we have 
> to record the minimum sequence id.in memstore. We could just update the 
> mappings in SequenceIdAccounting using these values after flush. This means 
> we do not need to update the lowest unflushed sequence id in 
> SequenceIdAccounting, and also do not need to make space for the new lowest 
> unflushed when startCacheFlush, and also do not need the extra map to store 
> the old mappings.
> This could simplify our logic a lot. But this is a fundamental change so I 
> need sometime to implement, especially for modifying tests... And I also need 
> sometime to check if I miss something.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HBASE-17633) Update unflushed sequence id in SequenceIdAccounting after flush with the minimum sequence id in memstore

Reply via email to