[jira] [Comment Edited] (HBASE-17471) Region Seqid will be out of order in WAL if using mvccPreAssign

Yu Li (JIRA) Tue, 21 Feb 2017 22:16:06 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-17471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877550#comment-15877550
 ]


Yu Li edited comment on HBASE-17471 at 2/22/17 6:15 AM:
--------------------------------------------------------

Here is the performance data before/after change here in our customized 1.1.2 
with PCIe-SSD (with HBASE-17676 as well), which shows no regression:

||Case||Throughput(ops/sec)||AverageLatency(us)||
|before|127708|4983|
|after|127608|4987|

Test environment:
{noformat}
Hardware:
4 physical client node, 1 single RS, 3 Datanodes
1 PCIe-SSD, 10 SATA disks

YCSB configurations:
8 YCSB processes on each client node
operationcount=20000000
threadcount=20 (overall 4*8*20=640 threads against the single RS)
insertproportion=1

HBase configurations:
hbase.hregion.memstore.flush.size => 268435456
hbase.regionserver.handler.count => 192
hbase.wal.storage.policy => ALL_SSD

table schema:
{NAME => 'cf', DATA_BLOCK_ENCODING => 'DIFF', VERSIONS=> '1', 
COMPRESSION => 'SNAPPY', IN_MEMORY => 'false', BLOCKCACHE => 'true'},
{SPLITS => (1..9).map {|i| "user#{1000+i*(9999-1000)/9}"}, 
DURABILITY=>'SYNC_WAL',
METADATA => {'hbase.hstore.block.storage.policy' => 'ALL_SSD'}}
{noformat}


was (Author: carp84):
Here is the performance data before/after change here in our customized 1.1.2 
(with HBASE-17676 as well), which shows no regression:

||Case||Throughput(ops/sec)||AverageLatency(us)||
|before|127708|4983|
|after|127608|4987|

Test environment:
{noformat}
Hardware:
4 physical client node, 1 single RS, 3 Datanodes
1 PCIe-SSD, 10 SATA disks

YCSB configurations:
8 YCSB processes on each client node
operationcount=20000000
threadcount=20 (overall 4*8*20=640 threads against the single RS)
insertproportion=1

HBase configurations:
hbase.hregion.memstore.flush.size => 268435456
hbase.regionserver.handler.count => 192
hbase.wal.storage.policy => ALL_SSD

table schema:
{NAME => 'cf', DATA_BLOCK_ENCODING => 'DIFF', VERSIONS=> '1', 
COMPRESSION => 'SNAPPY', IN_MEMORY => 'false', BLOCKCACHE => 'true'},
{SPLITS => (1..9).map {|i| "user#{1000+i*(9999-1000)/9}"}, 
DURABILITY=>'SYNC_WAL',
METADATA => {'hbase.hstore.block.storage.policy' => 'ALL_SSD'}}
{noformat}

> Region Seqid will be out of order in WAL if using mvccPreAssign
> ---------------------------------------------------------------
>
>                 Key: HBASE-17471
>                 URL: https://issues.apache.org/jira/browse/HBASE-17471
>             Project: HBase
>          Issue Type: Bug
>          Components: wal
>    Affects Versions: 2.0.0, 1.4.0
>            Reporter: Allan Yang
>            Assignee: Allan Yang
>            Priority: Critical
>         Attachments: HBASE-17471-duo.patch, HBASE-17471-duo-v1.patch, 
> HBASE-17471-duo-v2.patch, HBASE-17471.patch, HBASE-17471.tmp, 
> HBASE-17471.v2.patch, HBASE-17471.v3.patch, HBASE-17471.v4.patch, 
> HBASE-17471.v5.patch, HBASE-17471.v6.patch
>
>
>  mvccPreAssign was brought by HBASE-16698, which truly improved the 
> performance of writing, especially in ASYNC_WAL scenario. But mvccPreAssign 
> was only used in {{doMiniBatchMutate}}, not in Increment/Append path. If 
> Increment/Append and batch put are using against the same region in parallel, 
> then seqid of the same region may not monotonically increasing in the WAL. 
> Since one write path acquires mvcc/seqid before append, and the other 
> acquires in the append/sync consume thread.
> The out of order situation can easily reproduced by a simple UT, which was 
> attached in the attachment. I modified the code to assert on the disorder: 
> {code}
>     if(this.highestSequenceIds.containsKey(encodedRegionName)) {
>       assert highestSequenceIds.get(encodedRegionName) < sequenceid;
>     }
> {code}
> I'd like to say, If we allow disorder in WALs, then this is not a issue. 
> But as far as I know, if {{highestSequenceIds}} is not properly set, some 
> WALs may not archive to oldWALs correctly.
> which I haven't figure out yet is that, will disorder in WAL cause data loss 
> when recovering from disaster? If so, then it is a big problem need to be 
> fixed.
> I have fix this problem in our costom1.1.x branch, my solution is using 
> mvccPreAssign everywhere, making it un-configurable. Since mvccPreAssign it 
> is indeed a better way than assign seqid in the ringbuffer thread while 
> keeping handlers waiting for it.
> If anyone think it is doable, then I will port it to branch-1 and master 
> branch and upload it. 
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (HBASE-17471) Region Seqid will be out of order in WAL if using mvccPreAssign

Reply via email to