[ 
https://issues.apache.org/jira/browse/HBASE-29357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tak-Lon (Stephen) Wu resolved HBASE-29357.
------------------------------------------
    Resolution: Fixed

Thanks again [~junegunn]

> PerformanceEvaluation: Read tests should not drop existing table
> ----------------------------------------------------------------
>
>                 Key: HBASE-29357
>                 URL: https://issues.apache.org/jira/browse/HBASE-29357
>             Project: HBase
>          Issue Type: Bug
>          Components: PE
>    Affects Versions: 4.0.0-alpha-1
>            Reporter: Junegunn Choi
>            Assignee: Junegunn Choi
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2, 2.6.3
>
>
> h2. Problem
> A read test such as {{randomRead}} might drop an existing table when the 
> specified {{--presplit}} value is not consistent with the current number of 
> regions of the table.
> {code:java}
> # Generate data
> bin/hbase pe --nomapred --size=2 --presplit=30 sequentialWrite 1
> # Perform a read test on the table. Forgot to remove --presplit option, but 
> it's okay.
> bin/hbase pe --nomapred --size=2 --presplit=30 --sampleRate=0.1 
> sequentialRead 1
> # But if the number of the regions has changed
> bin/hbase shell -n <<< "split 'TestTable'"
> # The --presplit option will cause recreation of the table.
> bin/hbase pe --nomapred --size=2 --presplit=30 --sampleRate=0.1 
> sequentialRead 1
>   # Operation: DISABLE, Table Name: default:TestTable completed
>   # Operation: DELETE, Table Name: default:TestTable completed
>   # Operation: CREATE, Table Name: default:TestTable completed
> {code}
> One might say it's wrong to put a {{--presplit}} value in a read test, yes, 
> but even so, it should not cause recreation of the table, which makes the 
> following read test meaningless.
> h2. Analysis
> There are currently 4 conditions for recreating the table.
> {code:java}
> if (
>   (exists && opts.presplitRegions != DEFAULT_OPTS.presplitRegions
>     && opts.presplitRegions != admin.getRegions(tableName).size())
>     || (!isReadCmd && desc != null
>       && !StringUtils.equals(desc.getRegionSplitPolicyClassName(), 
> opts.splitPolicy))
>     || (!(isReadCmd || isDeleteCmd) && desc != null
>       && desc.getRegionReplication() != opts.replicas)
>     || (desc != null && desc.getColumnFamilyCount() != opts.families)
> ) {
>   needsDelete = true;
> {code}
> But they are inconsistent in how they treat {{{}isReadCmd{}}}.
> h2. Suggestion
> *Premise: never drop an existing table unless executing a write command.*
> ||Condition||Current behavior||Suggested behavior||
> |Region count changed|{color:#de350b}Table recreated{color}|Proceed the test 
> with a warning|
> |Split policy changed|No warning|Proceed the test with a warning|
> |Replication factor changed|No warning|Proceed the test with a warning|
> |CF count changed|{color:#de350b}Table recreated{color}|Abort the test with a 
> warning|
>  * Change of region count or split policy shouldn't affect read tests, so 
> it's better to proceed the test but with a warning.
>  * I can also imagine wanting to perform a read test with {{--replicas=1}} 
> even when the table has a different setting.
>  * Technically, we can still run a read test if the current number of CFs is 
> greater than the requested number of CFs, but I decided not to allow it to 
> avoid confusion.
> h2. Result
> {code:java}
> bin/hbase pe --nomapred --size=2 --presplit=30 --sampleRate=0.1 
> sequentialRead 1
>   # Inconsistent table state detected. Consider running a write command 
> first: [--presplit=30, but found 60 regions]
> bin/hbase pe --nomapred --size=2 --presplit=30 --replicas=2 --sampleRate=0.1 
> sequentialRead 1
>   # Inconsistent table state detected. Consider running a write command 
> first: [--presplit=30, but found 60 regions], [--replicas=2, but found 1 
> replicas]
> bin/hbase pe --nomapred --size=2 --presplit=30 --replicas=2 
> --splitPolicy=org.apache.hadoop.hbase.regionserver.DisabledRegionSplitPolicy 
> --sampleRate=0.1 sequentialRead 1
>   # Inconsistent table state detected. Consider running a write command 
> first: [--presplit=30, but found 60 regions], 
> [--splitPolicy=org.apache.hadoop.hbase.regionserver.DisabledRegionSplitPolicy,
>  but current policy is null], [--replicas=2, but found 1 replicas]
> bin/hbase pe --nomapred --size=2 --presplit=30 --replicas=2 
> --splitPolicy=org.apache.hadoop.hbase.regionserver.DisabledRegionSplitPolicy 
> --families=2 --sampleRate=0.1 sequentialRead 1
>   # java.lang.IllegalStateException: Cannot proceed the test. Run a write 
> command first: --families=2, but found 1 column families
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to