[ https://issues.apache.org/jira/browse/HBASE-29357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tak-Lon (Stephen) Wu resolved HBASE-29357. ------------------------------------------ Resolution: Fixed Thanks again [~junegunn] > PerformanceEvaluation: Read tests should not drop existing table > ---------------------------------------------------------------- > > Key: HBASE-29357 > URL: https://issues.apache.org/jira/browse/HBASE-29357 > Project: HBase > Issue Type: Bug > Components: PE > Affects Versions: 4.0.0-alpha-1 > Reporter: Junegunn Choi > Assignee: Junegunn Choi > Priority: Major > Labels: pull-request-available > Fix For: 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2, 2.6.3 > > > h2. Problem > A read test such as {{randomRead}} might drop an existing table when the > specified {{--presplit}} value is not consistent with the current number of > regions of the table. > {code:java} > # Generate data > bin/hbase pe --nomapred --size=2 --presplit=30 sequentialWrite 1 > # Perform a read test on the table. Forgot to remove --presplit option, but > it's okay. > bin/hbase pe --nomapred --size=2 --presplit=30 --sampleRate=0.1 > sequentialRead 1 > # But if the number of the regions has changed > bin/hbase shell -n <<< "split 'TestTable'" > # The --presplit option will cause recreation of the table. > bin/hbase pe --nomapred --size=2 --presplit=30 --sampleRate=0.1 > sequentialRead 1 > # Operation: DISABLE, Table Name: default:TestTable completed > # Operation: DELETE, Table Name: default:TestTable completed > # Operation: CREATE, Table Name: default:TestTable completed > {code} > One might say it's wrong to put a {{--presplit}} value in a read test, yes, > but even so, it should not cause recreation of the table, which makes the > following read test meaningless. > h2. Analysis > There are currently 4 conditions for recreating the table. > {code:java} > if ( > (exists && opts.presplitRegions != DEFAULT_OPTS.presplitRegions > && opts.presplitRegions != admin.getRegions(tableName).size()) > || (!isReadCmd && desc != null > && !StringUtils.equals(desc.getRegionSplitPolicyClassName(), > opts.splitPolicy)) > || (!(isReadCmd || isDeleteCmd) && desc != null > && desc.getRegionReplication() != opts.replicas) > || (desc != null && desc.getColumnFamilyCount() != opts.families) > ) { > needsDelete = true; > {code} > But they are inconsistent in how they treat {{{}isReadCmd{}}}. > h2. Suggestion > *Premise: never drop an existing table unless executing a write command.* > ||Condition||Current behavior||Suggested behavior|| > |Region count changed|{color:#de350b}Table recreated{color}|Proceed the test > with a warning| > |Split policy changed|No warning|Proceed the test with a warning| > |Replication factor changed|No warning|Proceed the test with a warning| > |CF count changed|{color:#de350b}Table recreated{color}|Abort the test with a > warning| > * Change of region count or split policy shouldn't affect read tests, so > it's better to proceed the test but with a warning. > * I can also imagine wanting to perform a read test with {{--replicas=1}} > even when the table has a different setting. > * Technically, we can still run a read test if the current number of CFs is > greater than the requested number of CFs, but I decided not to allow it to > avoid confusion. > h2. Result > {code:java} > bin/hbase pe --nomapred --size=2 --presplit=30 --sampleRate=0.1 > sequentialRead 1 > # Inconsistent table state detected. Consider running a write command > first: [--presplit=30, but found 60 regions] > bin/hbase pe --nomapred --size=2 --presplit=30 --replicas=2 --sampleRate=0.1 > sequentialRead 1 > # Inconsistent table state detected. Consider running a write command > first: [--presplit=30, but found 60 regions], [--replicas=2, but found 1 > replicas] > bin/hbase pe --nomapred --size=2 --presplit=30 --replicas=2 > --splitPolicy=org.apache.hadoop.hbase.regionserver.DisabledRegionSplitPolicy > --sampleRate=0.1 sequentialRead 1 > # Inconsistent table state detected. Consider running a write command > first: [--presplit=30, but found 60 regions], > [--splitPolicy=org.apache.hadoop.hbase.regionserver.DisabledRegionSplitPolicy, > but current policy is null], [--replicas=2, but found 1 replicas] > bin/hbase pe --nomapred --size=2 --presplit=30 --replicas=2 > --splitPolicy=org.apache.hadoop.hbase.regionserver.DisabledRegionSplitPolicy > --families=2 --sampleRate=0.1 sequentialRead 1 > # java.lang.IllegalStateException: Cannot proceed the test. Run a write > command first: --families=2, but found 1 column families > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)