Ke Han created HBASE-28159:
------------------------------

             Summary: Unable to get table state error when table is being 
initialized
                 Key: HBASE-28159
                 URL: https://issues.apache.org/jira/browse/HBASE-28159
             Project: HBase
          Issue Type: Bug
          Components: master
    Affects Versions: 2.4.17
            Reporter: Ke Han
         Attachments: hbase--master-37bbb9b6f05a.log, persistent.tar.gz

When executing commands to create a table, I noticed the following ERROR in 
HMaster
{code:java}
2023-10-17 06:41:47,118 ERROR [master/hmaster:16000.Chore.1] 
master.TableStateManager: Unable to get table 
uuidf68fb89ec7f4435597d69fb7b099d8e7 state
org.apache.hadoop.hbase.TableNotFoundException: No state found for 
uuidf68fb89ec7f4435597d69fb7b099d8e7
        at 
org.apache.hadoop.hbase.master.TableStateManager.getTableState(TableStateManager.java:155)
        at 
org.apache.hadoop.hbase.master.TableStateManager.isTableState(TableStateManager.java:92)
        at 
org.apache.hadoop.hbase.master.assignment.AssignmentManager.isTableDisabled(AssignmentManager.java:419)
        at 
org.apache.hadoop.hbase.master.assignment.AssignmentManager.getRegionStatesCount(AssignmentManager.java:2341)
        at 
org.apache.hadoop.hbase.master.HMaster.getClusterMetricsWithoutCoprocessor(HMaster.java:2616)
        at 
org.apache.hadoop.hbase.master.HMaster.getClusterMetricsWithoutCoprocessor(HMaster.java:2537)
        at 
org.apache.hadoop.hbase.master.balancer.ClusterStatusChore.chore(ClusterStatusChore.java:47)
        at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:158)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at 
org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:107)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750){code}
h1. Reproduce

Due to the thread interleaving, it might need to run the following command 
sequence multiple times to reproduce

1 HM, 2 RS, HDFS-2.10.2
{code:java}
create 'uuid49bb410e0a0c40ffb070d17787b4cad7', {NAME => 
'uuid66e57e5195e04956a78f789b2a25ec01', VERSIONS => 1, COMPRESSION => 'GZ', 
BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'true'}, {NAME => 
'uuid119181eed72a43ccb66fabe37f84d2c0', VERSIONS => 1, COMPRESSION => 'GZ', 
BLOOMFILTER => 'NONE', IN_MEMORY => 'true'}, {NAME => 
'uuidc2d4931eaf4c429db0e55514fb12e767', VERSIONS => 3, COMPRESSION => 'NONE', 
BLOOMFILTER => 'NONE', IN_MEMORY => 'false'}, {NAME => 
'uuidc9802bbfbe434411ae68bb8388d499b6', VERSIONS => 3, COMPRESSION => 'NONE', 
BLOOMFILTER => 'NONE', IN_MEMORY => 'false'}, {NAME => 
'uuidc85e117d0ca144719fc53d30b189a343', VERSIONS => 3, COMPRESSION => 'NONE', 
BLOOMFILTER => 'NONE', IN_MEMORY => 'false'}
create 'uuid094dd5bf47eb47d69148b63e73ce0e7c', {NAME => 
'uuid76ccbd96fbdc418b95ed9971ff423b2d', VERSIONS => 1, COMPRESSION => 'GZ', 
BLOOMFILTER => 'ROW', IN_MEMORY => 'true'}, {NAME => 
'uuid36835d3faff04838bd02d6226557d7c8', VERSIONS => 1, COMPRESSION => 'GZ', 
BLOOMFILTER => 'NONE', IN_MEMORY => 'false'}, {NAME => 
'uuid37752598d1bb405eb39a3e17c04d7e60', VERSIONS => 1, COMPRESSION => 'NONE', 
BLOOMFILTER => 'NONE', IN_MEMORY => 'false'}
create 'uuidf68fb89ec7f4435597d69fb7b099d8e7', {NAME => 
'uuidb235288b1d304fe1a62adb63968d9eee', VERSIONS => 1, COMPRESSION => 'NONE', 
BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'}, {NAME => 
'uuidf348f8849e724b3fa231fc2bb459be2d', VERSIONS => 1, COMPRESSION => 'NONE', 
BLOOMFILTER => 'NONE', IN_MEMORY => 'true'}, {NAME => 
'uuid81341a87083e49d7a0d8aff7b1ccf16a', VERSIONS => 3, COMPRESSION => 'GZ', 
BLOOMFILTER => 'ROW', IN_MEMORY => 'false'}, {NAME => 
'uuid24db0d3c67c347d3a4c18af90facec2d', VERSIONS => 1, COMPRESSION => 'NONE', 
BLOOMFILTER => 'ROW', IN_MEMORY => 'true'}, {NAME => 
'uuid7ecf10315f444cfd9c5698695f9054d9', VERSIONS => 1, COMPRESSION => 'NONE', 
BLOOMFILTER => 'ROWCOL', IN_MEMORY => 'false'}
enable 'uuid094dd5bf47eb47d69148b63e73ce0e7c'
create_namespace 'uuidc1066f82d7834f698d335dd04fa7ad3e'
alter 'uuid094dd5bf47eb47d69148b63e73ce0e7c', {NAME => 'enaJvIGYBk', 
BLOOMFILTER => 'ROWCOL', IN_MEMORY => false}
disable 'uuidf68fb89ec7f4435597d69fb7b099d8e7' {code}
I have attached the full logs.
h1. Root Cause

The ERROR message is thrown because of the thread interleaving between (1) T1: 
creating the table and (2) T2: Chore thread calculating TABLE_TO_REGIONS_COUNT.

Here's how it happens in detail
 # User issues a create table request, it puts the table name into 
tableDescriptors.
 # Chore thread is trying to calculate TABLE_TO_REGIONS_COUNT by iterating all 
tables from {*}getTableDescriptors().getAll(){*}. This also includes the table 
which is being created but the table state is not created yet.
 # It tries to fetch the table state and throws an ERROR.

IMO, this is a normal and correct process which shouldn't incur ERROR level 
message. It could be avoided by properly handling the thread interleaving 
between table updates and chore threads.

I am trying to fix it. Any help would be appreciated! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to