[ 
https://issues.apache.org/jira/browse/PHOENIX-6104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255293#comment-17255293
 ] 

Viraj Jasani commented on PHOENIX-6104:
---------------------------------------

[~stoty] I spent some time with this one but didn't realize that this Jira was 
already created.

I believe we are not splitting SYSTEM.CATALOG synchronously with correct 
strategy. What we are doing is
{code:java}
admin.split(fullTableName, splitPoint);
// make sure the split finishes (there's no synchronous splitting before HBase 
2.x)
admin.disableTable(fullTableName);
admin.enableTable(fullTableName);

{code}
With HBase 2.3, we try to split the table asynchronously and when 
SplitTableProcedure is actually getting executed, we soon ask Admin to disable 
table and this seems problematic, causing NPE while retrieving RegionNode's 
location while unassigning the region:
{code:java}
2020-12-26 14:17:18,043 ERROR [PEWorker-13] 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor(1688): CODE-BUG: Uncaught 
runtime exception: pid=125, ppid=119, 
state=RUNNABLE:REGION_STATE_TRANSITION_CLOSE, locked=true; 
TransitRegionStateProcedure table=SYSTEM.CATALOG, 
region=62da70c1cc98a8e5e0dd93cd7abce3a8, UNASSIGN
java.lang.NullPointerException
        at 
java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936)
        at 
org.apache.hadoop.hbase.master.assignment.RegionStates.getOrCreateServer(RegionStates.java:742)
        at 
org.apache.hadoop.hbase.master.assignment.RegionStates.addRegionToServer(RegionStates.java:777)
        at 
org.apache.hadoop.hbase.master.assignment.AssignmentManager.regionClosing(AssignmentManager.java:1807)
        at 
org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure.closeRegion(TransitRegionStateProcedure.java:267)

{code}
I was thinking, instead of disabling and enabling SYSTEM.CATALOG, we should 
rather wait for table to be split. For instance, the way we do in 

TableSnapshotReadsMapReduceIT.splitTableSync(), maybe we can make it move to 
BaseTest. Thought?

> SplitSystemCatalogIT tests very unstable with Hbase 2.3
> -------------------------------------------------------
>
>                 Key: PHOENIX-6104
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6104
>             Project: Phoenix
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 5.1.0
>            Reporter: Istvan Toth
>            Assignee: Istvan Toth
>            Priority: Major
>         Attachments: 6104-testouput.log
>
>
> The failure is in the test preparation code, where we split the system 
> catalog table, and it seems to be a HBase issue, rather than a Phoenix one, 
> but we need to track the issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to