[
https://issues.apache.org/jira/browse/HBASE-627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12597984#action_12597984
]
Michaela Buergle commented on HBASE-627:
----------------------------------------
Some debugging later I now have a clearer idea of what's happening after
ServerManager, in his load balancing activities, receives MSG_REPORT_CLOSE for
region xy from node1:
- RegionManager assigns region xy to node2 and sets it to "unassigned" (towards
the end of assignRegionsToMultipleServers())
- At this point, HBaseAdmin.disableTable is called; in .META., region xy still
has info:server node1, region xy is marked as beingServed by node1 in
ProcessTableOperation.call() and thus added to the local kill list of node1
- Now we wait for region xy to go offline. But in the meantime, region xy opens
on node2. .META. changes, region xy now has info:server node2
So there is a short period during which the information in .META. is not
consistent with the actual state of regions. But disableTable() relies on the
information found in .META. How could this best be solved? It looks like quite
a fundamental problem to me.
> Disable table doesn't work reliably
> -----------------------------------
>
> Key: HBASE-627
> URL: https://issues.apache.org/jira/browse/HBASE-627
> Project: Hadoop HBase
> Issue Type: Bug
> Affects Versions: 0.2.0
> Environment: Hadoop/HBase on two nodes
> Reporter: Michaela Buergle
> Priority: Critical
> Fix For: 0.2.0
>
> Attachments: disableTable31.log, disableTable5.log
>
>
> When creating a couple of tables like this:
> 1) create an empty table
> 2) disable table, add new column family, enable table
> 3) put 100 small documents into newly created column
> around once in 10 tries the disable doesn't happen.
> I have no clue as to why the table isn't disabled in the first place, but if
> this occurs, two things in HBaseAdmin.disableTable() strike me as odd:
> - after numRetries tries to wait for disabling we exit the loop; there is no
> exception or error message:
> ...
> 2008-05-14 16:19:47,903 INFO org.apache.hadoop.hbase.client.HBaseAdmin:
> Disabled table table31
> 2008-05-14 16:19:47,910 INFO org.apache.hadoop.ipc.Server: IPC Server handler
> 3 on 60000, call addColumn(table31, {name: document, max versions: 3,
> compression: NONE, in memory: false, block cache enabled: false, max length:
> 2147483647, time to live: FOREVER, bloom filter: none}) from
> XXX.XX.40.36:47116: error: org.apache.hadoop.hbase.TableNotDisabledException:
> table31
> ...
> - the scanner iterates over HRegionInfos of several tables. If any one of
> those is disabled, we also leave the loop as if the requested table had been
> disabled.
> I've had this disabling problem occur quite reliably over the last days -
> today I couldn't reproduce it, though HBase version hasn't changed. ???
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.