[
https://issues.apache.org/jira/browse/HBASE-21863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16765570#comment-16765570
]
stack commented on HBASE-21863:
-------------------------------
bq. stack can you elaborate on extra states from deadline?
If you are asking about why we do not timeout AMv2 commands, its because the
spec is as [~Apache9] has noted a few times.... either the command is ack'd,
error'd or we get an SCP. There is merit in our spec being this basic at least
in a first version of AMv2. If all calls now can also timeout, then every
command needs to handle have timeout handling and cancelling messaging; more
possible states, more moving parts.
bq. If the message did expire (master is no longer waiting), we avoid doing
something master doesn't expect. If it doesn't expire and we respond with
error, it happens before any work, so the master will just handle it like a
regular error. It's not ideal but should be rare and doesn't add new states.
... whats the thing the master doesn't expect? Pardon me, I'm having trouble
understanding above paragraph (bad context-switch). Say more please.
On killing RS if reports it has a Region it shouldn't have, yeah, makes sense.
I like your idea of versioning commands/rpcs so we know when we can safely
ignore reports.
> narrow down the double-assignment race window
> ---------------------------------------------
>
> Key: HBASE-21863
> URL: https://issues.apache.org/jira/browse/HBASE-21863
> Project: HBase
> Issue Type: Bug
> Reporter: Sergey Shelukhin
> Assignee: Sergey Shelukhin
> Priority: Major
> Attachments: HBASE-21863.01.patch, HBASE-21863.patch
>
>
> See HBASE-21862.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)