[ 
https://issues.apache.org/jira/browse/PHOENIX-3326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15521542#comment-15521542
 ] 

James Taylor commented on PHOENIX-3326:
---------------------------------------

Would you be able to point us to an example of implementing an atomic lock that 
is ephemeral, [~apurtell]? I agree, that'd be a better implementation. I think 
if zookeeper isn't available, we likely wouldn't be able to proceed with the 
upgrade anyway. Is it ok for an HBase client to communicate directly with 
zookeeper? I wouldn't want to put new requirements on the client in terms of 
opening new ports. But I suppose the client is already communicating with 
zookeeper?

> Restoring SYSTEM.CATALOG from snapshot causes clients to run into 
> UpgradeInProgressException
> --------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-3326
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3326
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Samarth Jain
>            Assignee: Samarth Jain
>         Attachments: PHOENIX-3326_4.8-HBase-0.98.patch, 
> PHOENIX-3326_4.8-HBase-0.98_v2.patch, PHOENIX-3326_wip.patch
>
>
> We create a snapshot of the SYSTEM.CATALOG table only after the client is 
> able to successfully acquire a distributed mutex of sorts. This means the 
> snapshot also ends up containing the row that serves as the mutex. Now when 
> restoring the table from snapshot, this rows is still present which causes 
> clients to throw UpgradeInProgress exception. 
> I can think of a couple of ways to fix this:
> 1) Do the checkAndPut for the UPGRADE_MUTEX after creating the snapshot. I am 
> not too sure though how about HBase handles concurrent snapshot requests. Do 
> clients get an exception? Also we potentially could end up creating more 
> snapshots than we really need to. 
> 2) Do the checkAndPut for the UPGRADE_MUTEX in a different table (possibly 
> SYSTEM.SEQUENCE). This way the restored snapshot won't have the row. We would 
> need to delete the row from SYSTEM.SEQUENCE after the upgrade is done 
> (successfully or unsuccessfully).
> [~jamestaylor] - WDYT? 
> FYI, [~lhofhansl] - this is probably a blocker for 4.8.1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to