[ 
https://issues.apache.org/jira/browse/PHOENIX-3326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15521594#comment-15521594
 ] 

James Taylor commented on PHOENIX-3326:
---------------------------------------

We can't put the cell we're using for the lock on the SYSTEM.CATALOG for the 
reasons already mentioned here: 
https://issues.apache.org/jira/browse/PHOENIX-3326?focusedCommentId=15519226&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15519226

I definitely wouldn't want to introduce a new dependency. I think we can leave 
the RC as it is and fix in 4.9. It's not causing any harm.

How about if we did the mutex using a new coprocessor method on our 
MetaDataEndpoint coprocessor which is installed on SYSTEM.CATALOG? We could 
likely even do that without involving zk. Maybe a row lock on the row in the 
SYSTEM.CATALOG representing the SYSTEM.CATALOG? We could make this change in 
the 4.x branches.

> Restoring SYSTEM.CATALOG from snapshot causes clients to run into 
> UpgradeInProgressException
> --------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-3326
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3326
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Samarth Jain
>            Assignee: Samarth Jain
>         Attachments: PHOENIX-3326_4.8-HBase-0.98.patch, 
> PHOENIX-3326_4.8-HBase-0.98_v2.patch, PHOENIX-3326_wip.patch
>
>
> We create a snapshot of the SYSTEM.CATALOG table only after the client is 
> able to successfully acquire a distributed mutex of sorts. This means the 
> snapshot also ends up containing the row that serves as the mutex. Now when 
> restoring the table from snapshot, this rows is still present which causes 
> clients to throw UpgradeInProgress exception. 
> I can think of a couple of ways to fix this:
> 1) Do the checkAndPut for the UPGRADE_MUTEX after creating the snapshot. I am 
> not too sure though how about HBase handles concurrent snapshot requests. Do 
> clients get an exception? Also we potentially could end up creating more 
> snapshots than we really need to. 
> 2) Do the checkAndPut for the UPGRADE_MUTEX in a different table (possibly 
> SYSTEM.SEQUENCE). This way the restored snapshot won't have the row. We would 
> need to delete the row from SYSTEM.SEQUENCE after the upgrade is done 
> (successfully or unsuccessfully).
> [~jamestaylor] - WDYT? 
> FYI, [~lhofhansl] - this is probably a blocker for 4.8.1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to