[
https://issues.apache.org/jira/browse/PHOENIX-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17236488#comment-17236488
]
Chinmay Kulkarni edited comment on PHOENIX-6086 at 11/20/20, 10:42 PM:
-----------------------------------------------------------------------
I created this Jira to extend safety during upgrades for all SYSTEM tables, so
that we now would take a snapshot of each one. I just realized that we also
then extended the restore-snapshot logic to all SYSTEM tables in case of an
exception during EXECUTE UPGRADE (see
[this|https://github.com/apache/phoenix/blob/0d0e86e7ba63d20e92d4e8c03259344a958dbcd1/phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java#L3940-L3944]).
Thinking about this a little bit, there can be various downsides to
automatically restoring from the SYSTEM table snapshot, such as:
# Any DDLs issued since the upgrade began would be lost when we restore the
snapshot of SYSTEM.CATALOG
# If we encounter an issue when upgrading any SYSTEM table, we would end up
restoring the snapshots of all of them (I don't think there is necessarily a
better way to handle this since we can't just restore the snapshot for the
table whose upgrade failed or we'd be in a weird mixed metadata state)
# The point above also means that SYSTEM.SEQUENCE would be restored and that
would break(?) sequences issued during this time.
I wanted to get your thoughts on how to handle this [~gjacoby] [~kadir]
[~jisaac] [~vjasani] [~yanxinyi] [~sukumaddineni]. Since we currently don't log
DDLs issued during the upgrade path and because of the problem with sequences I
think for now, maybe it is safer to just keep the snapshots around and allow
the operator to decide how to handle the upgrade failure rather than blindly
forcing a restore from snapshots. What do you guys think?
was (Author: ckulkarni):
[~vjasani] I created this Jira to extend safety during upgrades for all SYSTEM
tables, so that we now would take a snapshot of each one. I just realized that
we also then extended the restore-snapshot logic to all SYSTEM tables in case
of an exception during EXECUTE UPGRADE (see
[this|https://github.com/apache/phoenix/blob/0d0e86e7ba63d20e92d4e8c03259344a958dbcd1/phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java#L3940-L3944]).
Thinking about this a little bit, there can be various downsides to
automatically restoring from the SYSTEM table snapshot, such as:
# Any DDLs issued since the upgrade began would be lost when we restore the
snapshot of SYSTEM.CATALOG
# If we encounter an issue when upgrading any SYSTEM table, we would end up
restoring the snapshots of all of them (I don't think there is necessarily a
better way to handle this since we can't just restore the snapshot for the
table whose upgrade failed or we'd be in a weird mixed metadata state)
# The point above also means that SYSTEM.SEQUENCE would be restored and that
would break(?) sequences issued during this time.
I wanted to get your thoughts on how to handle this [~gjacoby] [~kadir]
[~jisaac]. Since we currently don't log DDLs issued during the upgrade path and
because of the problem with sequences I think for now, maybe it is safer to
just keep the snapshots around and allow the operator to decide how to handle
the upgrade failure rather than blindly forcing a restore from snapshots. What
do you guys think?
> Take a snapshot of all SYSTEM tables before attempting to upgrade them
> ----------------------------------------------------------------------
>
> Key: PHOENIX-6086
> URL: https://issues.apache.org/jira/browse/PHOENIX-6086
> Project: Phoenix
> Issue Type: Improvement
> Affects Versions: 5.0.0, 4.15.0
> Reporter: Chinmay Kulkarni
> Assignee: Viraj Jasani
> Priority: Critical
> Fix For: 5.1.0, 4.16.0
>
> Attachments: PHOENIX-6086.4.x.000.patch,
> PHOENIX-6086.master.000.patch, PHOENIX-6086.master.002.patch,
> PHOENIX-6086.master.003.patch
>
>
> Currently we only take a snapshot of SYSTEM.CATALOG before attempting to
> upgrade it (see
> [this|https://github.com/apache/phoenix/blob/1922895dfe5960dc025709b04acfaf974d3959dc/phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java#L3718]).
> From 4.15 onwards we also store critical metadata information in other
> SYSTEM tables like SYSTEM.CHILD_LINK, so it is beneficial to also snapshot
> those tables before upgrading them henceforth.
> We also currently don't take a snapshot of SYSTEM.CATALOG on receiving an
> [UpgradeRequiredException|https://github.com/apache/phoenix/blob/1922895dfe5960dc025709b04acfaf974d3959dc/phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java#L3685-L3707]
> which we should do.
> In case of any errors during the upgrade, we restore SYSTEM.CATALOG from this
> snapshot and we should extend this to all tables. In cases where the table
> didn't exist before the upgrade, we need to ensure it is dropped so that a
> subsequent upgrade attempt can start afresh.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)