[
https://issues.apache.org/jira/browse/HBASE-26286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440521#comment-17440521
]
Szabolcs Bukros commented on HBASE-26286:
-----------------------------------------
[~zhangduo]
{quote}
IIRC, our decision on HBASE-26280 is that, a snapshot will be constructed by
plain HFiles, you always need to list the directory to get all the HFiles, so
I'm a bit confusing that why here we say 'snapshot with file based SFT'. Did I
miss something?\{quote}
If I understand correctly that discussion was about tracker files and concluded
in not adding them because the list of available hfiles in the snapshot will
always be the full and correct list of storefiles so the tracker files can be
rebuilt if necessary. The TableDescriptor however can contain SFT config, both
on table and cf level if the global SFT config was overridden.
{quote}
Could someone explain them for me? It seems that one of them is for creating a
new table and another one is for performing on an existing table?
{quote}
Clone creates a new table with the provided name and the TableDescriptor from
the snapshot metadata. So we can freely change the SFT implementation we would
like to use, because we can just override the TableDescriptor and the new table
will be created with it.
Restore, tries to restore the state of an existing table to match the snapshot.
To achieve this it deletes regions and/or hfiles present in the current table
but not present in the snapshot, copies regions and/or hfiles missing to the
current table from the snapshot and most importantly for us it simply
overwrites the current TableDesriptor with the one from the snapshot.
This last step causes the problems.
* Consider a usecase where a cf uses file based SFT at the time of snapshot,
while the global config is still the default SFT. Later on we migrate the cf
back to the default SFT. Then we have to restore the snapshot. The process
overwrites the TableDescriptor with the one from snapshot and suddenly the cf
will try to use file based SFT (since it used that before the snapshot) but
because there is no actual SFT migration as part of the restore process the cf
folder does not have tracking files and SFT fails. This is a bug in the current
implementation.
* Specifying the SFT for restore has it's own issues. Consider a usecase where
the global SFT config uses the default. We restore a table and specify we would
like to use file based SFT instead. There will be regions that exists in the
current table and existed at the time of the snapshot too. A few hfiles might
get added/deleted, but otherwise they remain untouched. Forcefully setting the
SFT to file based as specified is possible, but there is no logic that would do
the migration and build the tracker files, so the SFT would fail. Similarly
switching back to default (from a file based SFT) is possible but restore lacks
the logic to clean up the tracker files.
We have multiple options here:
# As [~wchevreuil] suggested we could add a check that stops the restore
process if there would be an SFT incompatibility and would prompt the user to
manually migrate the problematic sections first. This has the advantage of
keeping the restore logic clean and making an SFT change a more conscious
decision. But has the downside of being a potentially labor intensive manual
process.
# We could use the SFT implementation param we are currently introducing to
signal which implementation we would *prefer* to use. When there is a conflict
in the current and snapshot SFT config, if the currently used implementation
matches the SFT param, we can override the snapshot config. This is basically a
bit more flexible variation of the 1. point. It would help the user move
towards a selected SFT while keeping the restore logic clean.
# We could add the SFT migration logic to restore and simply add the tracking
files when needed or clean them up when we move away from file based SFT. It
has the upside of being the most user friendly solution, but it has the
downside of mixing restore logic with SFT logic.
# We could extend the SFT implementations to "auto migrate" meaning clean up
after themselves and prepare necessary files for themselves. This would allow
restore to just override the TableDescriptor any way it wants and let SFT deal
with the required steps.
> Add support for specifying store file tracker when restoring or cloning
> snapshot
> --------------------------------------------------------------------------------
>
> Key: HBASE-26286
> URL: https://issues.apache.org/jira/browse/HBASE-26286
> Project: HBase
> Issue Type: Sub-task
> Components: HFile, snapshots
> Reporter: Duo Zhang
> Assignee: Szabolcs Bukros
> Priority: Major
>
> As discussed in HBASE-26280.
> https://issues.apache.org/jira/browse/HBASE-26280?focusedCommentId=17414894&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17414894
--
This message was sent by Atlassian Jira
(v8.20.1#820001)