[ 
https://issues.apache.org/jira/browse/HBASE-26286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440521#comment-17440521
 ] 

Szabolcs Bukros commented on HBASE-26286:
-----------------------------------------

[~zhangduo] 

{quote}
IIRC, our decision on HBASE-26280 is that, a snapshot will be constructed by 
plain HFiles, you always need to list the directory to get all the HFiles, so 
I'm a bit confusing that why here we say 'snapshot with file based SFT'. Did I 
miss something?\{quote}

If I understand correctly that discussion was about tracker files and concluded 
in not adding them because the list of available hfiles in the snapshot will 
always be the full and correct list of storefiles so the tracker files can be 
rebuilt if necessary. The TableDescriptor however can contain SFT config, both 
on table and cf level if the global SFT config was overridden.

 

{quote}
Could someone explain them for me? It seems that one of them is for creating a 
new table and another one is for performing on an existing table?
{quote}

Clone creates a new table with the provided name and the TableDescriptor from 
the snapshot metadata. So we can freely change the SFT implementation we would 
like to use, because we can just override the TableDescriptor and the new table 
will be created with it.

Restore, tries to restore the state of an existing table to match the snapshot. 
To achieve this it deletes regions and/or hfiles present in the current table 
but not present in the snapshot, copies regions and/or hfiles missing to the 
current table from the snapshot and most importantly for us it simply 
overwrites the current TableDesriptor with the one from the snapshot.

This last step causes the problems.
 * Consider a usecase where a cf uses file based SFT at the time of snapshot, 
while the global config is still the default SFT. Later on we migrate the cf 
back to the default SFT. Then we have to restore the snapshot. The process 
overwrites the TableDescriptor with the one from snapshot and suddenly the cf 
will try to use file based SFT (since it used that before the snapshot) but 
because there is no actual SFT migration as part of the restore process the cf 
folder does not have tracking files and SFT fails. This is a bug in the current 
implementation.
 * Specifying the SFT for restore has it's own issues. Consider a usecase where 
the global SFT config uses the default. We restore a table and specify we would 
like to use file based SFT instead. There will be regions that exists in the 
current table and existed at the time of the snapshot too. A few hfiles might 
get added/deleted, but otherwise they remain untouched. Forcefully setting the 
SFT to file based as specified is possible, but there is no logic that would do 
the migration and build the tracker files, so the SFT would fail. Similarly 
switching back to default (from a file based SFT) is possible but restore lacks 
the logic to clean up the tracker files.

We have multiple options here:
 # As [~wchevreuil] suggested we could add a check that stops the restore 
process if there would be an SFT incompatibility and would prompt the user to 
manually migrate the problematic sections first. This has the advantage of 
keeping the restore logic clean and making an SFT change a more conscious 
decision. But has the downside of being a potentially labor intensive manual 
process.
 # We could use the SFT implementation param we are currently introducing to 
signal which implementation we would *prefer* to use. When there is a conflict 
in the current and snapshot SFT config, if the currently used implementation 
matches the SFT param, we can override the snapshot config. This is basically a 
bit more flexible variation of the 1. point. It would help the user move 
towards a selected SFT while keeping the restore logic clean.
 # We could add the SFT migration logic to restore and simply add the tracking 
files when needed or clean them up when we move away from file based SFT. It 
has the upside of being the most user friendly solution, but it has the 
downside of mixing restore logic with SFT logic.
 # We could extend the SFT implementations to "auto migrate" meaning clean up 
after themselves and prepare necessary files for themselves. This would allow 
restore to just override the TableDescriptor any way it wants and let SFT deal 
with the required steps.

> Add support for specifying store file tracker when restoring or cloning 
> snapshot
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-26286
>                 URL: https://issues.apache.org/jira/browse/HBASE-26286
>             Project: HBase
>          Issue Type: Sub-task
>          Components: HFile, snapshots
>            Reporter: Duo Zhang
>            Assignee: Szabolcs Bukros
>            Priority: Major
>
> As discussed in HBASE-26280.
> https://issues.apache.org/jira/browse/HBASE-26280?focusedCommentId=17414894&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17414894



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to