[
https://issues.apache.org/jira/browse/HBASE-7987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13593774#comment-13593774
]
Matteo Bertozzi commented on HBASE-7987:
----------------------------------------
[[email protected]] The "file-tracking table" is described a bit in the pdf
attached to HBASE-7806 ("future" section) and has a completely different idea
from the manifest, and is not just snapshot related.
anyway for the near term (.94/.96) I still haven't decided, I don't consider
this one as super high priority since we have tested the multi-file for months
and even on large cluster it was good enough. I'll probably make a patch for
this by next week. but I prefer working on making everything working (e.g.
merge, rename table, hbck) instead of saying "you can't merge a region if you
use snapshots, you can rename a table & co..." and also it will be nice having
more metrics to knows the state, how long it takes how many time it fails (the
current FlushSnapshot fail every time there's a split or a region move)
> Snapshot Manifest file instead of multiple empty files
> ------------------------------------------------------
>
> Key: HBASE-7987
> URL: https://issues.apache.org/jira/browse/HBASE-7987
> Project: HBase
> Issue Type: Improvement
> Components: snapshots
> Reporter: Matteo Bertozzi
>
> Currently taking a snapshot means creating one empty file for each file in
> the source table directory, plus copying the .regioninfo file for each
> region, the table descriptor file and a snapshotInfo file.
> during the restore or snapshot verification we traverse the filesystem
> (fs.listStatus()) to find the snapshot files, and we open the .regioninfo
> files to get the information.
> to avoid hammering the NameNode and having lots of empty files, we can use a
> manifest file that contains the list of files and information that we need.
> To keep the RS parallelism that we have, each RS can write its own manifest.
> {code}
> message SnapshotDescriptor {
> required string name;
> optional string table;
> optional int64 creationTime;
> optional Type type;
> optional int32 version;
> }
> message SnapshotRegionManifest {
> optional int32 version;
> required RegionInfo regionInfo;
> repeated FamilyFiles familyFiles;
> message StoreFile {
> required string name;
> optional Reference reference;
> }
> message FamilyFiles {
> required bytes familyName;
> repeated StoreFile storeFiles;
> }
> }
> {code}
> {code}
> /hbase/.snapshot/<snapshotName>
> /hbase/.snapshot/<snapshotName>/snapshotInfo
> /hbase/.snapshot/<snapshotName>/<tableName>
> /hbase/.snapshot/<snapshotName>/<tableName>/tableInfo
> /hbase/.snapshot/<snapshotName>/<tableName>/regionManifest(.n)
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira