paul-rogers commented on a change in pull request #1953: Add docs for Drill 
Metastore
URL: https://github.com/apache/drill/pull/1953#discussion_r374455079
 
 

 ##########
 File path: 
_docs/performance-tuning/drill-metastore/030-drill-iceberg-metastore.md
 ##########
 @@ -0,0 +1,69 @@
+---
+title: "Drill Iceberg Metastore"
+parent: "Drill Metastore"
+date: 2020-01-31
+---
+
+Drill uses Iceberg Metastore implementation based on [Iceberg 
tables](http://iceberg.incubator.apache.org). For Drill 1.17,
+ this is default Drill Metastore implementation. For details on how to 
configure Iceberg Metastore implementation and
+ its option descriptions, please refer to [Iceberg Metastore 
docs](https://github.com/apache/drill/blob/master/metastore/iceberg-metastore/README.md).
+
+{% include startnote.html %}
+Iceberg table supports concurrent writes and transactions but they are only 
effective on file systems that support
+ atomic rename.
+If the file system does not support atomic rename, it could lead to 
inconsistencies during concurrent writes.
+{% include endnote.html %}
+
+### Iceberg Tables Location
+
+Iceberg tables will reside on the file system in the location based on
+Iceberg Metastore base location `drill.metastore.iceberg.location.base_path` 
and component specific location.
+If Iceberg Metastore base location is `/drill/metastore/iceberg`
+and tables component location is `tables`. Iceberg table for tables component
+will be located in `/drill/metastore/iceberg/tables` folder.
+
+Metastore metadata will be stored inside Iceberg table location provided
+in the configuration file. Drill table metadata location will be constructed
+based on specific component storage keys. For example, for `tables` component,
+storage keys are storage plugin, workspace and table name: unique table 
identifier in Drill.
+
+Assume Iceberg table location is `/drill/metastore/iceberg/tables`, metadata 
for the table
+`dfs.tmp.nation` will be stored in the 
`/drill/metastore/iceberg/tables/dfs/tmp/nation` folder.
+
+Example of base Metastore configuration file `drill-metastore-override.conf`, 
where Iceberg tables will be stored in
+ hdfs:
+
+```
+drill.metastore.iceberg: {
+  config.properties: {
+    fs.defaultFS: "hdfs:///"
+  }
+
+  location: {
+    base_path: "/drill/metastore",
+    relative_path: "iceberg"
+  }
+}
+```
+
+### Metadata Storage Format
+
+Iceberg tables support data storage in three formats: Parquet, Avro, ORC. 
Drill metadata will be stored in Parquet files.
+This format was chosen over others since it is column oriented and efficient 
in terms of disk I/O when specific
+columns need to be queried.
+
+Each Parquet file will hold information for one partition. Partition keys will 
depend on Metastore
+component characteristics. For example, for tables component, partitions keys 
are storage plugin, workspace,
+table name and metadata key.
+
+Parquet files name will be based on UUID to ensure uniqueness. If somehow 
collision occurs, modify operation
+in Metastore will fail.
+
+### Iceberg metadata expiration
+
+Iceberg table generates metadata for each modification operation:
+snapshot, manifest file, table metadata file. Also when performing delete 
operation,
+previously stored data files are not deleted. These files with the time can 
occupy lots of space.
+Two table properties `write.metadata.delete-after-commit.enabled` and 
`write.metadata.previous-versions-max`
+control expiration process. Metadata files will be expired automatically if
+`write.metadata.delete-after-commit.enabled` is enabled.
 
 Review comment:
   Maybe we should take a set back. Sounds like Iceberg maintains multiple 
versions of each file stored in Iceberg. This allows Drill to read a prior 
version of the file while ANALYZE TABLE creates a new version (true? Do we have 
such synchronization? If not, it is a bug.)
   
   Similarly, if a file is delete, prior versions remain.
   
   Therefore Iceberg requires that Drill periodically remove old file versions. 
Drill does this on a time interval defined by ....
   
   Note that none of the above ensure atomic operation: we can always come up 
with a scenario in which updates happen faster than readers. This is a bug that 
needs fixing.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to