[GitHub] [incubator-iceberg] xabriel commented on issue #404: Provide a way to show metadata for a Hadoop table

GitBox Fri, 30 Aug 2019 08:57:14 -0700

xabriel commented on issue #404: Provide a way to show metadata for a Hadoop 
table
URL: 
https://github.com/apache/incubator-iceberg/issues/404#issuecomment-526654991
 
 
   This issue makes sense to me @waterlx . We should be able to study the 
metadata from an Iceberg table without having a dependency on Hive. I think its 
important for feature parity.
   
   We should agree on the interface. For Hive tables, @rdblue chose to extend 
the namespace:
   ```
   spark.read
     .format("iceberg")
     .load("db.table.history")
   ```
   
   where `db.table` is the Hive object, and `.history` is the namespace 
extension to access the "history" metadata.
   
   Problem is, with an URI such as we have in HDFS or Object Storage, that 
would look weird:
   ```
   spark.read
     .format("iceberg")
     .load("load("adl://bucket/path/to/table.history")
   ```
   
   So maybe we should make this an option?
   ```
   spark.read
     .format("iceberg")
     .option("read-metadata", "history")
     .load("load("adl://bucket//path/to/table")
   ```
   
   CC @rdblue


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [incubator-iceberg] xabriel commented on issue #404: Provide a way to show metadata for a Hadoop table

Reply via email to