Re: [Proposal] REST Spec: Server-side Metadata Tables

Vishal Jadhav (BLOOMBERG/ NEW JERSE) Fri, 05 Jul 2024 07:13:37 -0700

Thinking aloud.There are some usecases in authorization/authentication realm 
which could use this approach.


From: sn...@snazy.de At: 07/04/24 06:10:53 UTC-4:00To:  dev@iceberg.apache.org
Subject: Re: [Proposal] REST Spec: Server-side Metadata Tables

                   
Hi Yufei,     
I think the proposal is very interesting! The direction this and       other 
proposals are going is IMO the right one.     
Since many proposals need access to at least manifest-lists and       manifest 
files, potentially also data/delete files, does it make       sense to bundle 
all proposals that need this ability?
         
Robert
         
On 03.07.24 22:44, Yufei Gu wrote:
         
             
Hi folks,         

                 
I'd like to discuss a new proposal to support server-side           metadata 
tables.         

                 
One of Iceberg's most advantageous features is the ability           to inspect 
a table using metadata tables. For instance, we can           query snapshots 
just like we query data rows using the           following command: SELECT * 
FROM             prod.db.table.snapshots;           
            

              With the REST catalog, we can simplify this process               
further by providing metadata directly from REST               endpoints. Here 
are several benefits of this approach:
              
*                
*Engine Independence: The metadata tables do not rely                   on a 
specific implementation of an engine. The REST                   server returns 
the results directly. For example, the                   Rust Iceberg does not 
need to implement its own logic                   to query the snapshot table 
if it connects to a server                   with this capability. This reduces 
the complexity and                   development effort required for different 
clients and                   engines.
*                 
*Enabled New Use Cases: A catalog UI or Lakehouse UI                   can 
present a table's metadata (e.g.,                   snapshot/partition list) 
without relying on an engine                   like Trino. This opens up 
possibilities for                   lightweight UIs and tools that can directly 
interact                   with the REST endpoints to retrieve and display      
             metadata.
*                 
*Enhanced Performance: With server-side caching, the                   
server-side metadata tables will perform better.                   Caching 
reduces the need to repeatedly compute or                   retrieve metadata, 
leading to faster response times                   and reduced load on the 
underlying storage systems.
*                             
Here is the proposal in google doc: 
https://docs.google.com/document/d/1MVLwyMQtZ-7jewsQ0PuTvtJbpfl4HCoVdbowMqFTmfc/edit?usp=sharing
                             

                             
Estimated read time: 5 mins               

                             
                
Would really appreciate any feedback on this topic                   and 
proposal!
                                 

                                             
              Yufei                            
     -- 
Robert Stupp
@snazy

Re: [Proposal] REST Spec: Server-side Metadata Tables

Reply via email to