Thanks for the proposal. I would recommend creating the github issue with the "proposal" label for easy tracking of all the ongoing proposals as mentioned here <https://iceberg.apache.org/contribute/?h=proposal#apache-iceberg-improvement-proposals> .
The first concern that comes to mind is whether moving the compute logic to the server compromises the engine's distributive capabilities. As for server-side caching, engines are also capable of caching results. While "Engine Independence" is a positive direction, we should benchmark performance with large tables to evaluate any potential impact. - Ajantha On Thu, Jul 4, 2024 at 5:59 AM Szehon Ho <szehon.apa...@gmail.com> wrote: > Yes, I was chatting with Yufei about this, in the first glance I agree > this would be nice to have. I always thought that metadata tables are > important enough to spec somewhere, and I think this is a nice place to do > it. There seems to be some overlap with existing calls (ie, you can get > snapshots from table. and files from proposed Plan API), but it does seem > valuable to get it in one place. > > If we can solve the 'big metadata' issue for PrePlan/PlanTable API's, it > sounds like we can re-use the solution for files metadata tables. I'd > perhaps leave out position_deletes one though, as it's mostly used > internally and seems a bit too 'big' even for this. > > I wonder if we can even add an optional endpoint for listing 'removed' > snapshots. I know it sounds weird, but when looking at metadata tables, > the one question that I got a lot but could not answer is how to find when > a data file is added (or a partition is added). If the snapshot is expired > then it is no longer possible to trace that history. Users often expire > snapshots to claw back disk space, but may necessarily want to delete the > snapshot history. But I believe the REST catalog seems to have an > opportunity in removeSnapshot to preserve the metadata of the old snapshot > (up to some configured time). So we can query the snapshot metadata even > after it expires, which I feel will be valuable. > > Thanks > Szehon > > > On Wed, Jul 3, 2024 at 3:04 PM Jack Ye <yezhao...@gmail.com> wrote: > >> Hi Yufei, >> >> Interesting that we are thinking about similar things. I had this item as >> a part of the roadmap discussion items in the catalog sync meeting, and >> then I removed it before the meeting because I felt it's too early to >> discuss. >> >> My main concern for having server-side metadata tables is how we solve >> the "big metadata" issue. The partitions, manifests, files table can easily >> itself become a big table, and the REST server becomes inefficient in >> retrieving results. It's the same old "HMS is too slow in iterating through >> the partitions" problem. Iceberg kind of solves it by having this >> information in Avro and in storage that can be scanned distributedly, but >> with server-side metadata tables, we are technically re-introducing the >> problem. >> >> Maybe one potential approach is to run those potentially large metadata >> table scans through the PreplanTable and PlanTable APIs. Just a quick >> thought for now, I need to think a bit more about this. >> >> Best, >> Jack Ye >> >> >> >> >> >> On Wed, Jul 3, 2024 at 1:45 PM Yufei Gu <flyrain...@gmail.com> wrote: >> >>> Hi folks, >>> >>> I'd like to discuss a new proposal to support server-side metadata >>> tables. >>> >>> One of Iceberg's most advantageous features is the ability to inspect a >>> table using metadata tables. For instance, we can query snapshots just like >>> we query data rows using the following command: SELECT * FROM >>> prod.db.table.snapshots; >>> >>> With the REST catalog, we can simplify this process further by providing >>> metadata directly from REST endpoints. Here are several benefits of this >>> approach: >>> >>> - Engine Independence: The metadata tables do not rely on a specific >>> implementation of an engine. The REST server returns the results >>> directly. >>> For example, the Rust Iceberg does not need to implement its own logic to >>> query the snapshot table if it connects to a server with this capability. >>> This reduces the complexity and development effort required for different >>> clients and engines. >>> - Enabled New Use Cases: A catalog UI or Lakehouse UI can present a >>> table's metadata (e.g., snapshot/partition list) without relying on an >>> engine like Trino. This opens up possibilities for lightweight UIs and >>> tools that can directly interact with the REST endpoints to retrieve and >>> display metadata. >>> - Enhanced Performance: With server-side caching, the server-side >>> metadata tables will perform better. Caching reduces the need to >>> repeatedly >>> compute or retrieve metadata, leading to faster response times and >>> reduced >>> load on the underlying storage systems. >>> >>> Here is the proposal in google doc: >>> https://docs.google.com/document/d/1MVLwyMQtZ-7jewsQ0PuTvtJbpfl4HCoVdbowMqFTmfc/edit?usp=sharing >>> >>> Estimated read time: 5 mins >>> >>> Would really appreciate any feedback on this topic and proposal! >>> >>> >>> Yufei >>> >>