Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

Zoltán Borók-Nagy Mon, 18 Nov 2024 10:22:56 -0800

Hey Everyone,

Thanks Gábor, I think the proposed interface would be very useful to any
engine that employs caching, e.g. Impala.
And it is pretty neat that it is catalog-agnostic, i.e. we just give
all the information we have about the table and let the catalog
implementation efficiently reload it.


I might have a nitpick suggestion about the name to clearly express the
intent: loadTable -> reloadTable (or, refreshTable)

Cheers,
    Zoltan


On Mon, Nov 18, 2024 at 5:17 PM Gabor Kaszab
<gaborkas...@cloudera.com.invalid> wrote:

> Hi Iceberg Community,
>
> This is a great conversation so far, and thanks everyone for the valuable
> inputs!
> I'd like to articulate 2 things that we have to keep in mind with the
> design:
>
> *1: There are 2 interfaces here that we should consider:*
> What I mean by this is that so far we have been talking about the REST
> spec, more narrowly the HTTP communication between Iceberg's REST client
> and the REST server. I think the proposed solution with the ETag absolutely
> makes sense within this context.
> However, the usual way of a client interacting with an Iceberg catalog
> (including REST) is the Catalog API in the library. This API offers a
> loadTable(TableIdentifier) function that returns a Table object. With the
> above HTTP-based solution in mind I don't think we could give any
> meaningful results if the HTTP layer finds that the table hasn't changed. I
> argued already against pushing the caching responsibilities from the
> clients into the HTTP layer (mostly because of losing the control over the
> cache, and also observability won't be straightforward) so let's assume for
> now that we won't do caching in the HTTP layer, only execute the loadTable
> calls to the REST catalog by setting the ETag. In case we get a 304 we
> won't be able to construct a Table object to answer the
> Catalog.loadTable(TableIdentifier) call. We could return null or throw an
> exception but I don't find any of them appropriate.
>
> *2: There are catalog types other than REST*
> I started this conversation focusing on the REST spec, but the more I
> think of this the more I feel that the same functionality should be offered
> for all the other catalog types too. Let's assume that we have an engine
> that caches table metadata and initially uses REST catalog. For such an
> engine the proposed solution would solve the problem of checking table
> freshnes and also reloading the table metadata. A simple code for that
> could be enough if we configured our HTTP client properly (just sketched a
> simple example):
>
> tableCache_.put(catalog_.loadTable(tableIdentifier));
>
> Also let's assume we solve the issue in 1) and we can answer such a call
> even if we get 304 from the server as the table is unchanged. So with this
> solution with the REST catalog we can be sure that the table is only loaded
> from the catalog if changed (or the age expired). But what if we configure
> another catalog, let's say HiveCatalog. The very same code for that catalog
> would trigger a table reload for every execution causing unexpected
> performance issues.
> I have to double check but I assume that this HTTP approach wouldn't be
> feasible for other catalog types unfortunately.
>
> I hope these arguments make sense :)
>
>
> *As a partial solution this is what I have in mind:*
> We can add another function into the catalog API for this purpose. Let's
> say something like this:
> Table loadTable(Table existingTable);
>
> What advantages I see with this:
> - This could solve issue 1) above. In case the table hasn't changed we can
> simply return 'existingTable' without using HTTP Cache.
> - The clients wouldn't need to explicitly call for isLatest() and such
> functions to check for freshness, and they wouldn't need to trigger table
> reloading for themselve. This API would be expected to cover this under the
> hood.
> - The current Catalog.loadTable(TableIdentifier) API wouldn't be enough
> for all the catalog types on it's own, but with this one each catalog
> implementations (e.g. HiveCatalog, REST catalog, etc.) then can implement
> their own way of doing freshness checks and table reloads. For REST we
> could follow the HTTP ETag approach, while for other catalogs we could
> follow other approaches.
>
> Regards,
> Gabor
>
> On Mon, Nov 18, 2024 at 8:48 AM Shani Elharrar <sh...@upsolver.com.invalid>
> wrote:
>
>> You're totally right. Perhaps using a "Content-Location" header might be
>> a better fit for that.
>>
>> Shani.
>>
>> On 18 Nov 2024, at 9:27, Taeyun Kim <taeyun....@innowireless.com> wrote:
>>
>> 
>> Hi,
>>
>> Here are my thoughts:
>>
>> - The value of ETag is (as far as I know) defined as an opaque string by
>> the specification, meaning the client shouldn’t interpret or assign any
>> significance to it, regardless of what the server specifies. It’s best to
>> avoid the client giving any particular meaning to the ETag value.
>> - One major advantage of the header approach compared to other methods is
>> that if an update has occurred, the updated content can be immediately
>> included in the response without requiring an additional request. This
>> saves one request-response round-trip (although It’s also possible to
>> define a separate endpoint with the same functionality).
>> - Since the Iceberg REST catalog server is effectively a type of HTTP
>> server, at least in theory, it may be expected to handle HTTP cache and
>> validation-related processes. The header approach can be seen as leveraging
>> this mechanism appropriately.
>> - The header approach doesn’t have to be limited to the
>> /v1/{prefix}/namespaces/{namespace}/tables/{table} endpoint. It could also
>> be applied to all GET-based endpoints, though this might broaden the scope
>> significantly.
>>
>> Thank you.
>>
>>
>>
>> -----Original Message-----
>> From: "Shani Elharrar" <sh...@upsolver.com.invalid>
>> To: <dev@iceberg.apache.org>;
>> Cc: <dev@iceberg.apache.org>;
>> Sent: 2024-11-18 (월) 16:21:16 (UTC+09:00)
>> Subject: Re: [DISCUSS] REST: Way to query if metadata pointer is the
>> latest
>>
>> Using the metadata file name as ETag is nice way to go. In that case,
>> adding HEAD method support to the loadTable endpoint will return the latest
>> metadata pointer, which can be used to support "isLatest" without returning
>> the body. It can be also leveraged in order to return the latest metadata
>> location of the table.
>>
>> Shani.
>>
>> On 18 Nov 2024, at 8:52, Yufei Gu <flyrain...@gmail.com> wrote:
>>
>> 
>>
>> Hi Taeyun,
>>
>> Thank you for the clear explanation.
>>
>> I agree that the ETag solution is more suitable. If we were going that
>> way, I'd propose a customized version number as an ETag—for instance,
>> leveraging the metadata.json file name as the identifier.
>>
>> To summarize, HTTP caching relies on headers (e.g., ETag or
>> Last-Modified) to validate whether a version is up-to-date, whereas the
>> alternative approach proposed above uses an additional parameter for
>> verification. From my perspective, there isn’t a fundamental difference
>> between the two, so I’m OK with either.
>>
>> A couple of points to note:
>>
>>    1. Both approaches would require changes to the "loadTable" endpoint.
>>    2. A minor advantage of HTTP caching is that it integrates seamlessly
>>    with browsers, but since most clients of the Iceberg REST catalog aren’t
>>    browsers, this may not be a significant factor.
>>    3. I’d also recommend considering the requirement to retrieve
>>    multiple tables(e.g., all tables under a namespace, or a list of table
>>    names) from the catalog. This requires a new endpoint and may not work 
>> with
>>    HTTP caching.
>>
>> Let me know your thoughts or if there’s anything else to consider.
>> Yufei
>>
>>
>> On Sun, Nov 17, 2024 at 6:43 PM Taeyun Kim <taeyun....@innowireless.com>
>> wrote:
>>
>> Hi,
>>
>> To Gabor:
>> It doesn’t seem necessary to interpret HTTP caching literally in this
>> context.
>> Simply using the HTTP headers defined by HTTP caching to check the
>> freshness of metadata should be sufficient.
>> There’s no requirement for the client to duplicate or store cached HTTP
>> responses.
>>
>> To Yufei:
>> As I understand it, the client doesn’t send its own timestamp but instead
>> uses the timestamp originally provided by the server in the Last-Modified
>> header.
>> Therefore, clock synchronization issues should not be a concern.
>>
>> Here’s the general flow of HTTP cache validation based on
>> If-Modified-Since:
>>
>> - Client: initial request:
>>
>> GET (url) HTTP/1.1
>>
>> - Server response:
>>
>> HTTP/1.1 200 OK
>> Last-Modified: (date1)
>> Cache-Control: no-store, no-cache, max-age=0, must-revalidate,
>> proxy-revalidate
>> (with response body)
>>
>> - Client: validation request:
>>
>> GET (url) HTTP/1.1
>> If-Modified-Since: (date1)
>>
>> - Server response (if unchanged):
>>
>> HTTP/1.1 304 Not Modified
>> Last-Modified: (date1)
>> Cache-Control: no-store, no-cache, max-age=0, must-revalidate,
>> proxy-revalidate
>> (without response body)
>>
>> - Server response (if updated):
>>
>> HTTP/1.1 200 OK
>> Last-Modified: (date2)
>> Cache-Control: no-store, no-cache, max-age=0, must-revalidate,
>> proxy-revalidate
>> (with response body)
>>
>> However, using time-based freshness checks can present challenges, such
>> as parsing time formats or synchronizing file update times across servers.
>> To address these issues, HTTP cache validation based on ETag is also
>> defined in the specification.
>>
>> Here’s the flow for ETag-based validation:
>>
>> - Client: initial request:
>>
>> GET (url) HTTP/1.1
>>
>> - Server response:
>>
>> HTTP/1.1 200 OK
>> ETag: "(arbitrary string 1 generated by the server)"
>> Cache-Control: no-store, no-cache, max-age=0, must-revalidate,
>> proxy-revalidate
>> (with response body)
>>
>> - Client: validation request:
>>
>> GET (url) HTTP/1.1
>> If-None-Match: "(arbitrary string 1 generated by the server)"
>>
>> - Server response (if unchanged):
>>
>> HTTP/1.1 304 Not Modified
>> ETag: "(arbitrary string 1 generated by the server)"
>> Cache-Control: no-store, no-cache, max-age=0, must-revalidate,
>> proxy-revalidate
>> (without response body)
>>
>> - Server response (if updated):
>>
>> HTTP/1.1 200 OK
>> ETag: "(arbitrary string 2 generated by the server)"
>> Cache-Control: no-store, no-cache, max-age=0, must-revalidate,
>> proxy-revalidate
>> (with response body)
>>
>> The server can choose to use either If-Modified-Since or ETag for
>> freshness validation.
>> Alternatively, to simplify the implementation related to the Iceberg REST
>> catalog, it might make sense to define only the more accurate ETag-based
>> validation in the spec.
>> For reference, RFC 9110 recommends specifying both ETag and
>> Last-Modified. When both are provided, ETag takes precedence.
>>
>> Note on Cache-Control Headers:
>> The Cache-Control values in the examples above are intended to ensure
>> that the client validates freshness with the server on every request.
>> Writing the header in this extended format is primarily to accommodate
>> outdated HTTP/1.1 implementations. However, under the HTTP/1.1
>> specification, the following is sufficient:
>>
>> Cache-Control: no-cache
>>
>> That’s all for now.
>> Thank you.
>>
>>
>> -----Original Message-----
>> From: "Yufei Gu" <flyrain...@gmail.com>
>> To: <dev@iceberg.apache.org>;
>> Cc:
>> Sent: 2024-11-16 (토) 02:51:05 (UTC+09:00)
>> Subject: Re: [DISCUSS] REST: Way to query if metadata pointer is the
>> latest
>>
>>
>>
>> How does HTTP caching handle desynchronized clocks between clients and
>> the server?
>>
>> At t0, the client gets the latest table version.
>> At t1, the server makes a new commit.
>> At t2, the client sends a request with a timestamp t2, but due to
>> desynchronization, it refers to t0.
>>
>> The server may reply with 304 Not Modified, causing the client to think
>> its cache is up-to-date and miss the commit at t1.
>>
>>
>>
>> Yufei
>>
>>
>>
>>
>> On Fri, Nov 15, 2024 at 6:37 AM Gabor Kaszab <gaborkas...@apache.org>
>> wrote:
>> Hi All,
>>
>>
>> First of all it's great to see that there are others who could benefit
>> from giving a solution to this problem. I appreciate all the comments and
>> feedback so far.
>> There were a number of different opinions, so let me start with
>> summarizing the different topics that came up:
>>
>>
>> New endpoint vs using an existing endpoint:
>> Based on the answers (Fokko, Yufei) I had the impression that we should
>> be careful when adding new REST endpoints, and we should examine the re-use
>> of existing endpoints first. Let's do that then, and in case we don't find
>> it feasible then we can still fall back to any of my initial proposals
>> (isLatest() or metadataLocation()).
>>
>>
>> Granularity of freshness checks:
>> It was brought up (Dmitri) that we might not want to do the metadata
>> freshness checks solely based on metadata location, but we should consider
>> doing more granular freshness checks. I personally don't see much benefit
>> of designing this solution like that, TBH, but seeing some use-cases could
>> help us understand the motivation here.
>> Let me share my opinion on some of the arguments:
>>
>>
>> "A change in metadata location does not necessarily mean a change in
>> metadata content"
>>
>>
>> AFAIK whenever Iceberg creates a new metadata file there is some change
>> in the metadata itself. There might not be a new snapshot, though in the
>> cases of e.g. a schema/partition evolution. But even in these cases
>> triggering a table reload could make sense to me (e.g. answering SHOW
>> CREATE TABLE and similar queries). Additionally, I'd assume the number of
>> metadata location changes that don't create a new snapshot is too
>> negligible to optimize for.
>> Dmitri, let me know if I misunderstood something.
>>
>>
>> "it may still be beneficial to permit the client to ask for changes to
>> specific areas of metadata"
>>
>> This seems like a use-case that the partial metadata loading proposal
>> could solve. To identify the need to load a specific part of the metadata
>> with partial metadata loading seems an overkill to design with my proposal,
>> if this is what you have in mind. Also I found that the partial metadata
>> loading proposal faces serious headwinds, so I wouldn't rely on it at the
>> moment.
>>
>>
>> Re-using tableExists
>> I think there is a consensus here that tableExists returning a metadata
>> location could work but seems more like a workaround and could be
>> misleading for the users.
>>
>> Partial metadata loading could solve this:
>> (Yufei) I agree, it would be perfect for my use-case and I'm following
>> the discussion on the proposal. However, for me it seems, as I wrote above,
>> that the proposal faces serious headwinds now and I honestly wouldn't
>> expect a solution in the short term. But solving the freshness problems is
>> a more urgent thing to solve, not just for myself and Impala but apparently
>> to many other stakeholders in the community according to the interest on
>> this thread.
>> Hence, I propose to come up with a separate solution for freshness
>> checks, and we can still move to using partial metadata loading once that's
>> out.
>>
>>
>> Use HTTPCache and If-Modified-Since with loadTable
>> This solution seems to do the trick for us. Let me do some research
>> myself to see if there are any difficulties implementing this. Currently, I
>> have more questions than answers wrt this approach :)
>> - The initial problem is to answer freshness questions for the cached
>> tables on the client side. If we introduce HttpCaching wouldn't we
>> introduce the same problem but on a different level of representation. We'd
>> then need to decide the freshness/staleness of the cached data in the HTTP
>> layer.
>> - If we cache the HTTP responses for a loadTable then we essentially
>> cache the content of the metadata.jsons including the snapshot and metadata
>> log and everything, plus the snapshot list (and I think the manifests for
>> the latest snapshot). I believe that the size of this can easily reach the
>> low megabytes range in memory, so in total keeping them in the HTTP Cache
>> for all the tables we have queried can easily mean that we keep a couple of
>> GBs in memory just for this purpose.
>> For engines that already cache table metadata wouldn't this mean that we
>> will cache some parts of the metadata redundantly?
>> - How would we decide what is the max-age of a cached table metadata in
>> the HTTP Cache? Would it be configurable so that each engine could use
>> whatever it prefers?
>>
>>
>> Sorry if any of the questions doesn't make sense, I just want to make
>> sure I understand all the aspects of this approach.
>>
>>
>> An additional topic I have in mind:
>> REST catalog vs other catalogs:
>> Now we are focusing our discussion on the REST spec, but I think it would
>> be beneficial to extend our focus and cover other catalog implementations
>> too. I don't think that this problem of data freshness is specific to REST
>> catalog, it could affect any table in any other catalog too.
>>
>>
>> I'll continue my investigation wrt the proposals, I just wanted to flush
>> out and sum up what we have now before the weekend.
>>
>>
>> Regards,
>> Gabor
>>
>>
>>
>>
>> On Fri, Nov 15, 2024 at 10:16 AM Jean-Baptiste Onofré <j...@nanthrax.net>
>> wrote:
>> Hi,
>>
>> I like the idea and it makes sense. As soon as it's clearly stated in
>> the spec (using If-Modified-Since header and 304 status code), it
>> looks good to me.
>>
>> Thanks !
>> Regards
>> JB
>>
>> On Fri, Nov 15, 2024 at 1:58 AM Taeyun Kim <taeyun....@innowireless.com>
>> wrote:
>> >
>> > Hi,
>> >
>> > (Apologies if this email is a duplicate. This is my third attempt.)
>> >
>> > I also need a way to ensure that my table data is up-to-date. For now,
>> I’m handling this by setting an expiration period after which I fetch the
>> data again, regardless of its freshness.
>> >
>> > Here are my thoughts on the current suggestions. Please correct me if
>> I've misunderstood any of the points.
>> >
>> > - isLatest(): This function could be inefficient since it would require
>> an additional round-trip to fetch the metadata if it’s not up-to-date. This
>> would result in two round-trips overall, which seems suboptimal.
>> > - metadataLocation(): This has a similar issue as isLatest(). BTW,
>> according to the REST catalog API documentation for LoadTableResult schema,
>> it states, "Clients can check whether metadata has changed by comparing
>> metadata locations after the table has been created." (
>> https://github.com/apache/iceberg/blob/3659ded18d50206576985339bd55cd82f5e200cc/open-api/rest-catalog-open-api.yaml#L3175)
>> This suggests that if the metadata location has changed, the metadata can
>> be considered updated.
>> > - tableExists(): Based on the name, this function seems to serve a
>> different purpose.
>> >
>> > Here is my suggestion:
>> >
>> > Since HTTP has built-in caching features (
>> https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching), and REST
>> catalogs operate over HTTP, it seems natural to leverage HTTP caching
>> mechanisms. For example, HTTP includes the If-Modified-Since header and the
>> 304 Not Modified status code. Using this approach, we could achieve data
>> freshness with a single round-trip, fetching updated data only if there are
>> modifications.
>> >
>> > What do you think about defining the spec in this direction?
>> >
>> > Thank you.
>> >
>> >
>> >
>> >
>> > -----Original Message-----
>> > From: "Yufei Gu" <flyrain...@gmail.com>
>> > To: <dev@iceberg.apache.org>;
>> > Cc:
>> > Sent: 2024-11-13 (수) 03:43:24 (UTC+09:00)
>> > Subject: Re: [DISCUSS] REST: Way to query if metadata pointer is the
>> latest
>> >
>> >
>> >
>> > Hi Gamber,
>> >
>> > Thanks for the proposal! Impala isn’t unique in needing this—I've seen
>> similar requirements from other engines.
>> >
>> > As others pointed out, using the “tableExists” endpoint seems like a
>> workaround. I don't consider it a permanent way forward. We could address
>> this by either modifying the current load table endpoint or introducing a
>> new one, but ideally, we should avoid adding endpoints for every specific
>> need. With that, partial metadata loading seems like a strong approach
>> here, we will need certain agreement though. I'd suggest the community
>> consider the use cases seriously. We need a way forward.
>> >
>> > I’m also not too concerned about using metadata file paths to verify
>> the latest table version; clients can simply extract metadata filenames,
>> which include the UUID.
>> >
>> > Yufei
>> >
>> >
>> >
>> >
>> > On Tue, Nov 12, 2024 at 7:46 AM Jean-Baptiste Onofré <j...@nanthrax.net>
>> wrote:
>> >
>> > Hi Fokko
>> >
>> > I like the idea, but I think it's more a workaround and could be
>> > confusing for users :)
>> >
>> > Regards
>> > JB
>> >
>> > On Tue, Nov 12, 2024 at 2:53 PM Fokko Driesprong <fo...@apache.org>
>> wrote:
>> > >
>> > > Hey Gabor,
>> > >
>> > > Thanks for raising this. While reading this, my first thought is to
>> leverage the `tableExists` operation:
>> > >
>> https://github.com/apache/iceberg/blob/e3f39972863f891481ad9f5a559ffef093976bd7/open-api/rest-catalog-open-api.yaml#L1129-L1160
>> > >
>> > > This doesn't return anything today, but we could return a payload to
>> the latest metadata.json.
>> > >
>> > > Looking forward to what others think.
>> > >
>> > > Kind regards,
>> > > Fokko
>> > >
>> > >
>> > >
>> > >
>> > > Op di 12 nov 2024 om 14:33 schreef Shani Elharrar
>> <sh...@upsolver.com.invalid>:
>> > >>
>> > >> I recommend option (b), provided there is no partial metadata
>> loading. We implemented option (b) internally to facilitate partial
>> metadata loading, as we have tables with hundreds of thousands of
>> snapshots. This results in metadata that occupies approximately 500 MB in
>> memory (excluding the JsonNodes), which is a significant load for some of
>> our services.
>> > >>
>> > >> Shani.
>> > >>
>> > >> On 12 Nov 2024, at 14:12, Gabor Kaszab <gaborkas...@apache.org>
>> wrote:
>> > >>
>> > >> Hey Iceberg Community,
>> > >>
>> > >> Background:
>> > >> Impala is designed in a way to cache the Iceberg table metadata
>> (BaseTable objects in practice) for faster access. Currently, Impala is
>> tightly coupled with HMS and in turn with the HiveCatalog, and in order to
>> keep the cached table objects up-to-date there is a notification mechanism
>> driven by HMS to notify Impala about any changes in the table metadata.
>> > >> The Impala community is actively looking for ways to decouple HMS
>> from Impala and provide a way to use Impala without the need for HMS, and
>> get the Iceberg table metadata from other catalog Implementations mainly
>> focusing now on REST catalogs.
>> > >>
>> > >> Problem to solve:
>> > >> We identified a particular missing functionality in the current REST
>> spec: For engines that cache table metadata currently there is no way to
>> check if that table metadata is up-to-date or not, and whether the engine
>> should reload the metadata for that table or not without getting a whole
>> table object from the catalog. For this I think the REST catalog (but in
>> fact I think this could apply to any other catalogs) should be able to
>> answer a question like:
>> > >> "Hi Catalog, I have this version of this table, is it up-to-date?"
>> > >>
>> > >> Proposal:
>> > >> I've been following the discussion about partial metadata loading
>> that could be also used to answer the above question, but I have the
>> impression now that the conversation stopped making any progress.
>> > >> So instead of waiting for partial metadata loading I propose to have
>> an addition to the REST spec now to answer the question I raised above:
>> > >>
>> > >> a) boolean isLatest(TableIdentifier ident, String metadataLocation);
>> > >> b) String metadataLocation(TableIdentifier ident);
>> > >>
>> > >> Any of the above 2 approaches could help engines to decide if they
>> have to invalidate/reload particular table metadata in the cache. I
>> personally would go for option a) but would be open to hear other opinions.
>> > >>
>> > >> I'd like to know if the community could support me extending the
>> REST spec with any of the 2 options.
>> > >>
>> > >> Regards,
>> > >> Gabor
>> > >>
>> > >>
>>
>>

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

Reply via email to