I've been doing some experimenting with storing table metadata in the
persistence, and I think I'm a fan, though I haven't yet collected real
numbers. Personally, though I think storing as separate entities is the
right approach rather than putting it in table properties. I appreciate
that we don't necessarily need the best implementation as the first
implementation, but given that metadata files are easily several megabytes
(sometimes many times larger than that), I think it's important to give
ourselves an easy way to separate out such an important scaling factor.
E.g., what if someone enabled storing metadata in table properties, found
that it crashed their service, then needed to roll back? They'd be stuck
with however many table entities that have embedded metadata and quite
possibly _couldn't_ get the service back to a healthy state. To me, that's
less about "perfect vs good enough" and more about "safe vs unsafe".

> Introduce a new interface for accessing metadata

I assumed this was a reference to a new interface strictly for accessing
table metadata, not an addition to the persistence interfaces. I think an
interface with read/write methods make sense - something like

interface TableMetadataStorage {
   void storeTableMetadata(TableMetadata md, String fileLocation);

   TableMetadata readTableMetadata(TableIdentifier id, String fileLocation);
}

The default impl would be to use the FileIO, but an alternative could
create one or more new entities to store in the persistence layer. In this
case, someone could turn the feature off (or roll back) and get exactly the
same behavior they had before.

Mike


On Mon, May 26, 2025 at 9:41 PM Eric Maynard <eric.w.mayn...@gmail.com>
wrote:

> > Loading internal properties requires parsing the whole text, which will
> include Iceberg metadata, even when the metadata is not used by the code
> accessing internal properties.
>
> This concern should definitely be considered in the design -- but it should
> also be borne out in benchmarks if there is such a piece of code. I audited
> the usage of getInternalPropertiesAsMap and testing a lot of endpoints, but
> that was on a previous version of the PR.
>
> If you identify a code path like that, please do flag it so we can
> implement some benchmarks around it.
>
> > Introduce a new interface for accessing metadata
>
> I was a big fan of the DAO proposal some months ago, so I agree with the
> spirit of this recommendation. Making the cache more of a persistence-only
> concern was also a part of that discussion IIRC.
>
> However, given that we don't have type-specific persistence methods (i.e.
> there is no loadTable), it seems odd to introduce a persistence method like
> loadTableMetadata. It seems odder still to make the cache optional in only
> this case, given that this is the one property that would be bounded in
> size while other properties have historically been unbounded in size.
>
> Further, if we did introduce such a method without explicitly requiring
> that metastores must contain metadata (and my PR says they only sometimes
> do!), I think this would mix Iceberg-y concerns around parsing
> metadata.json files with persistence concerns.
>
> While these concerns don't mean I'm totally opposed to the proposed
> alternative, I still believe that this might be a case where "good enough"
> is better than "perfect". Persistence has been a highly contentious area of
> discussion in the past.
>
> On Mon, May 26, 2025 at 9:08 AM Dmitri Bourlatchkov <di...@apache.org>
> wrote:
>
> > >
> > > An older version of the implementation took this approach, but some
> > months
> > > ago Dennis left comments suggesting that we use an internal property
> and
> > > the PR was updated
> >
> >
> > Apologies for missing out on earlier reviews on this change.
> >
> > Why do we need the best approach to the first implementation?
> >
> >
> > By "best" I do not mean "ideal", but something that is better than
> > alternatives known at present time.
> >
> > My main concern is that bundling metadata into internal properties
> > automatically affects many existing call paths and existing storage.
> >
> > Loading internal properties requires parsing the whole text, which will
> > include Iceberg metadata, even when the metadata is not used by the code
> > accessing internal properties.
> >
> > These concerns affect all deployments and I do not see a way to mitigate
> > them in customizations.
> >
> > By detaching metadata from entities we still keep the performance
> > improvement, yet we avoid tight coupling.
> >
> > WDYT about this?
> > * Introduce a new interface for accessing metadata
> > * Implement it on top of Persistence
> >   - We could use a new column on the entities table in JDBC to avoid
> > cleanup issues.
> >   - The new column will not be loaded automatically.
> >   - If metadata is not found in Persistence, load from storage
> > * Optionally add an in-memory cache (implementing the same new interface)
> >   - this would be different from the Entity Cache
> >     - Entity Cache is optional in the Resolver
> >     - Entity Cache is tightly coupled with the Resolver (if present)...
> as
> > discussed wrt this cache's consistency concerns.
> >
> > Thanks,
> > Dmitri.
> >
> > On Fri, May 23, 2025 at 9:35 PM Eric Maynard <eric.w.mayn...@gmail.com>
> > wrote:
> >
> > > An older version of the implementation took this approach, but some
> > months
> > > ago Dennis left comments suggesting that we use an internal property
> and
> > > the PR was updated. Replies to your specific concerns inline:
> > >
> > > > Regarding the specific proposal of using a internal property of
> > entities
> > > for holding the whole metadata - I'm not sure it's the best approach.
> > >
> > > Why do we need the best approach to the first implementation? If
> version
> > 1
> > > of this change goes in and improves loadtable perf by 40% and then you
> > want
> > > to do a followup that improves it another 5%, isn't that okay?
> > >
> > > I don't believe we're designing ourselves into the corner by giving
> > admins
> > > the option to put metadata content in the properties, even if we later
> > > switch to a different implementation.
> > >
> > > > Not all access to tables requires metadata
> > >
> > > Much of it does, maybe all of it? Is there a particular method you're
> > > concerned about which we can try running through the benchmark?
> > >
> > > > Metadata can be huge
> > >
> > > Yes, that is why the size is limited.
> > >
> > > > Only sub-sections of the metadata may be required in some cases
> > >   (e.g. when resolving REPLACE / APPEND conflicts [1])
> > >
> > > As noted in my email this is not in scope. The idea here is that we
> > > can create some new TableMetadata entity and give it top-level columns
> > for
> > > the fields we care about. Again, I'd like to do this eventually and
> this
> > > was done in earlier versions of the PR. However I would point out that
> we
> > > don't really give other entities such top level fields right now, even
> > > frequently used ones like a table's location, and always rely on
> > > deserialization of potentially heavy properties.
> > >
> > > But the proof will be in the pudding. With the changes, the APIs I
> tested
> > > were much faster. If there's some API that we believe is frequently
> > called,
> > > which is not seeing benefits, and which we believe must see benefits
> for
> > > there to be any value in storing the metadata in the metastore, let's
> > test
> > > it and then talk about things in concrete terms.
> > >
> > > --EM
> > >
> > > On Fri, May 23, 2025 at 4:52 PM Dmitri Bourlatchkov <di...@apache.org>
> > > wrote:
> > >
> > > > Thanks for starting this discussion, Eric!
> > > >
> > > > Overall, I think the idea of storing (some) table metadata in the
> > Polaris
> > > > database is very relevant and a sound approach to improving query
> > > > performance on the engine side.
> > > >
> > > > Regarding the specific proposal of using a internal property of
> > entities
> > > > for holding the whole metadata - I'm not sure it's the best approach.
> > > >
> > > > I believe it might be preferable to store metadata as separate
> entries
> > > > in Polaris Persistence and only reference them from table-like
> > entities.
> > > >
> > > > Some points for consideration:
> > > >
> > > > * Not all access to tables requires metadata
> > > > * Metadata can be huge
> > > > * Only sub-sections of the metadata may be required in some cases
> > > >   (e.g. when resolving REPLACE / APPEND conflicts [1])
> > > >
> > > > The striping of large metadata into smaller database entries may be
> > > > best dealt with at the Persistence layer. The proposed NoSQL impl.
> > > > can do that efficiently. As for the current JDBC impl. it might be
> > > > reasonable to store a BLOB (at least initially).
> > > >
> > > > WDYT?
> > > >
> > > > Thanks,
> > > > Dmitri.
> > > >
> > > > [1] https://github.com/apache/polaris/pull/1285
> > > >
> > > > On 2025/05/23 14:57:34 Eric Maynard wrote:
> > > > > Hi all,
> > > > >
> > > > > Some time ago I opened this PR <
> > > > https://github.com/apache/polaris/pull/433>
> > > > > which proposes to store/cache TableMetadata in the Polaris
> metastore,
> > > > > avoiding a trip to object storage in many cases. Based on this
> recent
> > > > > comment <
> > > > https://github.com/apache/polaris/pull/433#issuecomment-2904298967>
> I
> > > > > wanted to start up a mailing list thread for discussion about this
> > > > feature
> > > > > as it might be a little hard to follow comment threads on what is
> > now a
> > > > > very old PR.
> > > > >
> > > > > The proposal is, in a nutshell, to add a new internal property
> > > > > metadata-cache-content to IcebergTableLikeEntity's internal
> > properties
> > > > and
> > > > > to use that to store the exact contents of a table's metadata.json.
> > The
> > > > > content can be updated whenever the metadata.json is read and can
> be
> > > > > configured to happen only for metadata.json files below some
> > > approximate
> > > > > size.
> > > > >
> > > > > I recently used the benchmark suite proposed in this PR
> > > > > <https://github.com/apache/polaris-tools/pull/21> to measure the
> > > impact
> > > > of
> > > > > the change and found it to dramatically improve loadTable
> > performance.
> > > > >
> > > > > Some things that have been brought up which are *not* in scope for
> > this
> > > > PR:
> > > > > 1. Directly loading the metadata.json content into a
> > LoadTableResponse
> > > > > without building an in-memory TableMetadata object was previously
> in
> > > the
> > > > PR
> > > > > but removed after this comment
> > > > > <
> https://github.com/apache/polaris/pull/433#issuecomment-2885074219>
> > > > from
> > > > > Russell; it's planned as a followup.
> > > > > 2. Storing individual parts of table metadata.json in persistence,
> > i.e.
> > > > > just the schema. We can do this if a use case arises, but being
> able
> > to
> > > > > store whole table metadata is beneficial immediately.
> > > > > 3. A separate entity for table metadata. Because we add the table
> > > > metadata
> > > > > to IcebergTableLikeEntity we immediately benefit from the entity
> > cache
> > > > and
> > > > > don't have to worry too much about consistency.
> > > > > 4. A separate cache for table metadata. Similar to the above, this
> > > would
> > > > > make handling consistency more complicated. Having a separate
> cache,
> > > > maybe
> > > > > with its own size or TTL configurations, just for table metadata
> > could
> > > > be a
> > > > > good followup but it's not necessary to make things work.
> > > > >
> > > > > This is a feature that has the potential to deliver tremendous
> > latency
> > > > > benefits and one that opens up several interesting possibilities
> for
> > > > > followup improvements.
> > > > >
> > > > > If you're interested in the feature, please check out the PR or
> join
> > > the
> > > > > discussion here. Thanks!
> > > > >
> > > > > --EM
> > > > >
> > > >
> > >
> >
>

Reply via email to