For a RESTCatalog server offering server-side scan planning, it does not prohibit a client (doing read and write) from using that API. I wonder why not server-side scan plan returns the same result as client-side scan planning given the same input parameters? In case server-side scan planning offers better performance than client-side scan planning, clients can take advantage of that. Your thoughts?
Thanks, Limin From: Ryan Blue <rdb...@gmail.com> Reply-To: "dev@iceberg.apache.org" <dev@iceberg.apache.org> Date: Tuesday, August 12, 2025 at 6:45 PM To: "dev@iceberg.apache.org" <dev@iceberg.apache.org> Subject: Re: [QUESTION] Rest catalog TableScan API response's data_file has no 'metadataLocation' The spec doesn't keep track of the manifest or position in the manifest where a data file is stored because that information is determined by writing a data file into a manifest. It's useful to keep track of that information for writes, ZjQcmQRYFpfptBannerStart This Message Is From an Untrusted Sender You have not previously corresponded with this sender. ZjQcmQRYFpfptBannerEnd The spec doesn't keep track of the manifest or position in the manifest where a data file is stored because that information is determined by writing a data file into a manifest. It's useful to keep track of that information for writes, but it isn't necessarily correct by the time the write commits so it is a hint only. For server-side scan planning, I'm not sure that it makes sense to add this to the REST protocol. It seems odd to me that a client that can read and write metadata would use server-side planning. I typically think of the use case for server-side scan planning as primarily supporting cases where the client is either incapable of planning itself (like a simple client in a new language) or not allowed to read metadata files for security reasons. It is possible that server-side planning could be used in other cases, but I wouldn't expect it to be used by writers. And if it were, I think it's fine that the hints are not present. Ryan On Tue, Aug 12, 2025 at 11:46 AM Ma, Limin <l...@akamai.com.invalid> wrote: Hi All, /v1/{prefix}/namespaces/{namespace}/tables/{table}/plan Response: { "file-scan-tasks": [ { "data-file": { "file-path": "string", ... } }] } The spec does not indicate “data_file” has 'metadataLocation' property. But with Iceberg Vanilla local TableScan, DataFile (extends ContentFile) objects have ‘metadataLocation’ populated, which is used by MergingSnapshotProducer’s filterManager to efficiently identify relevant manifest files for filtering. Any reason why Rest spec does not support that or will it be considered for support in future versions? Thanks, Limin