Hey Sam,

I'm not sure I fully understand the scenario you're describing, but
relative paths the basic concept is that you have a table location
(provided by a catalog) and files are resolved relative to that table
location.

Some example are provided in the spec
<https://iceberg.apache.org/spec/#path-resolution>.


   1.

   What is the intended use case for relative paths in Iceberg? Is it
   designed primarily for DR/replication scenarios?  What about real-time
   replication?

The design accommodates DR/replication with proper catalog implementations
to route or provide the table location.  The act of replicating the files
is left out of the spec, but can be realtime depending on the
implementation.

   1. At what point can a manifest or data file's relative path be resolved
   to an absolute path? Does the current design assume all referenced data is
   already available locally?

Paths are resolved when they're read out of manifests.  If you have a
reference in metadata to a file, it should exist or readers will fail when
fetching the file.  By the time you perform a commit operation, it must be
referenceable.

   1. In FileIO, newInputFile(String path) takes a raw path string. Is
   there a planned mechanism to provide additional metadata (like sequence
   context) to help resolve paths in more complex topologies?

A writer can construct paths in any way they want. Reference implementation
behaviors are described in the appendix section, but there's no requirement
for how they're constructed.  Relative path support is still being added to
the reference implementation, but path construction is largely the
responsibility of LocationProvider.  The path logic focuses on resolving or
relativizing paths, not constructing them.

-Dan


On Mon, Jun 1, 2026 at 12:28 PM samuel pacheco cantu via dev <
[email protected]> wrote:

> Helo everyone,
>
> I have a question about relative-path resolution in the context of
> multi-region replication.
>
> *Context:* We have a use case where data files may reside in different
> storage locations depending on the replication state. To resolve a relative
> path, we'd need additional context (e.g., the commit's sequence-id) to
> determine which region/scheme a given file should resolve to.
>
> We are actually thinking about swapping the absolute path scheme while we
> wait for relative-path support.  We plan to do this at the FileIO layer
> when requesting new input files.
>
> The problem we've got on doing the swap at the FileIO is that there are
> raw string path calls without not context to do any routing decision.  I
> would expect the same problem to occur here for relative-paths where there
> isn't enough context to determine the scheme.  The same argument can be
> made that we require even more metadata to support more complicated
> use-cases, such as sequence-id (and/or data-sequence-id) .
>
>
> *Questions:*
>
>    1.
>
>    What is the intended use case for relative paths in Iceberg? Is it
>    designed primarily for DR/replication scenarios?  What about real-time
>    replication?
>    2. At what point can a manifest or data file's relative path be
>    resolved to an absolute path? Does the current design assume all referenced
>    data is already available locally?
>    3. In FileIO, newInputFile(String path) takes a raw path string. Is
>    there a planned mechanism to provide additional metadata (like sequence
>    context) to help resolve paths in more complex topologies?
>
> We'd like to understand Iceberg's direction on relative-path resolution so
> we can align our approach with the community rather than diverging.
>
>
> Thanks,
> Sam
>
>

Reply via email to