Hi all,

I’m sharing an RFC to prepare PyIceberg’s public API for 1.0.0. We’ve
wanted to refine the list of public APIs for some time, but it’s been
difficult because many changes touch user-facing contracts. I believe
the right way to get started is to first agree on the approach for
identifying what we want to expose, then split the work into smaller,
incremental changes that we can complete over time.

At a high level, I'm proposing the following approach:
- Use "__all__" as the single source of truth of curated symbols per module.
- Classify modules as Intended Public (Full), Intended Public
(Subset), or Internal.
- Roll out with deprecations first and remove deprecation warnings in
the 1.0.0 release (no user-visible breaks during the transition: all
symbols remain importable).
- Add CI guardrails to detect breaking changes.
- Optionally re-export a minimal, discoverable subset at pyiceberg top
level module.

I’m asking for input for the following in this thread:
- Agreement (or objections) on the "__all__" based explicit public API
declaration approach
- Agreement on per-module curation model to kickoff and split out the
work into smaller increments.
- Agreement on using Intended Public (Full) / Intended Public (Subset)
/ Internal classifications at the module levels to get a high level
consensus on API structure to organize the work.

If the approach looks good, I’ll draft an initial module-level API
classification (a mapping of each top-level module to one of the
public classifications proposed above) and share it in a follow-up
DISCUSS thread to build lazy consensus at the module level. Per-symbol
decisions within each module and their submodules can then be made
through sub-issues/PRs.

RFC Link: 
https://docs.google.com/document/d/1-0-2Wx8saf3EQQW6AyMtPxlBLs7P5SJGsgimLQ4E_1Y/edit?usp=sharing

Best,
Sung Yun

Reply via email to