Hi everyone, I'm starting this discussion thread about the optional third-party dependencies we currently maintain in PyIceberg to support to_*() conversion methods (e.g., to_daft(), to_polars(), to_pandas(), to_ray(), to_duckdb(), etc.).
While this integration provides a great user experience by offering seamless conversions, it creates some maintenance challenges for the PyIceberg project: Maintenance burden: We're responsible for ensuring compatibility with all these external libraries and any future additions Version conflicts: Some tools have specific PyArrow version requirements that can conflict with PyIceberg's needs. For example, Bodo currently pins PyArrow to version 19.0, which could potentially block us from adopting newer PyArrow features (e.g: UUID support in 21.0) Dependency management complexity: Managing compatibility across multiple external libraries adds complexity to our release cycle IMHO rather than PyIceberg maintaining integrations with external libraries, perhaps these libraries should implement their own PyIceberg support I'd love to hear the community's thoughts on this approach. Has anyone else encountered similar challenges, or are there benefits to the current model that I might be overlooking? André Anastácio