Hi, everyone, I'm Tao. I'm currently working on a commercial streaming system 
that is written in Rust.

Actually, I'm planning to implement an Iceberg Rust SDK so that we can have 
better integration with the existing Iceberg ecosystem. Initially I found 
https://github.com/oliverdaff/iceberg-rs, but it appears the author hasn't been 
active lately. So I'm looking to see if the Iceberg community has any consensus 
on a Rust/C++ SDK (Rust is preferable), and if there is, we'd love to 
contribute. I believe as Iceberg increases its popularity, there will 
eventually be more systems that want such libraries. There could have even been 
some ongoing works without consulting with the community.

Additionally, I think the initial Rust/C++ SDK can only support the 
reader&writer sides of Iceberg. Because there have been plenty of JVM-based 
query engines out there taking charge of data maintenance. We don't have to 
rewrite every corner of Iceberg in Rust. That means less engineering work.

On 2022/06/08 10:16:05 OpenInx wrote:
> As a cloud-native table format standard for the big-data ecosystem,  I
> believe supporting multiple languages is the correct direction so that
> different languages can connect to the apache iceberg table format.
> 
> But I can also get Kyle's point about lacking enough resources(developers
> and reviewers ) to accomplish this goal.  In my mind,  Python, Golang, C++,
> Rust , all of them can be regarded as the native language support.  we may
> just need to support the Rust SDK and then all of the other languages can
> just wrap the Rust SDK to access the table format.
> 
> Anyway,  we will need to wait for the REST catalog finished before we
> introduce another languages support , because we can not access the iceberg
> table by invoking the JVM catalog interfaces.
> 
> On Tue, Jun 7, 2022 at 4:41 AM Micah Kornfield <emkornfi...@gmail.com>
> wrote:
> 
> > There’s also the question of how useful this would be in practice given
> >> the complexity of using C++ (or Rust etc) within some of the major
> >> frameworks.
> >>
> >
> > One place this would be useful is for the Arrow's DataSet API [1].  An
> > option the Arrow community might be open to is hosting parts of the code
> > there (this is what is done for Apache Parquet C++).  This helps shape some
> > of the answers to other questions posed (ORC and Parquet are already in the
> > Repo, it provides a Filesystem interface, etc).  The project doesn't
> > currently consume Avro, and I think the preferred approach is to make a
> > clean room Avro parser.  But I agree this is a non-trivial effort to get
> > underway.
> >
> > Another area to consider is compatibility testing.  I think before a third
> > officially supported community library is introduced it would be good to
> > have a compatibility framework in place to make sure implementations are
> > all interpreting the specification correctly.  If there isn't already an
> > effort here, I'd like to start contributing something (probably will have
> > bandwidth sometime place in Q3).
> >
> > Thanks,
> > -Micah
> >
> >
> > [1] https://arrow.apache.org/docs/cpp/dataset.html
> >
> > On Sun, Jun 5, 2022 at 11:07 PM Kyle Bendickson <k...@tabular.io> wrote:
> >
> >> Hi caneGuy,
> >>
> >> I personally don’t dislike this idea. I understand the performance
> >> benefits.
> >>
> >> But this would be a huge undertaking for the community. We’d need to
> >> ensure we had sufficient developer support for reviews (likely one of the
> >> biggest issues), as well as a number of other things. Particularly
> >> dependencies, package management, etc. We’d also need to scope support down
> >> to specific OS / compilers etc.
> >>
> >> We’d also need to be sure we had adequate developer support from a wide
> >> enough range of the community to support the project long term. One issue
> >> in open source is that developers will work on something tangential to
> >> their project in another repository, but nobody is available to maintain 
> >> it.
> >>
> >> There’s also the question of how useful this would be in practice given
> >> the complexity of using C++ (or Rust etc) within some of the major
> >> frameworks.
> >>
> >> Again, I’m not opposed to the idea but just trying to be realistic about
> >> the realities of such an undertaking. It would need full community support
> >> (or at least support from enough community members to be sustainable).
> >>
> >> If you wanted to make a design doc, the milestones tab in the Iceberg
> >> project has some that you might use as reference.
> >>
> >> *I highly suggest you come to the next community sync and bring this up
> >> to the community then.*
> >>
> >> If you’re not already on the invite list for the monthly community sync,
> >> you can get on it by joining the Google group. You’ll receive incites when
> >> they go out:
> >> https://groups.google.com/g/iceberg-sync
> >>
> >> Looking forward to seeing you at the next community sync.
> >>
> >> A design document and/or any prior art would be very helpful as the
> >> community sync does discuss many topics (possibly there is existing C++
> >> support in StarRocks for Iceberg V1?).
> >>
> >> Thank you,
> >> Kyle Bendickson
> >> GitHub: kbendick
> >>
> >> On Sun, Jun 5, 2022 at 10:44 PM Sam Redai <s...@tabular.io> wrote:
> >>
> >>> Currently there is no existing effort to develop a C++ package. That
> >>> being said I think it would be awesome to have one! If anyone is willing 
> >>> to
> >>> start that development effort, I can help with some of the ground work to
> >>> kickstart it.
> >>>
> >>> I would say the first step would be for someone to prepare a high-level
> >>> proposal.
> >>>
> >>> -Sam
> >>>
> >>> On Sun, Jun 5, 2022 at 11:02 PM 周康 <zhoukang199...@gmail.com> wrote:
> >>>
> >>>> Hi team
> >>>> I am a dev from StarRocks community, and we have supported iceberg v1
> >>>> format.
> >>>> We are also planning to support v2 format. If there is a C++ package,
> >>>> it will be very convenient for our implementation.
> >>>> At the same time, other c++ computing engines support v2 format will
> >>>> also be faster.
> >>>>
> >>>> Do we have plans to support c++ version sdk?
> >>>> --
> >>>> caneGuy
> >>>>
> >>> --
> >>>
> >>> Sam Redai <s...@tabular.io>
> >>>
> >>> Developer Advocate  |  Tabular <https://tabular.io/>
> >>>
> >>> c (267) 226-8606
> >>>
> >>
> 

Reply via email to