Iceberg has a Java library, which is the most complete implementation of the spec (compared to other languages like Python, Rust) at the moment. You can certainly use the Java library directly to write and commit data to Iceberg. But you will likely need to implement quite a bit of code for things that are already taken care of by the engines like Spark, Flink, Trino etc.
You also mentioned lots of data. You would need a distributed engine (like Spark etc.) to handle larger scales. Re-implementing your own distributed engine is likely not a good investment of your time. On Thu, May 16, 2024 at 10:54 AM John D. Ament <johndam...@apache.org> wrote: > Completely naive question since I'm not familiar at all with the > technologies. I wanted to demonstrate using Iceberg files as a way to > ingest lots of data and persist it to S3. It seems like it can do this, > but I have a feeling I need tools like Spark to do it. is that true? or > can I hook it up to just a generic REST API that loads the data in? > > I have a feeling Spark will be part of the final solution, but I don't > want to get into that complexity quite yet, and it's not clear from the > docs. > > Also apologies for posting this to your dev@ list. No user(s)@ list. > > John >