Hi Christine,

It would be great for you to submit your code examples! I think that would
be really helpful for other people as well.

For some things, it might also be a good idea to update the documentation
on the ASF site, iceberg.apache.org. The source for the site is in the
`site` folder in github, if you think there are missing examples that would
be beneficial to have on the site.

On Tue, Nov 19, 2019 at 5:19 AM Christine Mathiesen
<[email protected]> wrote:

> Hello!
>
>
>
> Recently, I’ve been researching Iceberg with the goal of developing some
> simple code exemplifying how to use the Iceberg Java API. The goal was to
> share this internally with developers along with information we’ve gained
> about Iceberg to start discussions on whether we could use Iceberg in our
> systems. On reviewing the documentation and code we thought this could be
> useful for anyone interested in learning more about Iceberg so we would
> like to open source it.  We noticed that Iceberg has a folder for examples (
> https://github.com/apache/incubator-iceberg/tree/master/examples) - there
> isn’t much there right now but it could be a good location for our examples
> and documentation.
>
>
>
> Our project is currently structured as many small JUnit tests that target
> the different functionality of Iceberg (such as the reading/writing of
> partitioned/unpartitioned tables, schema evolution, time travel etc). We
> went for this approach so we could use it as a sort of quickstart guide to
> using Iceberg with different use cases in mind.
>
>
>
> The code we have currently focuses mainly on using HadoopTables with Spark
> (in Java) and contains tests that follow this sort of pattern:
>
>
>
> @Test
>
>   public void writeToTableFromFile() {
>
>     Dataset<Row> df = spark.read().json(dataLocation + "/employees.json");
>
>
>
>     df.select("name", "salary").write()
>
>       .format("iceberg")
>
>       .mode("append")
>
>       .save(tableLocation.toString());
>
>
>
>     table.refresh();
>
>
>
>     df.createOrReplaceTempView("table");
>
>
>
>     Dataset<Row> sqlDF = spark.sql("select * from table");
>
>     assertEquals(sqlDF.count(), 10);
>
> }
>
>
>
> Could the developers on the project let us know if they think the above
> would be a useful contribution and if so, what the next steps would be?
> We’re happy to answer any questions and provide more info etc.
>
>
>
> Thank you and all the best,
>
>
>
> *Christine Mathiesen *
>
> Software Development Intern
>
> BDP – Hotels.com
>
> Expedia Group
>
>
>


-- 
Ryan Blue
Software Engineer
Netflix

Reply via email to