Hi,

Thanks for reaching out! I think I have seen similar questions for
several times and it is a shame that we don't have a cookbook with
concrete examples yet. This is something that we need to add to the
official Parquet website.

Back to your question, I think the parquet-cli command line is exactly
the tool your want:
https://github.com/apache/parquet-java/tree/master/parquet-cli
It has a sub-command `convert-csv` to convert a CSV file to Parquet. This
command line is also available via Homebrew:
https://formulae.brew.sh/formula/parquet-cli

Best,
Gang

On Wed, Dec 4, 2024 at 7:38 AM Javafanboy <javafan...@gmail.com> wrote:

> I would like to produce Parquet files from a "plain java command line
> program" *without any framework like Spark or Hadoop* *or serialization
> tools like proto buffers* - i.e. produce parquet from mainly arrays of
> primitive java types with as good compression ratio and performance as I
> can manage.
>
> I have played around with a very nice library called Parquet-Carpet
> <https://github.com/jerolba/parquet-carpet> that makes it very simple to
> produce Parquet without frameworks in Java (that builds on the Apache
> Parquet libraries) but it has some limitations that makes me curious to
> also investigate if I could use the Apache Parquet Java libraries directly
> but am having a hard time finding resources explaining how to get started.
>
> Are there any "HelloParquet" examples that one could start from etc?
>

Reply via email to