Thanks Tim and Gamaken! They are helpful links and code.

A followup(maybe stupid) question: it seems other java based engines
like Presto has their own implementations of Parquet read/write. Is
that because parquet-mr can only deserialize Parquet into some
specific format like avro/thrift/protobuf, but some other engines need
tight coupling between Parquet and their in memory format? They also
need some different IO/buffering techniques than parquet-mr. If my
understanding is correct, does that mean a unified parquet
implementation does not exist and that is not the purpose of
parquet-mr?

Thanks

On Tue, Apr 26, 2022 at 9:37 PM Miller, Tim <[email protected]> wrote:
>
> Also, using the API is a pain, because you have to use Hadoop. Various people 
> have found work-arounds for this, such as:
> Comments on: https://issues.apache.org/jira/browse/PARQUET-1822
>
> I also assembled a minimal reader myself (from code I found elsewhere on 
> github, which I should add attributions for later) which I put here:
> https://github.com/theosib-amazon/parquet-mr-minreader
>
> On 4/25/22, 2:51 PM, "gamaken k" <[email protected]> wrote:
>
>     CAUTION: This email originated from outside of the organization. Do not 
> click links or open attachments unless you can confirm the sender and know 
> the content is safe.
>
>
>
>     > wiki on how to use the api
>     +1 to this. I too think this would be very useful for getting started.
>     Xinyu, you could potentially look at parquet-cli's source code to
>     understand how it invokes the various APIs from parquet-mr, I think.
>
>     On Sun, Apr 24, 2022 at 8:29 AM Xinyu Zeng <[email protected]> wrote:
>
>     > Hi,
>     >
>     > I am a previous user of parquet-cpp(now integrated with arrow) and now
>     > I am going to use the java version parquet-mr. However, I did not find
>     > any doc or wiki on how to use the api. I am also interested in
>     > contributing but there is also no contribution guide like other open
>     > source projects. I would appreciate it if someone could give me a
>     > short guide.
>     >
>     > Thanks,
>     > Xinyu
>     >
>

Reply via email to