Thanks Tim and Gamaken! They are helpful links and code. A followup(maybe stupid) question: it seems other java based engines like Presto has their own implementations of Parquet read/write. Is that because parquet-mr can only deserialize Parquet into some specific format like avro/thrift/protobuf, but some other engines need tight coupling between Parquet and their in memory format? They also need some different IO/buffering techniques than parquet-mr. If my understanding is correct, does that mean a unified parquet implementation does not exist and that is not the purpose of parquet-mr?
Thanks On Tue, Apr 26, 2022 at 9:37 PM Miller, Tim <[email protected]> wrote: > > Also, using the API is a pain, because you have to use Hadoop. Various people > have found work-arounds for this, such as: > Comments on: https://issues.apache.org/jira/browse/PARQUET-1822 > > I also assembled a minimal reader myself (from code I found elsewhere on > github, which I should add attributions for later) which I put here: > https://github.com/theosib-amazon/parquet-mr-minreader > > On 4/25/22, 2:51 PM, "gamaken k" <[email protected]> wrote: > > CAUTION: This email originated from outside of the organization. Do not > click links or open attachments unless you can confirm the sender and know > the content is safe. > > > > > wiki on how to use the api > +1 to this. I too think this would be very useful for getting started. > Xinyu, you could potentially look at parquet-cli's source code to > understand how it invokes the various APIs from parquet-mr, I think. > > On Sun, Apr 24, 2022 at 8:29 AM Xinyu Zeng <[email protected]> wrote: > > > Hi, > > > > I am a previous user of parquet-cpp(now integrated with arrow) and now > > I am going to use the java version parquet-mr. However, I did not find > > any doc or wiki on how to use the api. I am also interested in > > contributing but there is also no contribution guide like other open > > source projects. I would appreciate it if someone could give me a > > short guide. > > > > Thanks, > > Xinyu > > >
