Also, using the API is a pain, because you have to use Hadoop. Various people 
have found work-arounds for this, such as:
Comments on: https://issues.apache.org/jira/browse/PARQUET-1822

I also assembled a minimal reader myself (from code I found elsewhere on 
github, which I should add attributions for later) which I put here:
https://github.com/theosib-amazon/parquet-mr-minreader 

On 4/25/22, 2:51 PM, "gamaken k" <[email protected]> wrote:

    CAUTION: This email originated from outside of the organization. Do not 
click links or open attachments unless you can confirm the sender and know the 
content is safe.



    > wiki on how to use the api
    +1 to this. I too think this would be very useful for getting started.
    Xinyu, you could potentially look at parquet-cli's source code to
    understand how it invokes the various APIs from parquet-mr, I think.

    On Sun, Apr 24, 2022 at 8:29 AM Xinyu Zeng <[email protected]> wrote:

    > Hi,
    >
    > I am a previous user of parquet-cpp(now integrated with arrow) and now
    > I am going to use the java version parquet-mr. However, I did not find
    > any doc or wiki on how to use the api. I am also interested in
    > contributing but there is also no contribution guide like other open
    > source projects. I would appreciate it if someone could give me a
    > short guide.
    >
    > Thanks,
    > Xinyu
    >

Reply via email to