Thank you Felix, Could you share some minimal examples of how you ran the benchmarking? I saw the code on the ticket, but it would be better to open a new repo on which you run the benchmark end to end.
I'm also curious about how you did the Avro performance measurement. While writing a specific record, this should be faster than JSON. Also, it greatly depends on how many rows you write. Avro and JSON are much better on writing single rows, and Parquet should excel on writing batches of thousands of rows. Cheers, Fokko Op vr 18 okt. 2019 om 19:33 schreef Kizhakkel Jose, Felix <[email protected]>: > Hello, > > I am from Philips Architecture team, where I am working on a POC to > compare different data models [ Parquet/Avro/Json]. But I see Parquet is > very slow while writing [pojo to Parquet file]. > > I have created two issues in Parquet project. One is regarding the > slowness of ParquetWritter compared to JSON and AvroWriter : > > Avro Serialization Stats: StopWatch 'AvroSerializer': running time > (millis) = 387 > JSON serialization Stats: StopWatch 'JsonSerializer': running time > (millis) = 103 > Parquet Serialization Stats: StopWatch 'ParquetSerializer': running time > (millis) = 8346 > > https://issues.apache.org/jira/browse/PARQUET-1680 > > Second issue is I was not able to serialize a Java object to Parquet when > the pojo has a UUID field. Parquet is throwing exception. > https://issues.apache.org/jira/browse/PARQUET-1679 > > Could you please help me on what I am doing wrong or give me some insights > on resolving the issue. > > Regards, > Felix K Jose > > ________________________________ > The information contained in this message may be confidential and legally > protected under applicable law. The message is intended solely for the > addressee(s). If you are not the intended recipient, you are hereby > notified that any use, forwarding, dissemination, or reproduction of this > message is strictly prohibited and may be unlawful. If you are not the > intended recipient, please contact the sender by return e-mail and destroy > all copies of the original message. >
