ArnavBalyan opened a new pull request, #3287:
URL: https://github.com/apache/parquet-java/pull/3287

   ### Summary
    - Currently parquet-cli breaks while operating on parquet files generated 
through parquet protobuf
    - This is because the CLI currently uses AvroReadSupport and 
AvroRecrodConverter which breaks for protobuf since underlying schema/data is 
different.
    - We now support reading proto files through CLI reader which routes the 
request to simple group factory for protobuf parquet files.
   
   #### - Before:
   ```
   Time elapsed: 1.351 s <<< ERROR!
   java.lang.RuntimeException: Failed on record 0 in file 
/tmp/junit149783857212573183/proto_someevent.parquet
           at org.apache.parquet.cli.commands.CatCommand.run(CatCommand.java:89)
           at 
org.apache.parquet.cli.commands.CatCommandTest.testCatCommandProtoParquetAutoDetected(CatCommandTest.java:82)
           at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
           at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
           at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
           at java.lang.reflect.Method.invoke(Method.java:498)
           at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
           at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
   Caused by: org.apache.parquet.io.ParquetDecodingException: Can not read 
value at 0 in block -1 in file 
file:/tmp/junit149783857212573183/proto_someevent.parquet
           at 
org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:280)
           at 
org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136)
           at 
org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:140)
           at 
org.apache.parquet.cli.BaseCommand$2$1.advance(BaseCommand.java:407)
           at 
org.apache.parquet.cli.BaseCommand$2$1.<init>(BaseCommand.java:388)
           at 
org.apache.parquet.cli.BaseCommand$2.iterator(BaseCommand.java:386)
           at org.apache.parquet.cli.commands.CatCommand.run(CatCommand.java:76)
           at 
org.apache.parquet.cli.commands.CatCommandTest.testCatCommandProtoParquetAutoDetected(CatCommandTest.java:82)
   [INFO] 
   [INFO] Results:
   [INFO] 
   [ERROR] Errors: 
   [ERROR]   CatCommandTest.testCatCommandProtoParquetAutoDetected:82 ยป Runtime 
Failed on record 0 in file /tmp/junit149783857212573183/proto_someevent.parquet
   ```
   
   #### - After:
   ```
   [INFO] Running org.apache.parquet.cli.commands...
   repeatedInt: 1
   repeatedInt: 2
   repeatedInt: 3
   ```
   (Succesful read)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to