ExpandingMan opened a new issue, #35797:
URL: https://github.com/apache/arrow/issues/35797

   ### Describe the usage question you have. Please include as many useful 
details as  possible.
   
   
   Hi, I maintain [Parquet2.jl](https://gitlab.com/ExpandingMan/Parquet2.jl) 
and [Thrift2.jl](https://gitlab.com/ExpandingMan/Thrift2.jl).
   
   I have recently re-implemented the thrift protocol in Julia (Thrift.jl) 
because the older implementation, Thrift.jl was extremely slow.  Currently, my 
output from Thrift2.jl is read properly by Thrift.jl, Thrift2.jl and 
fastparquet (all of which have completely separate read implementations), but 
currently `pyarrow` gets the following error:
   ```
   ERROR: Python: OSError: Could not open Parquet input source '<Buffer>': 
Couldn't deserialize thrift: TProtocolException: Invalid data
   
   Python stacktrace:
    [1] pyarrow.lib.check_status
      @ pyarrow/error.pxi:115
    [2] pyarrow.lib.pyarrow_internal_check_status
      @ pyarrow/error.pxi:144
    [3] pyarrow._dataset.Fragment.physical_schema.__get__
      @ pyarrow/_dataset.pyx:1345
   ```
   
   Unfortunately this error is *extremely* opaque, so it's very hard for me to 
figure out what's going on.  I was wondering if anyone could offer any 
suggestions on how to debug it.  Thanks.
   
   ### Component(s)
   
   Parquet


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to