Antoine Gelloz created ARROW-15544:
--------------------------------------
Summary: [Go][Parquet] pqarrow.getOriginSchema error while
decoding ARROW:schema
Key: ARROW-15544
URL: https://issues.apache.org/jira/browse/ARROW-15544
Project: Apache Arrow
Issue Type: Bug
Components: Go, Parquet
Affects Versions: 7.0.0
Environment: go1.17, python3.8
Reporter: Antoine Gelloz
Hello !
This is my first time participating in the open source community as a junior
developer and I would like to thank you all for your hard work :)
While using the new pqarrow package for our project[
[Metronlab/bow||https://github.com/Metronlab/bow],]
[https://github.com/Metronlab/bow] |https://github.com/Metronlab/bow],] [to
read parquet files previously written by Pandas.
An error is returned by function getOriginSchema if the "ARROW:schema" base64
encoded value is ending with padding
characters.|https://github.com/Metronlab/bow],]
This is caused by the use of the
[RawStdEncoding|https://pkg.go.dev/encoding/base64#pkg-variables] type that
omits padding characters.
Is there any reason for using raw encoding instead of standard?
Here is a repo with a test script to demonstrate the problem:
[antoinegelloz/arrowparquet|https://github.com/antoinegelloz/arrowparquet]
Thank you in advance for your help,
Antoine Gelloz
--
This message was sent by Atlassian Jira
(v8.20.1#820001)