Antoine Gelloz created ARROW-15544:
--------------------------------------

             Summary: [Go][Parquet] pqarrow.getOriginSchema error while 
decoding ARROW:schema
                 Key: ARROW-15544
                 URL: https://issues.apache.org/jira/browse/ARROW-15544
             Project: Apache Arrow
          Issue Type: Bug
          Components: Go, Parquet
    Affects Versions: 7.0.0
         Environment: go1.17, python3.8
            Reporter: Antoine Gelloz


Hello !

This is my first time participating in the open source community as a junior 
developer and I would like to thank you all for your hard work :)

While using the new pqarrow package for our project[ 
[Metronlab/bow||https://github.com/Metronlab/bow],] 
[https://github.com/Metronlab/bow] |https://github.com/Metronlab/bow],] [to 
read parquet files previously written by Pandas.
An error is returned by function getOriginSchema if the "ARROW:schema" base64 
encoded value is ending with padding 
characters.|https://github.com/Metronlab/bow],]
This is caused by the use of the 
[RawStdEncoding|https://pkg.go.dev/encoding/base64#pkg-variables] type that 
omits padding characters.
Is there any reason for using raw encoding instead of standard?

Here is a repo with a test script to demonstrate the problem: 
[antoinegelloz/arrowparquet|https://github.com/antoinegelloz/arrowparquet]

Thank you in advance for your help,

Antoine Gelloz



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to