ahmb84 opened a new issue, #14940:
URL: https://github.com/apache/arrow/issues/14940

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   Hello, I am trying to leverage the go arrow library to encrypt/decrypt 
parquet files.
   Unfortunately, it seems that the encrypted values are not written in the 
parquet file when calling the WriteBatch function.
   I have written a test function that writes and then reads a parquet file. 
The function is working correctly without encryption, but when I add 
encryption, the values for the encrypted columns do not exist. This is 
confirmed by two reasons:
   The size of the encrypted file is smaller than the unencrypted one
   When I read the encrypted file, the reader.HasNext always returns false, and 
the number of rows in the row group is 0.
   
   After noticing this behavior and testing different configurations 
(serial/buffer, encoding, compression), I've looked at the test suite and 
noticed what I assume is a bug.
   
   I’ve run the test suite and all the tests pass. The decryptions work 
perfectly for the files coming from the repo 
https://github.com/apache/parquet-testing, but for the file created in the 
previous test (the encryption one), the test is actually silently failing.
   The number of rows is zero, even if it should be 50. The test does check if 
the number of rows read equals the number of rows in the stats, but it does not 
check if there are rows at all.
   
   I am assuming that something is wrong with the flush when using encryption.
   
   I have forked the repo and updated the test to include a check for the 
number of rows. You can find the forked repo here => 
https://github.com/ahmb84/arrow/tree/parquet-go-add-test-for-number-of-rows-in-decryption
   
   Let me know if my assumptions are correct or if I am missing anything. Thank 
you!
   
   ### Component(s)
   
   Go, Parquet


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to