wgtmac commented on PR #35825:
URL: https://github.com/apache/arrow/pull/35825#issuecomment-1594105093

   Thanks for bearing with me. IMO, the test needs to be improved to cover more 
cases:
   - Data type: at least `string` and `list<string>` need to be covered.
   - Encoding: dictionary-encoded and plain-encoded. Please make sure both 
values of `ArrowReaderProperties.read_dictionary` are tested. For example, we 
need to make sure dictionary-encoded values can be read via encoded or decoded 
form of arrow arrays. Same for plain-encoded case.
   - Read both overflow and non-overflow cases with `use_large_binary_variant` 
= true. It would be good to also add a test to make sure it throws in the 
overflow case when `use_large_binary_variant` = false
   
   If building a roundtrip is not that easy, we can add a test file to 
parquet-testing.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to