frazar commented on PR #38360:
URL: https://github.com/apache/arrow/pull/38360#issuecomment-1786107127

   Added even more logs [in this 
branch](https://github.com/frazar/arrow/tree/parquet/python-support-crc-NEW-logs),
 and got a surprising result:
   
   ```
   dataset: read_options_args: {'coerce_int96_timestamp_unit': None} scan_args 
{'pre_buffer': True, 'thrift_string_size_limit': None, 
'thrift_container_size_limit': None, 'page_checksum_verification': True}
   read_options: None
   default_fragment_scan_options: None
   build scanOptions with  {'pre_buffer': True, 'thrift_string_size_limit': 
None, 'thrift_container_size_limit': None, 'page_checksum_verification': True}
   set_page_checksum_verification() called with: check_crc=true     <--- Here 
the C++ setter is called with argument true
   page_checksum_verification_ is now: true
   page_checksum_verification() called, returning: false            <--- Here 
the C++ getter is called, but returns false!
   Open with crc: false
   current page type is: 2, isset crc is: true
   page_checksum_verification() called, returning: false
   properties_.page_checksum_verification(): false
   current_page_header_.__isset.crc: true
   PageCanUseChecksum(page_type): true
   page_checksum_verification() called, returning: false
   current page type is: 0, isset crc is: true
   page_checksum_verification() called, returning: false
   properties_.page_checksum_verification(): false
   current_page_header_.__isset.crc: true
   PageCanUseChecksum(page_type): true
   page_checksum_verification() called, returning: false
   ```
   
   I see 2 possible explanations:
   - Either something is zero-ing `page_checksum_verification_` without using 
the setter,
   - Or we are looking at methods call for two different instances
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to