[GitHub] [arrow] yurikoomiga opened a new issue, #13186: An Error Occured While Reading Parquet File Using C++ - GetRecordBatchReader -Corrupt snappy compressed data.

GitBox Wed, 18 May 2022 02:46:37 -0700


yurikoomiga opened a new issue, #13186:
URL: https://github.com/apache/arrow/issues/13186


   Hi All
   
   When I use Arrow Reading Parquet File like this:
   
   `auto st = parquet::arrow::FileReader::Make(
                       arrow::default_memory_pool(),
                       parquet::ParquetFileReader::Open(_parquet, _properties), 
&_reader);
   arrow::Status status = _reader->GetRecordBatchReader({_current_group},
                                                                        
_parquet_column_ids, &_rb_batch);
    _reader->set_batch_size(65536);
    _reader->set_use_threads(true);
    status = _rb_batch->ReadNext(&_batch);`
   
   status is not ok and an error occured like this:
   `IOError: Corrupt snappy compressed data.`
   
   When I comment out this statement, 
   ` _reader->set_use_threads(true);`
   the program runs normally,i can read parquet file well
   Program errors only occur when I read multiple columns and using 
use_threads=true, and a single column will not occur error
   
   the testing parquet file is created by pyarrow，I use only 1 group and each 
group has 3000000 records.
   the parquet file has 20 columns including int and string types
   
   reading file using C++,arrow 7.0.0 ,snappy 1.1.8
   
   
   
   Thank you!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] yurikoomiga opened a new issue, #13186: An Error Occured While Reading Parquet File Using C++ - GetRecordBatchReader -Corrupt snappy compressed data.

Reply via email to