jonkeane commented on pull request #11614: URL: https://github.com/apache/arrow/pull/11614#issuecomment-963325783
The centos 7 error is real, though I don't understand why it's happening. It seems to be hitting [one of these two blocks](https://github.com/apache/arrow/blob/master/cpp/src/arrow/util/compression_snappy.cc#L47-L59) I've tried saving a snappy encoded parquet from inside of a centos machine with this installed and reading it on my local (macOS) and that works _just fine_. I also did the reverse (saved a parquet file on my local and tried to read it in in the centos7 container, and that errors with the same corruption message. So it doesn't look like the file is truly corrupt, but something is odd with our code on centos 7 specifically(?) Error: ``` > tf <- tempfile() > on.exit(unlink(tf)) > write_parquet(mtcars, tf) > df <- read_parquet(tf, col_select = starts_with("d")) Error: IOError: Corrupt snappy compressed data. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
