amoeba commented on issue #40050:
URL: https://github.com/apache/arrow/issues/40050#issuecomment-1945240990

   Hi @Yannaubineau, thanks for the report. Both the R and Python packages have 
tests covering this behavior so it's a known issue. Though, as you found out, 
they will happily write a Parquet file that can't be read in cases like this.
   
   A work-around for now would be to pass an extra option to `open_dataset` 
that sets the limit to a high-enough value:
   
   ```r
   dt_error <- open_dataset("./example_error.parquet", 
thrift_string_size_limit=1000000000)
   ```
   
   I'm not sure we want to increase or remove the default limit, as that might 
cause other problems. @Yannaubineau do you think a more informative error would 
be enough of a fix here?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to