Yannaubineau commented on issue #40050: URL: https://github.com/apache/arrow/issues/40050#issuecomment-1945653461
Hi @amoeba, thank you for your answer. Sorry if it is a non-issue. The main problem I faced was the lack of indication of the source of the error, and the absence of warning prior to creating the error. Thank you for the sample code, it works as a charm ! I think there is two aspects to this : - The error output is confusing, `Is this a 'parquet' file?` doesn't feel right if the error is known related to a string size limit parameter. **So informing the user of this parameter inside the error message would definitely be an improvement.** - But I also think that it could be very informative to trigger a warning when a parquet file is created (through `write_parquet` or `write_dataset`) **from a data.frame containing attributes**, simply because of how massive the size of the parquet file can get compared to the same data.frame without any attributes. **Users should be aware in some way that attributes in their data is impeding on the efficiency of the binary-data-storage.** -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
