zeroshade commented on issue #34334: URL: https://github.com/apache/arrow/issues/34334#issuecomment-1443894915
It absolutely will be haha. Though the same advice for extension types goes here: if it's an extension type just process it as it if was the underlying storage type. One thing you can try for binary types might be to also update the auto-detection to try base64 decoding the values before defaulting to just string types and if it can successfully base64 decode the data then assume it is binary (possibly, but not entirely necessary, i'm fine with binary not being supported in auto-detection). Another future enhancement I wanted to eventually work on for CSVs was to try to sample more than just the first line for the auto-detection of types to allow for better handling. But obviously that doesn't need to be part of this at all -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
