mightyshazam commented on PR #4967: URL: https://github.com/apache/arrow-rs/pull/4967#issuecomment-1773446489
> This might be a stupid question, but why does this need to be implemented in the JSON reader, as opposed to processing the StringArray after the fact? This could all be the product of me not understanding my options. I started looking into this as part of work on the [delta-rs](https://github.com/delta-io/delta-rs) project. I was working on a kubernetes operator to do some maintenance when I encountered the ArrowJson error related to binary columns. It turns out we can't run checkpoints on tables with binary columns. This was due to the call to [build_decoder](https://github.com/delta-io/delta-rs/blob/a9cdd605081a6e5ea5da7edff01f8d4bd3bfce77/rust/src/protocol/checkpoints.rs#L371C1-L373C27) in the `ReaderBuilder`. After a little research, I encountered the issue and your comments about handling it manually. Instead of creating a new reader for delta-rs, I thought it might be more reasonable to implement this as optional functionality in the arrow library. I know it is probably a pretty niche case, but it was also a relatively simple way forward. If you can think of a better alternative, I am definitely open to that. This just turned out to be a pretty easy way forward. Hopefully, that context makes sense. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
