scovich commented on issue #7453: URL: https://github.com/apache/arrow-rs/issues/7453#issuecomment-2846001651
I'm not sure the correct solution, but we're hitting this problem in https://github.com/delta-io/delta-kernel-rs/issues/501. Since the bad entry is one field of one row of a potentially very large json parse, it's pretty painful. And implementing a full-blown custom json parser for _everything_ -- just to deal with one bad field -- is a super unpleasant workaround. So far I'm aware of three alternatives to handle bad values without just walking away from arrow-json parsing: * This issue suggests converting bad values to strings (or rather, user can request string type for a column they're worried about, and it's their job to json-parse the resulting string if they want) * https://github.com/apache/arrow-rs/issues/7230 suggests to convert bad values to NULL * https://github.com/apache/arrow-rs/pull/7442 proposes a custom decoder concept -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org