kumarUjjawal commented on PR #19813: URL: https://github.com/apache/datafusion/pull/19813#issuecomment-3750140053
I looked at the arrow-rs handling and as per my understanding, Arrow's RowConverter uses a Stateless codec for Null type elements. However, when Null is nested inside a List, the list codec wraps each element in variable-length encoding bytes. During `convert_rows()`, the nested Stateless decoder doesn't consume these wrapper bytes, causing the "did not consume all bytes" error. This is a gap in arrow-rs where the list codec's byte protocol doesn't account for the zero-width nature of Null child elements. While this could be fixed upstream, List(Null) is a rare edge case that only arises from SQL literals like [[null]], so adding a workaround in DataFusion seemed more practical than modifying arrow-rs's row format. Let me know what are your thoughts on this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
