timsaucer commented on issue #715: URL: https://github.com/apache/datafusion-python/issues/715#issuecomment-2132210685
I think I know what's going on. Even if `outer` is null, we still have data within `inner_1` and `inner_2`. When pyarrow creates the record batch, it sets these to the default value rather than null even though the outer struct is null. Then on the datafusion side we index into these and get those default values. I *think* the right place to resolve this is in pyarrow setting null when all outer values are null. But maybe there is additional validity checks we should have. I'm going to think a little more about this issue before moving it to the most appropriate repo. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org