mustafasrepo commented on issue #9928: URL: https://github.com/apache/arrow-datafusion/issues/9928#issuecomment-2055812960
Thanks @korowa and @echai58 for detailed analysis. Sorry for the late reply. I will try to generate datafusion only reproducer for this use case with your findings. Then we can try to fix the problem there. The root cause of the problem seems to stem from we allow to insert data larger than the target partition from the source. This seems to cause some inconsistency internally. By adding a RepartitionExec immediately after the source to bring the source desired partitioning might solve the issue. I am not sure though, will post my findings as I progress. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
