JoshRosen commented on pull request #35066: URL: https://github.com/apache/spark/pull/35066#issuecomment-1002856449
Please let me know if you have suggestions for good ways to write a regression test for this bug. So far I've been unable to adapt my existing reproduction into something which fails in CI. Given enough time, I might be able to contrive a failing regression test by manually instantiating a SortMergeJoinExec operator and controlling its input iterators such that the non-copied values are mutated when the iterator advances (I'd use the SparkPlanTest helpers for this). OTOH this particular helper function changes very infrequently, so I think the risk of future regression might be small enough that it might be okay to forgo writing the more complicated test. If anyone has strong opinions here then please let me know. ---- I'm now curious about whether there could be other similar UDT-related bugs in our code generation. I plan to search through the code for all other places where we generate copy() / clone() logic to check whether they properly handle UDTs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
