gaogaotiantian commented on PR #55987: URL: https://github.com/apache/spark/pull/55987#issuecomment-4713449573
The code change has nothing to do with the description. Source code change basically raise an exception in certain cases - the test just checks the behavior is correctly implemented, no comparison to pandas. The extra tests for pandas vs pandas on pyspark has no behavior change because eq and ne are not changed at all. The description does not contain the LLM question. I suspect that this is just LLM generated code on a TODO in our code base. As for the code itself, it's debatable. Basically it checks the type and raises an error if the type does not match. That does not match pandas exact behavior either. For example, you can compare `bool` vs `int` in pandas. You can't do that in pyspark now, but with this change it's probably going to be another error. I want to hear from @devin-petersohn but my gut feeling is that this might introduce some regression for some currently allowed implicit casting. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
