Re: [PR] [SPARK-55791][PYTHON] Fix pandas-on-Spark equality comparisons under ANSI mode [spark]

via GitHub Mon, 15 Jun 2026 16:45:46 -0700


gaogaotiantian commented on PR #55987:
URL: https://github.com/apache/spark/pull/55987#issuecomment-4713449573


   The code change has nothing to do with the description. Source code change 
basically raise an exception in certain cases - the test just checks the 
behavior is correctly implemented, no comparison to pandas. The extra tests for 
pandas vs pandas on pyspark has no behavior change because eq and ne are not 
changed at all. The description does not contain the LLM question. I suspect 
that this is just LLM generated code on a TODO in our code base.
   
   As for the code itself, it's debatable. Basically it checks the type and 
raises an error if the type does not match. That does not match pandas exact 
behavior either. For example, you can compare `bool` vs `int` in pandas. You 
can't do that in pyspark now, but with this change it's probably going to be 
another error.
   
   I want to hear from @devin-petersohn but my gut feeling is that this might 
introduce some regression for some currently allowed implicit casting.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-55791][PYTHON] Fix pandas-on-Spark equality comparisons under ANSI mode [spark]

Reply via email to