fuxi611 commented on PR #55987: URL: https://github.com/apache/spark/pull/55987#issuecomment-4796957948
Hi @gaogaotiantian, Thank you for your candid feedback and for pointing out the issue with implicit casting. To be completely transparent, as students working on this, we did use an LLM helper to help us analyze the codebase and interpret the existing TODO comment. However, we now realize that the generated logic was too naive, strictly focused on the TODO text, and completely overlooked PySpark's implicit casting behaviors (like bool vs int) which, as you correctly noted, introduces regressions. We sincerely apologize for pushing code that didn't meet the project's quality standards. This is a critical assignment for our university lab, and our grade heavily depends on getting this PR properly reviewed and progressing towards an accepted state. We are fully committed to fixing this and doing it the right way. Could you please give us a hint or guide us on how we should properly handle this comparison? If you could point us in the right direction, we will completely rewrite the implementation and update the PR immediately to match the expected Pandas/PySpark behavior. Thank you so much for your patience and mentorship! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
