HyukjinKwon commented on pull request #34314: URL: https://github.com/apache/spark/pull/34314#issuecomment-975016963
@Yikun, I am very sorry but I realised that this patch breaks the test cases with lower pandas versions (because it requires to have https://pandas.pydata.org/pandas-docs/stable/whatsnew/v1.3.0.html#missing). `Decimal("NaN")` is not considered as null in old pandas versions, and it makes a bunch of related failures. I was preparing a followup with, for example an approach as below: ```diff - nullable=bool(col.isnull().any()), + nullable=bool(col.isnull().any()) + # To work around https://github.com/pandas-dev/pandas/pull/39409 + | bool( + col.map(lambda x: isinstance(x, decimal.Decimal) and math.isnan(x)).any() + ), ``` However, then the test fails because of difference behaviours in old pandas versions. While technically we can make a followup, please let me just revert it to make it easier to move forward - I hear complaints about that the tests are being failed from here and there. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
