bjornjorgensen commented on code in PR #41211:
URL: https://github.com/apache/spark/pull/41211#discussion_r1198937746
##########
python/pyspark/pandas/tests/data_type_ops/test_date_ops.py:
##########
@@ -61,6 +63,10 @@ def test_add(self):
for psser in self.pssers:
self.assertRaises(TypeError, lambda: self.psser + psser)
+ @unittest.skipIf(
+ LooseVersion(pd.__version__) >= LooseVersion("2.0.0"),
+ "TODO(SPARK-43571): Enable DateOpsTests.test_sub for pandas 2.0.0.",
+ )
Review Comment:
eh I'm a bit puzzled and wonder if there is a misunderstanding of the
language here.
"There should be no behavior changes unless it is a major release" where
does this come from?
What @gatorsmile said was "we should not remove these API before the major
release Spark 4.0"
did he mean that functions that are well-functioning but have been removed
in pandas version 2.0 should not be removed? He has also written this under the
`def append` which was meant to be removed in your previous PR.
The whole point of the `pandas API on spark` is that it should be as similar
as possible to `pandas` and then there must also be some behavior changes,
because it has happened in pandas.
Are there any other ways than one person having to fix everything in one and
the same PR? e.g. that you mark all tests that fail with
`@pytest.mark.skip(reason="see JIRA XXXX for updating to pandas 2.0")` and you
also create a JIRA for that. So that can more people can help upgrade the
`pandas API to version 2.0`?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]