This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch branch-4.0
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-4.0 by this push:
new eeece9d964d6 [MINOR][DOCS] Add missing backticks in `Upgrading from
PySpark 3.5 to 4.0`
eeece9d964d6 is described below
commit eeece9d964d68685629b2fd1a68647d37cb8ba4b
Author: Ruifeng Zheng <[email protected]>
AuthorDate: Wed Feb 19 08:56:57 2025 +0900
[MINOR][DOCS] Add missing backticks in `Upgrading from PySpark 3.5 to 4.0`
nit
### What changes were proposed in this pull request?
### Why are the changes needed?
Add missing backticks in `Upgrading from PySpark 3.5 to 4.0`
see
https://apache.github.io/spark/api/python/migration_guide/pyspark_upgrade.html
### Does this PR introduce _any_ user-facing change?
to make the doc correctly rendered
### How was this patch tested?
ci
### Was this patch authored or co-authored using generative AI tooling?
no
Closes #49989 from zhengruifeng/py_fix_doc_upgrade.
Authored-by: Ruifeng Zheng <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
(cherry picked from commit aa12070c1321dfddb5876ef1494a65743fd93280)
Signed-off-by: Hyukjin Kwon <[email protected]>
---
python/docs/source/migration_guide/pyspark_upgrade.rst | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/python/docs/source/migration_guide/pyspark_upgrade.rst
b/python/docs/source/migration_guide/pyspark_upgrade.rst
index 6ba86d7a7041..906e40140cef 100644
--- a/python/docs/source/migration_guide/pyspark_upgrade.rst
+++ b/python/docs/source/migration_guide/pyspark_upgrade.rst
@@ -40,7 +40,7 @@ Upgrading from PySpark 3.5 to 4.0
* In Spark 4.0, ``include_start`` and ``include_end`` parameters from
``DataFrame.between_time`` have been removed from pandas API on Spark, use
``inclusive`` instead.
* In Spark 4.0, ``include_start`` and ``include_end`` parameters from
``Series.between_time`` have been removed from pandas API on Spark, use
``inclusive`` instead.
* In Spark 4.0, the various datetime attributes of ``DatetimeIndex`` (``day``,
``month``, ``year`` etc.) are now ``int32`` instead of ``int64`` from pandas
API on Spark.
-* In Spark 4.0, ``sort_columns`` parameter from ``DataFrame.plot`` and
`Series.plot`` has been removed from pandas API on Spark.
+* In Spark 4.0, ``sort_columns`` parameter from ``DataFrame.plot`` and
``Series.plot`` has been removed from pandas API on Spark.
* In Spark 4.0, the default value of ``regex`` parameter for
``Series.str.replace`` has been changed from ``True`` to ``False`` from pandas
API on Spark. Additionally, a single character ``pat`` with ``regex=True`` is
now treated as a regular expression instead of a string literal.
* In Spark 4.0, the resulting name from ``value_counts`` for all objects sets
to ``'count'`` (or ``'proportion'`` if ``normalize=True`` was passed) from
pandas API on Spark, and the index will be named after the original object.
* In Spark 4.0, ``squeeze`` parameter from ``ps.read_csv`` and
``ps.read_excel`` has been removed from pandas API on Spark.
@@ -72,8 +72,8 @@ Upgrading from PySpark 3.5 to 4.0
* In Spark 4.0, ``pyspark.testing.assertPandasOnSparkEqual`` has been removed
from Pandas API on Spark, use ``pyspark.pandas.testing.assert_frame_equal``
instead.
* In Spark 4.0, the aliases ``Y``, ``M``, ``H``, ``T``, ``S`` have been
deprecated from Pandas API on Spark, use ``YE``, ``ME``, ``h``, ``min``, ``s``
instead respectively.
* In Spark 4.0, the schema of a map column is inferred by merging the schemas
of all pairs in the map. To restore the previous behavior where the schema is
only inferred from the first non-null pair, you can set
``spark.sql.pyspark.legacy.inferMapTypeFromFirstPair.enabled`` to ``true``.
-* In Spark 4.0, `compute.ops_on_diff_frames` is on by default. To restore the
previous behavior, set `compute.ops_on_diff_frames` to `false`.
-* In Spark 4.0, the data type `YearMonthIntervalType` in ``DataFrame.collect``
no longer returns the underlying integers. To restore the previous behavior,
set ``PYSPARK_YM_INTERVAL_LEGACY`` environment variable to ``1``.
+* In Spark 4.0, ``compute.ops_on_diff_frames`` is on by default. To restore
the previous behavior, set ``compute.ops_on_diff_frames`` to ``false``.
+* In Spark 4.0, the data type ``YearMonthIntervalType`` in
``DataFrame.collect`` no longer returns the underlying integers. To restore the
previous behavior, set ``PYSPARK_YM_INTERVAL_LEGACY`` environment variable to
``1``.
* In Spark 4.0, items other than functions (e.g. ``DataFrame``, ``Column``,
``StructType``) have been removed from the wildcard import ``from
pyspark.sql.functions import *``, you should import these items from proper
modules (e.g. ``from pyspark.sql import DataFrame, Column``, ``from
pyspark.sql.types import StructType``).
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]