This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 31f7bfcb3b0 [MINOR][DOCS] Update broken links for pyspark.pandas
31f7bfcb3b0 is described below

commit 31f7bfcb3b024f823b630bd5069773bde5439eea
Author: wuxiaolong26 <wuxiaolon...@jd.com>
AuthorDate: Tue Mar 28 15:49:42 2023 +0900

    [MINOR][DOCS] Update broken links for pyspark.pandas
    
    ### What changes were proposed in this pull request?
    Update broken links.
    
    ### Why are the changes needed?
    broken links:  
https://spark.apache.org/docs/latest/api/python/user_guide/pandas_on_spark/best_practices.html?highlight=best%20practices#avoid-computation-on-single-partition
    <img width="801" alt="image" 
src="https://user-images.githubusercontent.com/26707386/227914290-842d8be5-544b-401c-962e-3aef0ca443aa.png";>
    
    ### Does this PR introduce _any_ user-facing change?
    No
    
    ### How was this patch tested?
    Just docs fix
    
    Closes #40562 from chong0929/fix-docs.
    
    Authored-by: wuxiaolong26 <wuxiaolon...@jd.com>
    Signed-off-by: Hyukjin Kwon <gurwls...@apache.org>
---
 python/docs/source/user_guide/pandas_on_spark/best_practices.rst | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/python/docs/source/user_guide/pandas_on_spark/best_practices.rst 
b/python/docs/source/user_guide/pandas_on_spark/best_practices.rst
index 08b66dd0082..14c04aa622e 100644
--- a/python/docs/source/user_guide/pandas_on_spark/best_practices.rst
+++ b/python/docs/source/user_guide/pandas_on_spark/best_practices.rst
@@ -148,7 +148,7 @@ Avoid computation on single partition
 -------------------------------------
 
 Another common case is the computation on a single partition. Currently, some 
APIs such as
-`DataFrame.rank 
<https://spark.apache.org/docs/latest/api/python/reference/api/pyspark.pandas.DataFrame.rank.html>`_
+`DataFrame.rank 
<https://spark.apache.org/docs/latest/api/python/reference/pyspark.pandas/api/pyspark.pandas.DataFrame.rank.html>`_
 use PySpark’s Window without specifying partition specification. This moves 
all data into a single
 partition in a single machine and could cause serious performance degradation.
 Such APIs should be avoided for very large datasets.
@@ -168,7 +168,7 @@ Such APIs should be avoided for very large datasets.
                   +- *(1) Scan ExistingRDD[__index_level_0__#16L,id#17L]
 
 Instead, use 
-`GroupBy.rank 
<https://spark.apache.org/docs/latest/api/python/reference/api/pyspark.pandas.groupby.GroupBy.rank.html>`_
+`GroupBy.rank 
<https://spark.apache.org/docs/latest/api/python/reference/pyspark.pandas/api/pyspark.pandas.groupby.GroupBy.rank.html>`_
 as it is less expensive because data can be distributed and computed for each 
group.
 
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to