HyukjinKwon commented on a change in pull request #32835:
URL: https://github.com/apache/spark/pull/32835#discussion_r648800130
##########
File path: python/docs/source/user_guide/pandas_on_spark/typehints.rst
##########
@@ -1,36 +1,36 @@
-====================
-Type Hints In Koalas
-====================
+==================================
+Type Hints In pandas APIs on Spark
+==================================
.. currentmodule:: pyspark.pandas
-Koalas, by default, infers the schema by taking some top records from the
output,
-in particular, when you use APIs that allow users to apply a function against
Koalas DataFrame
+Pandas APIs on Spark, by default, infers the schema by taking some top records
from the output,
+in particular, when you use APIs that allow users to apply a function against
pandas APIs on Spark DataFrame
such as :func:`DataFrame.transform`, :func:`DataFrame.apply`,
:func:`DataFrame.koalas.apply_batch`,
:func:`DataFrame.koalas.apply_batch`, :func:`Series.koalas.apply_batch`, etc.
However, this is potentially expensive. If there are several expensive
operations such as a shuffle
-in the upstream of the execution plan, Koalas will end up with executing the
Spark job twice, once
+in the upstream of the execution plan, pandas APIs on Spark will end up with
executing the Spark job twice, once
for schema inference, and once for processing actual data with the schema.
-To avoid the consequences, Koalas has its own type hinting style to specify
the schema to avoid
-schema inference. Koalas understands the type hints specified in the return
type and converts it
+To avoid the consequences, pandas APIs on Spark has its own type hinting style
to specify the schema to avoid
Review comment:
```suggestion
To avoid the consequences, pandas APIs on Spark have its own type hinting
style to specify the schema to avoid
```
##########
File path: python/docs/source/user_guide/pandas_on_spark/typehints.rst
##########
@@ -1,36 +1,36 @@
-====================
-Type Hints In Koalas
-====================
+==================================
+Type Hints In pandas APIs on Spark
+==================================
.. currentmodule:: pyspark.pandas
-Koalas, by default, infers the schema by taking some top records from the
output,
-in particular, when you use APIs that allow users to apply a function against
Koalas DataFrame
+Pandas APIs on Spark, by default, infers the schema by taking some top records
from the output,
+in particular, when you use APIs that allow users to apply a function against
pandas APIs on Spark DataFrame
such as :func:`DataFrame.transform`, :func:`DataFrame.apply`,
:func:`DataFrame.koalas.apply_batch`,
:func:`DataFrame.koalas.apply_batch`, :func:`Series.koalas.apply_batch`, etc.
However, this is potentially expensive. If there are several expensive
operations such as a shuffle
-in the upstream of the execution plan, Koalas will end up with executing the
Spark job twice, once
+in the upstream of the execution plan, pandas APIs on Spark will end up with
executing the Spark job twice, once
for schema inference, and once for processing actual data with the schema.
-To avoid the consequences, Koalas has its own type hinting style to specify
the schema to avoid
-schema inference. Koalas understands the type hints specified in the return
type and converts it
+To avoid the consequences, pandas APIs on Spark has its own type hinting style
to specify the schema to avoid
+schema inference. Pandas APIs on Spark understands the type hints specified in
the return type and converts it
Review comment:
```suggestion
schema inference. Pandas APIs on Spark understand the type hints specified
in the return type and converts it
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]