HyukjinKwon commented on a change in pull request #33420:
URL: https://github.com/apache/spark/pull/33420#discussion_r672316136
##########
File path: docs/sql-programming-guide.md
##########
@@ -56,7 +56,7 @@ equivalent to a table in a relational database or a data
frame in R/Python, but
optimizations under the hood. DataFrames can be constructed from a wide array
of [sources](sql-data-sources.html) such
as: structured data files, tables in Hive, external databases, or existing
RDDs.
The DataFrame API is available in Scala,
-Java, [Python](api/python/pyspark.sql.html#pyspark.sql.DataFrame), and
[R](api/R/index.html).
+Java, [Python](api/python/reference/api/pyspark.sql.DataFrame.html), and
[R](api/R/index.html).
Review comment:
Oh yeah we should fix them. Nice, thanks. Could we these too:
```
ml-migration-guide.md:Refer to the [`MLUtils` Python
docs](api/python/pyspark.mllib.html#pyspark.mllib.util.MLUtils) for further
detail.
ml-pipeline.md:and [Python](api/python/pyspark.ml.html)).
rdd-programming-guide.md:The first thing a Spark program must do is to
create a [SparkContext](api/python/pyspark.html#pyspark.SparkContext) object,
which tells Spark
rdd-programming-guide.md:how to access a cluster. To create a `SparkContext`
you first need to build a
[SparkConf](api/python/pyspark.html#pyspark.SparkConf) object
rdd-programming-guide.md: [Python](api/python/pyspark.html#pyspark.RDD),
rdd-programming-guide.md: [Python](api/python/pyspark.html#pyspark.RDD),
rdd-programming-guide.md:[Python](api/python/pyspark.html#pyspark.StorageLevel))
rdd-programming-guide.md:create their own types by subclassing
[AccumulatorParam](api/python/pyspark.html#pyspark.AccumulatorParam).
sql-migration-guide.md: <a
href="api/python/pyspark.sql.html#pyspark.sql.SQLContext.read">Python</a>
sql-migration-guide.md: <a
href="api/python/pyspark.sql.html#pyspark.sql.DataFrame.write">Python</a>
sql-programming-guide.md:Java,
[Python](api/python/pyspark.sql.html#pyspark.sql.DataFrame), and
[R](api/R/index.html).
streaming-kinesis-integration.md: See the [API
docs](api/python/pyspark.streaming.html#pyspark.streaming.kinesis.KinesisUtils)
streaming-programming-guide.md:First, we import
[StreamingContext](api/python/pyspark.streaming.html#pyspark.streaming.StreamingContext),
which is the main entry point for all streaming functionality. We create a
local StreamingContext with two execution threads, and batch interval of 1
second.
streaming-programming-guide.md:A
[StreamingContext](api/python/pyspark.streaming.html#pyspark.streaming.StreamingContext)
object can be created from a
[SparkContext](api/python/pyspark.html#pyspark.SparkContext) object.
streaming-programming-guide.md:for Java, and
[StreamingContext](api/python/pyspark.streaming.html#pyspark.streaming.StreamingContext)
for Python.
streaming-programming-guide.md:For the Python API, see
[DStream](api/python/pyspark.streaming.html#pyspark.streaming.DStream).
streaming-programming-guide.md: *
[StreamingContext](api/python/pyspark.streaming.html#pyspark.streaming.StreamingContext)
and [DStream](api/python/pyspark.streaming.html#pyspark.streaming.DStream)
streaming-programming-guide.md: *
[KafkaUtils](api/python/pyspark.streaming.html#pyspark.streaming.kafka.KafkaUtils)
structured-streaming-programming-guide.md:([Scala](api/scala/org/apache/spark/sql/SparkSession.html)/[Java](api/java/org/apache/spark/sql/SparkSession.html)/[Python](api/python/pyspark.sql.html#pyspark.sql.SparkSession)/[R](api/R/sparkR.session.html)
docs)
structured-streaming-programming-guide.md:([Scala](api/scala/org/apache/spark/sql/streaming/DataStreamReader.html)/[Java](api/java/org/apache/spark/sql/streaming/DataStreamReader.html)/[Python](api/python/pyspark.sql.html#pyspark.sql.streaming.DataStreamReader)
docs)
structured-streaming-programming-guide.md: (<a
href="api/scala/org/apache/spark/sql/streaming/DataStreamReader.html">Scala</a>/<a
href="api/java/org/apache/spark/sql/streaming/DataStreamReader.html">Java</a>/<a
href="api/python/pyspark.sql.html#pyspark.sql.streaming.DataStreamReader">Python</a>/<a
structured-streaming-programming-guide.md:([Scala](api/scala/org/apache/spark/sql/streaming/DataStreamWriter.html)/[Java](api/java/org/apache/spark/sql/streaming/DataStreamWriter.html)/[Python](api/python/pyspark.sql.html#pyspark.sql.streaming.DataStreamWriter)
docs)
structured-streaming-programming-guide.md: (<a
href="api/scala/org/apache/spark/sql/DataFrameWriter.html">Scala</a>/<a
href="api/java/org/apache/spark/sql/DataFrameWriter.html">Java</a>/<a
href="api/python/pyspark.sql.html#pyspark.sql.DataFrameWriter">Python</a>/<a
structured-streaming-programming-guide.md:For more details, please check the
docs for DataStreamReader
([Scala](api/scala/org/apache/spark/sql/streaming/DataStreamReader.html)/[Java](api/java/org/apache/spark/sql/streaming/DataStreamReader.html)/[Python](api/python/pyspark.sql.html#pyspark.sql.streaming.DataStreamReader)
docs) and DataStreamWriter
([Scala](api/scala/org/apache/spark/sql/streaming/DataStreamWriter.html)/[Java](api/java/org/apache/spark/sql/streaming/DataStreamWriter.html)/[Python](api/python/pyspark.sql.html#pyspark.sql.streaming.DataStreamWriter)
docs).
structured-streaming-programming-guide.md:([Scala](api/scala/org/apache/spark/sql/streaming/StreamingQueryManager.html)/[Java](api/java/org/apache/spark/sql/streaming/StreamingQueryManager.html)/[Python](api/python/pyspark.sql.html#pyspark.sql.streaming.StreamingQueryManager)
docs)
```
while we're here? I got this by `cd docs` and `git grep -r
"api/python/pyspark."`
##########
File path: docs/sql-programming-guide.md
##########
@@ -56,7 +56,7 @@ equivalent to a table in a relational database or a data
frame in R/Python, but
optimizations under the hood. DataFrames can be constructed from a wide array
of [sources](sql-data-sources.html) such
as: structured data files, tables in Hive, external databases, or existing
RDDs.
The DataFrame API is available in Scala,
-Java, [Python](api/python/pyspark.sql.html#pyspark.sql.DataFrame), and
[R](api/R/index.html).
+Java, [Python](api/python/reference/api/pyspark.sql.DataFrame.html), and
[R](api/R/index.html).
Review comment:
Oh yeah we should fix them. Nice, thanks. Could we fix these too:
```
ml-migration-guide.md:Refer to the [`MLUtils` Python
docs](api/python/pyspark.mllib.html#pyspark.mllib.util.MLUtils) for further
detail.
ml-pipeline.md:and [Python](api/python/pyspark.ml.html)).
rdd-programming-guide.md:The first thing a Spark program must do is to
create a [SparkContext](api/python/pyspark.html#pyspark.SparkContext) object,
which tells Spark
rdd-programming-guide.md:how to access a cluster. To create a `SparkContext`
you first need to build a
[SparkConf](api/python/pyspark.html#pyspark.SparkConf) object
rdd-programming-guide.md: [Python](api/python/pyspark.html#pyspark.RDD),
rdd-programming-guide.md: [Python](api/python/pyspark.html#pyspark.RDD),
rdd-programming-guide.md:[Python](api/python/pyspark.html#pyspark.StorageLevel))
rdd-programming-guide.md:create their own types by subclassing
[AccumulatorParam](api/python/pyspark.html#pyspark.AccumulatorParam).
sql-migration-guide.md: <a
href="api/python/pyspark.sql.html#pyspark.sql.SQLContext.read">Python</a>
sql-migration-guide.md: <a
href="api/python/pyspark.sql.html#pyspark.sql.DataFrame.write">Python</a>
sql-programming-guide.md:Java,
[Python](api/python/pyspark.sql.html#pyspark.sql.DataFrame), and
[R](api/R/index.html).
streaming-kinesis-integration.md: See the [API
docs](api/python/pyspark.streaming.html#pyspark.streaming.kinesis.KinesisUtils)
streaming-programming-guide.md:First, we import
[StreamingContext](api/python/pyspark.streaming.html#pyspark.streaming.StreamingContext),
which is the main entry point for all streaming functionality. We create a
local StreamingContext with two execution threads, and batch interval of 1
second.
streaming-programming-guide.md:A
[StreamingContext](api/python/pyspark.streaming.html#pyspark.streaming.StreamingContext)
object can be created from a
[SparkContext](api/python/pyspark.html#pyspark.SparkContext) object.
streaming-programming-guide.md:for Java, and
[StreamingContext](api/python/pyspark.streaming.html#pyspark.streaming.StreamingContext)
for Python.
streaming-programming-guide.md:For the Python API, see
[DStream](api/python/pyspark.streaming.html#pyspark.streaming.DStream).
streaming-programming-guide.md: *
[StreamingContext](api/python/pyspark.streaming.html#pyspark.streaming.StreamingContext)
and [DStream](api/python/pyspark.streaming.html#pyspark.streaming.DStream)
streaming-programming-guide.md: *
[KafkaUtils](api/python/pyspark.streaming.html#pyspark.streaming.kafka.KafkaUtils)
structured-streaming-programming-guide.md:([Scala](api/scala/org/apache/spark/sql/SparkSession.html)/[Java](api/java/org/apache/spark/sql/SparkSession.html)/[Python](api/python/pyspark.sql.html#pyspark.sql.SparkSession)/[R](api/R/sparkR.session.html)
docs)
structured-streaming-programming-guide.md:([Scala](api/scala/org/apache/spark/sql/streaming/DataStreamReader.html)/[Java](api/java/org/apache/spark/sql/streaming/DataStreamReader.html)/[Python](api/python/pyspark.sql.html#pyspark.sql.streaming.DataStreamReader)
docs)
structured-streaming-programming-guide.md: (<a
href="api/scala/org/apache/spark/sql/streaming/DataStreamReader.html">Scala</a>/<a
href="api/java/org/apache/spark/sql/streaming/DataStreamReader.html">Java</a>/<a
href="api/python/pyspark.sql.html#pyspark.sql.streaming.DataStreamReader">Python</a>/<a
structured-streaming-programming-guide.md:([Scala](api/scala/org/apache/spark/sql/streaming/DataStreamWriter.html)/[Java](api/java/org/apache/spark/sql/streaming/DataStreamWriter.html)/[Python](api/python/pyspark.sql.html#pyspark.sql.streaming.DataStreamWriter)
docs)
structured-streaming-programming-guide.md: (<a
href="api/scala/org/apache/spark/sql/DataFrameWriter.html">Scala</a>/<a
href="api/java/org/apache/spark/sql/DataFrameWriter.html">Java</a>/<a
href="api/python/pyspark.sql.html#pyspark.sql.DataFrameWriter">Python</a>/<a
structured-streaming-programming-guide.md:For more details, please check the
docs for DataStreamReader
([Scala](api/scala/org/apache/spark/sql/streaming/DataStreamReader.html)/[Java](api/java/org/apache/spark/sql/streaming/DataStreamReader.html)/[Python](api/python/pyspark.sql.html#pyspark.sql.streaming.DataStreamReader)
docs) and DataStreamWriter
([Scala](api/scala/org/apache/spark/sql/streaming/DataStreamWriter.html)/[Java](api/java/org/apache/spark/sql/streaming/DataStreamWriter.html)/[Python](api/python/pyspark.sql.html#pyspark.sql.streaming.DataStreamWriter)
docs).
structured-streaming-programming-guide.md:([Scala](api/scala/org/apache/spark/sql/streaming/StreamingQueryManager.html)/[Java](api/java/org/apache/spark/sql/streaming/StreamingQueryManager.html)/[Python](api/python/pyspark.sql.html#pyspark.sql.streaming.StreamingQueryManager)
docs)
```
while we're here? I got this by `cd docs` and `git grep -r
"api/python/pyspark."`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]