Ondrej Kokes created SPARK-34606:
------------------------------------
Summary: New PySpark documentation has different URLs
Key: SPARK-34606
URL: https://issues.apache.org/jira/browse/SPARK-34606
Project: Spark
Issue Type: Bug
Components: Documentation, PySpark
Affects Versions: 3.1.1
Reporter: Ondrej Kokes
The new documentation site moved some subsites to different URLs, notably the
PySpark API reference ([see
here|https://spark.apache.org/docs/latest/api/python/pyspark.sql.html]). (Note
the new `/reference/` bit in the new URL.)
It's the first hit when you google "pyspark sql functions", you'll also get
there if you search for individual functions or modules (e.g. "pyspark
streaming").
I looked through various JIRA tickets and pull requests, but couldn't find a
mention of this. Even the pull request introducing the new documentation site
mentions the only visible change to users is the design, not its location.
Possible resolution:
* let the links be refreshed by search engines and live with dead links in
various places (stack overflow, emails, bookmarks, ...)
* identify the missing pages and provide a 301 redirects for these (could be
found in logs, google analytics, or maybe we can list all assets generated
before/now and diff them)
* change sphinx configuration to result in identical links as before
* https://issues.apache.org/jira/browse/SPARK-31851
* https://github.com/apache/spark/pull/29188
* https://issues.apache.org/jira/browse/SPARK-32188
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]