[jira] [Commented] (SPARK-34606) New PySpark documentation has different URLs

2021-03-07 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296795#comment-17296795
 ] 

Apache Spark commented on SPARK-34606:
--

User 'kokes' has created a pull request for this issue:
https://github.com/apache/spark/pull/31770

> New PySpark documentation has different URLs
> 
>
> Key: SPARK-34606
> URL: https://issues.apache.org/jira/browse/SPARK-34606
> Project: Spark
>  Issue Type: Bug
>  Components: Documentation, PySpark
>Affects Versions: 3.1.1
>Reporter: Ondrej Kokes
>Priority: Minor
>
> The new documentation site moved some subsites to different URLs, notably the 
> PySpark API reference ([see 
> here|https://spark.apache.org/docs/latest/api/python/pyspark.sql.html]). 
> (Note the new `/reference/` bit in the new URL.)
> It's the first hit when you google "pyspark sql functions", you'll also get 
> there if you search for individual functions or modules (e.g. "pyspark 
> streaming").
> I looked through various JIRA tickets and pull requests, but couldn't find a 
> mention of this. Even the pull request introducing the new documentation site 
> mentions the only visible change to users is the design, not its location.
> Possible resolution:
> * let the links be refreshed by search engines and live with dead links in 
> various places (stack overflow, emails, bookmarks, ...)
> * identify the missing pages and provide a 301 redirects for these (could be 
> found in logs, google analytics, or maybe we can list all assets generated 
> before/now and diff them)
> * change sphinx configuration to result in identical links as before
> Links to potentially relevant tickets and PRs:
> * https://issues.apache.org/jira/browse/SPARK-31851
> * https://github.com/apache/spark/pull/29188
> * https://issues.apache.org/jira/browse/SPARK-32188



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-34606) New PySpark documentation has different URLs

2021-03-07 Thread Ondrej Kokes (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296794#comment-17296794
 ] 

Ondrej Kokes commented on SPARK-34606:
--

[~hyukjin.kwon] gave it a go and [submitted a 
PR|https://github.com/apache/spark/pull/31770]

> New PySpark documentation has different URLs
> 
>
> Key: SPARK-34606
> URL: https://issues.apache.org/jira/browse/SPARK-34606
> Project: Spark
>  Issue Type: Bug
>  Components: Documentation, PySpark
>Affects Versions: 3.1.1
>Reporter: Ondrej Kokes
>Priority: Minor
>
> The new documentation site moved some subsites to different URLs, notably the 
> PySpark API reference ([see 
> here|https://spark.apache.org/docs/latest/api/python/pyspark.sql.html]). 
> (Note the new `/reference/` bit in the new URL.)
> It's the first hit when you google "pyspark sql functions", you'll also get 
> there if you search for individual functions or modules (e.g. "pyspark 
> streaming").
> I looked through various JIRA tickets and pull requests, but couldn't find a 
> mention of this. Even the pull request introducing the new documentation site 
> mentions the only visible change to users is the design, not its location.
> Possible resolution:
> * let the links be refreshed by search engines and live with dead links in 
> various places (stack overflow, emails, bookmarks, ...)
> * identify the missing pages and provide a 301 redirects for these (could be 
> found in logs, google analytics, or maybe we can list all assets generated 
> before/now and diff them)
> * change sphinx configuration to result in identical links as before
> Links to potentially relevant tickets and PRs:
> * https://issues.apache.org/jira/browse/SPARK-31851
> * https://github.com/apache/spark/pull/29188
> * https://issues.apache.org/jira/browse/SPARK-32188



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-34606) New PySpark documentation has different URLs

2021-03-07 Thread Hyukjin Kwon (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296749#comment-17296749
 ] 

Hyukjin Kwon commented on SPARK-34606:
--

Yeah, I noticed this problem too. One simple solution is to redirect to the 
root page of new documentation at least. I don't think it's feasible to map 
each link to the legacy ones.

> New PySpark documentation has different URLs
> 
>
> Key: SPARK-34606
> URL: https://issues.apache.org/jira/browse/SPARK-34606
> Project: Spark
>  Issue Type: Bug
>  Components: Documentation, PySpark
>Affects Versions: 3.1.1
>Reporter: Ondrej Kokes
>Priority: Minor
>
> The new documentation site moved some subsites to different URLs, notably the 
> PySpark API reference ([see 
> here|https://spark.apache.org/docs/latest/api/python/pyspark.sql.html]). 
> (Note the new `/reference/` bit in the new URL.)
> It's the first hit when you google "pyspark sql functions", you'll also get 
> there if you search for individual functions or modules (e.g. "pyspark 
> streaming").
> I looked through various JIRA tickets and pull requests, but couldn't find a 
> mention of this. Even the pull request introducing the new documentation site 
> mentions the only visible change to users is the design, not its location.
> Possible resolution:
> * let the links be refreshed by search engines and live with dead links in 
> various places (stack overflow, emails, bookmarks, ...)
> * identify the missing pages and provide a 301 redirects for these (could be 
> found in logs, google analytics, or maybe we can list all assets generated 
> before/now and diff them)
> * change sphinx configuration to result in identical links as before
> Links to potentially relevant tickets and PRs:
> * https://issues.apache.org/jira/browse/SPARK-31851
> * https://github.com/apache/spark/pull/29188
> * https://issues.apache.org/jira/browse/SPARK-32188



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-34606) New PySpark documentation has different URLs

2021-03-06 Thread Hyukjin Kwon (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17296750#comment-17296750
 ] 

Hyukjin Kwon commented on SPARK-34606:
--

[~ondrej], are you working on this? Any PR will be very welcome on this.

> New PySpark documentation has different URLs
> 
>
> Key: SPARK-34606
> URL: https://issues.apache.org/jira/browse/SPARK-34606
> Project: Spark
>  Issue Type: Bug
>  Components: Documentation, PySpark
>Affects Versions: 3.1.1
>Reporter: Ondrej Kokes
>Priority: Minor
>
> The new documentation site moved some subsites to different URLs, notably the 
> PySpark API reference ([see 
> here|https://spark.apache.org/docs/latest/api/python/pyspark.sql.html]). 
> (Note the new `/reference/` bit in the new URL.)
> It's the first hit when you google "pyspark sql functions", you'll also get 
> there if you search for individual functions or modules (e.g. "pyspark 
> streaming").
> I looked through various JIRA tickets and pull requests, but couldn't find a 
> mention of this. Even the pull request introducing the new documentation site 
> mentions the only visible change to users is the design, not its location.
> Possible resolution:
> * let the links be refreshed by search engines and live with dead links in 
> various places (stack overflow, emails, bookmarks, ...)
> * identify the missing pages and provide a 301 redirects for these (could be 
> found in logs, google analytics, or maybe we can list all assets generated 
> before/now and diff them)
> * change sphinx configuration to result in identical links as before
> Links to potentially relevant tickets and PRs:
> * https://issues.apache.org/jira/browse/SPARK-31851
> * https://github.com/apache/spark/pull/29188
> * https://issues.apache.org/jira/browse/SPARK-32188



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-34606) New PySpark documentation has different URLs

2021-03-03 Thread Ondrej Kokes (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17294423#comment-17294423
 ] 

Ondrej Kokes commented on SPARK-34606:
--

I tried building HTML docs for PySpark 2.4.7 and the current master and here's 
the one-way diff (set(2.4.7) - set(master)).

missing docs
* build/html/pyspark.html
* build/html/pyspark.ml.html
* build/html/pyspark.mllib.html
* build/html/pyspark.sql.html
* build/html/pyspark.streaming.html

other pages not present (module code):
* build/html/_modules/pyspark/profiler.html
* build/html/_modules/pyspark/serializers.html
* build/html/_modules/pyspark/sql/catalog.html
* build/html/_modules/pyspark/sql/context.html
* build/html/_modules/pyspark/sql/udf.html
* build/html/_modules/pyspark/status.html
* build/html/_modules/pyspark/streaming/flume.html
* build/html/_modules/pyspark/streaming/kafka.html
* build/html/_modules/pyspark/streaming/listener.html

> New PySpark documentation has different URLs
> 
>
> Key: SPARK-34606
> URL: https://issues.apache.org/jira/browse/SPARK-34606
> Project: Spark
>  Issue Type: Bug
>  Components: Documentation, PySpark
>Affects Versions: 3.1.1
>Reporter: Ondrej Kokes
>Priority: Minor
>
> The new documentation site moved some subsites to different URLs, notably the 
> PySpark API reference ([see 
> here|https://spark.apache.org/docs/latest/api/python/pyspark.sql.html]). 
> (Note the new `/reference/` bit in the new URL.)
> It's the first hit when you google "pyspark sql functions", you'll also get 
> there if you search for individual functions or modules (e.g. "pyspark 
> streaming").
> I looked through various JIRA tickets and pull requests, but couldn't find a 
> mention of this. Even the pull request introducing the new documentation site 
> mentions the only visible change to users is the design, not its location.
> Possible resolution:
> * let the links be refreshed by search engines and live with dead links in 
> various places (stack overflow, emails, bookmarks, ...)
> * identify the missing pages and provide a 301 redirects for these (could be 
> found in logs, google analytics, or maybe we can list all assets generated 
> before/now and diff them)
> * change sphinx configuration to result in identical links as before
> Links to potentially relevant tickets and PRs:
> * https://issues.apache.org/jira/browse/SPARK-31851
> * https://github.com/apache/spark/pull/29188
> * https://issues.apache.org/jira/browse/SPARK-32188



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org