panbingkun commented on PR #482:
URL: https://github.com/apache/spark-website/pull/482#issuecomment-1760614768

   > @panbingkun thanks for doing this. However, I discovered that some of the 
canonical links generated are not a valid URL, for example: 
https://spark.apache.org/docs/latest/api/python/reference/api/pyspark.sql.DataFrame.groupBy.html
 Is there a way to update this canonical link to the actual latest 
documentation for groupBy?
   
   Yes, I also noticed this, and the reason is not the issue of updating the 
logic,
   Because the location of the same document may change in different versions, 
such as:
   In version 3.1.1, the location of the `pyspark.sql.DataFrame.groupBy` file 
is: 
`https://spark.apache.org/docs/3.1.1/api/python/reference/api/pyspark.sql.DataFrame.groupBy.html?highlight=groupby#pyspark.sql.DataFrame.groupBy`
   
   According to normal logic, `canonical link` should be: `<link 
rel="canonical" 
href="https://spark.apache.org/docs/latest/api/python/reference/api/pyspark.sql.DataFrame.groupBy.html";
 />`
   But in the new version, this document has been moved to a different 
location: 
https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.DataFrame.groupBy.html?highlight=groupby#pyspark.sql.DataFrame.groupBy
   So, from the perspective of the old document, it seems that `canonical link` 
is incorrect.
   This issue will always exist. If the position of the document changes in the 
new version, the `canonical link` of the old document will need to be updated 
synchronously.
   Of course, we can do it manually, but what should we do if we update the 
document location later? This is a difficult problem.
   Additionally, there will be another issue, which is the disappearance of 
documents. How to handle this?
   Do not add `canonical link`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to