viirya commented on code in PR #56080:
URL: https://github.com/apache/spark/pull/56080#discussion_r3295563346


##########
docs/_config.yml:
##########
@@ -25,12 +25,10 @@ SCALA_BINARY_VERSION: "2.13"
 SCALA_VERSION: "2.13.18"
 SPARK_ISSUE_TRACKER_URL: https://issues.apache.org/jira/browse/SPARK
 SPARK_GITHUB_URL: https://github.com/apache/spark
-# Before a new release, we should:
-#   1. update the `version` array for the new Spark documentation
-#      on 
https://github.com/algolia/docsearch-configs/blob/master/configs/apache_spark.json.
-#   2. update the value of `facetFilters.version` in `algoliaOptions` on the 
new release branch.
-# Otherwise, after release, the search results are always based on the latest 
documentation
-# (https://spark.apache.org/docs/latest/) even when visiting the documentation 
of previous releases.
+# The DocSearch index is maintained by the Algolia crawler at 
https://crawler.algolia.com/.
+# The crawler indexes only https://spark.apache.org/docs/latest/ and tags 
every page with
+# `version:latest`. All release branches share this single index, so 
`facetFilters` stays
+# pinned to `version:latest` everywhere and no per-release update is required.

Review Comment:
   This comment doesn't match the live index: querying `apache_spark` returns 
`version` facet values `{latest, 4.1.0, 4.1.1, 4.1.2, 4.0.0}`, and 
`facetFilters: ["version:4.1.2"]` still returns 4.1.2-specific URLs. The 
crawler isn't `latest`-only — release pages are still indexed and tagged with 
their version. As written this will mislead the next release manager into 
thinking version filters are unused; please either correct the description of 
the crawler's behavior, or — if the intent is to deliberately switch to 
`latest`-only — say so explicitly and link to the crawler-side change that 
makes it true.



##########
dev/create-release/release-tag.sh:
##########
@@ -84,7 +84,6 @@ fi
 # Set the release version in docs
 sed -i".tmp1" 's/SPARK_VERSION:.*$/SPARK_VERSION: '"$RELEASE_VERSION"'/g' 
docs/_config.yml
 sed -i".tmp2" 's/SPARK_VERSION_SHORT:.*$/SPARK_VERSION_SHORT: 
'"$RELEASE_VERSION"'/g' docs/_config.yml
-sed -i".tmp3" "s/'facetFilters':.*$/'facetFilters': 
[\"version:$RELEASE_VERSION\"]/g" docs/_config.yml

Review Comment:
   Removing this rewrite means every future release branch (and the HTML 
shipped under `/docs/<X.Y.Z>/`) will ship `facetFilters: ["version:latest"]`. 
Combined with the live index still containing populated per-version facets, 
that turns release-page search into "jump to `/docs/latest/`" rather than 
staying on the user's release — a user-facing regression vs. the intent of 
SPARK-33479. If we do want this change, the PR description should reflect it; 
otherwise this `sed` (and the symmetric one below for `R_NEXT_VERSION`) should 
stay.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to