neilramaswamy opened a new pull request, #47864: URL: https://github.com/apache/spark/pull/47864
### What changes were proposed in this pull request? These changes break the Structured Streaming Programming Guide into smaller sub-pages **without changing any content**. I broke up the pages by `h1` tag; within pages, the sub-sections on the left menu are broken up by `h2`. The SS programming guide now will resemble the SQL programming guide and the MLLib programming guide. Additionally, to avoid cluttering the top-level namespace (there are dozens of `sql-*` files for the SQL reference), we nest all streaming docs in by one directory, namely the `/streaming/`. This has the side-effect of breaking links from our `_layouts`, since we assume a flat top-level namespace. To fix this issue, URLs in global layout files now all use absolute paths. This move to `/streaming/` has the consequence that bookmarks of `https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html` will not refer to the actual programming guide content. In anticipation of this, I have kept all pages for existing URLs present with links to the pages in their new locations. This includes the new state data source and the Kafka integration guide. In the future, we'll be able to quite easily (and in-parallel) break the programming guide apart further. This PR does all of the plumbing to make it work.  It is future work to fix the oddly-sized left-navigation bar for our menus. ### Why are the changes needed? One of the major hurdles that users have with Structured Streaming is that our guide is exceptionally long—it feels insurmountable, especially compared to other engines like Flink, which has many sub-pages. Google also has a very tricky time indexing the single large page; if you Google "[structured streaming output mode](https://www.google.com/search?q=structured+streaming+output+mode)" and you click on the link to our programming guide... nothing happens. You aren't taken to the actual content, since Google has trouble with indexing to specific heading tags. ### Does this PR introduce _any_ user-facing change? The structure of the website, with respect to Structured Streaming-related pages, is now changed. See the earlier parts of the PR description for the specific changes. However, **no** content is changed. This should make reviewing the changes much easier. ### How was this patch tested? I have used automated tools (e.g. [Lychee](https://github.com/lycheeverse/lychee)) and manual verification (i.e. clicking on every link) to make sure that I didn't break any links. It isn't fool-proof, though. ### Was this patch authored or co-authored using generative AI tooling? No. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
