Thanks, Dan. > Even just a minor rename to have it be SitemapLinkExternalizer or something similar might make more sense as even if a more general solution comes available
I renamed the Externalizer interface to SitemapLinkExternalizer as you suggested. At least for the canonical link on a page the same externalization should (if not even must) be used [0], but there can be others indeed. > How would cleanup work? Based on a cursory review of the code I'm You are right. It is as simple as iterating through all the sitemap files and checking if they are still "relevant", meaning their corresponding content resource exists and is still a sitemap root. > It'd be *nice* to have a Web Console or some means for an administrator / developer to understand what sitemaps currently exist and trigger regeneration (or if there's some better way let me know) I added an InventoryPrinter [1] to cover the first part and introduced some API methods in the SitemapService [2] to cover the later one. I did not create a WebConsolePlugin as from my pov the (re)generation should be accessible to business users and so, the product/project should provide an UI. I am not sure how that could look like for Sling CMS. Best, Dirk [0] https://developers.google.com/search/docs/advanced/sitemaps/build-sitemap#general-guidelines [1] https://github.com/apache/sling-whiteboard/blob/master/sitemap/src/main/java/org/apache/sling/sitemap/impl/console/SitemapInventoryPlugin.java [2] https://github.com/apache/sling-whiteboard/blob/master/sitemap/src/main/java/org/apache/sling/sitemap/SitemapService.java#L39 On Fri, 4 Jun 2021 at 22:11, Daniel Klco <[email protected]> wrote: > Dirk, > > Looks great to me! A couple of thoughts: > > - I like this 10x better than my hacked together script in Sling CMS :-) > - I don't think a different externalization method has been implemented. I > like this one, but I'm not sure it's appropriate for the scope of this > bundle. Even just a minor rename to have it be SitemapLinkExternalizer or > something similar might make more sense as even if a more general solution > comes available, there could be legitimate reasons for an externalizer to > work differently when generating a sitemap than other use cases > - How would cleanup work? Based on a cursory review of the code I'm > assuming it'd have to check the repository for each sitemap to find ones > that are no longer referenced? Sound about right? > - It'd be *nice* to have a Web Console or some means for an > administrator / developer to understand what sitemaps currently exist and > trigger regeneration (or if there's some better way let me know) > > Awesome work! > -Dan > > On Fri, Jun 4, 2021 at 12:31 PM Dirk Rudolph <[email protected]> wrote: > > > Hi all, > > > > I added a new bundle for xml sitemap generation to the whiteboard [0] and > > kindly want to ask for your feedback. > > > > The key highlights are: > > - A simple, builder-like API to create Sitemaps, that hides all the XML > > specifics > > - Supports on-demand and background generation w/ continuation after job > > interruption > > - Support for nested sitemaps, that are automatically collected into a > > sitemap indexes > > > > As this implementation depends on an actual project / product's content > > structure, I created a sample implementation for the Sling CMS [1]. > > > > I still have some open points on my list: > > - Link externalization. IIRC there was a discussion to implement a > general > > approach in Sling, has that been implemented? > > - Housekeeping of old/obsolete sitemap files > > > > However, I wanted to start the discussion and ask - when there are no > major > > objections - if this contribution could make its own module? > > > > Best, > > Dirk > > > > [0] https://github.com/apache/sling-whiteboard/tree/master/sitemap > > [1] > > > > > https://github.com/apache/sling-org-apache-sling-app-cms/compare/master...Buuhuu:feature/sitemap > > >
