What is the solution for rewriting URLs contained within the output of a
rich text editor? Similar to https://wcm.io/handler/richtext/? I hope that
downstream implementers of Sling will provide a pathway for replacing the
rewriter if it is removed. I would speculate this would result in a
tremendous amount of work to fix / test on upgrade for many
implementations.

I agree the AEM Link Checker is a bad idea, though I argue that has to do
more with the concept of removing links at runtime than the way it is
implemented.

While the rewriter is not the most efficient method, but it does give you
the ability to affect the entire page structure in one go vs. having to
handle it on a per-component, resource type or field basis. The problem I
see with most implementations of Sling using the rewriter is they do not
give enough control of how the rewriter operates. I've been using this with
the Sling CMS reference, for example to allow for whitelisting attributes
and elements on a per-site basis using Sling CA Configs:
https://github.com/apache/sling-org-apache-sling-app-cms/blob/master/docs/configure-site.md#rewrite-configuration


On Mon, Sep 9, 2019, 5:52 PM Stefan Seifert <sseif...@pro-vision.de> wrote:

> yes, that's what we roughly outlined at [1]. of course that's still a lot
> of work to do to get rid of the rewriter, and when it's time it maybe still
> deprecated kept in in-place for more exotic use cases (like PDF generation).
>
> why / for what use cases do we want to get rid of it - my view:
>
> - the rewriter is currently (esp. in AEM) mainly used for checking
> validity of links in a post-processing step
> - that means AFTER the markup was generated by all the scripts, it is
> parsed again, link are identified by heuristics, and if a link seems to
> point to an internal page/resource, and this page/resource does not exist,
> the link is removed by the rewriter
> - this leads to several problems:
>   - performance issue: why generating the markup and directly after parse
> it again to serialize it again?
>   - responsibility/control issue: what should happen should if a link is
> invalid should rely in the control of the component/script that renders the
> link. sometimes only the anchor is unwrapped, but often more is happening,
> e.g. the whole teaser is hidden, the whole navigation entry is hidden etc.
> the rewriter cannot "know" these use cases.
>   - link detection issue: because relying on heuristics it is never sure
> that the rewriter can detect all links. e.g. by default it cannot detect
> link in data attributes, link in javascript, link in JSON files etc.
> - that means the aspect of externalizing and validating links belongs into
> the business/framework logic, not in a post processing link rewriter.
>
> that's why we build the URL/link handler in wcm.io [2] - and i think in
> the long run sling should go in this direction as well (at least for the
> basic URL handling features). I known that lot's of sling-based application
> put all their complex link handling logic into customized rewriter
> pipelines, but imho this is the wrong place for it. see also [3] for some
> background information.
>
> stefan
>
> [1]
> https://lists.apache.org/thread.html/c80093524461d7203fa9799b79ebbf6bfd1bb3f9795865f4aaf3cd4a@%3Cdev.sling.apache.org%3E
> [2] https://wcm.io/handler/
> [3]
> https://adapt.to/2019/en/schedule/assets-and-links-in-aem-projects.html
>
>
> >-----Original Message-----
> >From: Jason E Bailey [mailto:j...@apache.org]
> >Sent: Monday, September 9, 2019 3:07 PM
> >To: dev@sling.apache.org
> >Subject: Why get rid of the rewriter?
> >
> >I obviously missed the chat at the adaptTo() meetup.
> >
> >Can anyone summarize why we are getting rid of it? And the thought process
> >on replacing it?
> >
> >- Jason
>
>
>

Reply via email to