i try to write a first summary:

- we have good use cases for a rewriter in general, but are unhappy with the 
current implementation and it's way of configuration. the whole thing is a bit 
"alien" to the other parts of sling.

- a new rewriter implementation should be performant and with up-to-date 
support for modern HTML. ideally something like SAX for HTML (not XML/XHTML). 
this would not cover other use cases producing something other than HTML, 
though. perhaps there are libraries out there that can be built upon. should 
also provide modular support, e.g. rewriting only the output of a single 
component or text fragment. maybe someone can come up with a proposal.

- we have to keep the old rewriter for backward compatibility because it was 
used a lot in the past, but do not plan to maintain it further and probably 
will remove it from the sling starter as well. mark it as deprecated or contrib.

- we do no longer want to use the rewriter for link validation and 
externalization, but support this as aspect in another, more appropriate way 
(with some basic support or hooks/SPIs from Sling itself, the rest is more up 
to upstream layers). we have currently only very rough ideas here, a proposal 
needs to be draftet as well.

stefan

>-----Original Message-----
>From: Jason E Bailey [mailto:jason.bai...@24601.org]
>Sent: Tuesday, September 10, 2019 6:09 PM
>To: dev@sling.apache.org
>Subject: Re: Why get rid of the rewriter?
>
>I'm positive towards the concept of the rewriter, a utility that provides
>centralized features that addresses cross-cutting concerns with user
>generated content.
>
>I'm not a fan of the current implementation of the rewriter.
>
>1.  As Ruben pointed out, as of HTML5  html doesn't have anything to do
>with XML. There is no concept of namespace in HTML now. There is no self
>closing tag. There may not be an end tag. There is a concept of implied
>parent tags.
>
>2. TagSoup, which is currently used to parse the HTML, requires  fully
>cached content and will attempt to validate the content and add to it,
>where appropriate. Which isn't necessary.
>
>3. The Rewriter requires the pipeline configuration to be in a specific
>place with a specific name which is inflexible and contrary to other tools
>we use.
>
>
>The point being that I would prefer to have the rewriter implementation
>deprecated in favor of a more up to date solution then having the concept
>of the rewriter deprecated.
>
>
>
>--
>Jason
>
>On Tue, Sep 10, 2019, at 8:56 AM, Ruben Reusser wrote:
>> As was pointed out before the rewriter is used in a lot of projects for
>> other things than rewriting links (in our case we use it a lot to inject
>> legal disclaimers or content fragment models)
>>
>> The bigger problem however is that it assumes hml == xml and hence can
>not
>> deal with attributes with no value
>>
>> Ruben
>>
>> On Tue, Sep 10, 2019 at 5:12 AM Bertrand Delacretaz
><bdelacre...@apache.org>
>> wrote:
>>
>> > Hi,
>> >
>> > On Mon, Sep 9, 2019 at 3:07 PM Jason E Bailey <j...@apache.org> wrote:
>> > > ...Can anyone summarize why we are getting rid of it?...
>> >
>> > I'm not sure if we need to "get rid" of that module, even if some
>> > portion of Sling users stops using it.
>> >
>> > The proposal at [1] says the rewriter should be "deprecated and no
>> > longer used", which is apparently what was discussed at the adaptTo
>> > round table or hackathon.
>> >
>> > If people still find the module useful I think it''s fine to move it
>> > to "contrib" status instead of deprecating. As long as there's a
>> > reasonable expectation that the module will be maintained I think
>> > that's a realistic status, but our guarantees are weak for contrib
>> > modules so there's no pressure.
>> >
>> > And if other modules provide better ways of doing similar things, link
>> > to them from the rewriter's docs.
>> >
>> > -Bertrand
>> >
>> > [1]
>> >
>https://lists.apache.org/thread.html/c80093524461d7203fa9799b79ebbf6bfd1bb3
>f9795865f4aaf3cd4a@%3Cdev.sling.apache.org%3E
>> >
>>
>>
>> --
>> thank you
>>
>> Ruben Reusser
>> CTO, headwire.com, Inc
>>


Reply via email to