Sounds good to me, now for the rewriter I would like to have something more special geared towards html thank a generic thing like SAX. So basically instead of getting the information that the element name is "a" and then you have to figure out that this is a link, you already get the information this is a "LINK" or a "SRC" or whatever.

Regards
Carsten

Am 10.09.2019 um 18:37 schrieb Stefan Seifert:
i try to write a first summary:

- we have good use cases for a rewriter in general, but are unhappy with the current 
implementation and it's way of configuration. the whole thing is a bit "alien" 
to the other parts of sling.

- a new rewriter implementation should be performant and with up-to-date 
support for modern HTML. ideally something like SAX for HTML (not XML/XHTML). 
this would not cover other use cases producing something other than HTML, 
though. perhaps there are libraries out there that can be built upon. should 
also provide modular support, e.g. rewriting only the output of a single 
component or text fragment. maybe someone can come up with a proposal.

- we have to keep the old rewriter for backward compatibility because it was 
used a lot in the past, but do not plan to maintain it further and probably 
will remove it from the sling starter as well. mark it as deprecated or contrib.

- we do no longer want to use the rewriter for link validation and 
externalization, but support this as aspect in another, more appropriate way 
(with some basic support or hooks/SPIs from Sling itself, the rest is more up 
to upstream layers). we have currently only very rough ideas here, a proposal 
needs to be draftet as well.

stefan

-----Original Message-----
From: Jason E Bailey [mailto:jason.bai...@24601.org]
Sent: Tuesday, September 10, 2019 6:09 PM
To: dev@sling.apache.org
Subject: Re: Why get rid of the rewriter?

I'm positive towards the concept of the rewriter, a utility that provides
centralized features that addresses cross-cutting concerns with user
generated content.

I'm not a fan of the current implementation of the rewriter.

1.  As Ruben pointed out, as of HTML5  html doesn't have anything to do
with XML. There is no concept of namespace in HTML now. There is no self
closing tag. There may not be an end tag. There is a concept of implied
parent tags.

2. TagSoup, which is currently used to parse the HTML, requires  fully
cached content and will attempt to validate the content and add to it,
where appropriate. Which isn't necessary.

3. The Rewriter requires the pipeline configuration to be in a specific
place with a specific name which is inflexible and contrary to other tools
we use.


The point being that I would prefer to have the rewriter implementation
deprecated in favor of a more up to date solution then having the concept
of the rewriter deprecated.



--
Jason

On Tue, Sep 10, 2019, at 8:56 AM, Ruben Reusser wrote:
As was pointed out before the rewriter is used in a lot of projects for
other things than rewriting links (in our case we use it a lot to inject
legal disclaimers or content fragment models)

The bigger problem however is that it assumes hml == xml and hence can
not
deal with attributes with no value

Ruben

On Tue, Sep 10, 2019 at 5:12 AM Bertrand Delacretaz
<bdelacre...@apache.org>
wrote:

Hi,

On Mon, Sep 9, 2019 at 3:07 PM Jason E Bailey <j...@apache.org> wrote:
...Can anyone summarize why we are getting rid of it?...

I'm not sure if we need to "get rid" of that module, even if some
portion of Sling users stops using it.

The proposal at [1] says the rewriter should be "deprecated and no
longer used", which is apparently what was discussed at the adaptTo
round table or hackathon.

If people still find the module useful I think it''s fine to move it
to "contrib" status instead of deprecating. As long as there's a
reasonable expectation that the module will be maintained I think
that's a realistic status, but our guarantees are weak for contrib
modules so there's no pressure.

And if other modules provide better ways of doing similar things, link
to them from the rewriter's docs.

-Bertrand

[1]

https://lists.apache.org/thread.html/c80093524461d7203fa9799b79ebbf6bfd1bb3
f9795865f4aaf3cd4a@%3Cdev.sling.apache.org%3E



--
thank you

Ruben Reusser
CTO, headwire.com, Inc




--
--
Carsten Ziegeler
Adobe Research Switzerland
cziege...@apache.org

Reply via email to