Re: [Standards] Security issues with XHTML-IM (again)

Goffi Thu, 12 Oct 2017 08:32:19 -0700

Le jeudi 12 octobre 2017, 16:58:02 CEST Dave Cridland a écrit :
> On 12 October 2017 at 15:19, Sam Whited <[email protected]> wrote:
> > On Thu, Oct 12, 2017, at 03:09, Dave Cridland wrote:
> >> I would note that in principle, a content security policy ought to
> >> prevent such attacks outright.
> >> 
> >> But there would, probably, remain several other innovative attacks,
> >> such as passing client-specific markup intended to duplicate existing
> >> UI elements.
> > 
> > Indeed. Using a restricted subset of a complicated system always
> > introduces the risk that some part of that complexity will not be
> > understood and will leak out, possibly causing security issues. We see
> > that on the web fairly regularly.
> > 
> > It's my beleif that it's always better to use a simple, complete system
> > instead of a restricted, complex system. We see the same thing with
> > XMPP's use of XML: we may use a sane subset of it, but since the
> > underlying libraries still handle things like proc insts and whatever
> > the ampersand escape thing is called you still get attacks based on
> > those every so often (even though they're forbidden in XMPP).
> > 
> > I didn't bring this up in the original mail because it tends to get a
> > bit abstract, but it's worth discussing if we move to make a
> > replacement.
> 
> I think the problem isn't simply a subset of a complex system, it's
> that sanitizing HTML is a difficult and largely error prone problem
> which has repeatedly been the cause of a number of security problems.
> 
> I appreciate it's entirely possible, but even a simplified ruleset is
> something like:
> 
> 1) For each child element:
> a) Discard if this is an unsupported element.
> b) Remove any unsupported attributes.
> c) For the style attribute, parse the CSS and:
>     ii)  remove any unsupported attributes.
>     i) For attributes which (might) contain a URL, ensure the URL is
> of a scheme we think might be OK, although we won't tell you which
> those are.
> d) For each remaining HTML attribute which (might) contain a URL,
> ensure that any URL is of a scheme we think be be OK, although we
> won't tell you which those are.
> e) Recurse for each child element.
> 
> >> So overall, I think we should move rich IM formatting to Markdown and
> >> call it done.
> > 
> > Let's discuss this in a separate thread. I'd really like to try and keep
> > this about deprecating XHTML-IM, which I think is an orthogonal track of
> > work (unless you disagree, in which case, please voice that here!).
> 
> It's clearly not orthogonal, since simply getting rid of XHTML-IM is
> not deprecating it in favour of anything else.
> 
> But several clients have supported a basic Markdown-like syntax for
> emphasis for years - Gajim, for example, supports both *bold* and
> /italic/ at a quick test, and I think it has for years.
> 
> Slack does fine on just a handful more items (`preformat`, for example).
> 
> I appreciate Goffi's argument that Markdown-like syntaxes do not
> handle tables, but guess what? Nor does XHTML-IM.
> 
> So my argument for keeping it in this thread is really in order to
> understand what features of XHTML-IM are desirable rather than to
> fully specify a replacement - once we know that we want XHTML-IM's
> feature set to support bold, or tables, or inline images, or whatever
> then we can move on to design a replacement.
> 
> Dave.
> _______________________________________________
> Standards mailing list
> Info: https://mail.jabber.org/mailman/listinfo/standards
> Unsubscribe: [email protected]
> _______________________________________________




just to clarify my position: I'm not absolutely opposed as using a new syntax 
as long as:
1) it is a publishing format, meaning well specified, with a schema to validate 
it (invalid syntax being rejected), and with a reproducible rendering. This 
would exlude mardown for all the reasons I've mentionned in my previous 
messages
2) it's easy to parse. Knowing that every XMPP software has at least an XML 
parser, I would rather like XML based syntax, but it's not absolute necessity
3) we have a base set of feature similar to XHTML-IM
4) we can extend is when needed (table ? forms ?)

On the other hand, the proposition to remove "style" attribute from XHTML-IM 
and reduce it to avoid XSS attack seems resonable, so I'm OK with both options 
(new syntax and making XHTML-IM less error prone). The latter case would avoid 
lot of troubles (yet another syntax would bring incompatibilities for a long 
time, and would need work to standardize, while we are already lacking 
resource for other more urgent things).

For the blog use case, I think the only active clients which implement it are 
SàT and Movim, and both use actually XHTML with some cleaning, so it would not 
be a disaster to remove XHTML-IM.

Cheers
Goffi

_______________________________________________
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: [email protected]
_______________________________________________

Re: [Standards] Security issues with XHTML-IM (again)

Reply via email to