Re: Concerns regarding cross-origin copy/paste security

2012-02-08 Thread Hallvord R. M. Steen

Adam Barth w...@adambarth.com skreiv Wed, 08 Feb 2012 00:05:54 +0100

FWIW, my main concern was the hidden data aspect because it can be  
abused
for cross-site request forgery if a malicious site by getting the user  
to

copy and paste gets access to form anti-CSRF tokens and such.


That's certainly possible, but I don't think it's possible for us to
protect against the long tail of risks here.  In these sorts of cases,
it can be better for security to not implement a half-correct solution
and instead decide not to try to mitigate a particular risk.


You are right here.

Also, on considering the abuse potential of getData('text/html'), I've  
realised that we are not introducing much new threat surface here, since a  
simple paste into a rich text editing-enabled element already inserts  
markup so that the target page can see much of what I proposed removing.


I've changed the spec from saying the implementation *must* apply the  
sanitization algorithm to saying the user agent *may* apply it, made it  
clear that it is merely a suggestion, removed some of the most draconian  
parts and marked it as informative. I think it still has some value as an  
informative section.


http://dev.w3.org/cvsweb/~checkout~/2006/webapi/clipops/clipops-source.html?rev=1.15;content-type=text%2Fhtml

Perhaps we should publish a new working draft now?

--
Hallvord R. M. Steen
Core tester, Opera Software



Re: Concerns regarding cross-origin copy/paste security

2012-02-07 Thread Adam Barth
On Mon, May 16, 2011 at 8:41 PM, Hallvord R. M. Steen
hallv...@opera.com wrote:
 On Thu, 05 May 2011 06:46:55 +0900, Daniel Cheng dch...@chromium.org
 wrote:

 There was a recent discussion involving directly exposing the HTML
 fragment
 in a paste to a page, since we're doing the parsing anyway for security
 reasons. I have some concerns regarding

 http://www.w3.org/TR/clipboard-apis/#cross-origin-copy-paste-of-source-code
 though.

 From my understanding, we are trying to protect against [1] hidden data

 being copied without a user's knowledge and [2] XSS via pasting hostile
 HTML. In my opinion, the algorithm as written is either going to remove
 too
 much information or not enough. If it removes too much, the HTML paste is
 effectively useless to a client app. If it doesn't remove enough, then the
 client app is going to have to sanitize the HTML itself anyway.

 FWIW, my main concern was the hidden data aspect because it can be abused
 for cross-site request forgery if a malicious site by getting the user to
 copy and paste gets access to form anti-CSRF tokens and such.

That's certainly possible, but I don't think it's possible for us to
protect against the long tail of risks here.  In these sorts of cases,
it can be better for security to not implement a half-correct solution
and instead decide not to try to mitigate a particular risk.

 I *intend* to
 leave some processing of the HTML to the client application, for example the
 removal of third-party application-specific or browser-specific CSS
 properties.

 I see that Chrome applies different security policies depending on whether
 the content is read by a JavaScript (getData('text/html') - style) and
 inserted directly. You do some extra work to avoid XSS, such as removing on*
 event listener attributes and href=javascript: when content is inserted
 directly (you also remove some browser-specific elements and class names).
 This sort of clean up and processing on direct data insertion by the
 user-agent is not really in scope for the events spec IMO.

That makes sense.  The risk here is somewhat different from what
you've articulated above.  Rather than trying to prevent information
leaks from the source of the copy to the target of the paste,
these checks aim to prevent the source from injecting script into the
target.

 However, for getData('text/html') it seems you do no clean-up at all, not
 for cross-origin paste either.

Correct.  The idea here is to have a secure default but still let a
sophisticated web application handle the complicated cases if they
want to.  I just spoke with Ryosuke and Daniel, and we're considering
tightening up the default behavior somewhat to prevent injections of
style and other dangerous elements (probably by switching to a
whitelist).

 Implementing the current spec would thus
 require that you tighten your existing security policy. Will you consider
 doing so, or would you rather argue for removal of any spec-mandated
 clean-up of cross-origin source code?

IMHO, we shouldn't try to protect the source of the data, but we
should aim to protect the target.  My understanding of your message
is that would cause us to remove the text in this spec.  If we find a
good whitelist for protecting the target, that's probably worth
writing in a spec so that browsers can interoperate, but it doesn't
have to be this spec if you feel that this behavior is out of scope.

 [2] is no different than using data from any
 other untrusted source, like dragging HTML or data from an XHR. It doesn't
 make sense to special-case HTML pastes.

 Using data is not the only threat model - limiting the damage potential
 when the page you paste into is malicious is harder. However, there is some
 overlap in the strategies we might use - for example event attributes are
 certainly hidden data, might contain secrets and might cause XSS attacks so
 you might argue for their removal based on both abuse scenarios though I
 think [2] is a more relevant threat.

The problem is that the tail of where sensitive information might
reside is long and thick, making these security measures only
partially effective, at best.

Adam



Re: Concerns regarding cross-origin copy/paste security

2012-02-04 Thread Charles Pritchard

On 2/2/2012 10:48 PM, Ryosuke Niwa wrote:
On Thu, Feb 2, 2012 at 10:43 PM, Charles Pritchard ch...@jumis.com 
mailto:ch...@jumis.com wrote:


On 2/2/12 10:27 PM, Ryosuke Niwa wrote:

On Thu, Feb 2, 2012 at 10:20 PM, Charles Pritchard
ch...@jumis.com mailto:ch...@jumis.com wrote:

Seems like a very minor risk for high security sites, e.g.
banking, in identifying form elements.
In the spirit of giving it some thought:


But even for those websites, what could input / textarea elements
can reveal more than what user sees?

Many sites use input hidden elements with what are essentially
image maps for entering a PIN.


But any element with display:none will be removed so input hidden 
should be removed.


It's becoming more common that top level domains are being
restricted or redirected to country codes. It seems plausible that
domains may further be restricted to HTTPS (SSL) signatures. Going
further, sites may be restricted to those which serve appropriate
security headers against XSS attacks. Disabling the copy
mechanism for any portion of a site does risk censorship. But, we
are only examining high security portions of high security sites,
such as input hidden and input type=password.


input[type=password] is a good one. We should probably get rid of the 
value in that case?


Yes, I think so. I'm working on an application in which I do a lot of 
copy and paste work. I'll let you know if I come across anything I think 
should change.


-Charles


Re: Concerns regarding cross-origin copy/paste security

2012-02-02 Thread Ryosuke Niwa
Sorry for the extremely slow reply. It slipped through hundreds of emails :(

On Mon, May 16, 2011 at 8:41 PM, Hallvord R. M. Steen hallv...@opera.comwrote:

  To me, it doesn't make sense to remove the other elements:
 - OBJECT: Could be used for SVG as I understand.


 OBJECT is considered a form element, so it might have hidden data
 associated with it. It can also contain plugin content that could inject
 scripts and be used for XSS attacks. It may be too far-fetched or draconian
 to remove it though. (SVG is rich enough to be its own can of worms by the
 way..)


Given the improved support for inline SVG and MathML, it's probably okay to
strip it. However, we should add EMBED to the list since it's a plugin
element.

 - INPUT (non-hidden, non-password): Content is already available via
 text/plain.


 An input's @name attribute is basically hidden data the user will not be
 aware of pasting. I'm not sure how much of a threat this is, but we should
 give it some thought.


You mean input name=~? I don't think that'll expose much information.
I'd prefer not removing these attributes as I've seen bugs filed against
WebKit for form control editors; apparently some people would like to
create form control editors using contenteditable.

- Ryosuke


Re: Concerns regarding cross-origin copy/paste security

2012-02-02 Thread Charles Pritchard

On 2/2/12 10:14 PM, Ryosuke Niwa wrote:
Sorry for the extremely slow reply. It slipped through hundreds of 
emails :(


On Mon, May 16, 2011 at 8:41 PM, Hallvord R. M. Steen 
hallv...@opera.com mailto:hallv...@opera.com wrote:


To me, it doesn't make sense to remove the other elements:
- OBJECT: Could be used for SVG as I understand.


OBJECT is considered a form element, so it might have hidden data
associated with it. It can also contain plugin content that could
inject scripts and be used for XSS attacks. It may be too
far-fetched or draconian to remove it though. (SVG is rich enough
to be its own can of worms by the way..)


Given the improved support for inline SVG and MathML, it's probably 
okay to strip it. However, we should add EMBED to the list since it's 
a plugin element.


- INPUT (non-hidden, non-password): Content is already
available via
text/plain.


An input's @name attribute is basically hidden data the user will
not be aware of pasting. I'm not sure how much of a threat this
is, but we should give it some thought.


You mean input name=~? I don't think that'll expose much 
information. I'd prefer not removing these attributes as I've seen 
bugs filed against WebKit for form control editors; apparently some 
people would like to create form control editors using contenteditable.




Seems like a very minor risk for high security sites, e.g. banking, in 
identifying form elements.

In the spirit of giving it some thought:

There are various XSS headers that signal enhanced security for 
websites, even to browser extensions.
Perhaps some of them ought to be used in the copy mechanism. That way 
the data never reaches the clipboard for paste.


-Charles


Re: Concerns regarding cross-origin copy/paste security

2012-02-02 Thread Ryosuke Niwa
On Thu, Feb 2, 2012 at 10:20 PM, Charles Pritchard ch...@jumis.com wrote:

  Seems like a very minor risk for high security sites, e.g. banking, in
 identifying form elements.
 In the spirit of giving it some thought:


But even for those websites, what could input / textarea elements can
reveal more than what user sees?

 There are various XSS headers that signal enhanced security for websites,
 even to browser extensions.
 Perhaps some of them ought to be used in the copy mechanism. That way
 the data never reaches the clipboard for paste.


That's also an option and may need to be spec'ed to some extent.

- Ryosuke


Re: Concerns regarding cross-origin copy/paste security

2012-02-02 Thread Charles Pritchard

On 2/2/12 10:27 PM, Ryosuke Niwa wrote:
On Thu, Feb 2, 2012 at 10:20 PM, Charles Pritchard ch...@jumis.com 
mailto:ch...@jumis.com wrote:


Seems like a very minor risk for high security sites, e.g.
banking, in identifying form elements.
In the spirit of giving it some thought:


But even for those websites, what could input / textarea elements can 
reveal more than what user sees?


Many sites use input hidden elements with what are essentially image 
maps for entering a PIN.


In that case, a user does not see the PIN, though they do see an image 
map which has been obscured through various means.


I doubt there are security risks in this area.



There are various XSS headers that signal enhanced security for
websites, even to browser extensions.
Perhaps some of them ought to be used in the copy mechanism.
That way the data never reaches the clipboard for paste.


That's also an option and may need to be spec'ed to some extent.


It's the best I have to offer, in hypothesizing how we may address the 
concern.
High security sites use high security headers. If they've opted into 
those headers, we can do a lot to limit data exposure.


There are many sorts of XSS attacks for sites that do not implement 
security headers. We can't help those, there are just too many leaks. 
So, I'd focus on specifying additional clipboard constraints for high 
security sites.


I would put out one word of caution: such restrictions could be used for 
censorship. I don't think we have an option there.


It's becoming more common that top level domains are being restricted or 
redirected to country codes. It seems plausible that domains may further 
be restricted to HTTPS (SSL) signatures. Going further, sites may be 
restricted to those which serve appropriate security headers against XSS 
attacks. Disabling the copy mechanism for any portion of a site does 
risk censorship. But, we are only examining high security portions of 
high security sites, such as input hidden and input type=password.


We're examining those elements for the sake of consumer protection for 
users doing online banking and otherwise cooperating in a secure 
environment for private data. That's a good thing.


-Charles


Re: Concerns regarding cross-origin copy/paste security

2012-02-02 Thread Ryosuke Niwa
On Thu, Feb 2, 2012 at 10:43 PM, Charles Pritchard ch...@jumis.com wrote:

 **
 On 2/2/12 10:27 PM, Ryosuke Niwa wrote:

 On Thu, Feb 2, 2012 at 10:20 PM, Charles Pritchard ch...@jumis.comwrote:

  Seems like a very minor risk for high security sites, e.g. banking, in
 identifying form elements.
 In the spirit of giving it some thought:


  But even for those websites, what could input / textarea elements can
 reveal more than what user sees?

 Many sites use input hidden elements with what are essentially image
 maps for entering a PIN.


But any element with display:none will be removed so input hidden should
be removed.

 It's becoming more common that top level domains are being restricted or
 redirected to country codes. It seems plausible that domains may further be
 restricted to HTTPS (SSL) signatures. Going further, sites may be
 restricted to those which serve appropriate security headers against XSS
 attacks. Disabling the copy mechanism for any portion of a site does risk
 censorship. But, we are only examining high security portions of high
 security sites, such as input hidden and input type=password.


input[type=password] is a good one. We should probably get rid of the value
in that case?

- Ryosuke


Re: Concerns regarding cross-origin copy/paste security

2011-05-16 Thread Hallvord R. M. Steen
On Thu, 05 May 2011 06:46:55 +0900, Daniel Cheng dch...@chromium.org  
wrote:


There was a recent discussion involving directly exposing the HTML  
fragment

in a paste to a page, since we're doing the parsing anyway for security
reasons. I have some concerns regarding
http://www.w3.org/TR/clipboard-apis/#cross-origin-copy-paste-of-source-code
though.


From my understanding, we are trying to protect against [1] hidden data

being copied without a user's knowledge and [2] XSS via pasting hostile
HTML. In my opinion, the algorithm as written is either going to remove  
too

much information or not enough. If it removes too much, the HTML paste is
effectively useless to a client app. If it doesn't remove enough, then  
the

client app is going to have to sanitize the HTML itself anyway.


FWIW, my main concern was the hidden data aspect because it can be abused  
for cross-site request forgery if a malicious site by getting the user to  
copy and paste gets access to form anti-CSRF tokens and such. I *intend*  
to leave some processing of the HTML to the client application, for  
example the removal of third-party application-specific or  
browser-specific CSS properties.


I see that Chrome applies different security policies depending on whether  
the content is read by a JavaScript (getData('text/html') - style) and  
inserted directly. You do some extra work to avoid XSS, such as removing  
on* event listener attributes and href=javascript: when content is  
inserted directly (you also remove some browser-specific elements and  
class names). This sort of clean up and processing on direct data  
insertion by the user-agent is not really in scope for the events spec IMO.


However, for getData('text/html') it seems you do no clean-up at all, not  
for cross-origin paste either. Implementing the current spec would thus  
require that you tighten your existing security policy. Will you consider  
doing so, or would you rather argue for removal of any spec-mandated  
clean-up of cross-origin source code?


I would argue that we should primarily be trying to prevent [1] and  
leave it

up to web pages to prevent [2].


Chrome currently does neither for the getData() case - as far as I can  
tell.



[2] is no different than using data from any
other untrusted source, like dragging HTML or data from an XHR. It  
doesn't

make sense to special-case HTML pastes.


Using data is not the only threat model - limiting the damage potential  
when the page you paste into is malicious is harder. However, there is  
some overlap in the strategies we might use - for example event attributes  
are certainly hidden data, might contain secrets and might cause XSS  
attacks so you might argue for their removal based on both abuse scenarios  
though I think [2] is a more relevant threat.



In order to achieve [1], the algorithm merely needs to be:
- Remove HTML comments, script, input type=hidden, and all other elements
that have no effect on layout (display: none). Possibly remove applet as
well.
- Remove event handlers, data- and form action attributes.
- Blanking input type=password elements.


So you still suggest removing event handlers even though this is primarily  
about your case [2]?



To me, it doesn't make sense to remove the other elements:
- OBJECT: Could be used for SVG as I understand.


OBJECT is considered a form element, so it might have hidden data  
associated with it. It can also contain plugin content that could inject  
scripts and be used for XSS attacks. It may be too far-fetched or  
draconian to remove it though. (SVG is rich enough to be its own can of  
worms by the way..)



- FORM: Essentially harmless once the action attribute is cleared.


Agree. I've changed the spec to allow FORM but remove @action.


- INPUT (non-hidden, non-password): Content is already available via
text/plain.


An input's @name attribute is basically hidden data the user will not be  
aware of pasting. I'm not sure how much of a threat this is, but we should  
give it some thought.



- TEXTAREA: See above.


Ditto :)


- BUTTON, INPUT buttons: Most of the content is already available via
text/plain. We can scrub the value attribute if there is concern about  
that.


More about @name regarding the principle of hidden data. However, I can  
easily be convinced that violating user expectations as little as possible  
is more important than taking this principle to its extreme consequences  
;-) Perhaps other people would like to chime in here?



- SELECT/OPTION/OPTGROUP: See above.

The draft also does not mention how EMBED elements should be handled.


Any thoughts on this?


Finally:
If a script calls getData('text/html'), the implementation supports  
pasting
HTML, and the data available on the clipboard is from a different  
origin,

the implementation must sanitize the content by following these steps:

Should this sanitization be done during a copy as well to prevent data a
paste in a non-conforming browser 

Re: Concerns regarding cross-origin copy/paste security

2011-05-11 Thread Ryosuke Niwa
On Wed, May 4, 2011 at 2:46 PM, Daniel Cheng dch...@chromium.org wrote:

 From my understanding, we are trying to protect against [1] hidden data
 being copied without a user's knowledge and [2] XSS via pasting hostile
 HTML. In my opinion, the algorithm as written is either going to remove too
 much information or not enough. If it removes too much, the HTML paste is
 effectively useless to a client app. If it doesn't remove enough, then the
 client app is going to have to sanitize the HTML itself anyway.

 I would argue that we should primarily be trying to prevent [1] and leave
 it up to web pages to prevent [2]. [2] is no different than using data from
 any other untrusted source, like dragging HTML or data from an XHR. It
 doesn't make sense to special-case HTML pastes.


However, fragment parsing algorithm as spec'ed in HTML5 already prevents
[2].  It removes event handler, script element, etc...

To me, it doesn't make sense to remove the other elements:
 - OBJECT: Could be used for SVG as I understand.
 - FORM: Essentially harmless once the action attribute is cleared.
 - INPUT (non-hidden, non-password): Content is already available via
 text/plain.
 - TEXTAREA: See above.
 - BUTTON, INPUT buttons: Most of the content is already available via
 text/plain. We can scrub the value attribute if there is concern about that.
 - SELECT/OPTION/OPTGROUP: See above.


I'm also curious as to why these elements are being removed.  Hallvord?

 Should this sanitization be done during a copy as well to prevent data a
 paste in a non-conforming browser from pasting unexpected things?


We already do some of this stuff in WebKit.  For example, we avoid
serializing non-rendered contents.

- Ryosuke


Concerns regarding cross-origin copy/paste security

2011-05-04 Thread Daniel Cheng
There was a recent discussion involving directly exposing the HTML fragment
in a paste to a page, since we're doing the parsing anyway for security
reasons. I have some concerns regarding
http://www.w3.org/TR/clipboard-apis/#cross-origin-copy-paste-of-source-codethough.

From my understanding, we are trying to protect against [1] hidden data
being copied without a user's knowledge and [2] XSS via pasting hostile
HTML. In my opinion, the algorithm as written is either going to remove too
much information or not enough. If it removes too much, the HTML paste is
effectively useless to a client app. If it doesn't remove enough, then the
client app is going to have to sanitize the HTML itself anyway.

I would argue that we should primarily be trying to prevent [1] and leave it
up to web pages to prevent [2]. [2] is no different than using data from any
other untrusted source, like dragging HTML or data from an XHR. It doesn't
make sense to special-case HTML pastes.

In order to achieve [1], the algorithm merely needs to be:
- Remove HTML comments, script, input type=hidden, and all other elements
that have no effect on layout (display: none). Possibly remove applet as
well.
- Remove event handlers, data- and form action attributes.
- Blanking input type=password elements.

To me, it doesn't make sense to remove the other elements:
- OBJECT: Could be used for SVG as I understand.
- FORM: Essentially harmless once the action attribute is cleared.
- INPUT (non-hidden, non-password): Content is already available via
text/plain.
- TEXTAREA: See above.
- BUTTON, INPUT buttons: Most of the content is already available via
text/plain. We can scrub the value attribute if there is concern about that.
- SELECT/OPTION/OPTGROUP: See above.

The draft also does not mention how EMBED elements should be handled.

Finally:
If a script calls getData('text/html'), the implementation supports pasting
HTML, and the data available on the clipboard is from a different origin,
the implementation must sanitize the content by following these steps:
Should this sanitization be done during a copy as well to prevent data a
paste in a non-conforming browser from pasting unexpected things?

Daniel

(resending from the right address, sorry for the spam Hallvord)