Re: copy events and content from server

2011-05-16 Thread Hallvord R. M. Steen
On Wed, 04 May 2011 02:26:22 +0900, Paul Libbrecht p...@hoplahup.net  
wrote:


In many of the scenarios I have working for, the content to be put on  
the clipboard would come from a luxury knowledge structure on the  
server, one that has access to some semantic source and can infer  
useful representations out of it; these get put to the clipboard.

An offline HTML would also be an example of it.


but I am realizing that this is probably not possible to do because the  
only way to do obtain something from the server is to wait until a  
callback is called (and this is good so) at which time the copy event  
might be long gone already.


Indeed.

Would it be thinkable to *lock* the copy event until either a timeout  
occurs or an unlock is called?


It sounds like a quite advanced use case. I briefly considered something  
like event.clipboardData.pushContentsOfURL('/foo/bar') but that would be  
way to limited in options - POST/GET, post data etc. I would like to defer  
this to later and see if we get more demand for it. Overall, the push for  
web applications is a lot about removing logic from the server and adding  
more on the client's side, so I'm unsure how common this state (when the  
server knows significantly more than the client-side logic about what  
should be placed on the clipboard) is and will be going forward.


--
Hallvord R. M. Steen, Core Tester, Opera Software
http://www.opera.com http://my.opera.com/hallvors/



[IndexedDB] Bug# 11401 - We should disallow .transaction() from within setVersion transactions

2011-05-16 Thread Israel Hilerio
Pablo explained to me that the main issue with allowing transactions 
from being created inside a SetVersion handler is identifying which 
objectstores the new transaction is binding to. That is bug# 11401 [1].

Using Jeremy's example:
db.setVersion('1').onsuccess(function () {
   db.createObjectStore('a');   //objectstore a
   trans = db.transaction('a');
   db.removeObjectStore('a');
   db.createObjectStore('a');   //objecstore a'
   trans.objectStore('a').put('foo', 'bar'); });
 
It is unclear which of the two objectstores a or a' is associated with 
the newly created READ_ONLY transaction inside the setVersion handler.  
To echo Jeremy's proposal, would it be okay if we were not to support 
this scenario and just throw and exception.  

We would like to modify the spec to say something like:

IDBDatabase.transaction:
Throws an IDBDatabaseException of NOT_ALLOWED_ERR when the
transaction() method is called within the onsuccess handler of a  
setVersion request.
 
Israel
[1] http://www.w3.org/Bugs/Public/show_bug.cgi?id=11401



Re: Concerns regarding cross-origin copy/paste security

2011-05-16 Thread Hallvord R. M. Steen
On Thu, 05 May 2011 06:46:55 +0900, Daniel Cheng dch...@chromium.org  
wrote:


There was a recent discussion involving directly exposing the HTML  
fragment

in a paste to a page, since we're doing the parsing anyway for security
reasons. I have some concerns regarding
http://www.w3.org/TR/clipboard-apis/#cross-origin-copy-paste-of-source-code
though.


From my understanding, we are trying to protect against [1] hidden data

being copied without a user's knowledge and [2] XSS via pasting hostile
HTML. In my opinion, the algorithm as written is either going to remove  
too

much information or not enough. If it removes too much, the HTML paste is
effectively useless to a client app. If it doesn't remove enough, then  
the

client app is going to have to sanitize the HTML itself anyway.


FWIW, my main concern was the hidden data aspect because it can be abused  
for cross-site request forgery if a malicious site by getting the user to  
copy and paste gets access to form anti-CSRF tokens and such. I *intend*  
to leave some processing of the HTML to the client application, for  
example the removal of third-party application-specific or  
browser-specific CSS properties.


I see that Chrome applies different security policies depending on whether  
the content is read by a JavaScript (getData('text/html') - style) and  
inserted directly. You do some extra work to avoid XSS, such as removing  
on* event listener attributes and href=javascript: when content is  
inserted directly (you also remove some browser-specific elements and  
class names). This sort of clean up and processing on direct data  
insertion by the user-agent is not really in scope for the events spec IMO.


However, for getData('text/html') it seems you do no clean-up at all, not  
for cross-origin paste either. Implementing the current spec would thus  
require that you tighten your existing security policy. Will you consider  
doing so, or would you rather argue for removal of any spec-mandated  
clean-up of cross-origin source code?


I would argue that we should primarily be trying to prevent [1] and  
leave it

up to web pages to prevent [2].


Chrome currently does neither for the getData() case - as far as I can  
tell.



[2] is no different than using data from any
other untrusted source, like dragging HTML or data from an XHR. It  
doesn't

make sense to special-case HTML pastes.


Using data is not the only threat model - limiting the damage potential  
when the page you paste into is malicious is harder. However, there is  
some overlap in the strategies we might use - for example event attributes  
are certainly hidden data, might contain secrets and might cause XSS  
attacks so you might argue for their removal based on both abuse scenarios  
though I think [2] is a more relevant threat.



In order to achieve [1], the algorithm merely needs to be:
- Remove HTML comments, script, input type=hidden, and all other elements
that have no effect on layout (display: none). Possibly remove applet as
well.
- Remove event handlers, data- and form action attributes.
- Blanking input type=password elements.


So you still suggest removing event handlers even though this is primarily  
about your case [2]?



To me, it doesn't make sense to remove the other elements:
- OBJECT: Could be used for SVG as I understand.


OBJECT is considered a form element, so it might have hidden data  
associated with it. It can also contain plugin content that could inject  
scripts and be used for XSS attacks. It may be too far-fetched or  
draconian to remove it though. (SVG is rich enough to be its own can of  
worms by the way..)



- FORM: Essentially harmless once the action attribute is cleared.


Agree. I've changed the spec to allow FORM but remove @action.


- INPUT (non-hidden, non-password): Content is already available via
text/plain.


An input's @name attribute is basically hidden data the user will not be  
aware of pasting. I'm not sure how much of a threat this is, but we should  
give it some thought.



- TEXTAREA: See above.


Ditto :)


- BUTTON, INPUT buttons: Most of the content is already available via
text/plain. We can scrub the value attribute if there is concern about  
that.


More about @name regarding the principle of hidden data. However, I can  
easily be convinced that violating user expectations as little as possible  
is more important than taking this principle to its extreme consequences  
;-) Perhaps other people would like to chime in here?



- SELECT/OPTION/OPTGROUP: See above.

The draft also does not mention how EMBED elements should be handled.


Any thoughts on this?


Finally:
If a script calls getData('text/html'), the implementation supports  
pasting
HTML, and the data available on the clipboard is from a different  
origin,

the implementation must sanitize the content by following these steps:

Should this sanitization be done during a copy as well to prevent data a
paste in a non-conforming browser 

Re: paste events and HTML support - interest in exposing a DOM tree?

2011-05-16 Thread Hallvord R. M. Steen
On Mon, 09 May 2011 21:02:21 +0900, Johan Sörlin spo...@moxiecode.com  
wrote:



Hi Hallvord,

This is wonderful news since getting the html from the clipboard right  
now is a really ugly hack and very browser dependent.


Sure it is, we hope to fix that in a nice way :-)

Getting the clipboard data as both a string or a fragment would sure  
make it easy for developers to handle clipboard contents.


I take that as support for my event.clipboardData.getDocumentFragment()  
suggestion. I'll need to talk to Ian Hickson about it since the actual  
DataTransfer definition is in HTML5, but I guess it might be interesting  
also for DnD?



Regarding HTML sanitation:
The mozilla folks recently decided to clean up pasted HTML but it's a  
bit too aggressive removing all non standard attributes. In order to for  
example detect MS word HTML a lot of this odd content needs to be  
retained to check for list like structures for example.


Yes, the getData() algorithm is not going to clean up non-standard  
attributes/class names/elements.


I think the sanitation outlined in the document might also be a bit too  
aggressive. Such as removing the HTML comments and data attributes. For  
example both TinyMCE and CKEditor uses the data- attributes for internal  
usage. So if a user is pasting from one editor to another cross domains  
the attribute would be lost and therefor break that item.


I didn't know that you use data- attributes. This gets somewhat tricky -  
data- is certainly a type of hidden data that should be removed under the  
to the greatest extent possible, a user should know what s/he is really  
pasting principle. I see your use case, but I also assume that many sites  
would use data- attributes for information that the site doesn't expect  
anybody but its own JS to get access to.


Also removing all style properties that is computed to 'none' would  
remove browser specific CSS rules and mso- styles that we use to detect  
word specific items.


What he algorithm actually intends is to remove all elements with  
display:none (and visibility:hidden) - again based on the principle that  
we should try to make sure the user knows what s/he is pasting.


I've reworked the stuff on pasting HTML (potentially with multiple parts)  
today - please review this section at your leisure, particularly the  
screenful known as sections 8.3 and 8.3.1:

http://dev.w3.org/2006/webapi/clipops/clipops.html#pasting-html

--
Hallvord R. M. Steen, Core Tester, Opera Software
http://www.opera.com http://my.opera.com/hallvors/



safeguarding a live getData() against looping scripts? (was: Re: clipboard events)

2011-05-16 Thread Hallvord R. M. Steen



IMO getData() should be 'live' - i.e. return what's on the clipboard.



I think having it return live data could result in potential security
issues. Couldn't a script loop inside the paste event to keep sniffing  
out live data?


What should we do about this? Should the spec mandate a timeout or a limit  
on how many times a script may call getData() for the same event?


--
Hallvord R. M. Steen, Core Tester, Opera Software
http://www.opera.com http://my.opera.com/hallvors/



Filtering clipboard MIME types (was: Re: clipboard events)

2011-05-16 Thread Hallvord R. M. Steen
On Mon, 31 Jan 2011 19:39:13 +0900, Daniel Cheng dch...@chromium.org  
wrote:


I'd go one step further and say that there should be some agreement on  
what

MIME types ought to be supported to try to insure somewhat consistent
behavior across different platforms.


To get a table started in the spec, could you give me a small list of  
(MIME) types one should mandate the UA to be aware of and be able to  
roundtrip to/from native clipboard types? Just off the top of your head?  
The typical Web MIME types would of course be something along the lines of


text/plain
text/html
image/jpg
image/gif
image/png
application/xhtml+xml
image/svg+xml

What about e.g. RTF?


The way I'm working on implementing it
(for drag and drop, though it applies to copy and paste as well),  
arbitrary

strings would not be accessible from a non-DOM application, e.g. a native
app like Word or Photoshop. Only a set of known MIME types would be
automatically converted to the corresponding native type.


That's dragging from UA to another app, right? So the way to spec it would  
be during copy/cut processing, the UA should support placing content of  
these MIME types on the clipboard and translate the type to the OS native  
equivalent where applicable or something like that?



 When pulling data from the clipboard

X

I'm choosing
to restrict the number of native types to a smaller, defined set that are
visible to webpages. Any paths in this set can be filtered as necessary  
when a file drag is detected.


Again the specific list of types for this would be great :-)

--
Hallvord R. M. Steen, Core Tester, Opera Software
http://www.opera.com http://my.opera.com/hallvors/



request for custom clipboard types (Re: clipboard events)

2011-05-16 Thread Hallvord R. M. Steen
On Mon, 31 Jan 2011 20:25:20 +0900, Paul Libbrecht p...@activemath.org  
wrote:



there should be some agreement on what MIME types ought to be supported



Some types will be predefined but the door should stay opened for others.


I think what you are asking implies that the UA should get out of the  
way and just pass the arbitrary string the script gives it to the OS.


Then you risk that script authors need to
a) start writing platform-detection and OS-specific code
b) be forced to handle cases like a Windows OS whose list of possible  
clipboard types is full


I think in particular a) is a very bad consequence. Browser sniffing is an  
awful failure, holding the web back, preventing compatibility and  
competition. We should certainly avoid specifying something that will be  
even worse if we can. (I see scripts detecting Windows and Macs only and  
not fall back to anything but broken clipboard support for other platforms  
if we go down this route).


A website maker for, say, a shop for furnitures that knows they can go  
into my home plan maker through the clipboard will want to be able to  
produce and export a clipboard flavor that is unknown to both browser  
implementors and spec makers now.
Provided the user may say that the format is safe (safe as a picture  
for example), he would be able to drag-and-drop the furniture and get a  
3D view inside my home plan maker.


I can see there are some really nice and tempting use cases. The problem  
is the serious downsides.. I would however assume that if we support  
placing a main XML (or JSON) payload plus alternate- or sub-parts on the  
clipboard, many custom formats and applications would be able to do their  
custom business in XML or JSON plus binary blobs. What do you think?


--
Hallvord R. M. Steen, Core Tester, Opera Software
http://www.opera.com http://my.opera.com/hallvors/