On , Hallvord R. M. Steen <hallv...@opera.com> wrote:

Hi,
a question related to the evolving draft on
http://www.w3.org/TR/clipboard-apis/ (which actually is slightly better
styled on http://dev.w3.org/2006/webapi/clipops/clipops.html - I should
figure out why ;-))

We want to enable some sort of access to HTML code if the user pastes
formatted text from an application that places HTML on the clipboard.
However, the browser will have to implement some security restrictions
(see relevant section of the spec - though it's just a draft), and in some
cases process the HTML to deal with embedded data when there are sub-parts
on the clipboard.

To handle both security algorithms and any embedded data, the browser will
probably need to parse the HTML. So actually, when you call
event.clipboardData.getData('text/html') the browser will get HTML from
the clipboard, parse it, do any work required by security and data embeds
rules on the DOM, and then serialize the code (possibly after modifying
the DOM) to pass it on to the script. Of course the script will want to do
its own processing, which will probably at some point require parsing the
code again..

So, to make things more efficient - would it be interesting to expose the
DOM tree from the browser's internal parsing? For example, we could define

event.clipboardData.getDocumentFragment()

which would return a parsed and when applicable sanitized view of any
markup the implementation supports from the clipboard. Rich text editors
could now use normal DOM methods to dig through the contents and
remove/add cruft as they wish on the returned document fragment, before
doing an appendChild() and preventing the event's default action.

Thoughts?


This is already covered by doing x=createElement;x.innerHTML=foo;traverse x

Regarding simplifying the pasted html to remove stuff that could be malicious, 
consider a rogue app that injects a script in the clipboard and expects the 
user to hit paste on his bank site. There is little the user agent can do but 
to provide quick and easy methods to sanatize this. There is already the 
toStaticHTML API that IE implements.
I would suggest supporting and implementing it. Or even add a sister property 
of innerHTML, innerStaticHTML which would not return scripts or event handlers 
on reading, and would parse out those when setting.

Reply via email to