(Finally found some time to resume this old discussion - if you've all forgotten the details by now the thread started here: https://lists.w3.org/Archives/Public/public-webapps/2015AprJun/0819.html )
On Sat, Aug 29, 2015 at 3:16 PM, Paul Libbrecht <p...@hoplahup.net> wrote: > But copying a fragment of HTML in the wild without reformulating it will > lead to privacy breach: it would copy references to external content. I > believe all browsers have an "inlining" method to solve that problem when > pasting from a web-page (I believe "save as web page complete" also does a > part of that). I think the proposed solution may be worse for privacy than the original problem: images you would "inline" might have sensitive data in them.. AFAIK it's not common to do any image inlining or URL mangling when pasting HTML copied from another site, is it? >> Why should JSON be unsafe? Parsing JSON should be pretty easy, so hopefully >> most parsers would be safe. > > I think the danger lies beyond parsers. > In XML, you would have XInclude which can be used in many tools to include > content from outside. > I believe I have seen JSON syntaxes that had external references as part of > their specs but I can't remember which now. > As long these formats are copied as is and parsed blindly the risk of > external inclusion remains. XML: good point. JSON: nope, there's no such thing as "external inclusion" in JSON, and there never will be. >> For the unsafe formats, the warning could say that the UA-implementors >> should only support the flavour if they have a method to make this content >> safe so that local applications (which do not expect untrusted content) >> receive content they can trust when pasting. Methods to make the content >> safe include the following: transcoding a picture, inlining all external >> entities for html, xml, mathml, or rtf). > > > On Windows I believe images are transcoded to and from DIB - device > independent bitmap format anyway. Is there any equivalent graphics > interchange format on other platforms? Does mandating such transcoding help > guarantee against payloads that might trigger vulnerabilities in target > software? > > All platforms I know of have some sort of transcoding of pictures (in Macs > it is PDF as the central format). > I think this is a very safe mechanism to rely on. I've just done a small test in Safari on Mac. It allows writing a random string to the clipboard and labelling it image/tiff (and helpfully adds a "public.tiff" clipboard entry with the same random data). There's no transcoding or similar. > I expect it adds a significant hurdle against exploits, but I'd like input > from Daniel Cheng and perhaps from people who have worked on image > decoders.. I'd still like Daniel Cheng to chime in again if he has time :) So, the question (for recap) is: would it be OK to let JS write binary data labelled as an image type if the browser was required to transcode it to TIFF or DIB and only update the clipboard if such transcoding succeeded? (Obviously we also need the reverse mechanism - on paste, if there's TIFF or DIB on the clipboard offer JPEG and/or PNG to JS). -Hallvord