On Mon, Mar 14, 2011 at 1:27 PM, Luis Marsnao <[email protected]> wrote: > Can data: URIs be used insecurely?
Yes, but everything can be used insecurely, even a butter knife. > I'm attempting to write a client-side script that processes a user selected > file through an input element. Since the input element interface conceals the > file: URI, the best solution I can think of is to access the file through the > input element's interface, get its data: URI through readAsDataURL in > FileAPI's FileReader interface, and process the data: URI. However, I get > not-same-origin errors when I try to use this URI. Specifically, this happens > when I try to use XMLHttpRequest to retrieve an XML resource with the data > URI. > > Is this correct? > http://www.w3.org/TR/html5/origin-0.html#origin-0 appears to suggest it: "If > url does not use a server-based naming authority, or if parsing url failed, > or if url is not an absolute URL, then return a new globally unique > identifier.", data URIs do not use server-based authorities, and opaque > identifiers only have same origin with themselves. Are you using WebKit? There are long-standing bugs in WebKit where WebKit is more conservative about the security context for data URLs than what's in the spec. I'd like to fix them, but I've got a bunch of other things to do first. > Is there a better way to process files in a client-side script? I considered > using blob: URIs, but the support is not yet there. Blob is a much better way to interact with files. With Blob, you can interact with much larger files and you don't need to access the disk synchronously (which can be arbitrarily slow). > Can data: URIs be abused with the other same-origin policies in effect? I'm > trying to imagine a situation where the data: URI origin policy is necessary > for security. But I'm under the impression data: URIs literally are the > resources they denote, and current policies allow input only from same-origin > resources or the user, so scripts get input only from those sources. If that > input literally is a resource, then that resource /should/ be treated as > same-origin or from the user. Am I wrong? The security context of data URLs is a subtle issue. Life is more complex than you state above. Adam
