[Bug 27196] New: Blob URL origin

2014-10-30 Thread bugzilla
https://www.w3.org/Bugs/Public/show_bug.cgi?id=27196

Bug ID: 27196
   Summary: Blob URL origin
   Product: WebAppsWG
   Version: unspecified
  Hardware: PC
OS: All
Status: NEW
  Severity: normal
  Priority: P2
 Component: File API
  Assignee: a...@mozilla.com
  Reporter: ann...@annevk.nl
QA Contact: public-webapps-bugzi...@w3.org
CC: public-webapps@w3.org

1) You should not use the effective script origin.

2) You should not use the incumbent settings object.

3) You should use the entry settings object's origin.

-- 
You are receiving this mail because:
You are on the CC list for the bug.



Re: File API: Blob URL origin

2014-07-17 Thread Arun Ranganathan
On Jun 30, 2014, at 7:13 PM, Glenn Maynard gl...@zewt.org wrote:

 Why would the identifier not just be the blob URL itself?  


Done.

 
 Also, both Chrome and Firefox treat the entire URL as case-sensitive, eg. 
 Blob:... won't revoke the URL, or uppercasing the hostname portion in 
 Chrome.  Using the whole URL as the identifier makes this easy to do.
 
 Subsequent attempts to dereference url must return a network error should 
 be removed.  That should already be the consequence of unregistering the URL, 
 so this is a redundant requiremen


Done: http://dev.w3.org/2006/webapi/FileAPI/

— A*

Re: File API: Blob URL origin

2014-07-02 Thread Anne van Kesteren
On Tue, Jul 1, 2014 at 5:18 PM, Arun Ranganathan a...@mozilla.com wrote:
 While I think mediastream URLs may have died on the vine, using the same
 store for filesystem URLs would be good.

Yeah maybe. I thought those worked differently in that they're not
really tied to objects, but rather actual files. So they would either
resolve or not.

We do need to have a discussion about their design at some point and
make any changes to the URL parsing algorithm deemed necessary.


-- 
http://annevankesteren.nl/



Re: File API: Blob URL origin

2014-07-01 Thread Anne van Kesteren
On Tue, Jul 1, 2014 at 1:13 AM, Glenn Maynard gl...@zewt.org wrote:
 Why would the identifier not just be the blob URL itself?  The spec
 currently makes the identifier just the scheme data, which seems much more
 complex than it needs to be.  revokeObjectURL should simply be Remove the
 entry from the Blob URL Store for URL.  If we want to allow revoking URLs
 that have a fragment attached it'd still need to strip it off; Firefox does
 this, but Chrome doesn't.

That works for me. That way we can make this a more generic store if
god forbid we get more of these schemes.


 Also, both Chrome and Firefox treat the entire URL as case-sensitive, eg.
 Blob:... won't revoke the URL, or uppercasing the hostname portion in
 Chrome.  Using the whole URL as the identifier makes this easy to do.

Ew, but okay I guess.


 Subsequent attempts to dereference url must return a network error should
 be removed.  That should already be the consequence of unregistering the
 URL, so this is a redundant requirement.

Agreed.


-- 
http://annevankesteren.nl/



Re: File API: Blob URL origin

2014-07-01 Thread Arun Ranganathan
On Jun 30, 2014, at 7:13 PM, Glenn Maynard gl...@zewt.org wrote:

 Why would the identifier not just be the blob URL itself?  The spec currently 
 makes the identifier just the scheme data, which seems much more complex than 
 it needs to be.  revokeObjectURL should simply be Remove the entry from the 
 Blob URL Store for URL.  If we want to allow revoking URLs that have a 
 fragment attached it'd still need to strip it off; Firefox does this, but 
 Chrome doesn’t.


There’s no good reason not to do this. The scheme data identifier is a 
hold-over from the “only UUID” model that Fx has right now (the original intent 
was to have a table full of UUIDs and blob references). I’m happy to change to 
this model.


 Also, both Chrome and Firefox treat the entire URL as case-sensitive, eg. 
 Blob:... won't revoke the URL, or uppercasing the hostname portion in 
 Chrome.  Using the whole URL as the identifier makes this easy to do.
 
 Subsequent attempts to dereference url must return a network error should 
 be removed.  That should already be the consequence of unregistering the URL, 
 so this is a redundant requirement.
 



OK, this is a good nit too.

— A*

Re: File API: Blob URL origin

2014-07-01 Thread Arun Ranganathan

On Jul 1, 2014, at 2:32 AM, Anne van Kesteren ann...@annevk.nl wrote:

 That works for me. That way we can make this a more generic store if
 god forbid we get more of these schemes.


While I think mediastream URLs may have died on the vine, using the same store 
for filesystem URLs would be good.

— A*

Re: File API: Blob URL origin

2014-06-30 Thread Arun Ranganathan
On Jun 28, 2014, at 4:42 AM, Anne van Kesteren ann...@annevk.nl wrote:

 I now defined the origin for blob URLs:
 http://url.spec.whatwg.org/#concept-url-origin Sorry for the delay.
 
 Still need to work out the correct Fetch integration.
 


Thanks :)

Removed origin extraction from FIle API, but added identifier extraction (based 
on the same model — that is, running the basic URL parser). This makes adding 
entires to the Blob URL Store clearer.

I think the Blob pieces for Fetch are in place now.

http://dev.w3.org/2006/webapi/FileAPI

— A* 




Re: File API: Blob URL origin

2014-06-30 Thread Anne van Kesteren
On Mon, Jun 30, 2014 at 8:45 PM, Arun Ranganathan a...@mozilla.com wrote:
 Removed origin extraction from FIle API, but added identifier extraction 
 (based on the same model — that is, running the basic URL parser). This makes 
 adding entires to the Blob URL Store clearer.

I don't really understand this. Entries should be added when a blob
URL is created. And should be checked/removed when a blob URL is used
(through parsing it in the URL parser, as defined by the URL parser).

There should be no need for such a definition in the File API specification.


-- 
http://annevankesteren.nl/



Re: File API: Blob URL origin

2014-06-30 Thread Arun Ranganathan
On Jun 30, 2014, at 4:20 PM, Anne van Kesteren ann...@annevk.nl wrote:

 I don't really understand this. Entries should be added when a blob
 URL is created.


They are! That is, at the time the method URL.createObjectURL(blob) is called 
on blob, that method adds an entry to the Blob URL Store: 
http://dev.w3.org/2006/webapi/FileAPI/#add-an-entry

I’ve only defined identifier extraction for use with adding an entry. Is that 
wrong?


 And should be checked/removed when a blob URL is used
 (through parsing it in the URL parser, as defined by the URL parser).


Yes, this is when URL parser *checks* the Blob URL Store, *after* entries have 
either been added or are not present (which generates a network error).


 There should be no need for such a definition in the File API specification.


Since adding scheme data as an identifer to the Blob URL Store is the 
requirement, we only use the basic URL parser to get the scheme data from a 
generated URL to add to the Blob URL Store.

— A*



Re: File API: Blob URL origin

2014-06-30 Thread Anne van Kesteren
On Mon, Jun 30, 2014 at 10:48 PM, Arun Ranganathan a...@mozilla.com wrote:
 They are! That is, at the time the method URL.createObjectURL(blob) is
 called on blob, that method adds an entry to the Blob URL Store:
 http://dev.w3.org/2006/webapi/FileAPI/#add-an-entry

 I’ve only defined identifier extraction for use with adding an entry. Is
 that wrong?

It seems like you could define identifier creation, use that, use the
return value to add an entry, and then return blob: + the return
value. Creating a URL first and then parsing it again to extract
something seems needlessly complicated.


-- 
http://annevankesteren.nl/



Re: File API: Blob URL origin

2014-06-30 Thread Arun Ranganathan
On Jun 30, 2014, at 4:57 PM, Anne van Kesteren ann...@annevk.nl wrote:

 On Mon, Jun 30, 2014 at 10:48 PM, Arun Ranganathan a...@mozilla.com wrote:
 They are! That is, at the time the method URL.createObjectURL(blob) is
 called on blob, that method adds an entry to the Blob URL Store:
 http://dev.w3.org/2006/webapi/FileAPI/#add-an-entry
 
 I’ve only defined identifier extraction for use with adding an entry. Is
 that wrong?
 
 It seems like you could define identifier creation, use that, use the
 return value to add an entry, and then return blob: + the return
 value. Creating a URL first and then parsing it again to extract
 something seems needlessly complicated.



Well, the best way to define URL.revokeObjectURL(blobURL) seemed to be in terms 
of parsing to extract identifier (scheme data) and then delete the entry 
corresponding to identifier.

But you’re absolutely right that URL.create* methods shouldn’t have a 
dependency on the basic URL parser, and so I’ve redefined those methods along 
the lines you say above (namely, defining identifier creation, and then Blob 
URL creation).

That’s http://dev.w3.org/2006/webapi/FileAPI/#creating-revoking in today’s 
editor’s draft.

(So *now* maybe the path is clear for Fetch with Blobs.)

— A*




Re: File API: Blob URL origin

2014-06-30 Thread Glenn Maynard
On Mon, Jun 30, 2014 at 3:57 PM, Anne van Kesteren ann...@annevk.nl wrote:

 On Mon, Jun 30, 2014 at 10:48 PM, Arun Ranganathan a...@mozilla.com
 wrote:
  They are! That is, at the time the method URL.createObjectURL(blob) is
  called on blob, that method adds an entry to the Blob URL Store:
  http://dev.w3.org/2006/webapi/FileAPI/#add-an-entry
 
  I’ve only defined identifier extraction for use with adding an entry. Is
  that wrong?

 It seems like you could define identifier creation, use that, use the
 return value to add an entry, and then return blob: + the return
 value. Creating a URL first and then parsing it again to extract
 something seems needlessly complicated.


Why would the identifier not just be the blob URL itself?  The spec
currently makes the identifier just the scheme data, which seems much more
complex than it needs to be.  revokeObjectURL should simply be Remove the
entry from the Blob URL Store for URL.  If we want to allow revoking URLs
that have a fragment attached it'd still need to strip it off; Firefox does
this, but Chrome doesn't.

Also, both Chrome and Firefox treat the entire URL as case-sensitive, eg.
Blob:... won't revoke the URL, or uppercasing the hostname portion in
Chrome.  Using the whole URL as the identifier makes this easy to do.

Subsequent attempts to dereference url must return a network error should
be removed.  That should already be the consequence of unregistering the
URL, so this is a redundant requirement.

-- 
Glenn Maynard


File API: Blob URL origin

2014-06-28 Thread Anne van Kesteren
I now defined the origin for blob URLs:
http://url.spec.whatwg.org/#concept-url-origin Sorry for the delay.

Still need to work out the correct Fetch integration.


-- 
http://annevankesteren.nl/



Re: Blob URL Origin

2014-06-12 Thread Arun Ranganathan
On Jun 10, 2014, at 2:57 AM, Anne van Kesteren ann...@annevk.nl wrote:

 On Tue, Jun 10, 2014 at 12:16 AM, Arun Ranganathan a...@mozilla.com wrote:
 Right now, the Blob URL Store is defined in terms of units of similar-origin 
 browsing contexts; each unit is required to have a Blob URL Store. As you 
 point out, that allows all origins within document.domain access to a given 
 Blob URL Store.
 
 Yeah, so unlike what the discussion claimed thus far, we did not in
 fact allow that much cross-origin blob URL usage. Only origins within
 the document.domain reach.
 
 
 1. Require that entries in the Blob URL Store also store origin
 
 I thought this was the idea. The identifier would be
 http://someorigin:70/uuid”.


Yes; there was some discussion about tuples vs. strings on IRC, but I think one 
leads to the other, and we can define how to extract the origin from a parsed 
Blob URL in terms of another use of the URL Parser instead of string parsing.


 
 2. Define it strictly as a same-origin store. I’m a bit fuzzy on how exactly 
 to define this; for instance, strictly the origin and not the effective 
 script origin of a Document?
 
 We could say that the store is bound to a global object. And then both
 URL.createObjectURL() and places that parse URLs hook into the entry
 setting object's global object's blob URL store.
 
 At that point the only benefit of putting the origin into the URL is
 so that new URL(blob).origin works.


This seems right; I think it would be rare that a developer would need to check 
origin, but it’s been pointed out that there some use cases for that. It seems 
better to introduce a method that doesn’t require creating a new object, but I 
don’t feel strongly about it.


 Something that is still unclear to me is what happens when you
 navigate to a blob URL. I guess that still technically works as the
 URL parsing would happen within the correct global.


If URL parsing doesn’t occur within the correct global, a network error will be 
the result, since there won’t be a corresponding entry in the Blob URL store 
that matches the identifier. So I think this sounds workable.

— A*




Re: Blob URL Origin

2014-06-10 Thread Anne van Kesteren
On Tue, Jun 10, 2014 at 12:16 AM, Arun Ranganathan a...@mozilla.com wrote:
 Right now, the Blob URL Store is defined in terms of units of similar-origin 
 browsing contexts; each unit is required to have a Blob URL Store. As you 
 point out, that allows all origins within document.domain access to a given 
 Blob URL Store.

Yeah, so unlike what the discussion claimed thus far, we did not in
fact allow that much cross-origin blob URL usage. Only origins within
the document.domain reach.


 1. Require that entries in the Blob URL Store also store origin

I thought this was the idea. The identifier would be
http://someorigin:70/uuid;.


 2. Define it strictly as a same-origin store. I’m a bit fuzzy on how exactly 
 to define this; for instance, strictly the origin and not the effective 
 script origin of a Document?

We could say that the store is bound to a global object. And then both
URL.createObjectURL() and places that parse URLs hook into the entry
setting object's global object's blob URL store.

At that point the only benefit of putting the origin into the URL is
so that new URL(blob).origin works.


Something that is still unclear to me is what happens when you
navigate to a blob URL. I guess that still technically works as the
URL parsing would happen within the correct global.


-- 
http://annevankesteren.nl/



Re: Blob URL Origin

2014-06-09 Thread Anne van Kesteren
On Thu, May 29, 2014 at 11:42 AM, Anne van Kesteren ann...@annevk.nl wrote:
 However, I wonder if this at a standards level should come into play
 in the URL parser. After all that creates a structured clone of the
 blob in question. The lookup for the blob ID should probably fail at
 that point meaning it does not really matter when you then try to
 fetch that URL as it will simply not have an associated blob.

I filed a bug https://www.w3.org/Bugs/Public/show_bug.cgi?id=25987 for
this, but it seems worth discussing here.

A blob URL store is already limited to all the origins that can reach
each other through document.domain. So cross-origin blob usage was
already limited per the specification. It seems like what we should do
is instead make this a same-origin store. And then when URLs are
parsed you'd only have access to the same-origin (and *not* effective
origin) blob URL store. In turn that means it does not matter much
whether you put origins in the blob URLs, but I suppose we cold do it
for clarity. It would also make new URL(blobURL).origin work.

What am I missing?


-- 
http://annevankesteren.nl/



Re: Blob URL Origin

2014-06-09 Thread Arun Ranganathan
On Jun 9, 2014, at 3:23 AM, Anne van Kesteren ann...@annevk.nl wrote:

 On Thu, May 29, 2014 at 11:42 AM, Anne van Kesteren ann...@annevk.nl wrote:
 However, I wonder if this at a standards level should come into play
 in the URL parser. After all that creates a structured clone of the
 blob in question. The lookup for the blob ID should probably fail at
 that point meaning it does not really matter when you then try to
 fetch that URL as it will simply not have an associated blob.
 
 I filed a bug https://www.w3.org/Bugs/Public/show_bug.cgi?id=25987 for
 this, but it seems worth discussing here.
 
 A blob URL store is already limited to all the origins that can reach
 each other through document.domain. So cross-origin blob usage was
 already limited per the specification. It seems like what we should do
 is instead make this a same-origin store. And then when URLs are
 parsed you'd only have access to the same-origin (and *not* effective
 origin) blob URL store. In turn that means it does not matter much
 whether you put origins in the blob URLs, but I suppose we cold do it
 for clarity. It would also make new URL(blobURL).origin work.


Right now, the Blob URL Store is defined in terms of units of similar-origin 
browsing contexts; each unit is required to have a Blob URL Store. As you point 
out, that allows all origins within document.domain access to a given Blob URL 
Store.

We could:

1. Require that entries in the Blob URL Store also store origin (namely the 
effective script origin of the settings object when the URL was created), in 
addition to identifier and reference to the Blob object. Requests without same 
origin as that of the entry in the Blob URL store for that identifier must 
fail. This keeps the Blob URL Store requirement as is, but adds origin as 
something in the store.

2. Define it strictly as a same-origin store. I’m a bit fuzzy on how exactly to 
define this; for instance, strictly the origin and not the effective script 
origin of a Document?

— A*


File API For Review | was Re: Blob URL Origin

2014-06-04 Thread Arun Ranganathan
The 2 June 2014 Editor’s Draft of the File API solves some bugs and technical 
issues with Blob URLs. Review is encouraged, with a view towards a LCWD 
publication:

http://dev.w3.org/2006/webapi/FileAPI/

In particular:

1. It nails down syntax differences between user agents on Blob URLs. 
2. It specifies origin and origin extraction from Blob URL strings.
3. It provides other pieces of plumbing that specifications like URL and Fetch 
can use.

A few issues remain, but all of these have dependencies:

https://www.w3.org/Bugs/Public/buglist.cgi?bug_status=NEWbug_status=ASSIGNEDbug_status=REOPENEDemail1=arun%40mozilla.comemailassigned_to1=1emailreporter1=1emailtype1=exactlist_id=38706

Notably, dependencies on Fetch (for Fetch of Blob URLs), and dependencies on 
WebIDL (e.g. removing DOMError altogether requires something done for error 
handling; supplanting FileList with array requires this part nailed down). I’m 
happy to have provisional specification text in place before WebIDL adopts 
fixes, but with guidance from the editor(s) of WebIDL.

I expect a CfC to follow about the specification.

— A*


On May 29, 2014, at 5:42 AM, Anne van Kesteren ann...@annevk.nl wrote:

 On Thu, May 29, 2014 at 8:38 AM, Jonas Sicking jo...@sicking.cc wrote:
 On Thu, May 22, 2014 at 1:29 AM, Anne van Kesteren ann...@annevk.nl wrote:
 For fetching blob URLs (and prolly filesystem and indexeddb) we
 effectively act as if the request's mode was same-origin. Allowing
 tainted cross-origin requests would complicate UUID (for the UA) and
 memory (for the page) management in a multiprocess environment.
 
 Hmm.. I think that is effectively it yes. I.e. even though img says
 that it wants to permit cross-origin loads, we'd override that if the
 fetch is for a blob: URL and only permit same-origin loads. Is that
 what you mean?
 
 Yes.
 
 However, I wonder if this at a standards level should come into play
 in the URL parser. After all that creates a structured clone of the
 blob in question. The lookup for the blob ID should probably fail at
 that point meaning it does not really matter when you then try to
 fetch that URL as it will simply not have an associated blob.
 
 
 -- 
 http://annevankesteren.nl/
 




Re: Data URL Origin (Was: Blob URL Origin)

2014-06-02 Thread Anne van Kesteren
On Fri, May 30, 2014 at 2:07 AM, Jonas Sicking jo...@sicking.cc wrote:
 On Thu, May 29, 2014 at 9:21 AM, Anne van Kesteren ann...@annevk.nl wrote:
 Given that workers execute script in a fairly contained way, it might be 
 okay?

 Worker scripts aren't going to be very contained as we add more APIs
 to workers. They can already read any data from the server (through
 XHR) and much local data (through IDB).

 I'd definitely want them not to inherit the origin, the question is if
 that's web compatible at this point. Maybe we can allow them to
 execute but as a sandboxed origin?

Good point. We'll have to investigate how much we can do there. I
followed up on the WHATWG list with regards to aligning Fetch and HTML
with the new policy. I also filed a bug on Gecko.

*  http://lists.w3.org/Archives/Public/public-whatwg-archive/2014Jun/0002.html
* https://bugzilla.mozilla.org/show_bug.cgi?id=1018872


-- 
http://annevankesteren.nl/



Re: Blob URL Origin

2014-05-29 Thread Jonas Sicking
On Thu, May 22, 2014 at 1:29 AM, Anne van Kesteren ann...@annevk.nl wrote:
 For blob URLs (and prolly filesystem and indexeddb) we put the origin
 in the URL and define a way to extract it again so new
 URL(blob).origin does the right thing.

Yup.

 For fetching blob URLs (and prolly filesystem and indexeddb) we
 effectively act as if the request's mode was same-origin. Allowing
 tainted cross-origin requests would complicate UUID (for the UA) and
 memory (for the page) management in a multiprocess environment.

Hmm.. I think that is effectively it yes. I.e. even though img says
that it wants to permit cross-origin loads, we'd override that if the
fetch is for a blob: URL and only permit same-origin loads. Is that
what you mean?

/ Jonas



Data URL Origin (Was: Blob URL Origin)

2014-05-29 Thread Jonas Sicking
On Thu, May 22, 2014 at 1:29 AM, Anne van Kesteren ann...@annevk.nl wrote:
 How do we deal with data URLs? Obviously you can always get a resource
 out of them. But when should the response of fetching one be tainted
 and when should it not be? And there's a somewhat similar question for
 about URLs. Although only about:blank results in something per the
 specification at the moment.

My proposal is something like this:

* Add a new flag to the fetch algorithm allow inheriting origin
* The default for this new flag is false
* If the flag is set to false, the origin of the URL is a unique identifier.
* When the origin is a unique identifier, it would not match any other
origin and so responses would always be tainted.
* If the flag is true, then the origin of the URL is equal to that of
the page that initiated the load.
* When the origin of the URL is inherited, it would always match the
origin of the caller, and so responses would never be tainted.
* I don't know what URL(data).origin should return.
* Make APIs explicitly opt in to setting the allow inheriting origin
flag to true based on whatever policies that we decide.

So for example we could make img always set the allow inheriting
origin flag to true.

And for iframes the flag would only be true if some iframe
allowinheritingoriginfordataurlsplease attribute was set. And then it
would still only be set for the initial load. If the iframe navigated
(through a link or through setting window.location) the flag would be
set to falls.

For `new Worker(...)` I'm not sure what would be web compatible. I'd
prefer if the flag was set to false by default, but that the page
could use some explicit syntax (similar to the iframe) to opt in to
allowing inheriting.

/ Jonas



Re: Data URL Origin (Was: Blob URL Origin)

2014-05-29 Thread Anne van Kesteren
On Thu, May 29, 2014 at 9:06 AM, Jonas Sicking jo...@sicking.cc wrote:
 My proposal is something like this:

Thanks!


 * Add a new flag to the fetch algorithm allow inheriting origin

same-origin data URL flag? Ideally we don't do another data URL mistake.


 * The default for this new flag is false
 * If the flag is set to false, the origin of the URL is a unique identifier.
 * When the origin is a unique identifier, it would not match any other
 origin and so responses would always be tainted.
 * If the flag is true, then the origin of the URL is equal to that of
 the page that initiated the load.
 * When the origin of the URL is inherited, it would always match the
 origin of the caller, and so responses would never be tainted.

This does not clarify what happens if you end up at a data URL as a
result of a redirect. If the redirect is cross-origin you'll end up
tainted. If it's CORS you get a network error. But if it's same-origin
that's fair game?


 * I don't know what URL(data).origin should return.

Probably just null. I think we should make it about the origin of
the request, not the URL.


 * Make APIs explicitly opt in to setting the allow inheriting origin
 flag to true based on whatever policies that we decide.

 So for example we could make img always set the allow inheriting
 origin flag to true.

And for XMLHttpRequest? We decided a while back we wanted data URLs to
work there.


 And for iframes the flag would only be true if some iframe
 allowinheritingoriginfordataurlsplease attribute was set. And then it
 would still only be set for the initial load. If the iframe navigated
 (through a link or through setting window.location) the flag would be
 set to falls.

Seems fair.


 For `new Worker(...)` I'm not sure what would be web compatible. I'd
 prefer if the flag was set to false by default, but that the page
 could use some explicit syntax (similar to the iframe) to opt in to
 allowing inheriting.

Given that workers execute script in a fairly contained way, it might be okay?


-- 
http://annevankesteren.nl/



Re: Data URL Origin (Was: Blob URL Origin)

2014-05-29 Thread Jonas Sicking
On Thu, May 29, 2014 at 9:21 AM, Anne van Kesteren ann...@annevk.nl wrote:
 On Thu, May 29, 2014 at 9:06 AM, Jonas Sicking jo...@sicking.cc wrote:
 * The default for this new flag is false
 * If the flag is set to false, the origin of the URL is a unique identifier.
 * When the origin is a unique identifier, it would not match any other
 origin and so responses would always be tainted.
 * If the flag is true, then the origin of the URL is equal to that of
 the page that initiated the load.
 * When the origin of the URL is inherited, it would always match the
 origin of the caller, and so responses would never be tainted.

 This does not clarify what happens if you end up at a data URL as a
 result of a redirect. If the redirect is cross-origin you'll end up
 tainted. If it's CORS you get a network error. But if it's same-origin
 that's fair game?

For something like an iframe load I think the safe thing is to
always clear the flag when a redirect happens. I.e. if someone does

iframe src=http://example.com/a; allowinheritingoriginfordataurlsplease

and example.com redirects to a data URL, we would have all sorts of
messy questions if we allowed the flag to stay set an the origin to be
inherited:

* Should it be inherited from the owner of the iframe, who set the
allowinheritingoriginfordataurlsplease attribute, or from example.com
who is the one that generated the data URL. We don't want example.com
to get XSSed either.
* What if the owner of the iframe hadn't thought about redirects to
data URLs and just checked the src URL for data: and verified that it
didn't contain any bad stuff?

Redirecting to a data URL feels like a very edge-casy thing. So lets
keep it simple and safe rather than worry about cramming more features
in.

 * I don't know what URL(data).origin should return.

 Probably just null.

Given that the effective origin depends on which API you pass the
data-url to, I agree that trying to return a real origin here is
never going to be sensible. I don't know if returning null is the
way to go, or if returning `undefined` is. I guess I don't have a
strong opinion.

 * Make APIs explicitly opt in to setting the allow inheriting origin
 flag to true based on whatever policies that we decide.

 So for example we could make img always set the allow inheriting
 origin flag to true.

 And for XMLHttpRequest? We decided a while back we wanted data URLs to
 work there.

I don't feel strongly.

 For `new Worker(...)` I'm not sure what would be web compatible. I'd
 prefer if the flag was set to false by default, but that the page
 could use some explicit syntax (similar to the iframe) to opt in to
 allowing inheriting.

 Given that workers execute script in a fairly contained way, it might be okay?

Worker scripts aren't going to be very contained as we add more APIs
to workers. They can already read any data from the server (through
XHR) and much local data (through IDB).

I'd definitely want them not to inherit the origin, the question is if
that's web compatible at this point. Maybe we can allow them to
execute but as a sandboxed origin?

/ Jonas



Re: Blob URL Origin

2014-05-28 Thread Arun Ranganathan
On May 22, 2014, at 4:29 AM, Anne van Kesteren ann...@annevk.nl wrote:

 Thanks, I'm convinced.
 
 So now I'd like to know what policy we want so we can carefully define it.


The lastest editor’s draft of the File API specifies what we discussed in this 
email thread as syntax for Blob URLs:

http://dev.w3.org/2006/webapi/FileAPI/#DefinitionOfScheme

and origin, including how to serialize the Blob URL.



 For blob URLs (and prolly filesystem and indexeddb) we put the origin
 in the URL and define a way to extract it again so new
 URL(blob).origin does the right thing.


I wonder if .origin should be static?



 For fetching blob URLs (and prolly filesystem and indexeddb) we
 effectively act as if the request's mode was same-origin. Allowing
 tainted cross-origin requests would complicate UUID (for the UA) and
 memory (for the page) management in a multiprocess environment.


We’re not allowing them.

— A*



Re: Blob URL Origin

2014-05-22 Thread Anne van Kesteren
On Thu, May 22, 2014 at 1:45 AM, Jonas Sicking jo...@sicking.cc wrote:
 The fact that we allow passing blobs around is no different from the
 fact that we allow passing an ArrayBuffer or a string around. Once a
 page knows that it has the blob/arraybuffer/string the only way to
 have it XSS you is to eval() it. Hopefully pages know not to do that
 when receiving a blob/arraybuffer/string from an untrusted party as it
 pretty obviously will enable that party to XSS you. eval() always
 *explicitly* runs code in your origin.

 [...]

Thanks, I'm convinced.

So now I'd like to know what policy we want so we can carefully define it.

For blob URLs (and prolly filesystem and indexeddb) we put the origin
in the URL and define a way to extract it again so new
URL(blob).origin does the right thing.

For fetching blob URLs (and prolly filesystem and indexeddb) we
effectively act as if the request's mode was same-origin. Allowing
tainted cross-origin requests would complicate UUID (for the UA) and
memory (for the page) management in a multiprocess environment.


How do we deal with data URLs? Obviously you can always get a resource
out of them. But when should the response of fetching one be tainted
and when should it not be? And there's a somewhat similar question for
about URLs. Although only about:blank results in something per the
specification at the moment.


-- 
http://annevankesteren.nl/



Re: Blob URL Origin

2014-05-21 Thread Anne van Kesteren
On Tue, May 20, 2014 at 9:24 PM, Jonas Sicking jo...@sicking.cc wrote:
 I think you are confusing issues. Or at least talking about two
 separate issues at once in a way that I'm not sure what you are
 talking about. The issue of is there an XSS issue with treated blob:
 like we treat data: is a separate issue from should we treat
 cross-origin blob: like cross-origin http:, i.e. should we allow
 pointing an img to a cross-origin blob:.

Sure, I'm still at the is there an XSS issue here given that we can
pass Blob objects around without restrictions.


 I had hoped that we had settled the former and decided that blob:
 should not be treated as data:. And I think we've also decided that we
 should use the explicit origin syntax, i.e. something like
 blob:http://example.com/uuid;

I'm not quite there yet. In part it seems this design stems from the
fact that we cannot create unique enough IDs. My question was if
things change if we did create unique enough IDs as it seems we are
designing something around a rather artificial limitation.


-- 
http://annevankesteren.nl/



Re: Blob URL Origin

2014-05-21 Thread Glenn Maynard
Hmm.  One factor that might change my mind on this: If I pass a blob URL,
revoking the URL appropriately becomes hard.  Even if it gets implemented,
auto-revoke can't help with this.  That brings back all of the problems
with non-auto-revoking blob URLs, and adds a new layer of complexity, since
I have to coordinate between the site creating the blob URL and everyone
receiving it to figure out when to revoke it.

On the other hand, I can just post the blob itself.  That avoids all of
that mess, and the other side can just create a blob URL from it itself if
that's what it needs.

That suggests that it's not worth trying to make blob URLs more accessible
cross-origin.  I can't think of any case where I'd rather pass a blob URL
instead of just posting the Blob itself.

-- 
Glenn Maynard


Re: Blob URL Origin

2014-05-21 Thread Jonas Sicking
On Wed, May 21, 2014 at 3:59 AM, Anne van Kesteren ann...@annevk.nl wrote:
 On Tue, May 20, 2014 at 9:24 PM, Jonas Sicking jo...@sicking.cc wrote:
 I think you are confusing issues. Or at least talking about two
 separate issues at once in a way that I'm not sure what you are
 talking about. The issue of is there an XSS issue with treated blob:
 like we treat data: is a separate issue from should we treat
 cross-origin blob: like cross-origin http:, i.e. should we allow
 pointing an img to a cross-origin blob:.

 Sure, I'm still at the is there an XSS issue here given that we can
 pass Blob objects around without restrictions.

The fact that we allow passing blobs around is no different from the
fact that we allow passing an ArrayBuffer or a string around. Once a
page knows that it has the blob/arraybuffer/string the only way to
have it XSS you is to eval() it. Hopefully pages know not to do that
when receiving a blob/arraybuffer/string from an untrusted party as it
pretty obviously will enable that party to XSS you. eval() always
*explicitly* runs code in your origin.

Likewise, if you receive a blob from an untrusted party, and then do
myscriptelement.src = URL.createObjectURL(blob), then that very
explicitly will run code in your origin. It should be no surprise that
that will cause an XSS risk and so hopefully pages know not to do
that.

URLs are different though. If you receive a URL from an untrusted
party you can generally do window.location = url. Or you can do
myiframeelement.src = url and the worst thing you need to worry about
is that the contained page can navigate the top frame. And that you
can protect against by using a sandbox attribute.

It is not at all obvious that this can suddenly run code in your own origin.

Designing a secure platform doesn't mean forbidding any action that
can cause bad things to happen. Doing that might create a secure
platform, but one that probably can't do anything interesting.

Designing a secure platform is done by making things that can be
insecure be very explicit about what they do. So if someone wants to
do something that might be insecure that they have to very explicitly
ask for that. Hopefully that will cause them to think twice about it
and take any necessary precautions first.

/ Jonas



Re: Blob URL Origin

2014-05-20 Thread Anne van Kesteren
On Mon, May 19, 2014 at 9:57 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Mon, May 19, 2014 at 2:00 AM, Anne van Kesteren ann...@annevk.nl wrote:
 Again fair, but do we consider that something we want to fix or do we
 want to enshrine this?

 Given that there's no way to set CORS headers on these (yet), I think
 there's very limited value in allowing them to be read cross-origin.

I meant fixing not generating unique enough IDs. The way I see it such
a URL is effectively a capability URL (given a unique enough ID) and
at that point it should not be that different from handing out a Blob
object across origins.

The perceived danger is apparently people sticking these URLs in
things sans sandboxing and shooting themselves in the foot. So it
seems reasonable to treat such URLs as cross-origin for iframe and
workers (CSP's child-src), but for canvas that does not seem that
clear.


-- 
http://annevankesteren.nl/



Re: Blob URL Origin

2014-05-20 Thread Jonas Sicking
On Tue, May 20, 2014 at 1:28 AM, Anne van Kesteren ann...@annevk.nl wrote:
 On Mon, May 19, 2014 at 9:57 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Mon, May 19, 2014 at 2:00 AM, Anne van Kesteren ann...@annevk.nl wrote:
 Again fair, but do we consider that something we want to fix or do we
 want to enshrine this?

 Given that there's no way to set CORS headers on these (yet), I think
 there's very limited value in allowing them to be read cross-origin.

 I meant fixing not generating unique enough IDs. The way I see it such
 a URL is effectively a capability URL (given a unique enough ID) and
 at that point it should not be that different from handing out a Blob
 object across origins.

 The perceived danger is apparently people sticking these URLs in
 things sans sandboxing and shooting themselves in the foot. So it
 seems reasonable to treat such URLs as cross-origin for iframe and
 workers (CSP's child-src), but for canvas that does not seem that
 clear.

I think you are confusing issues. Or at least talking about two
separate issues at once in a way that I'm not sure what you are
talking about. The issue of is there an XSS issue with treated blob:
like we treat data: is a separate issue from should we treat
cross-origin blob: like cross-origin http:, i.e. should we allow
pointing an img to a cross-origin blob:.

I had hoped that we had settled the former and decided that blob:
should not be treated as data:. And I think we've also decided that we
should use the explicit origin syntax, i.e. something like
blob:http://example.com/uuid;

Now that leaves the question of if blob: URLs should be loadable
cross-origin. I.e. if a page from http://a.com should be able to use
img src=blob:http://b.com/uuid;.

Yes, we could demand that that implementations generate unguessable
UUIDs. And then define that a page from http://a.com can use img
src=blob:http://b.com/uuid;, but if it then used that element to
drawImage into a canvas, that the canvas would get tainted.

But there appears to be very little utility of doing this. Rather than
spending time implementing an unguessable UUID generator, and then
worrying that someone would still accidentally pass a blob: URL where
they shouldn't, I'd rather implement a way to generate a blob: URL
which is explicitly usable cross-origin. But in img and in XHR. I.e.
a Blob URL which responds with CORS headers.

/ Jonas



Re: Blob URL Origin

2014-05-20 Thread Glenn Maynard
On Tue, May 20, 2014 at 2:24 PM, Jonas Sicking jo...@sicking.cc wrote:

 Yes, we could demand that that implementations generate unguessable

UUIDs. And then define that a page from http://a.com can use img
 src=blob:http://b.com/uuid;, but if it then used that element to
 drawImage into a canvas, that the canvas would get tainted.

 But there appears to be very little utility of doing this. Rather than
 spending time implementing an unguessable UUID generator, and then
 worrying that someone would still accidentally pass a blob: URL where
 they shouldn't, I'd rather implement a way to generate a blob: URL
 which is explicitly usable cross-origin. But in img and in XHR. I.e.
 a Blob URL which responds with CORS headers.


It'd be a lot better for blob URLs to act like other resources: either full
access (same origin or CORS cross-origin) or limited access cross-origin
(usable but taints canvas, can't be read with XHR, etc.) than to block them
entirely cross-origin.

Generating unguessable tokens (including version 4 UUIDs) is so easy to do
that it doesn't make sense to limit the API based on this.

-- 
Glenn Maynard


Re: Blob URL Origin

2014-05-19 Thread Jonas Sicking
On Sun, May 18, 2014 at 6:38 AM, Anne van Kesteren ann...@annevk.nl wrote:
 On Sat, May 17, 2014 at 12:22 AM, Jonas Sicking jo...@sicking.cc wrote:
 And I agree with them. The fact that iframes end up same-origin
 makes it easier to XSS a website by tricking it to load a URL of the
 attackers choice in an iframe. Or open a worker using a URL of the
 attackers choice.

 I guess that is fair. Should a cross-origin blob URL taint the canvas?

In at least Chrome and Firefox, blob: acts like filesystem: and can't
be loaded cross-origin. Even in cases when we normally permit loading
of cross-origin resources like in img and script.

This has been to prevent websites from being able to steal data by
guessing UUIDs (at least the Gecko UUID generator isn't guaranteed to
produce unguessable UUIDs).

So the question of canvas tainting doesn't really come into play,
since you can't even load the cross-origin blob: into an image and
draw it into the canvas.

/ Jonas



Re: Blob URL Origin

2014-05-19 Thread Kyle Huey
On Mon, May 19, 2014 at 2:33 AM, Frederik Braun fbr...@mozilla.com wrote:
 On 15.05.2014 22:46, Glenn Maynard wrote:
 On Thu, May 15, 2014 at 12:07 PM, Jonas Sicking jo...@sicking.cc
 mailto:jo...@sicking.cc wrote:

 On Thu, May 15, 2014 at 6:52 AM, Anne van Kesteren ann...@annevk.nl
 mailto:ann...@annevk.nl wrote:
  I was thinking about the latter and that would not work if the URL was
  revoked. Unless we store origin at parse time.

 Good point. Without using the explicit syntax we couldn't return a
 consistent result for the origin.


 I'm not against the explicit model, but I think it would be very
 desirable if there was a way to get the origin out of the blob URL with
 some sort of API.
 It's not unlikely that developers will implement their own (flawed) URL
 parsing to determine the origin of a blob URL, if it's explicitly in the
 blob URL.

There's no way to extract the origin from the blob URL.  The URL is
just something of the form
blob:0ffc771c-486d-4cb0-8b7c-07b8dd9ab101.

- Kyle



Re: Blob URL Origin

2014-05-19 Thread Kyle Huey
It was pointed out to me that I should have read the rest of the thread ...

*looks sheepish*

- Kyle

On Mon, May 19, 2014 at 9:51 AM, Kyle Huey m...@kylehuey.com wrote:
 On Mon, May 19, 2014 at 2:33 AM, Frederik Braun fbr...@mozilla.com wrote:
 On 15.05.2014 22:46, Glenn Maynard wrote:
 On Thu, May 15, 2014 at 12:07 PM, Jonas Sicking jo...@sicking.cc
 mailto:jo...@sicking.cc wrote:

 On Thu, May 15, 2014 at 6:52 AM, Anne van Kesteren ann...@annevk.nl
 mailto:ann...@annevk.nl wrote:
  I was thinking about the latter and that would not work if the URL was
  revoked. Unless we store origin at parse time.

 Good point. Without using the explicit syntax we couldn't return a
 consistent result for the origin.


 I'm not against the explicit model, but I think it would be very
 desirable if there was a way to get the origin out of the blob URL with
 some sort of API.
 It's not unlikely that developers will implement their own (flawed) URL
 parsing to determine the origin of a blob URL, if it's explicitly in the
 blob URL.

 There's no way to extract the origin from the blob URL.  The URL is
 just something of the form
 blob:0ffc771c-486d-4cb0-8b7c-07b8dd9ab101.

 - Kyle



Re: Blob URL Origin

2014-05-19 Thread Arun Ranganathan

On May 19, 2014, at 5:33 AM, Frederik Braun fbr...@mozilla.com wrote:

 On 15.05.2014 22:46, Glenn Maynard wrote:
 On Thu, May 15, 2014 at 12:07 PM, Jonas Sicking jo...@sicking.cc
 mailto:jo...@sicking.cc wrote:
 
On Thu, May 15, 2014 at 6:52 AM, Anne van Kesteren ann...@annevk.nl
mailto:ann...@annevk.nl wrote:
 I was thinking about the latter and that would not work if the URL was
 revoked. Unless we store origin at parse time.
 
Good point. Without using the explicit syntax we couldn't return a
consistent result for the origin.
 
 
 I'm not against the explicit model, but I think it would be very
 desirable if there was a way to get the origin out of the blob URL with
 some sort of API.
 It's not unlikely that developers will implement their own (flawed) URL
 parsing to determine the origin of a blob URL, if it's explicitly in the
 blob URL.


I think this could be done with the existing URL API’s origin attribute so that 
it does the right thing with a Blob URL: 
http://url.spec.whatwg.org/#dom-url-origin 

Of course, right now that won’t work in Fx. I wonder if we could fix that? Or 
add another static function to URL.

—A*




Re: Blob URL Origin

2014-05-19 Thread Jonas Sicking
On Mon, May 19, 2014 at 2:00 AM, Anne van Kesteren ann...@annevk.nl wrote:
 On Mon, May 19, 2014 at 10:30 AM, Jonas Sicking jo...@sicking.cc wrote:
 In at least Chrome and Firefox, blob: acts like filesystem: and can't
 be loaded cross-origin. Even in cases when we normally permit loading
 of cross-origin resources like in img and script.

 This has been to prevent websites from being able to steal data by
 guessing UUIDs (at least the Gecko UUID generator isn't guaranteed to
 produce unguessable UUIDs).

 So the question of canvas tainting doesn't really come into play,
 since you can't even load the cross-origin blob: into an image and
 draw it into the canvas.

 Again fair, but do we consider that something we want to fix or do we
 want to enshrine this?

Given that there's no way to set CORS headers on these (yet), I think
there's very limited value in allowing them to be read cross-origin.

We could look at enabling developers to opting in to generating a URI
which can be read cross-origin, at which point it could generate a URI
which can be read by a developer-chosen set of origins. But I'd prefer
to keep the default behavior closed.

/ Jonas



Re: Blob URL Origin

2014-05-19 Thread Glenn Maynard
On Mon, May 19, 2014 at 3:30 AM, Jonas Sicking jo...@sicking.cc wrote:

 In at least Chrome and Firefox, blob: acts like filesystem: and can't

be loaded cross-origin. Even in cases when we normally permit loading
 of cross-origin resources like in img and script.

 This has been to prevent websites from being able to steal data by
 guessing UUIDs (at least the Gecko UUID generator isn't guaranteed to
 produce unguessable UUIDs).


Again, generating securely unguessable tokens (whether in UUID format or
not) is straightforward, so this seems doesn't seem like a reason to block
cross-origin blob URLs.

-- 
Glenn Maynard


Re: Blob URL Origin

2014-05-18 Thread Anne van Kesteren
On Sat, May 17, 2014 at 12:22 AM, Jonas Sicking jo...@sicking.cc wrote:
 And I agree with them. The fact that iframes end up same-origin
 makes it easier to XSS a website by tricking it to load a URL of the
 attackers choice in an iframe. Or open a worker using a URL of the
 attackers choice.

I guess that is fair. Should a cross-origin blob URL taint the canvas?

Do we have an exhaustive list of where data URLs are problematic and
where they are not? Ideally we rewrite the model in the specifications
to something that is coherent and more secure.


 But really, I'd recommend reaching out to the browsers that currently
 treat data: URLs as having a unique origin. They can probably much
 better speak to why they feel that that's needed.

I believe they are subscribed. Adam? Joel?


-- 
http://annevankesteren.nl/



Re: Blob URL Origin

2014-05-16 Thread Anne van Kesteren
On Thu, May 15, 2014 at 8:17 PM, Jonas Sicking jo...@sicking.cc wrote:
 I did. It's not very attractive to use the model of something that so
 far we haven't been able to make work consistently across UAs, and
 which isn't looking like we will be able to get consistently working
 across UAs for a long time to come. Not only does that mean that we'll
 keep blob: in the same limbo, it also means that we don't know if
 after that limbo we'll get something that is particularly great.

You did say this, but this does not explain the actual problem. What
exactly is wrong with the data URL model that we have today and how do
we plan on fixing it?


 Another argument that I'm not sure if I've raised yet is that so far I
 don't see any good solutions for data:. The rfc seems to currently
 call for all data: URLs to be given a unique origin. That means that
 you can't do new Worker(data:...), which seems bad. And for blob:
 it'd be particularly sad if we couldn't do new Image(blob:) and then
 using that image in WebGL or in canvas2d without the canvas getting
 tainted.

The origin you get out of a URL is not the origin assigned to a
resource, necessarily. At the moment Fetch (and HTML before it)
defines that fetching a data URL is the same as fetching a same-origin
resource and gives you something back that is not tainted. This is the
same for blob URLs, they won't get tainted either. Or about:blank.


 The best solution that I've been able to think of so far was what I
 proposed in another thread of requiring explicit opt-in. However that
 requires messy and unusual syntax everywhere where URLs are used and
 where we might want to treat data: as same-origin. So also not
 something I'd be sad not to have to do for blob:.

I think the sad thing is that if you couple origins with blob URLs you
can no longer hand a blob URL to an iframe-based widget and let them
play with it. E.g. draw, modify, and hand a URL back for the modified
image. But I guess this is a scenario you explicitly want to outlaw,
even though you could do the equivalent by passing a Blob object
directly and that would always work.


-- 
http://annevankesteren.nl/



Re: Blob URL Origin

2014-05-16 Thread Boris Zbarsky

On 5/16/14, 10:11 AM, Anne van Kesteren wrote:

What exactly is wrong with the data URL model that we have today


The fact that some UAs don't want to implement it?


and how do we plan on fixing it?


We don't have a plan yet.


But I guess this is a scenario you explicitly want to outlaw,
even though you could do the equivalent by passing a Blob object
directly and that would always work.


Where by directly you mean postMessage, right?

-Boris




Re: Blob URL Origin

2014-05-16 Thread Anne van Kesteren
On Fri, May 16, 2014 at 4:31 PM, Boris Zbarsky bzbar...@mit.edu wrote:
 On 5/16/14, 10:11 AM, Anne van Kesteren wrote:
 What exactly is wrong with the data URL model that we have today

 The fact that some UAs don't want to implement it?

Do we know why?


 But I guess this is a scenario you explicitly want to outlaw,
 even though you could do the equivalent by passing a Blob object
 directly and that would always work.

 Where by directly you mean postMessage, right?

Yes.


-- 
http://annevankesteren.nl/



Re: Blob URL Origin

2014-05-16 Thread Boris Zbarsky

On 5/16/14, 10:39 AM, Anne van Kesteren wrote:

The fact that some UAs don't want to implement it?


Do we know why?


They think it's a security problem.

-Boris



Re: Blob URL Origin

2014-05-16 Thread Anne van Kesteren
On Fri, May 16, 2014 at 5:04 PM, Boris Zbarsky bzbar...@mit.edu wrote:
 On 5/16/14, 10:39 AM, Anne van Kesteren wrote:
 The fact that some UAs don't want to implement it?

 Do we know why?

 They think it's a security problem.

Not tainting canvas? Same-origin iframe? Doesn't matter?


-- 
http://annevankesteren.nl/



Re: Blob URL Origin

2014-05-16 Thread Glenn Maynard
On Fri, May 16, 2014 at 9:11 AM, Anne van Kesteren ann...@annevk.nl wrote:

 I think the sad thing is that if you couple origins with blob URLs you

can no longer hand a blob URL to an iframe-based widget and let them
 play with it. E.g. draw, modify, and hand a URL back for the modified
 image. But I guess this is a scenario you explicitly want to outlaw,
 even though you could do the equivalent by passing a Blob object
 directly and that would always work.


As I recall, when I asked why blob URLs were same-origin only, the answer
was that it was uncertain whether all platforms had a good enough PRNG to
allow generating securely-unguessable tokens for blob URLs in order to make
sure sites can't guess blob URLs for other sites.  I don't think that's an
issue (if you don't have an entropy source to implement a secure PRNG, you
don't even have basic crypto).  I think that the same-origin restriction
for blob URLs should be removed.

-- 
Glenn Maynard


Re: Blob URL Origin

2014-05-16 Thread Boris Zbarsky

On 5/16/14, 11:08 AM, Anne van Kesteren wrote:

Not tainting canvas? Same-origin iframe? Doesn't matter?


The same-origin iframe bit.  I think everyone is on board with not 
tainting canvas for data: things.


-Boris




Re: Blob URL Origin

2014-05-15 Thread Jonas Sicking
On Thu, May 15, 2014 at 6:52 AM, Anne van Kesteren ann...@annevk.nl wrote:
 I was thinking about the latter and that would not work if the URL was
 revoked. Unless we store origin at parse time.

Good point. Without using the explicit syntax we couldn't return a
consistent result for the origin.

So that's another argument for using the explicit syntax. We should
just do that.

/ Jonas



Re: Blob URL Origin

2014-05-15 Thread Glenn Maynard
On Thu, May 15, 2014 at 12:07 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Thu, May 15, 2014 at 6:52 AM, Anne van Kesteren ann...@annevk.nl
 wrote:
  I was thinking about the latter and that would not work if the URL was
  revoked. Unless we store origin at parse time.

 Good point. Without using the explicit syntax we couldn't return a
 consistent result for the origin.


I pointed this out three days ago.  If my mails aren't making it through,
please let me know.
http://lists.w3.org/Archives/Public/public-webapps/2014AprJun/0397.html

-- 
Glenn Maynard


Re: Blob URL Origin

2014-05-13 Thread Frederik Braun
On 12.05.2014 18:41, Jonas Sicking wrote:
 (new URL(url)).origin should work, no?

It does not work for blob URIs in Firefox.




Re: Blob URL Origin

2014-05-13 Thread Boris Zbarsky

On 5/13/14, 1:20 AM, Frederik Braun wrote:

On 12.05.2014 18:41, Jonas Sicking wrote:

(new URL(url)).origin should work, no?


It does not work for blob URIs in Firefox.


It can't work, given how URL.origin is currently defined...  It's 
possible that definition should change, though.


-Boris




Re: Blob URL Origin

2014-05-13 Thread Anne van Kesteren
On Mon, May 12, 2014 at 6:51 PM, Jonas Sicking jo...@sicking.cc wrote:
 I agree with this. But Adam's assessment of how long that will take to get
 specced and implemented was in the order of year, not month. I share that
 assessment.

 I am also not at all convinced that I'd want blob: to behave like data: even
 if I could. The solution for data: is likely to get messy.

This keeps coming up. Is there a page where we have a description?


 The origin model for http has so far worked better for us I think.

I'm not sure how you can straightforwardly compare that to locally
created resources.


-- 
http://annevankesteren.nl/



Re: Blob URL Origin

2014-05-13 Thread Anne van Kesteren
On Tue, May 13, 2014 at 10:33 AM, Boris Zbarsky bzbar...@mit.edu wrote:
 It can't work, given how URL.origin is currently defined...  It's possible
 that definition should change, though.

We don't want new URL() to take ownership of the Blob object, so
making new URL() reflect the origin of whoever created the uuid for
the Blob object seems weird.


-- 
http://annevankesteren.nl/



Re: Blob URL Origin

2014-05-13 Thread Jonas Sicking
On Tue, May 13, 2014 at 6:00 AM, Anne van Kesteren ann...@annevk.nl wrote:
 On Tue, May 13, 2014 at 10:33 AM, Boris Zbarsky bzbar...@mit.edu wrote:
 It can't work, given how URL.origin is currently defined...  It's possible
 that definition should change, though.

 We don't want new URL() to take ownership of the Blob object, so
 making new URL() reflect the origin of whoever created the uuid for
 the Blob object seems weird.

Why would it need to take ownership of the Blob object? First of
all, the origin of the blob: url is determined by who called
createObjectURL, not who owns the Blob instance. Hence you don't
actually need to touch the Blob instance to figure out the origin, but
rather just inspect the url itself (if we use explicit origin syntax),
or look up the origin in the internal url-Blob table (if we use
implicit origin syntax).

And even if you did somehow need to touch the Blob, the implementation
could just immediately release it before returning from the
constructor.

/ Jonas



Re: Blob URL Origin

2014-05-13 Thread Arun Ranganathan
On May 12, 2014, at 8:28 AM, Anne Van Kesteren ann...@annevk.nl wrote:

 It still seems a bit sad though to tie these URLs to origins in this
 fashion. Jonas is correct that there are inconsistencies in how data
 URLs and origins behave across browsers, but it seems like we should
 sort those out first then if we want a consistent story.




Since Blobs can be passed around in a number of well-known ways, it seems that 
the most legitimate origin of a Blob URL is the origin of the script that 
coined it. I’m not entirely sure how to take action on “it still seems a bit 
sad” though. Sad because of security considerations? After drying my tears, I 
can’t construct a meaningful attack, but I’d welcome more information about 
what benefits are gained by encoding certain “HTTP-reserved” components of URL 
nomenclature (and here, Chrome is inconsistent between blob: and filesystem:). 
Sad because of aesthetics? It’s pretty enough for Safari.

And really, all user agents seem to agree that the origin is that of the 
settings object today. That model seems to work. The remaining question is the 
pro and con of denoting this in the URL’s syntax. abarth’s advice is to put the 
syntax horse in front of the origin cart: 
http://krijnhoetmer.nl/irc-logs/whatwg/20140508#l-913

Also, if it’s “sad” because it doesn’t match data: URL’s way of reckoning 
origin, that doesn’t seem sad to me. 

— A*

Re: Blob URL Origin

2014-05-13 Thread Jonas Sicking
On Tue, May 13, 2014 at 11:19 AM, Arun Ranganathan a...@mozilla.com wrote:
 And really, all user agents seem to agree that the origin is that of the
 settings object today. That model seems to work. The remaining question is
 the pro and con of denoting this in the URL's syntax. abarth's advice is to
 put the syntax horse in front of the origin cart:
 http://krijnhoetmer.nl/irc-logs/whatwg/20140508#l-913

We definitely want to put the origin in the URL's syntax. The only
reason Gecko can get away with not doing that is because we are
single-process. But that's not something we want to impose on other
browsers.

I think the main issue to figure out is the exact syntax. I.e. if it's
ok that things like ':' and '//' appear in the URL. Given that Anne
said that it was I think the most obvious answer here is to use URLs
like:

blob:https://cantaloupe.org/e55f2c33-f000-4e88-b89c-874ae09e7f93
blob:http://example.com:8080/e55f2c33-f000-4e88-b89c-874ae09e7f93

/ Jonas



Re: Blob URL Origin

2014-05-12 Thread Frederik Braun
On 09.05.2014 23:29, Arun Ranganathan wrote:
 ..
 So this is problematic: we don’t have a common syntax on the web, and
 even implementations which share code don’t do it exactly the same. Of
 course, blob: URLs aren’t supposed to be human readable, or really
 viewed by the developer. But not having a good way to denote origin
 within the URL that signifies the origin of the incumbent settings
 object is problematic for Fetch and Parse specifications that need
 origin information.

Wouldn't it be nice if there was a programmatic way to probe if a blob
URI belongs to a specified origin or not?
If they had an origin attribute one could compare them to
location.origin or so. But well, they are DOMStrings and not objects...




Re: Blob URL Origin

2014-05-12 Thread Boris Zbarsky

On 5/12/14, 5:28 AM, Anne van Kesteren wrote:

so blob:https://origin:42/uuid would be fine.


I'd really rather we didn't make web pages parse these strings to get 
the origin.  A static method on Blob that takes a valid blob: URI and 
returns its origin seems like it should be pretty easy for UAs to 
implement, though.


-Boris



Re: Blob URL Origin

2014-05-12 Thread Anne van Kesteren
On Mon, May 12, 2014 at 4:31 PM, Boris Zbarsky bzbar...@mit.edu wrote:
 On 5/12/14, 5:28 AM, Anne van Kesteren wrote:
 so blob:https://origin:42/uuid would be fine.

 I'd really rather we didn't make web pages parse these strings to get the
 origin.  A static method on Blob that takes a valid blob: URI and returns
 its origin seems like it should be pretty easy for UAs to implement, though.

I thought the idea was to associate the origin of
URL.createObjectURL() with the Blob object (which might be different
from the Blob object's origin). And then for iframe etc. only allow
loading same-origin blob: URLs.

It seems this could also be achieved via other means, such as scoping
the minted uuids to a particular origin.

And I guess it protects against someone handing you a URL and you
assuming loading that is safe. Again, seems like that could be
achieved by scoped uuids if we think it's desirable.


-- 
http://annevankesteren.nl/



Re: Blob URL Origin

2014-05-12 Thread Boris Zbarsky

On 5/12/14, 7:46 AM, Anne van Kesteren wrote:

I thought the idea was to associate the origin of
URL.createObjectURL() with the Blob object (which might be different
from the Blob object's origin).


Er, right, these URLs come from URL.createObjectURL.  So we'd want a 
URL.getObjectURLOrigin() or some such, not a static on Blob, if we want 
a way to expose the origin to script.


-Boris



Re: Blob URL Origin

2014-05-12 Thread Arun Ranganathan

On May 12, 2014, at 10:31 AM, Boris Zbarsky bzbar...@mit.edu wrote:

 On 5/12/14, 5:28 AM, Anne van Kesteren wrote:
 so blob:https://origin:42/uuid would be fine.
 
 I'd really rather we didn't make web pages parse these strings to get the 
 origin.  A static method on Blob that takes a valid blob: URI and returns its 
 origin seems like it should be pretty easy for UAs to implement, though.


We actually aren’t obliging web pages parse these strings to get the origin. In 
fact, blob: URL strings shouldn’t even be of interest to web pages. They aren’t 
today, and I don’t envision them being of interest even with “origin tagging.” 
That is, I can’t think of why exactly a web developer would want to look into 
the blob: URL strings. UA’s should just “do the right thing” once a Blob URL is 
coined.

The question is really whether origin should be implicit or explicit. Fx’s 
implementation makes it implicit. so that there’s no way to deduce origins from 
the Blob URL itself, but it just “does the right thing” in terms of origin 
strictures. That hasn’t been a problem, but it’s hard to spec. it that way. 
Also, it makes Blob URLs only usable within APIs that are aware of them, which 
honestly is the case today.

So what if we tag origin into the strings? Would that be so bad? It’s not doing 
anything other than denoting the incumbent script setting object’s origin, no? 
Even in the Chrome/Safari cases, I can’t think of web developers using that 
information.

— A*




Re: data:, was: Blob URL Origin

2014-05-12 Thread Arun Ranganathan

On May 12, 2014, at 6:26 AM, Julian Reschke julian.resc...@gmx.de wrote:

 Could you please clarify what spec you are referring to, and in which way 
 it's not implemented correctly?


Well, I think http://tools.ietf.org/html/rfc6454#section-4 is is supposed to be 
normative for data: URL origin. But, implementations don’t behave this way; the 
problem is inheriting origins. Here’s the pertinent thread:

http://lists.w3.org/Archives/Public/public-webapps/2014JanMar/0682.html

Re: Blob URL Origin

2014-05-12 Thread Jonas Sicking
On May 12, 2014 7:33 AM, Boris Zbarsky bzbar...@mit.edu wrote:

 On 5/12/14, 5:28 AM, Anne van Kesteren wrote:

 so blob:https://origin:42/uuid would be fine.


 I'd really rather we didn't make web pages parse these strings to get the
origin.  A static method on Blob that takes a valid blob: URI and returns
its origin seems like it should be pretty easy for UAs to implement, though.

(new URL(url)).origin should work, no?

But creating an even eased way to get the origin of a url might be good.
Though I don't think this is a specific problem to blob: URLs, so I
wouldn't create a solution that is specific to them.

Maybe a static URL.getOrigin(url).

/ Jonas


Re: Blob URL Origin

2014-05-12 Thread Jonas Sicking
On May 12, 2014 8:57 AM, Arun Ranganathan a...@mozilla.com wrote:
 On May 12, 2014, at 10:31 AM, Boris Zbarsky bzbar...@mit.edu wrote:

  On 5/12/14, 5:28 AM, Anne van Kesteren wrote:
  so blob:https://origin:42/uuid would be fine.
 
  I'd really rather we didn't make web pages parse these strings to get
the origin.  A static method on Blob that takes a valid blob: URI and
returns its origin seems like it should be pretty easy for UAs to
implement, though.


 We actually aren't obliging web pages parse these strings to get the
origin. In fact, blob: URL strings shouldn't even be of interest to web
pages. They aren't today, and I don't envision them being of interest even
with origin tagging. That is, I can't think of why exactly a web
developer would want to look into the blob: URL strings. UA's should just
do the right thing once a Blob URL is coined.

I suspect that some pages will want to check the origin of a url before
firing off a load to it. For example complex app frameworks like facebook's.

However I agree that this wont be a core use case. So not something to
worry too much about.

The strongest reason I could see for doing anything here is that when it
comes to security it is extra important to en courage people to not do the
wrong thing.

Though, would simply `URL(url).origin` work? If so that might be enough.

/ Jonas


Re: Blob URL Origin

2014-05-12 Thread Glenn Maynard
On Mon, May 12, 2014 at 11:41 AM, Jonas Sicking jo...@sicking.cc wrote:

 I'd really rather we didn't make web pages parse these strings to get the
 origin.  A static method on Blob that takes a valid blob: URI and returns
 its origin seems like it should be pretty easy for UAs to implement, though.

 (new URL(url)).origin should work, no?

I don't think there have been any real differences to argue between the
implicitly or explicitly approaches, but this does argue for
explicit.  Otherwise, new URL(blobURL) would have to synchronously read
the associated Blob's metadata (which might be on disk or in another
process), and the result of new URL() would change when a blob URL is
revoked.

-- 
Glenn Maynard