Re: Blob URLs | autoRevoke, defaults, and resolutions

2013-05-07 Thread Jonas Sicking
On Mon, May 6, 2013 at 5:57 PM, Anne van Kesteren ann...@annevk.nl wrote:
 On Mon, May 6, 2013 at 5:45 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Mon, May 6, 2013 at 4:28 PM, Anne van Kesteren ann...@annevk.nl wrote:
 Okay. So that fails for XMLHttpRequest :-(

 What do you mean? Those are the steps we take for XHR requests too.

 So e.g. open() needs to do URL parsing (per XHR spec), send() would
 cause CSP to fail (per CSP spec), send() also does the fetch (per XHR
 spec). Overall it seems like a different model from the other APIs,
 but maybe I'm missing something?

The only thing that's different about XHR is that the first step in my
list lives in one function, and the other steps live in another
function. Doesn't seem to have any effect on the discussions here
other than that we'd need to define which of the two functions does
the step which grabs a reference to the Blob.

/ Jonas



Re: Blob URLs | autoRevoke, defaults, and resolutions

2013-05-07 Thread Glenn Maynard
On Mon, May 6, 2013 at 10:52 PM, Eric U er...@google.com wrote:

  I'm not really sure what you're saying, here.  If you want an URL to

expire or otherwise be revoked, no, you can't use it multiple times
 after that.  If you want it to work multiple times, don't revoke it or
 don't set oneTimeOnly.


No, I'm saying that APIs *internally* may perform multiple fetches, such as
if you load an API into video and the user performs multiple pauses and
seeks.  This should be completely transparent to script.  Similarly, if you
load blob URLs into srcset, the fact that srcset might load or reload the
images any number of times in the future due to changes to the environment
should be completely transparent to script.  There are lots of cases of
this, and we should have a simple, predictable approach to dealing with it.

At a high-level, my view is that (within reason) blobs should be captured
at the point of entry into the native API.  As soon as you say img.srcset
= '...; blob URL; ...', or xhr.open(objectURL), or img.src =
createObjectURL(), that point where the URL first enters the native API is
what matters.  That's a simple rule that's easy for developers to
understand in general, without needing to care about when or how many times
the fetch algorithm is run (if ever, as with srcset) on the URLs.

-- 
Glenn Maynard


Re: Blob URLs | autoRevoke, defaults, and resolutions

2013-05-07 Thread Anne van Kesteren
On Mon, May 6, 2013 at 11:11 PM, Jonas Sicking jo...@sicking.cc wrote:
 The only thing that's different about XHR is that the first step in my
 list lives in one function, and the other steps live in another
 function. Doesn't seem to have any effect on the discussions here
 other than that we'd need to define which of the two functions does
 the step which grabs a reference to the Blob.

Fair enough. So I guess we can indeed fix this by changing
http://fetch.spec.whatwg.org/#concept-fetch to get a reference to the
Blob/MediaStream/... before returning early as Arun suggested.


--
http://annevankesteren.nl/



Re: Blob URLs | autoRevoke, defaults, and resolutions

2013-05-07 Thread Arun Ranganathan

On May 7, 2013, at 10:45 AM, Anne van Kesteren wrote:

 On Mon, May 6, 2013 at 11:11 PM, Jonas Sicking jo...@sicking.cc wrote:
 The only thing that's different about XHR is that the first step in my
 list lives in one function, and the other steps live in another
 function. Doesn't seem to have any effect on the discussions here
 other than that we'd need to define which of the two functions does
 the step which grabs a reference to the Blob.
 
 Fair enough. So I guess we can indeed fix this by changing
 http://fetch.spec.whatwg.org/#concept-fetch to get a reference to the
 Blob/MediaStream/... before returning early as Arun suggested.


\o/ :-)

Filed https://www.w3.org/Bugs/Public/show_bug.cgi?id=21955 

-- A*



Re: Blob URLs | autoRevoke, defaults, and resolutions

2013-05-07 Thread Anne van Kesteren
On Tue, May 7, 2013 at 12:17 PM, Arun Ranganathan a...@mozilla.com wrote:
 \o/ :-)

 Filed https://www.w3.org/Bugs/Public/show_bug.cgi?id=21955

So actually, after I emailed that this morning I wondered how this
would work for img srcset or image() in CSS where fetch is unlikely
to be sync. Most of those are new features which might explain why it
has not been seen as a problem thus far.


--
http://annevankesteren.nl/



Re: Blob URLs | autoRevoke, defaults, and resolutions

2013-05-07 Thread Glenn Maynard
On Tue, May 7, 2013 at 9:45 AM, Anne van Kesteren ann...@annevk.nl wrote:

 On Mon, May 6, 2013 at 11:11 PM, Jonas Sicking jo...@sicking.cc wrote:
  The only thing that's different about XHR is that the first step in my
  list lives in one function, and the other steps live in another
  function. Doesn't seem to have any effect on the discussions here
  other than that we'd need to define which of the two functions does
  the step which grabs a reference to the Blob.

 Fair enough. So I guess we can indeed fix this by changing
 http://fetch.spec.whatwg.org/#concept-fetch to get a reference to the
 Blob/MediaStream/... before returning early as Arun suggested.


Step 1 is resolve, step 3 is fetch.  Moving it into step 1 means it would
go in resolve, not fetch.  Putting it in fetch wouldn't help, since fetch
doesn't always start synchronously.  (I'm confused, because we've talked
about this distinction several times.)

-- 
Glenn Maynard


Re: Blob URLs | autoRevoke, defaults, and resolutions

2013-05-07 Thread Jonas Sicking
I'd be worried about letting any resolved URL to hold a reference to
the Blob. We are playing very fast and loose with URLs in Gecko and
it's never been intended that they hold on to any resources of
significant size.

/ Jonas

On Tue, May 7, 2013 at 1:34 PM, Glenn Maynard gl...@zewt.org wrote:
 On Tue, May 7, 2013 at 9:45 AM, Anne van Kesteren ann...@annevk.nl wrote:

 On Mon, May 6, 2013 at 11:11 PM, Jonas Sicking jo...@sicking.cc wrote:
  The only thing that's different about XHR is that the first step in my
  list lives in one function, and the other steps live in another
  function. Doesn't seem to have any effect on the discussions here
  other than that we'd need to define which of the two functions does
  the step which grabs a reference to the Blob.

 Fair enough. So I guess we can indeed fix this by changing
 http://fetch.spec.whatwg.org/#concept-fetch to get a reference to the
 Blob/MediaStream/... before returning early as Arun suggested.


 Step 1 is resolve, step 3 is fetch.  Moving it into step 1 means it would go
 in resolve, not fetch.  Putting it in fetch wouldn't help, since fetch
 doesn't always start synchronously.  (I'm confused, because we've talked
 about this distinction several times.)

 --
 Glenn Maynard




Re: Blob URLs | autoRevoke, defaults, and resolutions

2013-05-07 Thread Michael Nordman
Fwiw, to the extent it may be helpful when it comes to spec writing, here
are some quick-n-dirty thoughts about how some approximation of the
'autorevoke' behavior could be implemented in chromium.

1) Extend the lifetime of the PublicBlobURL registration until *after* the
last latchee has redeemed its ticket to ride the url. The url registration
itself is refcounted. The url registration implies the underlying data is
not reclaimable.  At time of coining the url registration, a microtask is
scheduled to release it. Latchee's may addref it safely prior to microtask
execution, keeping the url registration valid until a later balancing
release. This would be a non-compliant approximation of the way the spec is
shaping up... but good enough... maybe has the important
characteristics of freeing memory up eventually (generally prior to doc
unload) and guaranteeing a latchee's ticket to ride.

2) Extend the lifetime of the underlying BlobData until *after* the latchee
has redeemed its ticket to ride the url, but revoke the URL well in advance
of that. The url registration is not refcounted, The url registration
implies the underlying data is not reclaimable. There is a means to lookup
a handle to the data given the url.  The underlying blob data is
refcounted. Latchee's must take and release a ref on that. The url
registration is revoked at microtask execution time after being coined.
Piggy back a handle to the blob data on future Fetch (or other network
requests) for the PublicBlobURL. Ignore the URL when processing those
requests and refer only to the piggybacked blob data handle. Probably a
more compliant approach, but maybe more tedious to implement in chromium
(since the URL is no longer useful as the identifier for what to
fetch/load/whathaveyou, we need sideband data for that in addition to
addref/release at url latch/redemption times).

  (I'm confused, because we've talked about this distinction several
times.)
lol, picture a herd of cats chasing their tails, you can call me calico :)


On Tue, May 7, 2013 at 1:34 PM, Glenn Maynard gl...@zewt.org wrote:

 On Tue, May 7, 2013 at 9:45 AM, Anne van Kesteren ann...@annevk.nlwrote:

 On Mon, May 6, 2013 at 11:11 PM, Jonas Sicking jo...@sicking.cc wrote:
  The only thing that's different about XHR is that the first step in my
  list lives in one function, and the other steps live in another
  function. Doesn't seem to have any effect on the discussions here
  other than that we'd need to define which of the two functions does
  the step which grabs a reference to the Blob.

 Fair enough. So I guess we can indeed fix this by changing
 http://fetch.spec.whatwg.org/#concept-fetch to get a reference to the
 Blob/MediaStream/... before returning early as Arun suggested.


 Step 1 is resolve, step 3 is fetch.  Moving it into step 1 means it would
 go in resolve, not fetch.  Putting it in fetch wouldn't help, since fetch
 doesn't always start synchronously.  (I'm confused, because we've talked
 about this distinction several times.)

 --
 Glenn Maynard




Re: Blob URLs | autoRevoke, defaults, and resolutions

2013-05-07 Thread Jonas Sicking
On Tue, May 7, 2013 at 2:54 PM, Jonas Sicking jo...@sicking.cc wrote:
 I'd be worried about letting any resolved URL to hold a reference to
 the Blob. We are playing very fast and loose with URLs in Gecko and
 it's never been intended that they hold on to any resources of
 significant size.

That said, the one way to find out if this approach works is to simply try it.

/ Jonas



Re: Blob URLs | autoRevoke, defaults, and resolutions

2013-05-07 Thread Glenn Maynard
On Tue, May 7, 2013 at 4:54 PM, Jonas Sicking jo...@sicking.cc wrote:

 I'd be worried about letting any resolved URL to hold a reference to
 the Blob. We are playing very fast and loose with URLs in Gecko and
 it's never been intended that they hold on to any resources of
 significant size.


Note that I'm not suggesting that every invocation of the resolve algorithm
start capturing blob URLs.  It'd be an explicit operation at entry points
that support it, not a catch-all happening behind the scenes any time you
resolve a URL anywhere.  (Actually, I went a bit further--entry points that
don't explicitly do this shouldn't allow autorevoke URLs at all.)

The actual change required in the particular entry points might be as
simple as saying resolve URL with capture instead of resolve URL to
invoke a wrapper algorithm, but it lets it be introduced gradually and make
it clear exactly where it happens.

-- 
Glenn Maynard


Re: Blob URLs | autoRevoke, defaults, and resolutions

2013-05-06 Thread Anne van Kesteren
On Sun, May 5, 2013 at 5:37 PM, Jonas Sicking jo...@sicking.cc wrote:
 What we do is that we

 1. Resolve the URL against the current base URL
 2. Perform some security checks
 3. Kick off a network fetch
 4. Return

Okay. So that fails for XMLHttpRequest :-( But if we made it part of
resolve that could work.


--
http://annevankesteren.nl/



Re: Blob URLs | autoRevoke, defaults, and resolutions

2013-05-06 Thread Jonas Sicking
On Mon, May 6, 2013 at 4:28 PM, Anne van Kesteren ann...@annevk.nl wrote:
 On Sun, May 5, 2013 at 5:37 PM, Jonas Sicking jo...@sicking.cc wrote:
 What we do is that we

 1. Resolve the URL against the current base URL
 2. Perform some security checks
 3. Kick off a network fetch
 4. Return

 Okay. So that fails for XMLHttpRequest :-(

What do you mean? Those are the steps we take for XHR requests too.

/ Jonas



Re: Blob URLs | autoRevoke, defaults, and resolutions

2013-05-06 Thread Anne van Kesteren
On Mon, May 6, 2013 at 5:45 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Mon, May 6, 2013 at 4:28 PM, Anne van Kesteren ann...@annevk.nl wrote:
 Okay. So that fails for XMLHttpRequest :-(

 What do you mean? Those are the steps we take for XHR requests too.

So e.g. open() needs to do URL parsing (per XHR spec), send() would
cause CSP to fail (per CSP spec), send() also does the fetch (per XHR
spec). Overall it seems like a different model from the other APIs,
but maybe I'm missing something?


--
http://annevankesteren.nl/



Re: Blob URLs | autoRevoke, defaults, and resolutions

2013-05-06 Thread Glenn Maynard
On Mon, May 6, 2013 at 7:57 PM, Anne van Kesteren ann...@annevk.nl wrote:

 On Mon, May 6, 2013 at 5:45 PM, Jonas Sicking jo...@sicking.cc wrote:
  On Mon, May 6, 2013 at 4:28 PM, Anne van Kesteren ann...@annevk.nl
 wrote:
  Okay. So that fails for XMLHttpRequest :-(
 
  What do you mean? Those are the steps we take for XHR requests too.

 So e.g. open() needs to do URL parsing (per XHR spec), send() would
 cause CSP to fail (per CSP spec), send() also does the fetch (per XHR
 spec). Overall it seems like a different model from the other APIs,
 but maybe I'm missing something?


XHR isn't so different from other APIs, it's just that the separation of
URL enters the API and the fetch is started is more obvious, and more
easily controlled from script.  I think that makes it a really good test
case.

-- 
Glenn Maynard


Re: Blob URLs | autoRevoke, defaults, and resolutions

2013-05-06 Thread Eric U
On Wed, May 1, 2013 at 5:16 PM, Glenn Maynard gl...@zewt.org wrote:
 On Wed, May 1, 2013 at 7:01 PM, Eric U er...@google.com wrote:

 Hmm...now Glenn points out another problem: if you /never/ load the
 image, for whatever reason, you can still leak it.  How likely is that
 in good code, though?  And is it worse than the current state in good
 or bad code?


 I think it's much too easy for well-meaning developers to mess this up.  The
 example I gave is code that *does* use the URL, but the browser may or may
 not actually do anything with it.  (I wouldn't even call that author
 error--it's an interoperability failure.)  Also, the failures are both
 expensive and subtle (eg. lots of big blobs being silently leaked to disk),
 which is a pretty nasty failure mode.

True.

 Another problem is that APIs should be able to receive an API, then use it
 multiple times.  For example, srcset can change the image being displayed
 when the environment changes.  oneTimeOnly would be weird in that case.  For
 example, it would work when you load your page on a tablet, then work again
 when your browser outputs the display to a TV and changes the srcset image.
 (The image was never used, so the URL is still valid.)  But then when you go
 back to the tablet screen and reconfigure back to the original
 configuration, it suddenly breaks, since the first URL was already used and
 discarded.  The blob capture approach can be made to work with srcset, so
 this would work reliably.

I'm not really sure what you're saying, here.  If you want an URL to
expire or otherwise be revoked, no, you can't use it multiple times
after that.  If you want it to work multiple times, don't revoke it or
don't set oneTimeOnly.



Re: Blob URLs | autoRevoke, defaults, and resolutions

2013-05-05 Thread Jonas Sicking
On Fri, May 3, 2013 at 6:55 AM, Anne van Kesteren ann...@annevk.nl wrote:
 On Thu, May 2, 2013 at 12:53 AM, Jonas Sicking jo...@sicking.cc wrote:
 It actually has turned out to be surprisingly easy in Gecko. But I
 realize the same might not be true everywhere.

 Can we have a description of this (and how it does not run into the
 problems Glenn mentioned). I feel that I have an incomplete
 understanding of what actually happens at the moment.

 img.src = url

What we do is that we

1. Resolve the URL against the current base URL
2. Perform some security checks
3. Kick off a network fetch
4. Return

Note that no actual network activity happens here. That is all being
done on background threads. But what we do in step 3 is to send the
signal to the network code that it should start doing all the stuff
that it needs to do.

Step 3 is where we inserted the code to grab a reference to the Blob
such that it doesn't matter if the URL is revoked.

Some of this code will change. For example I'd like to move towards
doing the security checks asynchronously. Essentially by making them
part of the the stuff that the network code needs to do. But I we'll
always need to fire off that algorithm from the main thread, and
generally doing that synchronously is the simplest solution.

/ Jonas



Re: Blob URLs | autoRevoke, defaults, and resolutions

2013-05-05 Thread Glenn Maynard
On Sun, May 5, 2013 at 7:37 PM, Jonas Sicking jo...@sicking.cc wrote:

 What we do is that we

 1. Resolve the URL against the current base URL
 2. Perform some security checks
 3. Kick off a network fetch
 4. Return

 Note that no actual network activity happens here. That is all being
 done on background threads. But what we do in step 3 is to send the
 signal to the network code that it should start doing all the stuff
 that it needs to do.

 Step 3 is where we inserted the code to grab a reference to the Blob
 such that it doesn't matter if the URL is revoked.

 Some of this code will change. For example I'd like to move towards
 doing the security checks asynchronously. Essentially by making them
 part of the the stuff that the network code needs to do. But I we'll
 always need to fire off that algorithm from the main thread, and
 generally doing that synchronously is the simplest solution.


I think the only difference between this and what I'm suggesting is that
grabbing the blob happens in step #1, instead of step #3.  That way, it
still works if the fetch isn't actually started right away (srcset,
on-demand image loading, xhr.open(), etc).

-- 
Glenn Maynard


Re: Blob URLs | autoRevoke, defaults, and resolutions

2013-05-05 Thread Glenn Maynard
Oops, forgot this was sitting here.

On Fri, May 3, 2013 at 8:55 AM, Anne van Kesteren ann...@annevk.nl wrote:

 Glenn has at times suggested we could make a pertinent reference to
 the Blob object from the URL object you get from the parsing. That
 might work, but requires some special casing of blob URLs and soon
 mediastream URLs (and ...) in a thin wrapper around the URL parser
 which all end points would need to use.


The special casing doesn't seem bad (the specs using it don't need to know
anything about it).  It's the need to insert something into every entry
point that's annoying, but I don't see any way around that.

-- 
Glenn Maynard


Re: Blob URLs | autoRevoke, defaults, and resolutions

2013-05-03 Thread Anne van Kesteren
On Thu, May 2, 2013 at 12:53 AM, Jonas Sicking jo...@sicking.cc wrote:
 But if we can figure out this problem, then my proposal would be to
 add a new method which has a nicer name than createObjectURL as to
 encourage authors to use that and have fewer leaks.

Yeah, I've been thinking the same thing. URL.create(...) is available.
Or maybe URL.from() or some such.


 It actually has turned out to be surprisingly easy in Gecko. But I
 realize the same might not be true everywhere.

Can we have a description of this (and how it does not run into the
problems Glenn mentioned). I feel that I have an incomplete
understanding of what actually happens at the moment.

img.src = url

Will at least parse the URL. As you can synchronously request img.src
and it will return a serialization of the parsed URL (or the same
value in case parsing failed). Are any other algorithms synchronously
invoked?

img.style.backgroundImage = url( + url + )

Similarly seems to require (apparent) synchronous parsing.

Glenn has at times suggested we could make a pertinent reference to
the Blob object from the URL object you get from the parsing. That
might work, but requires some special casing of blob URLs and soon
mediastream URLs (and ...) in a thin wrapper around the URL parser
which all end points would need to use.

What am I missing?


--
http://annevankesteren.nl/



Blob URLs | autoRevoke, defaults, and resolutions

2013-05-01 Thread Arun Ranganathan
At the recent TPAC for Working Groups held in San Jose, Adrian Bateman, Jonas 
Sicking and I spent some time taking a look at how to remedy what the spec. 
says today about Blob URLs, both from the perspective of default behavior and 
in terms of what correct autoRevoke behavior should be.  This email is to 
summarize those discussions.

Blob URLs are used in different parts of the platform today, and are expected 
to work on the platform wherever URLs do.  This includes CSS, MediaStream and 
MediaSource use cases [1], along with use of 'src='.   

(Separate discussions about a v2 of the File API spec, including use of a 
Futures-based model in lieu of the event model, took place, but submitting a 
LCWD with major interoperability amongst all browsers is a good goal for this 
draft.)

Here's a summary of the Blob URL issues:

1. There's the relatively easy question of defaults.  While the spec says that 
URL.createObjectURL should create a Blob URL which has autoRevoke: true by 
default [2], there isn't any implementation that supports this, whether that's 
IE's oneTimeOnly behavior (which is related but different), or Firefox's 
autoRevoke implementation.  Chrome doesn't touch this yet :)

The spec. will roll back the default from true to false.  At least this 
matches what implementations do; there's been resistance to changing the 
default due to shipping applications relying on autoRevoke being false by 
default, or at least implementor reluctance [1].

Switching the default to false would enable IE, Chrome, andFirefox to have 
interoperability with URL.createObjectURL(blobArg), though such a default 
places burdens on web developers to couple create* calls with revoke* calls to 
not leak Blobs.  Jonas proposes a separate method, 
URL.createAutoRevokeObjectURL, which creates an autoRevoke URL.  I'm lukewarm 
on that :-\

2. Regardless of the default, there's the hard question of what to do with Blob 
URL revocation.  Glenn / zewt points out that this applies, though perhaps less 
dramatically, to *manually* revoked Blob URLs, and provides some test cases 
[3].  

Options are:

2a. To meticulously special-case Blob URLs, per Bug 17765 [4].  This calls for 
a synchronous step attached to wherever URLs are used to peg Blob URL data at 
fetch, so that the chance of a concurrent revocation doesn't cause things to 
behave unpredictably.  Firefox does a variation of this with keeping channels 
open, but solving this bug interoperably is going to be very hard, and has to 
be done in different places across the platform.  And even within CSS.  This is 
hard to move forward with.

2b.To adopt an 80-20 rule, and only specify what happens for some cases that 
seem common, but expressly disallow other cases.  This might be a more muted 
version of Bug 17765, especially if it can't be done within fetch [5].  

This could mean that the blob clause for basic fetch[5] only defines some 
cases where a synchronous fetch can be run (TBD) but expressly disallows others 
where synchronous fetching is not feasible.  This would limit the use of Blob 
URLs pretty drastically, but might be the only solution.  For instance, 
asynchronous calls accompanying embed, defer etc. might have to be 
expressly disallowed.  It would be great if we do this in fetch [5] :-)

Essentially, this might be to do what Firefox does but document what 
dereference means [6], and be clear about what might break.  Most 
implementors acknowledge that use of Blob URLs simply won't work in some cases 
(e.g. CSS cases, etc.).  We should formalize that; it would involve listing 
what works explicitly.  Anne?

2c. Re-use oneTimeOnly as in IE's behavior for autoRevoke (but call it 
autoRevoke).  But we jettisoned this for race conditions e.g.

// This is in IE only
 
img2.src = URL.createObjectURL(fileBlob, {oneTimeOnly: true});

// race now! then fail in IE only
img1.src = img2.src;

will fail in IE with oneTimeOnly.  It appears to fail reliably, but again, 
dereference URL may not be interoperable here.  This is probably not what we 
should do, but it was worth listing, since it carries the brute force of a 
shipping implementation, and shows how some % of the market has actively solved 
this problem :)

3. We can lift origin restrictions in v2 on Blob URL; currently, one shipping 
implementation (IE) actively relies on origin restrictions, but expressed 
willingness to phase this out.  Most use cases needing Blob data across origins 
can be met without needing Blob URLs to not be origin restricted.  Blob URLs 
must be unguessable for that to happen, and today, they aren't unguessable in 
some implementations.

-- A*


[1] https://www.w3.org/Bugs/Public/show_bug.cgi?id=19594
[2] http://dev.w3.org/2006/webapi/FileAPI/#creating-revoking
[3] http://lists.w3.org/Archives/Public/public-webapps/2013AprJun/0294.html
[4] https://www.w3.org/Bugs/Public/show_bug.cgi?id=17765
[5] http://fetch.spec.whatwg.org/#concept-fetch
[6] 

Re: Blob URLs | autoRevoke, defaults, and resolutions

2013-05-01 Thread Eric U
On Wed, May 1, 2013 at 3:36 PM, Arun Ranganathan a...@mozilla.com wrote:
 At the recent TPAC for Working Groups held in San Jose, Adrian Bateman, Jonas 
 Sicking and I spent some time taking a look at how to remedy what the spec. 
 says today about Blob URLs, both from the perspective of default behavior and 
 in terms of what correct autoRevoke behavior should be.  This email is to 
 summarize those discussions.

 Blob URLs are used in different parts of the platform today, and are expected 
 to work on the platform wherever URLs do.  This includes CSS, MediaStream and 
 MediaSource use cases [1], along with use of 'src='.

 (Separate discussions about a v2 of the File API spec, including use of a 
 Futures-based model in lieu of the event model, took place, but submitting a 
 LCWD with major interoperability amongst all browsers is a good goal for this 
 draft.)

 Here's a summary of the Blob URL issues:

 1. There's the relatively easy question of defaults.  While the spec says 
 that URL.createObjectURL should create a Blob URL which has autoRevoke: true 
 by default [2], there isn't any implementation that supports this, whether 
 that's IE's oneTimeOnly behavior (which is related but different), or 
 Firefox's autoRevoke implementation.  Chrome doesn't touch this yet :)

 The spec. will roll back the default from true to false.  At least this 
 matches what implementations do; there's been resistance to changing the 
 default due to shipping applications relying on autoRevoke being false by 
 default, or at least implementor reluctance [1].

Sounds good.  Let's just be consistent.

 Switching the default to false would enable IE, Chrome, andFirefox to have 
 interoperability with URL.createObjectURL(blobArg), though such a default 
 places burdens on web developers to couple create* calls with revoke* calls 
 to not leak Blobs.  Jonas proposes a separate method, 
 URL.createAutoRevokeObjectURL, which creates an autoRevoke URL.  I'm lukewarm 
 on that :-\

I'd support a new method with a different default, if we could figure
out a reasonable thing for that new method to do.

 2. Regardless of the default, there's the hard question of what to do with 
 Blob URL revocation.  Glenn / zewt points out that this applies, though 
 perhaps less dramatically, to *manually* revoked Blob URLs, and provides some 
 test cases [3].

 Options are:

 2a. To meticulously special-case Blob URLs, per Bug 17765 [4].  This calls 
 for a synchronous step attached to wherever URLs are used to peg Blob URL 
 data at fetch, so that the chance of a concurrent revocation doesn't cause 
 things to behave unpredictably.  Firefox does a variation of this with 
 keeping channels open, but solving this bug interoperably is going to be very 
 hard, and has to be done in different places across the platform.  And even 
 within CSS.  This is hard to move forward with.

Hard.

 2b.To adopt an 80-20 rule, and only specify what happens for some cases that 
 seem common, but expressly disallow other cases.  This might be a more muted 
 version of Bug 17765, especially if it can't be done within fetch [5].

Ugly.

 This could mean that the blob clause for basic fetch[5] only defines some 
 cases where a synchronous fetch can be run (TBD) but expressly disallows 
 others where synchronous fetching is not feasible.  This would limit the use 
 of Blob URLs pretty drastically, but might be the only solution.  For 
 instance, asynchronous calls accompanying embed, defer etc. might have to 
 be expressly disallowed.  It would be great if we do this in fetch [5] :-)

Just to be clear, this would limit the use of *autoRevoke* Blob URLs,
not all Blob URLs, yes?

 Essentially, this might be to do what Firefox does but document what 
 dereference means [6], and be clear about what might break.  Most 
 implementors acknowledge that use of Blob URLs simply won't work in some 
 cases (e.g. CSS cases, etc.).  We should formalize that; it would involve 
 listing what works explicitly.  Anne?

 2c. Re-use oneTimeOnly as in IE's behavior for autoRevoke (but call it 
 autoRevoke).  But we jettisoned this for race conditions e.g.

 // This is in IE only

 img2.src = URL.createObjectURL(fileBlob, {oneTimeOnly: true});

 // race now! then fail in IE only
 img1.src = img2.src;

 will fail in IE with oneTimeOnly.  It appears to fail reliably, but again, 
 dereference URL may not be interoperable here.  This is probably not what 
 we should do, but it was worth listing, since it carries the brute force of a 
 shipping implementation, and shows how some % of the market has actively 
 solved this problem :)

I'm not really sure this is so bad.  I know it's the case I brought
up, and I must admit that I disliked the oneTimeOnly when I first
heard about it, but all other proposals [including not having
automatic revocation at all] now seem worse.  Here you've set
something to be oneTimeOnly and used it twice; if that fails in IE,
that's correct.  If it works some of 

Re: Blob URLs | autoRevoke, defaults, and resolutions

2013-05-01 Thread Jonas Sicking
On Wed, May 1, 2013 at 4:25 PM, Eric U er...@google.com wrote:
 On Wed, May 1, 2013 at 3:36 PM, Arun Ranganathan a...@mozilla.com wrote:
 Switching the default to false would enable IE, Chrome, andFirefox to have 
 interoperability with URL.createObjectURL(blobArg), though such a default 
 places burdens on web developers to couple create* calls with revoke* calls 
 to not leak Blobs.  Jonas proposes a separate method, 
 URL.createAutoRevokeObjectURL, which creates an autoRevoke URL.  I'm 
 lukewarm on that :-\

 I'd support a new method with a different default, if we could figure
 out a reasonable thing for that new method to do.

Yeah, the if-condition here is quite important.

But if we can figure out this problem, then my proposal would be to
add a new method which has a nicer name than createObjectURL as to
encourage authors to use that and have fewer leaks.

 2. Regardless of the default, there's the hard question of what to do with 
 Blob URL revocation.  Glenn / zewt points out that this applies, though 
 perhaps less dramatically, to *manually* revoked Blob URLs, and provides 
 some test cases [3].

 Options are:

 2a. To meticulously special-case Blob URLs, per Bug 17765 [4].  This calls 
 for a synchronous step attached to wherever URLs are used to peg Blob URL 
 data at fetch, so that the chance of a concurrent revocation doesn't cause 
 things to behave unpredictably.  Firefox does a variation of this with 
 keeping channels open, but solving this bug interoperably is going to be 
 very hard, and has to be done in different places across the platform.  And 
 even within CSS.  This is hard to move forward with.

 Hard.

It actually has turned out to be surprisingly easy in Gecko. But I
realize the same might not be true everywhere.

 2b.To adopt an 80-20 rule, and only specify what happens for some cases that 
 seem common, but expressly disallow other cases.  This might be a more muted 
 version of Bug 17765, especially if it can't be done within fetch [5].

 Ugly.

 This could mean that the blob clause for basic fetch[5] only defines 
 some cases where a synchronous fetch can be run (TBD) but expressly 
 disallows others where synchronous fetching is not feasible.  This would 
 limit the use of Blob URLs pretty drastically, but might be the only 
 solution.  For instance, asynchronous calls accompanying embed, defer 
 etc. might have to be expressly disallowed.  It would be great if we do this 
 in fetch [5] :-)

 Just to be clear, this would limit the use of *autoRevoke* Blob URLs,
 not all Blob URLs, yes?

No, it would limit the use of all *revokable* Blob URLs. Since you get
exactly the same issues when the page calls revokeObjectURL manually.
So that means that it applies to all Blob URLs.

 2c. Re-use oneTimeOnly as in IE's behavior for autoRevoke (but call it 
 autoRevoke).  But we jettisoned this for race conditions e.g.

 // This is in IE only

 img2.src = URL.createObjectURL(fileBlob, {oneTimeOnly: true});

 // race now! then fail in IE only
 img1.src = img2.src;

 will fail in IE with oneTimeOnly.  It appears to fail reliably, but again, 
 dereference URL may not be interoperable here.  This is probably not what 
 we should do, but it was worth listing, since it carries the brute force of 
 a shipping implementation, and shows how some % of the market has actively 
 solved this problem :)

 I'm not really sure this is so bad.  I know it's the case I brought
 up, and I must admit that I disliked the oneTimeOnly when I first
 heard about it, but all other proposals [including not having
 automatic revocation at all] now seem worse.  Here you've set
 something to be oneTimeOnly and used it twice; if that fails in IE,
 that's correct.  If it works some of the time in other browsers [after
 they implement oneTimeOnly], that's not good, but you did pretty much
 aim at your own foot.  Developers that actively try to do the right
 thing will have consistent good results without extra code, at least.
 I realize that img1.src = img2.src failing is odd, but as [IIRC]
 Adrian pointed out, if it's an uncacheable image on a server that's
 gone away, couldn't that already happen, depending on your network
 stack implementation?

I'm more worried that if implementations doesn't initiate the load
synchronously, which is hard per your comment above, then it can
easily be random which of the two loads succeeds and which fails. If
the revoking happens at the end of the load, both loads could even
succeed depending on timing and implementation details.

/ Jonas



Re: Blob URLs | autoRevoke, defaults, and resolutions

2013-05-01 Thread Glenn Maynard
On Wed, May 1, 2013 at 5:36 PM, Arun Ranganathan a...@mozilla.com wrote:

 2a. To meticulously special-case Blob URLs, per Bug 17765 [4].  This calls
 for a synchronous step attached to wherever URLs are used to peg Blob URL
 data at fetch, so that the chance of a concurrent revocation doesn't cause
 things to behave unpredictably.  Firefox does a variation of this with
 keeping channels open, but solving this bug interoperably is going to be
 very hard, and has to be done in different places across the platform.  And
 even within CSS.  This is hard to move forward with.

 2b.To adopt an 80-20 rule, and only specify what happens for some cases
 that seem common, but expressly disallow other cases.  This might be a more
 muted version of Bug 17765, especially if it can't be done within fetch [5].


I'm okay with limiting this in cases where it's particularly hard to
define.  In particular, it seems like placing a hook in CSS in any
deterministic way is hard, at least today: from what I understand, the time
CSS parsing happens is unspecified.

However, we probably can't break non-autorevoke blob URLs with CSS.  So,
I'd propose:

- Start by defining that auto-revoke blob URLs may only be used with APIs
that explicitly capture the blob (putting aside the mechanics of how we do
that, for now).  Blob capture would still affect non-autorevoke blob URLs,
since it fixes race conditions, but an uncaptured blob URL would continue
to work with non-autorevoke URLs.
- Apply blob capture to one or two test cases.  I think XHR is a good place
for this, because it's easy to test, due to the xhr.open() and xhr.send()
split.  xhr.open() is where blob capture should happen, and xhr.send() is
where the fetch happens.
- Once people are comfortable with how it works, start applying it to other
major blob URL cases (eg. img).  Whether to apply it broadly to all APIs
next or not is something that could be decided at this point.

This will make autorevoke blob URLs work, gradually fix manual-revoke blob
URLs as a side-effect, and leave manual-revoke URLs unspecified but
functional for the remaining cases.  It also doesn't require us to dive in
head-first and try to apply this to every API on the platform all at once,
which nobody wants to do; it lets us test it out, then apply it to more
APIs at whatever pace makes sense.

(I don't know any way to deal with the CSS case.)



 2c. Re-use oneTimeOnly as in IE's behavior for autoRevoke (but call it
 autoRevoke).  But we jettisoned this for race conditions e.g.

 // This is in IE only

 img2.src = URL.createObjectURL(fileBlob, {oneTimeOnly: true});

 // race now! then fail in IE only
 img1.src = img2.src;

 will fail in IE with oneTimeOnly.  It appears to fail reliably, but again,
 dereference URL may not be interoperable here.  This is probably not what
 we should do, but it was worth listing, since it carries the brute force of
 a shipping implementation, and shows how some % of the market has actively
 solved this problem :)


There are a lot of problems with oneTimeOnly.  It's very easy for the URL
to never actually be used, which results in a subtle and expensive blob
leak.  For example, this:

setInterval(function() {
img.src = URL.createObjectURL(createBlob(), {oneTimeOnly: true});
}, 100);

might leak 10 blobs per second, since a browser that obtains images on
demand might not fetch the blob at all, while a browser that obtains
images immediately wouldn't.

-- 
Glenn Maynard


Re: Blob URLs | autoRevoke, defaults, and resolutions

2013-05-01 Thread Eric U
On Wed, May 1, 2013 at 4:53 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Wed, May 1, 2013 at 4:25 PM, Eric U er...@google.com wrote:
 On Wed, May 1, 2013 at 3:36 PM, Arun Ranganathan a...@mozilla.com wrote:
 Switching the default to false would enable IE, Chrome, andFirefox to 
 have interoperability with URL.createObjectURL(blobArg), though such a 
 default places burdens on web developers to couple create* calls with 
 revoke* calls to not leak Blobs.  Jonas proposes a separate method, 
 URL.createAutoRevokeObjectURL, which creates an autoRevoke URL.  I'm 
 lukewarm on that :-\

 I'd support a new method with a different default, if we could figure
 out a reasonable thing for that new method to do.

 Yeah, the if-condition here is quite important.

 But if we can figure out this problem, then my proposal would be to
 add a new method which has a nicer name than createObjectURL as to
 encourage authors to use that and have fewer leaks.

Heh; I wasn't even going to mention the name.

 2. Regardless of the default, there's the hard question of what to do with 
 Blob URL revocation.  Glenn / zewt points out that this applies, though 
 perhaps less dramatically, to *manually* revoked Blob URLs, and provides 
 some test cases [3].

 Options are:

 2a. To meticulously special-case Blob URLs, per Bug 17765 [4].  This calls 
 for a synchronous step attached to wherever URLs are used to peg Blob URL 
 data at fetch, so that the chance of a concurrent revocation doesn't cause 
 things to behave unpredictably.  Firefox does a variation of this with 
 keeping channels open, but solving this bug interoperably is going to be 
 very hard, and has to be done in different places across the platform.  And 
 even within CSS.  This is hard to move forward with.

 Hard.

 It actually has turned out to be surprisingly easy in Gecko. But I
 realize the same might not be true everywhere.

Right, and defining just when it happens, across browsers, may also be hard.

 2b.To adopt an 80-20 rule, and only specify what happens for some cases 
 that seem common, but expressly disallow other cases.  This might be a more 
 muted version of Bug 17765, especially if it can't be done within fetch [5].

 Ugly.

 This could mean that the blob clause for basic fetch[5] only defines 
 some cases where a synchronous fetch can be run (TBD) but expressly 
 disallows others where synchronous fetching is not feasible.  This would 
 limit the use of Blob URLs pretty drastically, but might be the only 
 solution.  For instance, asynchronous calls accompanying embed, defer 
 etc. might have to be expressly disallowed.  It would be great if we do 
 this in fetch [5] :-)

 Just to be clear, this would limit the use of *autoRevoke* Blob URLs,
 not all Blob URLs, yes?

 No, it would limit the use of all *revokable* Blob URLs. Since you get
 exactly the same issues when the page calls revokeObjectURL manually.
 So that means that it applies to all Blob URLs.

Ah, right; all revoked Blob URLs.

 2c. Re-use oneTimeOnly as in IE's behavior for autoRevoke (but call it 
 autoRevoke).  But we jettisoned this for race conditions e.g.

 // This is in IE only

 img2.src = URL.createObjectURL(fileBlob, {oneTimeOnly: true});

 // race now! then fail in IE only
 img1.src = img2.src;

 will fail in IE with oneTimeOnly.  It appears to fail reliably, but again, 
 dereference URL may not be interoperable here.  This is probably not what 
 we should do, but it was worth listing, since it carries the brute force of 
 a shipping implementation, and shows how some % of the market has actively 
 solved this problem :)

 I'm not really sure this is so bad.  I know it's the case I brought
 up, and I must admit that I disliked the oneTimeOnly when I first
 heard about it, but all other proposals [including not having
 automatic revocation at all] now seem worse.  Here you've set
 something to be oneTimeOnly and used it twice; if that fails in IE,
 that's correct.  If it works some of the time in other browsers [after
 they implement oneTimeOnly], that's not good, but you did pretty much
 aim at your own foot.  Developers that actively try to do the right
 thing will have consistent good results without extra code, at least.
 I realize that img1.src = img2.src failing is odd, but as [IIRC]
 Adrian pointed out, if it's an uncacheable image on a server that's
 gone away, couldn't that already happen, depending on your network
 stack implementation?

 I'm more worried that if implementations doesn't initiate the load
 synchronously, which is hard per your comment above, then it can
 easily be random which of the two loads succeeds and which fails. If
 the revoking happens at the end of the load, both loads could even
 succeed depending on timing and implementation details.

Yup; I'm just saying that if you get a failure here, you shouldn't be
surprised, no matter which img gets it.  You did something explicitly
wrong.  Ideally we'd give predictable behavior, but if we can't do

Re: Blob URLs | autoRevoke, defaults, and resolutions

2013-05-01 Thread Glenn Maynard
On Wed, May 1, 2013 at 7:01 PM, Eric U er...@google.com wrote:

 Hmm...now Glenn points out another problem: if you /never/ load the
 image, for whatever reason, you can still leak it.  How likely is that
 in good code, though?  And is it worse than the current state in good
 or bad code?


I think it's much too easy for well-meaning developers to mess this up.
The example I gave is code that *does* use the URL, but the browser may or
may not actually do anything with it.  (I wouldn't even call that author
error--it's an interoperability failure.)  Also, the failures are both
expensive and subtle (eg. lots of big blobs being silently leaked to disk),
which is a pretty nasty failure mode.

Another problem is that APIs should be able to receive an API, then use it
multiple times.  For example, srcset can change the image being displayed
when the environment changes.  oneTimeOnly would be weird in that case.
For example, it would work when you load your page on a tablet, then work
again when your browser outputs the display to a TV and changes the srcset
image.  (The image was never used, so the URL is still valid.)  But then
when you go back to the tablet screen and reconfigure back to the original
configuration, it suddenly breaks, since the first URL was already used and
discarded.  The blob capture approach can be made to work with srcset, so
this would work reliably.

-- 
Glenn Maynard