[fileapi-directories-and-system/filewriter]

2014-04-02 Thread Eric U
Status:

 The specs are clearly dead; it's just been way down on my
priority list to do anything about it.  We should funnel it off to be
a Note [or whatever the proper procedure is--Art?].

  Eric



Re: IndexedDB: Syntax for specifying persistent/temporary storage

2013-12-13 Thread Eric U
Good writeup, Jonas--I think you've hit the major points.

I think numeric priorities are both overkill and underpowered,
depending upon their specific implementation.  Without the promise
we're currently making for Persistent storage [this will never be
cleared unless you do it or the user explicitly requests it], numeric
priorities are ultimately weaker than apps want.  Unless we say that
the top priority is the same as persistent, in which case we've added
complexity without taking any away.

The idea of Default is kind of appealing, and easy to migrate over to,
but I'm not sure it's necessary.  As Kinuko says, we can just unlock
Persistent storage for apps on install, and let them migrate over
whichever data needs it.  This would work better if we supplied a tool
to do an atomic migration, though--using the current APIs, apps would
have to use 2x their storage during the transition, and browser
developers might be able to implement it internally with a simple flag
change or directory rename.

I don't have a strong opinion there, but I lean toward just the two
types rather than three.

As for Alex's please clear up space event--it's not clear to me how
to do that cleanly for apps that aren't currently loaded, which may
need to talk to servers that aren't currently running, which the user
may never plan to run again, or which require credentials to access
their stored data, etc.


On Wed, Dec 11, 2013 at 7:39 PM, Jonas Sicking jo...@sicking.cc wrote:
 Hi All,

 Thanks Jan for sending this.

 Now let me throw a giant wrench into this discussion :-)

 Unfortunately as we've been discussing webapps, manifests etc at
 mozilla I've slowly come to the realization that the
 temporary/persistent categorization isn't really fulfilling all the
 envisioned use cases.

 The background is that multiple platforms are now building the
 functionality to run normal websites outside of the browser.

 iOS was one of the first popular implementations of this. If you put a
 meta name=apple-mobile-web-app-capable content=yes in the markup
 of a page, and the user use bookmark to homescreen feature in iOS
 Safari, that almost turns the website into an app [1].

 Google is currently working on implementing the same feature in Chrome
 for Android. At mozilla we created a proposal [2] for what is
 essentially a standardized version of the same idea.

 I think this approach is a really awesome use of the web and something
 that I'm very interested in supporting when designing these storage
 APIs.

 To support this use case, I think it needs to be possible for a
 website to first start as a website which the user only has a casual
 connection with, then gradually grow into something that the user
 essentially treats as a trusted app.

 Such a trusted app should have much more ability to store data without
 having to ask the user for permission, or without that data being
 suddenly deleted because we're low on disk space. In short, such an
 app should be treated more like a native app when it comes to storage.

 There are a few ways we can enable this use case. In the discussion
 below I'll use IndexedDB as an example of storage API, but it applies
 to all storage APIs equally.

 A)
 The temporary/persistent split almost enables this. We could say
 that when something that's a normal website stores data in temporary
 storage we count that data towards both per-origin and global quotas.
 If the global quota fills up, then we silently delete data from
 websites in an LRU fashion.

 If the user converts the website to an app by using bookmark to
 homescreen then we simply start treating the data stored in the
 temporary storage as persistent. I.e. we don't count it towards the
 global temporary-storage quota and we never delete it in order to make
 room for other websites.

 For persistent databases we would for normal websites put up a
 prompt (I'll leave out details like if this happens only when the
 quota API is used, or if can happen when the database is being written
 to). If persistent storage is used by a bookmarked app we simply
 would not prompt. In neither case would data stored in persistent
 storage ever be silently deleted in order to make room for other
 storage.

 The problem with this solution is that it doesn't give bookmarked apps
 the ability to create truly temporary data. Even data that a
 bookmarked app puts in the temporary storage is effectively treated
 as persistent and so not deleted if we start to run low on disk space.
 Temporary storage for apps is a feature that Android has, and that to
 some extent *nix OSs has had through use of /tmp. It definite is
 something that seems nice for constrained mobile devices.

 B)
 We could create a temporary/default/persistent split. I.e. we
 create three different storage categories.

 The default is what's used if no storage category is explicitly
 specified a IDB database is created. For normal webpages default is
 treated like temporary. I.e. it is counted towards 

Re: FileSystem API

2013-08-19 Thread Eric U
OK, I just finished making my way through the public-script-coord
thread [I'm not on that list, but someone pointed me to it].  I have
no official objections to you editing a spec based on Jonas's
proposal, but I do have a couple of questions:

1) Why is this on public-script-coord instead of public-webapps?
2) Is any vendor other than Mozilla actually interested in this
proposal?  When it was brought up on public-webapps, and at the
WebApps F2F, it dropped with a resounding thud.

Given the standardization failure of the Chrome FileSystem API, this
could be a massive waste of time.  Or it could just be a way for
Mozilla to document its filesystem API, since we've already got
documentation of the Chrome API, but then you don't need to drag
public-script-coord into that.

I may have a few small bits of feedback on the color of the bikeshed,
but mostly I'm going to stay out of it, lest I accidentally give the
impression that we're going to implement it.  As I stated at the F2F,
we'll be the last ones to do it, but if 2 major browser vendors ship
it first, we'll certainly consider it.

On Mon, Aug 19, 2013 at 3:11 PM, Arun Ranganathan a...@mozilla.com wrote:
 Greetings Eric and WG,

 The Chair and I were discussing setting up repositories for the 
 specifications discussed here 
 (http://lists.w3.org/Archives/Public/public-webapps/2013JulSep/0307.html), 
 notably the FileSystem API and File API v2.  Before creating a repository to 
 edit the FileSystem API, we thought we'd check with you about the first 
 proposal, which Chrome implements, and get the Google perspective.

 You've edited the first FileSystem API proposal, which currently lives here 
 (http://www.w3.org/TR/file-system-api/).  Can I create a repository and edit 
 the other proposal for FileSystem API, which currently exists as an email 
 thread 
 (http://lists.w3.org/Archives/Public/public-script-coord/2013JulSep/0379.html)
  ?

 Just checking to see if there are any objections or concerns that would stop 
 a draft or future WG activity.  Of course, technical nits should be heard as 
 well, and can proceed concurrently with a draft :)

 -- A*



Re: ZIP archive API?

2013-05-06 Thread Eric U
On Mon, May 6, 2013 at 5:03 AM, Glenn Maynard gl...@zewt.org wrote:
 On Mon, May 6, 2013 at 6:27 AM, Robin Berjon ro...@w3.org wrote:

 Another question to take into account here is whether this should only be
 about zip. One of the limitations of zip archives is that they aren't
 streamable. Without boiling the ocean, adding support for a streamable
 format (which I don't think needs be more complex than tar) would be a big
 plus.


 Zips are streamable.  That's what the local file headers are for.
 http://www.pkware.com/documents/casestudies/APPNOTE.TXT

This came up a few years ago; Gregg Tavares explained in [1] that only
/some/ zipfiles are streamable, and you don't know whether yours are
or not until you've seen the whole file.

 Eric

[1] http://lists.w3.org/Archives/Public/public-webapps/2010AprJun/0362.html



Re: Blob URLs | autoRevoke, defaults, and resolutions

2013-05-06 Thread Eric U
On Wed, May 1, 2013 at 5:16 PM, Glenn Maynard gl...@zewt.org wrote:
 On Wed, May 1, 2013 at 7:01 PM, Eric U er...@google.com wrote:

 Hmm...now Glenn points out another problem: if you /never/ load the
 image, for whatever reason, you can still leak it.  How likely is that
 in good code, though?  And is it worse than the current state in good
 or bad code?


 I think it's much too easy for well-meaning developers to mess this up.  The
 example I gave is code that *does* use the URL, but the browser may or may
 not actually do anything with it.  (I wouldn't even call that author
 error--it's an interoperability failure.)  Also, the failures are both
 expensive and subtle (eg. lots of big blobs being silently leaked to disk),
 which is a pretty nasty failure mode.

True.

 Another problem is that APIs should be able to receive an API, then use it
 multiple times.  For example, srcset can change the image being displayed
 when the environment changes.  oneTimeOnly would be weird in that case.  For
 example, it would work when you load your page on a tablet, then work again
 when your browser outputs the display to a TV and changes the srcset image.
 (The image was never used, so the URL is still valid.)  But then when you go
 back to the tablet screen and reconfigure back to the original
 configuration, it suddenly breaks, since the first URL was already used and
 discarded.  The blob capture approach can be made to work with srcset, so
 this would work reliably.

I'm not really sure what you're saying, here.  If you want an URL to
expire or otherwise be revoked, no, you can't use it multiple times
after that.  If you want it to work multiple times, don't revoke it or
don't set oneTimeOnly.



Re: Blob URLs | autoRevoke, defaults, and resolutions

2013-05-01 Thread Eric U
On Wed, May 1, 2013 at 3:36 PM, Arun Ranganathan a...@mozilla.com wrote:
 At the recent TPAC for Working Groups held in San Jose, Adrian Bateman, Jonas 
 Sicking and I spent some time taking a look at how to remedy what the spec. 
 says today about Blob URLs, both from the perspective of default behavior and 
 in terms of what correct autoRevoke behavior should be.  This email is to 
 summarize those discussions.

 Blob URLs are used in different parts of the platform today, and are expected 
 to work on the platform wherever URLs do.  This includes CSS, MediaStream and 
 MediaSource use cases [1], along with use of 'src='.

 (Separate discussions about a v2 of the File API spec, including use of a 
 Futures-based model in lieu of the event model, took place, but submitting a 
 LCWD with major interoperability amongst all browsers is a good goal for this 
 draft.)

 Here's a summary of the Blob URL issues:

 1. There's the relatively easy question of defaults.  While the spec says 
 that URL.createObjectURL should create a Blob URL which has autoRevoke: true 
 by default [2], there isn't any implementation that supports this, whether 
 that's IE's oneTimeOnly behavior (which is related but different), or 
 Firefox's autoRevoke implementation.  Chrome doesn't touch this yet :)

 The spec. will roll back the default from true to false.  At least this 
 matches what implementations do; there's been resistance to changing the 
 default due to shipping applications relying on autoRevoke being false by 
 default, or at least implementor reluctance [1].

Sounds good.  Let's just be consistent.

 Switching the default to false would enable IE, Chrome, andFirefox to have 
 interoperability with URL.createObjectURL(blobArg), though such a default 
 places burdens on web developers to couple create* calls with revoke* calls 
 to not leak Blobs.  Jonas proposes a separate method, 
 URL.createAutoRevokeObjectURL, which creates an autoRevoke URL.  I'm lukewarm 
 on that :-\

I'd support a new method with a different default, if we could figure
out a reasonable thing for that new method to do.

 2. Regardless of the default, there's the hard question of what to do with 
 Blob URL revocation.  Glenn / zewt points out that this applies, though 
 perhaps less dramatically, to *manually* revoked Blob URLs, and provides some 
 test cases [3].

 Options are:

 2a. To meticulously special-case Blob URLs, per Bug 17765 [4].  This calls 
 for a synchronous step attached to wherever URLs are used to peg Blob URL 
 data at fetch, so that the chance of a concurrent revocation doesn't cause 
 things to behave unpredictably.  Firefox does a variation of this with 
 keeping channels open, but solving this bug interoperably is going to be very 
 hard, and has to be done in different places across the platform.  And even 
 within CSS.  This is hard to move forward with.

Hard.

 2b.To adopt an 80-20 rule, and only specify what happens for some cases that 
 seem common, but expressly disallow other cases.  This might be a more muted 
 version of Bug 17765, especially if it can't be done within fetch [5].

Ugly.

 This could mean that the blob clause for basic fetch[5] only defines some 
 cases where a synchronous fetch can be run (TBD) but expressly disallows 
 others where synchronous fetching is not feasible.  This would limit the use 
 of Blob URLs pretty drastically, but might be the only solution.  For 
 instance, asynchronous calls accompanying embed, defer etc. might have to 
 be expressly disallowed.  It would be great if we do this in fetch [5] :-)

Just to be clear, this would limit the use of *autoRevoke* Blob URLs,
not all Blob URLs, yes?

 Essentially, this might be to do what Firefox does but document what 
 dereference means [6], and be clear about what might break.  Most 
 implementors acknowledge that use of Blob URLs simply won't work in some 
 cases (e.g. CSS cases, etc.).  We should formalize that; it would involve 
 listing what works explicitly.  Anne?

 2c. Re-use oneTimeOnly as in IE's behavior for autoRevoke (but call it 
 autoRevoke).  But we jettisoned this for race conditions e.g.

 // This is in IE only

 img2.src = URL.createObjectURL(fileBlob, {oneTimeOnly: true});

 // race now! then fail in IE only
 img1.src = img2.src;

 will fail in IE with oneTimeOnly.  It appears to fail reliably, but again, 
 dereference URL may not be interoperable here.  This is probably not what 
 we should do, but it was worth listing, since it carries the brute force of a 
 shipping implementation, and shows how some % of the market has actively 
 solved this problem :)

I'm not really sure this is so bad.  I know it's the case I brought
up, and I must admit that I disliked the oneTimeOnly when I first
heard about it, but all other proposals [including not having
automatic revocation at all] now seem worse.  Here you've set
something to be oneTimeOnly and used it twice; if that fails in IE,
that's correct.  If it works some of 

Re: Blob URLs | autoRevoke, defaults, and resolutions

2013-05-01 Thread Eric U
On Wed, May 1, 2013 at 4:53 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Wed, May 1, 2013 at 4:25 PM, Eric U er...@google.com wrote:
 On Wed, May 1, 2013 at 3:36 PM, Arun Ranganathan a...@mozilla.com wrote:
 Switching the default to false would enable IE, Chrome, andFirefox to 
 have interoperability with URL.createObjectURL(blobArg), though such a 
 default places burdens on web developers to couple create* calls with 
 revoke* calls to not leak Blobs.  Jonas proposes a separate method, 
 URL.createAutoRevokeObjectURL, which creates an autoRevoke URL.  I'm 
 lukewarm on that :-\

 I'd support a new method with a different default, if we could figure
 out a reasonable thing for that new method to do.

 Yeah, the if-condition here is quite important.

 But if we can figure out this problem, then my proposal would be to
 add a new method which has a nicer name than createObjectURL as to
 encourage authors to use that and have fewer leaks.

Heh; I wasn't even going to mention the name.

 2. Regardless of the default, there's the hard question of what to do with 
 Blob URL revocation.  Glenn / zewt points out that this applies, though 
 perhaps less dramatically, to *manually* revoked Blob URLs, and provides 
 some test cases [3].

 Options are:

 2a. To meticulously special-case Blob URLs, per Bug 17765 [4].  This calls 
 for a synchronous step attached to wherever URLs are used to peg Blob URL 
 data at fetch, so that the chance of a concurrent revocation doesn't cause 
 things to behave unpredictably.  Firefox does a variation of this with 
 keeping channels open, but solving this bug interoperably is going to be 
 very hard, and has to be done in different places across the platform.  And 
 even within CSS.  This is hard to move forward with.

 Hard.

 It actually has turned out to be surprisingly easy in Gecko. But I
 realize the same might not be true everywhere.

Right, and defining just when it happens, across browsers, may also be hard.

 2b.To adopt an 80-20 rule, and only specify what happens for some cases 
 that seem common, but expressly disallow other cases.  This might be a more 
 muted version of Bug 17765, especially if it can't be done within fetch [5].

 Ugly.

 This could mean that the blob clause for basic fetch[5] only defines 
 some cases where a synchronous fetch can be run (TBD) but expressly 
 disallows others where synchronous fetching is not feasible.  This would 
 limit the use of Blob URLs pretty drastically, but might be the only 
 solution.  For instance, asynchronous calls accompanying embed, defer 
 etc. might have to be expressly disallowed.  It would be great if we do 
 this in fetch [5] :-)

 Just to be clear, this would limit the use of *autoRevoke* Blob URLs,
 not all Blob URLs, yes?

 No, it would limit the use of all *revokable* Blob URLs. Since you get
 exactly the same issues when the page calls revokeObjectURL manually.
 So that means that it applies to all Blob URLs.

Ah, right; all revoked Blob URLs.

 2c. Re-use oneTimeOnly as in IE's behavior for autoRevoke (but call it 
 autoRevoke).  But we jettisoned this for race conditions e.g.

 // This is in IE only

 img2.src = URL.createObjectURL(fileBlob, {oneTimeOnly: true});

 // race now! then fail in IE only
 img1.src = img2.src;

 will fail in IE with oneTimeOnly.  It appears to fail reliably, but again, 
 dereference URL may not be interoperable here.  This is probably not what 
 we should do, but it was worth listing, since it carries the brute force of 
 a shipping implementation, and shows how some % of the market has actively 
 solved this problem :)

 I'm not really sure this is so bad.  I know it's the case I brought
 up, and I must admit that I disliked the oneTimeOnly when I first
 heard about it, but all other proposals [including not having
 automatic revocation at all] now seem worse.  Here you've set
 something to be oneTimeOnly and used it twice; if that fails in IE,
 that's correct.  If it works some of the time in other browsers [after
 they implement oneTimeOnly], that's not good, but you did pretty much
 aim at your own foot.  Developers that actively try to do the right
 thing will have consistent good results without extra code, at least.
 I realize that img1.src = img2.src failing is odd, but as [IIRC]
 Adrian pointed out, if it's an uncacheable image on a server that's
 gone away, couldn't that already happen, depending on your network
 stack implementation?

 I'm more worried that if implementations doesn't initiate the load
 synchronously, which is hard per your comment above, then it can
 easily be random which of the two loads succeeds and which fails. If
 the revoking happens at the end of the load, both loads could even
 succeed depending on timing and implementation details.

Yup; I'm just saying that if you get a failure here, you shouldn't be
surprised, no matter which img gets it.  You did something explicitly
wrong.  Ideally we'd give predictable behavior, but if we can't do

Re: FileSystem compromise spec

2012-12-11 Thread Eric U
On Fri, Nov 30, 2012 at 9:11 AM, SULLIVAN, BRYAN L bs3...@att.com wrote:
 -Original Message-
 From: Arthur Barstow [mailto:art.bars...@nokia.com]
 Sent: Friday, November 30, 2012 6:46 AM
 To: ext Eric U; Doug Schepers
 Cc: Web Applications Working Group WG
 Subject: Re: FileSystem compromise spec

 On 11/15/12 7:39 PM, ext Eric U wrote:
  As discussed at TPAC, there's little support for the current FileSystem 
  API, but
  some support for a new API, and I promised to put forth a compromise 
  proposal.
  In order to do that, I'd like to hear 1) what kinds of changes would make 
  it
  more popular; 2) who I'm trying to convince.  There are a number of folks 
  who
  have said that they're not interested in a FileSystem API at all, so I'd 
  rather
  concentrate my efforts on those with skin in the game.
 

 Note that even though we are a service provider and not a browser vendor, I 
 do consider us to have skin in the game.

Sure thing; I was looking to hear from those who were interested, not
necessarily those who were implementers.

 * It's designed to handle both the sandbox and the
   outside-the-sandbox use cases.  For folks interested in just the 
  sandbox and
   no future expansions, that seems like wasted effort, and a 
  sandbox-only API
   could be simpler.  It's not clear to me that there is anyone 
  interested in
   just the sandbox and no future expansions, but if there is, please 
  speak up.
   I've certainly heard from folks with the opposite goal.

 I am still looking for evidence that IndexedDB provides a high-performance, 
 scalable, cross-domain alternative to native filesystem access. I've seen 
 conflicting information on that, and will gather this information with 
 whatever tests can be found to validate performance of browsers for IndexedDB.

I've seen no proposals for cross-domain access.

 It seems like it would be useful to look at these various file and
 database specs from a high level use case perspective (f.ex. one way to
 address UC X is to use spec X). If anyone is aware of some related
 docs, please let me know. Doug - wondering aloud here if this is
 something webplatform.org might cover or if you know of someone that
 might be interested in creating this type of documentation?

 In the Web  TV IG I will be leading a task force specifically to address the 
 recording and storage of media use cases, where storage options are the key 
 focus. If someone can prove to us that in-the-sandbox storage addresses the 
 needs (high-performance, scalable, cross-domain) then great; otherwise we 
 will keep looking.

Isn't in the sandbox a bit opposed to cross-domain?  Or are you
suggesting some kind of a shared sandbox?

  I'd like to hear from folks who are interested, but not in the current 
  spec.
 

 I note that this request seems to exclude (or recommend silence) of 
 counter-points from those that *want the current specs* as mentioned by Eric. 
 So if there is a lack of contribution from those that support the other use 
 cases noted (e.g. out-of-the-sandbox storage), it should not be taken as 
 consensus with the alternative as discussed in this thread.

That's because we took an informal poll at TPAC as to where folks
stood on these options:
1) the current spec
2) an evolution of the current spec to be more like the newer
proposals [the compromise spec]
3) chuck it all and start over

...and not a single person present voted for option 1.  I'll count you
as 1, but there was a lot more support for 2 or 3.  I promised to make
a proposal for 2, and 3 needs at the very least an editor and a spec
to become viable.

I'm still hoping to hear who it is that's interested in 2, so that I
can make sure to address their concerns.  I wasn't at TPAC, so I don't
know who voted that way.



FileSystem compromise spec

2012-11-15 Thread Eric U
As discussed at TPAC, there's little support for the current FileSystem API, but
some support for a new API, and I promised to put forth a compromise proposal.
In order to do that, I'd like to hear 1) what kinds of changes would make it
more popular; 2) who I'm trying to convince.  There are a number of folks who
have said that they're not interested in a FileSystem API at all, so I'd rather
concentrate my efforts on those with skin in the game.

So far I've been hearing:

  * It's too complicated.  A number of the methods aren't absolutely necessary
if the user's willing to do a bit more work, so they should be dropped.
  * Even for what functionality we keep, it could be simpler.
  * The synchronous [worker-only] interface is superfluous.  It's not necessary
for 1.0, and it's a lot of extra implementation work.
  * It's designed to handle both the sandbox and the
outside-the-sandbox use cases.  For folks interested in just the sandbox and
no future expansions, that seems like wasted effort, and a sandbox-only API
could be simpler.  It's not clear to me that there is anyone interested in
just the sandbox and no future expansions, but if there is, please speak up.
I've certainly heard from folks with the opposite goal.

Does that sum it up?

I'd like to hear from folks who are interested, but not in the current spec.

Thanks,

Eric



Re: [quota-api] Need for session storage type

2012-10-31 Thread Eric U
On Tue, Oct 30, 2012 at 1:04 PM, Brady Eidson beid...@apple.com wrote:
 (Sending again as my first attempt seems to have not gone out to the list)

 On Oct 30, 2012, at 12:10 PM, Kinuko Yasuda kin...@chromium.org wrote:

 Reviving this thread as well... to give a chance to get more feedbacks
 before moving this forward.

 Let me briefly summarize:
 The proposal was to add 'Session' storage type to the quota API, whose data
 should get wiped when the session is closed.


 I like this.

 Past related discussion:
 * Should the data go away in an unexpected crash?
   -- It should follow the behavior of session cookies on the UA


 I'm not sure how useful it is to specify behavior in an unexpected crash.
 Almost by definition, such an event cannot have defined behavior.

Not true--databases do this all the time, as do journaling
filesystems.  While it's hard to guarantee that e.g. all data is wiped
from the system, you can certainly specify whether or not it should be
accessible to script on the next page load after the crash.

I think the bigger question is What's a session?
Does it end if I:

* close the window?
* close the last window in this origin?
* close the last window in this browser profile?
* quit the browser?
- With or without continue where I left off/load my same 
windows
from last time?
- Due to an update that caused a restart?
- Due to a crash, with automatic crash recovery?
* switch to another app on my phone/tablet?
* use enough other apps on my phone/tablet that the browser gets
purged from memory?

I doubt browsers are consistent in all these situations, given that
current Chrome doesn't behave the same as the Chrome of a year ago.
So saying it should act like session cookies doesn't work.

 * Some storage APIs implicitly have default storage types (e.g.
 sessionStorage - session, AppCache - temp) but IDB and localStorage do not
 have them. If we have more storage types we might need an explicit way to
 associate a storage API (or a data unit) to a particular storage type.
   -- would be nice, we'll need a separate proposal / design for this though


 The idea sounds useful, but I may want to hear a bit more discussion /
 opinion from other developers / vendors.


 This is an especially squirrely area.

 Even the assumed default storage types listed are not necessarily accurate.
 For example, WebKit supports making AppCache permanent and that is supported
 on Mac and iOS.

 How we should define which technology belongs to which storage type is not
 obvious to me.  It requires explicitly specifying a storage type for each
 existing and future storage technology.  It requires that storage type being
 a must requirement for each of those specs.  And that removes the ability
 for user agents to be flexible in managing their own storage.

 For example, today a user agent could implement AppCache as permanent… up to
 a limit… at which point the application could go over that limit but now
 only be temporary.

 We would either have to remove that flexibility or account for it in this
 API.

 Slightly tangent:
 A related question is how the new storage type should be enforced on
 various storage APIs.  No storage APIs other than FileSystem API has an
 explicit way to associate their data/storage to a particular storage
 type, and the current FileSystem API only knows temporary and persistent
 types.

 Well, there's the distinction between localStorage and sessionStorage to
 keep in mind. (Not sure whether the former falls under temp or persistent,
 however).


 This is another example of the particularly squirrely area I mention above.

 As the LocalStorage spec reads today, any attempted guarantees as to the
 lifetime of the data are should level guarantees and therefore not
 guarantees at all.  Therefore it is inarguably specified as a temporary
 storage.

 However, Apple treats LocalStorage as sacred as a file on the filesystem and
 we've reiterated our position on this in discussions in the past.  WIll we
 have to report this in navigator.temporaryStorage anyways?

 If we're adding more storage types (with different expire options) it
 might be nice to have a better unified way to associate a group of data
 items/units to a specific storage type/options across different storage
 APIs.

 That's an interesting suggestion. It's implicit when choosing
 sessionStorage (session) or AppCache (temp) but unclear for IDB and
 localStorage.

 Maybe a standard API for this would be a good thing.


 I think we have to fully resolve this to move forward.

 Thanks,
 ~Brady



Re: Sandboxed Filesystem use cases? (was Re: Moving File API: Directories and System API to Note track?)

2012-09-26 Thread Eric U
Asking about use cases that can be served by a filesystem API, but not
by IDB, is reasonable [and I'll respond to it below], but it misses a
lot of the point.  The users I've talked to like the FS API because
it's a simple interface that everyone already understands, that's
powerful enough to handle a huge variety of use cases.

Sure, the async API makes it a bit more complicated.  Every API that
handles large data is stuck with the same overhead there.  But
underneath that, people know what to expect from it and can figure it
out very quickly.

You just need to store 100KB?
  1) Request a filesystem.
  2) Open a file.
  3) Write your data.

Need a URL for that?  Sure, it's just a file, so obviously that works.

Want it organized in directories just like your server or dev environment?
Go ahead.

You don't have to write SQL queries, learn how to organize data into
noSQL tables, or deal with version change transactions.

If you want to see what's in your data store, you don't need to write
a viewer to dump your tables; you just go to the URL of any directory
in your store and browse around.  Our URLs have a natural structure
that matches the directory tree.  If you add URLs to IDB, with its
free-form key/value arrangement, I don't forsee an immediate natural
mapping that doesn't involve lots of escaping, ugly URLs, and/or
limitations.

On to the use cases:

Things that work well in a sandboxed filesystem that don't work well
in IDB [or any of the other current storage APIs] are those that
involve nontransactional modifications of large blobs of data.  For
example, video/photo/audio editing, which involve data that's too big
to store lots of extra copies of for rollback of failed transactions,
and which you don't necessarily want to try to fit into memory.
Overwriting just the ID3 tag of an MP3, or just the comment section of
the EXIF in a JPEG, would be much more efficient via a filesystem
interface.  Larger series of modifications to those files, which you
don't want to hold in memory, would be similar.

I know Jonas wants to bolt nontransactional data onto the side of IDB
via FileHandle, but I think that the cure there is far worse than the
disease, and I don't think anyone at Google likes that proposal.  I
haven't polled everyone, but that's the impression I get.

Beyond individual use cases:

When looking at use cases for a filesystem API, people often want to
separate the sandboxed cases and the non-sandboxed cases [My Photos,
etc.].  It's also worthwhile to look at the added value of having a
single API that works for both cases.  You have a photo organizer that
works in the sandbox with downloaded files?  If your browser supports
external filesystems, you can adapt your code to run in either place
with a very small change [mainly dealing with paths that aren't legal
on the local system].  If you're using IDB in the sandbox, and have a
different API to expose media directories, you've got to start over,
and then you have to maintain both systems.

One added API?

It's pretty clear that people see the value of an API that lets one
access My Photos from the web.  That API is necessarily going to
cope with files and directories on some platforms, even if others
don't expose directories as such.  If we're going to need to add a
filesystem API of some kind to deal with that, also using the same API
to manage a sandboxed storage area seems like a very small addition to
the web platform, unlike the other storage APIs we've added in the
past.


Regarding your final note:  I'm not sure what you're talking about
with BlobBuilder; is that the EXIF overwrite case you're trying to
handle?  If so, File[Handle|Writer] with BlobBuilder and seek seems to
handle it better than anything else.

Eric

On Tue, Sep 25, 2012 at 11:57 AM, Maciej Stachowiak m...@apple.com wrote:

 On Sep 25, 2012, at 10:20 AM, James Graham jgra...@opera.com wrote:


 In addition, this would be the fourth storage API that we have tried to 
 introduce to the platform in 5 years (localStorage, WebSQL, IndexedDB being 
 the other three), and the fifth in total. Of the four APIs excluding this 
 one, one has failed over interoperability concerns (WebSQL), one has 
 significant performance issues and is discouraged from production use 
 (localStorage) and one suffers from a significant problems due to its legacy 
 design (cookies). The remaining API (IndexedDB) has not yet achieved 
 widespread use. It seems to me that we don't have a great track record in 
 this area, and rushing to add yet another API probably isn't wise. I would 
 rather see JS-level implementations of a filesystem-like API on top of 
 IndexedDB in order to work out the kinks without creating a legacy that has 
 to be maintained for back-compat than native implementations at this time.

 I share your concerns about adding yet-another-storage API. (Although I 
 believe there are major websites that have adopted or are in the process of 
 adopting IndexedDB). I like my 

Re: Moving File API: Directories and System API to Note track?

2012-09-21 Thread Eric U
While I don't see any other browsers showing interest in implementing
the FileSystem API as currently specced, I do see Firefox coming
around to the belief that a filesystem-style API is a good thing,
hence their DeviceStorage API.  Rather than scrap the API that we've
put 2 years of discussion and work into, why not work with us to
evolve it to something you'd like more?  If you have objections to
specific attributes of the API, wouldn't it be more efficient to
change just those things than to start over from scratch?  Or worse,
to have the Chrome filesystem API, the Firefox filesystem API, etc.?

If I understand correctly, folks at Mozilla think having a directory
abstraction is too heavy-weight, and would prefer users to slice and
dice paths by hand.  OK, that's a small change, and the
functionality's roughly equivalent.  We could probably even make
migration fairly easy with a small polyfill.

Jonas suggests FileHandle to replace FileWriter.  That's clearly not a
move to greater simplicity, and no polyfill is possible, but it does
open up the potential for higher perfomance, especially in a
multi-process browser.  As i said when you proposed it, I'm
interested, and we'd also like to solve the locking use cases.

Let's talk about it, rather than throw the baby out with the bathwater.

Eric

On Tue, Sep 18, 2012 at 4:04 AM, Olli Pettay olli.pet...@helsinki.fi wrote:
 Hi all,


 I think we should discuss about moving File API: Directories and System API
 from Recommendation track to Note. Mainly because the API hasn't been widely
 accepted nor
 implemented and also because there are other proposals which handle the same
 use cases.
 The problem with keeping the API in recommendation track is that people
 outside
 standardization world think that the API is the one which all the browsers
 will implement and
 as of now that doesn't seem likely.




 -Olli




Re: [File API] File behavior under modification

2012-07-11 Thread Eric U
Agreed.

On Wed, Jul 11, 2012 at 1:02 PM, Arun Ranganathan
aranganat...@mozilla.com wrote:

 On May 23, 2012, at 9:58 AM, Glenn Maynard wrote:

 On Wed, May 23, 2012 at 3:03 AM, Kinuko Yasuda kin...@chromium.org wrote:

 Just to make sure, I assume 'the underlying storage' includes memory.


 Right.  For simple Blobs without a mutable backing store, all of this
 essentially optimizes away.

 We should also make it clear whether .size and .lastModifiedDate should
 return live state or should just returning the same constant values.  (I
 assume the latter)


 It would be the values at the time of the snapshot state.  (I doubt it was
 ever actually intended that lastModifiedDate always return the file's latest
 mtime.  We'll find out when one of the editors gets around to this
 thread...)


 I think the ideal behavior is that it reflects values at snapshot state, but
 that reads if snapshot state has modified fail.

 -- A*



Re: Feedback on Quota Management API

2012-05-30 Thread Eric U
On Wed, May 30, 2012 at 11:59 AM, Boris Zbarsky bzbar...@mit.edu wrote:
 On 5/30/12 2:05 PM, Eric Uhrhane wrote:

 How about session, which is guaranteed to go away when the browser
 exits


 Should it go away if the browser crashes (or is killed by an OOM killer or
 the background process killer on something like Android) and then restarts
 and restores the session?

 Should it go away if the user has explicitly set the browser to restore
 sessions and then restarts it?

Off the top of my head, I dunno.  I was just giving examples to
explain that I can't think of any other storage types isn't a very
solid argument that there will never be any more.  I'm not actually
proposing that we implement any of these at this time.

Also, having read Robert's blog post now, I think he makes some good
points, especially w.r.t. feature detection.



[File API] File behavior under modification

2012-05-21 Thread Eric U
According to the latest editor's draft [1], a File object must always
return an accurate lastModifiedDate if at all possible.
On getting, if user agents can make this information available, this
MUST return a new Date[HTML] object initialized to the last modified
date of the file; otherwise, this MUST return null.

However, if the underlying file has been modified since the creation
of the File, reads processed on the File must throw exceptions or fire
error events.
...if the file has been modified on disk since the File object
reference is created, user agents MUST throw a NotReadableError...

These seem somewhat contradictory...you can always look at the
modification time and see that it's changed, but if you try to read it
after a change, it blows up.
The non-normative text about security concerns makes me think that
perhaps both types of operations should fail if the file has changed
[... guarding against modifications of files on disk after a
selection has taken place].  That may not be necessary, but if it's
not, I think we should call it out in non-normative text that explains
why you can read the mod time and not the data.

This came up in https://bugs.webkit.org/show_bug.cgi?id=86811; I
believe WebKit is currently noncompliant with this part of the spec,
and we were debating the correct behavior.  Currently WebKit delays
grabbing the modification time of the file until it's been referenced
by a read or slice(), so it won't notice modifications that happen
between selection and read.  That was done because the slice creates a
File object reference, but in my reading creating the File referring
to the file should be the time of the snapshot, not creating a Blob
referring to a File.

What's the correct behavior?

Eric

[1] http://dev.w3.org/2006/webapi/FileAPI/



Re: Colliding FileWriters

2012-05-01 Thread Eric U
On Mon, Mar 19, 2012 at 3:55 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Wed, Feb 29, 2012 at 8:44 AM, Eric U er...@google.com wrote:
 On Mon, Feb 27, 2012 at 4:40 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Mon, Feb 27, 2012 at 11:36 PM, Eric U er...@google.com wrote:
 One working subset would be:

 * Keep createFileWriter async.
 * Make it optionally exclusive [possibly by default].  If exclusive,
 its length member is trustworthy.  If not, it can go stale.
 * Add an append method [needed only for non-exclusive writes, but
 useful for logs, and a safe default].

 This sounds great to me if we make it exclusive by default and remove
 the .length member for non-exclusive writers. Or make it return
 null/undefined.

 I like exclusive-by-default.  Of course, that means that by default
 you have to remember to call close() or depend on GC, but that's
 probably OK.  I'm less sure about .length being unusable on
 non-exclusive writers, but it's growing on me.  Since by default
 writers would be exclusive, length would generally work just the same
 as it does now.  However, if it returns null/undefined in the
 nonexclusive case, users might accidentally do math on it (if (length
 0) = false), and get confused.  Perhaps it should throw?

 Also, what's the behavior when there's already an exclusive lock, and
 you call createFileWriter?  Should it just not call you until the
 lock's free?  Do we need a trylock that fails fast, calling
 errorCallback?  I think the former's probably more useful than the
 latter, and you can always use a timer to give up if it takes too
 long, but there's no way to cancel a request, and you might get a call
 far later, when you've forgotten that you requested it.

 However this brings up another problem, which is how to support
 clients that want to mix read and write operations. Currently this is
 supported, but as far as I can tell it's pretty awkward. Every time
 you want to read you have to nest two asynchronous function calls.
 First one to get a File reference, and then one to do the actual read
 using a FileReader object. You can reuse the File reference, but only
 if you are doing multiple reads in a row with no writing in between.

 I thought about this for a while, and realized that I had no good
 suggestion because I couldn't picture the use cases.  Do you have some
 handy that would help me think about it?

 Mixing reading and writing can be something as simple as increasing a
 counter somewhere in the file. First you need to read the counter
 value, then add one to it, then write the new value. But there's also
 more complex operations such as reordering a set of blocks to
 defragment the contents of a file. Yet another example would be
 modifying a .zip file to add a new file. When you do this you'll want
 to first read out the location of the current zip directory, then
 overwrite it with the new file and then the new directory.

 That helps, thanks.  So we'll need to be able to do efficient
 (read[-modify-write]*), and we'll need to hold the lock for the reads
 as well as the writes.  The lock should prevent any other writes
 [exclusive or not], but need not prevent unlocked reads.

 I think we'd want to prevent unlocked reads too, otherwise the read
 might read the file in an inconsistent state. See more further down.

 We sat down and did some thinking about these two issues. I.e. the
 locking and the read-write-mixed issue. The solution is good news and
 bad news. The good news is that we've come up with something that
 seems like it should work, the bad news is that it's a totally
 different design from the current FileReader and FileWriter designs.

 Hmm...it's interesting, but I don't think we necessarily have to scrap
 FR and FW to use it.

 Here's a modified version that uses the existing interfaces:

 interface LockedReaderWriter : FileReader, FileWriter {
        [all the FileReader and FileWriter members]

        readonly attribute File writeResult;
 }

 Unfortunately this doesn't make sense since the functions on
 FileReader expects a Blob to be passed to them. We could certainly use
 slightly modified versions which doesn't take a Blob argument, but we
 can't inherit FileReader directly.

You missed the point of the writeResult field.  You can slice it and
give it to the FileReader-derived functions to do your reads, so no
modifications to the API are needed.

 However there are two downsides with an approach like this. First off
 it means that you *always* have to nest read/write operations in
 asynchronous callbacks. I.e. you always have to write code like:

 lock.write(...);
 lock.onsuccess = function() {
  lock.write(...);
  lock.onsuccess = function() {
    lock.read(...);
    lock.onsuccess = function() {
      lock.write(..., lock.result, ...);
      lock.onsuccess = function() {
      }
    }
  }
 }

True.  That's forced by the fact that we've modeled FileWriter after
FileReader, which is modeled after XHR, which has explicit visible
state

Re: BlobBuilder.append() should take ArrayBufferView in addition to ArrayBuffer

2012-04-12 Thread Eric U
On Thu, Apr 12, 2012 at 12:54 PM, Anne van Kesteren ann...@opera.com wrote:
 On Thu, 12 Apr 2012 21:48:12 +0200, Boris Zbarsky bzbar...@mit.edu wrote:

 Because it's still in the current editor's draft and it's still in the
 Gecko code and I was just reviewing a patch to it and saw the API?  ;)


 Eric, the plan is to remove that from File Writer, no?

Yes.  The next draft I publish will mark it deprecated, and it will
eventually go away.  However, currently at least Gecko and WebKit
support BlobBuilder, and WebKit doesn't yet have the Blob constructor,
so it'll be a little while before it actually fades away.

That being said, we should be talking about making this addition to
Blob, not to BlobBuilder.

 I thought we discussed long ago it should be removed in favor of a
 constructable(sp?) Blob?


 Could be.  Like I said, it's still in the editor's draft.


 Blob with constructor is in http://dev.w3.org/2006/webapi/FileAPI/



 Also, should it not accept just ArrayBufferView then as per
 XMLHttpRequest?


 Is there existing content depending on BlobBuilder and its ArrayBufferView
 stuff?


 I thought the idea was to not have BlobBuilder at all.



 --
 Anne van Kesteren
 http://annevankesteren.nl/



Re: Delay in File * spec publications in /TR/ [Was: CfC: publish LCWD of File API; deadline March 3]

2012-03-30 Thread Eric U
On Fri, Mar 30, 2012 at 5:39 AM, Arthur Barstow art.bars...@nokia.com wrote:
 Hi All - the publication of the File API LC was delayed because of some 
 logistical issues for Arun as well as some additional edits he intends to 
 make.

 This delay also resulted in Eric's two File * specs not being published since 
 they have a dependency on the latest File API spec.

 Arun - can you please give us at least a rough idea when you expect the spec 
 to be ready for LC publication?

 Jonas - as co-Editor of File API, can you help get the File API LC published?

 Eric - your File * docs were last published in April 2011 so I think it would 
 be good to get new versions published in /TR/ soon-ish. OTOH, if they have 
 dependencies on the latest File API, it may be better to postpone their 
 publication until File API is published. WDYT?

If it's going to be more than a month to get Arun+Jonas's spec up, we
might as well go ahead and publish mine; they've had quite a bit of
change.  If it's less than that, let's just do them all together.

 -Thanks, ArtB

 On Feb 25, 2012, at 7:19 AM, Arthur Barstow wrote:

 Comments and bugs submitted during the pre-LC comment period for File API 
 spec have been addressed and since there are no open bugs, this is a Call 
 for Consensus to publish a LCWD of the File API spec using the latest ED as 
 the basis:

 http://dev.w3.org/2006/webapi/FileAPI/

 This CfC satisfies the group's requirement to record the group's decision 
 to request advancement for this LCWD.

 Note the Process Document states the following regarding the 
 significance/meaning of a LCWD:

 [[
 http://www.w3.org/2005/10/Process-20051014/tr.html#last-call

 Purpose: A Working Group's Last Call announcement is a signal that:

 * the Working Group believes that it has satisfied its relevant technical 
 requirements (e.g., of the charter or requirements document) in the Working 
 Draft;

 * the Working Group believes that it has satisfied significant dependencies 
 with other groups;

 * other groups SHOULD review the document to confirm that these dependencies 
 have been satisfied. In general, a Last Call announcement is also a signal 
 that the Working Group is planning to advance the technical report to later 
 maturity levels.
 ]]

 If you have any comments or concerns about this CfC, please send them to 
 public-webapps@w3.org by March 3 at the latest. Positive response is 
 preferred and encouraged and silence will be assumed to be agreement with 
 the proposal.

 -Thanks, AB






Re: Colliding FileWriters

2012-03-19 Thread Eric U
On Wed, Feb 29, 2012 at 8:44 AM, Eric U er...@google.com wrote:
 On Mon, Feb 27, 2012 at 4:40 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Mon, Feb 27, 2012 at 11:36 PM, Eric U er...@google.com wrote:
 One working subset would be:

 * Keep createFileWriter async.
 * Make it optionally exclusive [possibly by default].  If exclusive,
 its length member is trustworthy.  If not, it can go stale.
 * Add an append method [needed only for non-exclusive writes, but
 useful for logs, and a safe default].

 This sounds great to me if we make it exclusive by default and remove
 the .length member for non-exclusive writers. Or make it return
 null/undefined.

 I like exclusive-by-default.  Of course, that means that by default
 you have to remember to call close() or depend on GC, but that's
 probably OK.  I'm less sure about .length being unusable on
 non-exclusive writers, but it's growing on me.  Since by default
 writers would be exclusive, length would generally work just the same
 as it does now.  However, if it returns null/undefined in the
 nonexclusive case, users might accidentally do math on it (if (length
 0) = false), and get confused.  Perhaps it should throw?

 Also, what's the behavior when there's already an exclusive lock, and
 you call createFileWriter?  Should it just not call you until the
 lock's free?  Do we need a trylock that fails fast, calling
 errorCallback?  I think the former's probably more useful than the
 latter, and you can always use a timer to give up if it takes too
 long, but there's no way to cancel a request, and you might get a call
 far later, when you've forgotten that you requested it.

 However this brings up another problem, which is how to support
 clients that want to mix read and write operations. Currently this is
 supported, but as far as I can tell it's pretty awkward. Every time
 you want to read you have to nest two asynchronous function calls.
 First one to get a File reference, and then one to do the actual read
 using a FileReader object. You can reuse the File reference, but only
 if you are doing multiple reads in a row with no writing in between.

 I thought about this for a while, and realized that I had no good
 suggestion because I couldn't picture the use cases.  Do you have some
 handy that would help me think about it?

 Mixing reading and writing can be something as simple as increasing a
 counter somewhere in the file. First you need to read the counter
 value, then add one to it, then write the new value. But there's also
 more complex operations such as reordering a set of blocks to
 defragment the contents of a file. Yet another example would be
 modifying a .zip file to add a new file. When you do this you'll want
 to first read out the location of the current zip directory, then
 overwrite it with the new file and then the new directory.

 That helps, thanks.  So we'll need to be able to do efficient
 (read[-modify-write]*), and we'll need to hold the lock for the reads
 as well as the writes.  The lock should prevent any other writes
 [exclusive or not], but need not prevent unlocked reads.

 We sat down and did some thinking about these two issues. I.e. the
 locking and the read-write-mixed issue. The solution is good news and
 bad news. The good news is that we've come up with something that
 seems like it should work, the bad news is that it's a totally
 different design from the current FileReader and FileWriter designs.

 Hmm...it's interesting, but I don't think we necessarily have to scrap
 FR and FW to use it.

 Here's a modified version that uses the existing interfaces:

 interface LockedReaderWriter : FileReader, FileWriter {
        [all the FileReader and FileWriter members]

        readonly attribute File writeResult;
 }

This came up in an offline discussion recently regarding an
currently-unserved use case: using a web app to edit a file outside
the browser sandbox.  You can certainly drag the file into or out of
the browser, but it's nothing like the experience you get with a
native app, where if you select a file for editing you can read+write
it many times, at its true location, without additional permission
checks.  If we added something like a refresh to regain expired
locks with this object, and some way for the user to grant permissions
to a file for the session, it could take care of that use case.

What do you think?

 As with your proposal, as long as any read or write method has
 outstanding events, the lock is held.  The difference here is that
 after any write method completes, and until another one begins or the
 lock is dropped, writeResult holds the state of the File as of the
 completion of the write.  The rest of the time it's null.  That way
 you're always as up-to-date as you can easily be, but no more so [it
 doesn't show partial writes during progress events].  To read, you use
 the standard FileReader interface, slicing writeResult as needed to
 get the appropriate offset.

 A potential feature

Re: Transferable and structured clones, was: Re: [FileAPI] Deterministic release of Blob proposal

2012-03-07 Thread Eric U
On Wed, Mar 7, 2012 at 11:38 AM, Kenneth Russell k...@google.com wrote:
 On Tue, Mar 6, 2012 at 6:29 PM, Glenn Maynard gl...@zewt.org wrote:
 On Tue, Mar 6, 2012 at 4:24 PM, Michael Nordman micha...@google.com wrote:

  You can always call close() yourself, but Blob.close() should use the
  neuter mechanism already there, not make up a new one.

 Blobs aren't transferable, there is no existing mechanism that applies
 to them. Adding a blob.close() method is independent of making blob's
 transferable, the former is not prerequisite on the latter.


 There is an existing mechanism for closing objects.  It's called
 neutering.  Blob.close should use the same terminology, whether or not the
 object is a Transferable.

 On Tue, Mar 6, 2012 at 4:25 PM, Kenneth Russell k...@google.com wrote:

 I would be hesitant to impose a close() method on all future
 Transferable types.


 Why?  All Transferable types must define how to neuter objects; all close()
 does is trigger it.

 I don't think adding one to ArrayBuffer would be a
 bad idea but I think that ideally it wouldn't be necessary. On memory
 constrained devices, it would still be more efficient to re-use large
 ArrayBuffers rather than close them and allocate new ones.


 That's often not possible, when the ArrayBuffer is returned to you from an
 API (eg. XHR2).

 This sounds like a good idea. As you pointed out offline, a key
 difference between Blobs and ArrayBuffers is that Blobs are always
 immutable. It isn't necessary to define Transferable semantics for
 Blobs in order to post them efficiently, but it was essential for
 ArrayBuffers.


 No new semantics need to be defined; the semantics of Transferable are
 defined by postMessage and are the same for all transferable objects.
 That's already done.  The only thing that needs to be defined is how to
 neuter an object, which is what Blob.close() has to define anyway.

 Using Transferable for Blob will allow Blobs, ArrayBuffers, and any future
 large, structured clonable objects to all be released with the same
 mechanisms: either pass them in the transfer argument to a postMessage
 call, or use the consistent, identical close() method inherited from
 Transferable.  This allows developers to think of the transfer list as a
 list of objects which won't be needed after the postMessage call.  It
 doesn't matter that the underlying optimizations are different; the visible
 side-effects are identical (the object can no longer be accessed).

 Closing an object, and neutering it because it was transferred to a
 different owner, are different concepts. It's already been
 demonstrated that Blobs, being read-only, do not need to be
 transferred in order to send them efficiently from one owner to
 another. It's also been demonstrated that Blobs can be resource
 intensive and that an explicit closing mechanism is needed.

 I believe that we should fix the immediate problem and add a close()
 method to Blob. I'm not in favor of adding a similar method to
 ArrayBuffer at this time and therefore not to Transferable. There is a
 high-level goal to keep the typed array specification as minimal as
 possible, and having Transferable support leak in to the public
 methods of the interfaces contradicts that goal.

This makes sense to me.  Blob needs close independent of whether it's
in Transferable, and Blob has no need to be Transferable, so let's not
mix the two.



Re: [FileAPI] Deterministic release of Blob proposal

2012-03-07 Thread Eric U
On Tue, Mar 6, 2012 at 5:12 PM, Feras Moussa fer...@microsoft.com wrote:
 From: Arun Ranganathan [mailto:aranganat...@mozilla.com]
 Sent: Tuesday, March 06, 2012 1:27 PM
 To: Feras Moussa
 Cc: Adrian Bateman; public-webapps@w3.org; Ian Hickson; Anne van Kesteren
 Subject: Re: [FileAPI] Deterministic release of Blob proposal

 Feras,

 In practice, I think this is important enough and manageable enough to 
 include in the spec., and I'm willing to slow the train down if necessary, 
 but I'd like to understand a few things first.  Below:
 
  At TPAC we discussed the ability to deterministically close blobs with a 
  few
  others.
   
  As we’ve discussed in the createObjectURL thread[1], a Blob may represent
  an expensive resource (eg. expensive in terms of memory, battery, or disk
  space). At present there is no way for an application to deterministically
  release the resource backing the Blob. Instead, an application must rely on
  the resource being cleaned up through a non-deterministic garbage collector
  once all references have been released. We have found that not having a way
  to deterministically release the resource causes a performance impact for a
  certain class of applications, and is especially important for mobile 
  applications
  or devices with more limited resources.
 
  In particular, we’ve seen this become a problem for media intensive 
  applications
  which interact with a large number of expensive blobs. For example, a 
  gallery
  application may want to cycle through displaying many large images 
  downloaded
  through websockets, and without a deterministic way to immediately release
  the reference to each image Blob, can easily begin to consume vast amounts 
  of
  resources before the garbage collector is executed.
   
  To address this issue, we propose that a close method be added to the Blob
  interface.
  When called, the close method should release the underlying resource of the
  Blob, and future operations on the Blob will return a new error, a 
  ClosedError.
  This allows an application to signal when it's finished using the Blob.
 

 Do you agree that Transferable
 (http://dev.w3.org/html5/spec/Overview.html#transferable-objects) seems to 
 be what
 we're looking for, and that Blob should implement Transferable?

 Transferable addresses the use case of copying across threads, and neuters 
 the source
 object (though honestly, the word neuter makes me wince -- naming is a 
 problem on the
 web).  We can have a more generic method on Transferable that serves our 
 purpose here,
 rather than *.close(), and Blob can avail of that.  This is something we can 
 work out with HTML,
 and might be the right thing to do for the platform (although this creates 
 something to think
 about for MessagePort and for ArrayBuffer, which also implement 
 Transferable).

 I agree with your changes, but am confused by some edge cases:
 To support this change, the following changes in the File API spec are 
 needed:
 
 * In section 6 (The Blob Interface)
  - Addition of a close method. When called, the close method releases the
 underlying resource of the Blob. Close renders the blob invalid, and further
 operations such as URL.createObjectURL or the FileReader read methods on
 the closed blob will fail and return a ClosedError.  If there are any 
 non-revoked
 URLs to the Blob, these URLs will continue to resolve until they have been
 revoked.
  - For the slice method, state that the returned Blob is a new Blob with 
its own
 lifetime semantics – calling close on the new Blob is independent of 
 calling close
 on the original Blob.

 *In section 8 (The FIleReader Interface)
 - State the FileReader reads directly over the given Blob, and not a copy 
 with
 an independent lifetime.

 * In section 10 (Errors and Exceptions)
 - Addition of a ClosedError. If the File or Blob has had the close method 
 called,
 then for asynchronous read methods the error attribute MUST return a
 “ClosedError” DOMError and synchronous read methods MUST throw a
 ClosedError exception.

 * In section 11.8 (Creating and Revoking a Blob URI)
 - For createObjectURL – If this method is called with a closed Blob 
 argument,
 then user agents must throw a ClosedError exception.

 Similarly to how slice() clones the initial Blob to return one with its own
 independent lifetime, the same notion will be needed in other APIs which
 conceptually clone the data – namely FormData, any place the Structured 
 Clone
 Algorithm is used, and BlobBuilder.
 Similarly to how FileReader must act directly on the Blob’s data, the same 
 notion
 will be needed in other APIs which must act on the data - namely XHR.send 
 and
 WebSocket. These APIs will need to throw an error if called on a Blob that 
 was
 closed and the resources are released.

 So Blob.slice() already presumes a new Blob, but I can certainly make this 
 clearer.
 And I agree with the changes above, including the addition of 

Re: FileReader abort, again

2012-03-06 Thread Eric U
On Mon, Mar 5, 2012 at 2:01 PM, Eric U er...@google.com wrote:
 On Thu, Mar 1, 2012 at 11:20 AM, Arun Ranganathan
 aranganat...@mozilla.com wrote:
 Eric,

   So we could:
   1. Say not to fire a loadend if onloadend or onabort

 Do you mean if onload, onerror, or onabort...?


 No, actually.  I'm looking for the right sequence of steps that results in 
 abort's loadend not firing if terminated by another read*.  Since abort will 
 fire an abort event and a loadened event as spec'd 
 (http://dev.w3.org/2006/webapi/FileAPI/#dfn-abort), if *those* event 
 handlers initiate a readAs*, we could then suppress abort's loadend.  This 
 seems messy.

 Ah, right--so a new read initiated from onload or onerror would NOT
 suppress the loadend of the first read.  And I believe that this
 matches XHR2, so we're good.  Nevermind.

No, I retract that.  In
http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/1627.html
Anne confirmed that a new open in onerror or onload /would/ suppress
the loadend of the first send.  So we would want a new read/write in
onload or onerror to do the same, not just those in onabort.

 Actually, if we really want to match XHR2, we should qualify all the
 places that we fire loadend.  If the user calls XHR2's open in
 onerror
 or onload, that cancels its loadend.  However, a simple check on
 readyState at step 6 won't do it.  Because the user could call
 readAsText in onerror, then call abort in the second read's
 onloadstart, and we'd see readyState as DONE and fire loadend twice.

 To emulate XHR2 entirely, we'd need to have read methods dequeue any
 leftover tasks for previous read methods AND terminate the abort
 algorithm AND terminate the error algorithm of any previous read
 method.  What a mess.


 This may be the way to do it.

 The problem with emulating XHR2 is that open() and send() are distinct 
 concepts in XHR2, but in FileAPI, they are the same.  So in XHR2 an open() 
 canceling abort does make sense; abort() cancels a send(), and thus an 
 open() should cancel an abort().  But in FileAPI, our readAs* methods are 
 equivalent to *both* open() and send().  In FileAPI, an abort() cancels a 
 readAs*; we now have a scenario where a readAs* may cancel an abort().  How 
 to make that clear?

 I'm not sure why it's any more confusing that read* is open+send.
 read* can cancel abort, and abort can cancel read*.  OK.


 Perhaps there's a simpler way to say successfully calling a read
 method inhibits any previous read's loadend?

 I'm in favor of any shorthand :)  But this may not do justice to each 
 readAs* algorithm being better defined.

 Hack 1: Don't call loadend synchronously.  Enqueue it, and let read*
 methods clear the queues when they start up.  This differs from XHR,
 though, and is a little odd.

Still works, but needs to be applied in multiple places.

 Hack 2: Add a virtual generation counter/timestamp, not exposed to
 script.  Increment it in read*, check it in abort before sending
 loadend.  This is kind of complex, but works [and might be how I end
 up implementing this in Chrome].


Still works, but needs to be applied in multiple places.

 But really, I don't think either of those is better than just saying,
 in read*, something like terminate the algorithm for any abort
 sequence being processed.

...or any previously-initiated read being processed.



Re: [FileAPI] Deterministic release of Blob proposal

2012-03-06 Thread Eric U
After a brief internal discussion, we like the idea over in Chrome-land.
Let's make sure that we carefully spec out the edge cases, though.
See below for some.

On Fri, Mar 2, 2012 at 4:54 PM, Feras Moussa fer...@microsoft.com wrote:
 At TPAC we discussed the ability to deterministically close blobs with a few

 others.



 As we’ve discussed in the createObjectURL thread[1], a Blob may represent

 an expensive resource (eg. expensive in terms of memory, battery, or disk

 space). At present there is no way for an application to deterministically

 release the resource backing the Blob. Instead, an application must rely on

 the resource being cleaned up through a non-deterministic garbage collector

 once all references have been released. We have found that not having a way

 to deterministically release the resource causes a performance impact for a

 certain class of applications, and is especially important for mobile
 applications

 or devices with more limited resources.



 In particular, we’ve seen this become a problem for media intensive
 applications

 which interact with a large number of expensive blobs. For example, a
 gallery

 application may want to cycle through displaying many large images
 downloaded

 through websockets, and without a deterministic way to immediately release

 the reference to each image Blob, can easily begin to consume vast amounts
 of

 resources before the garbage collector is executed.



 To address this issue, we propose that a close method be added to the Blob

 interface.

 When called, the close method should release the underlying resource of the

 Blob, and future operations on the Blob will return a new error, a
 ClosedError.

 This allows an application to signal when it's finished using the Blob.



 To support this change, the following changes in the File API spec are
 needed:



 * In section 6 (The Blob Interface)

   - Addition of a close method. When called, the close method releases the

 underlying resource of the Blob. Close renders the blob invalid, and further

 operations such as URL.createObjectURL or the FileReader read methods on

 the closed blob will fail and return a ClosedError.  If there are any
 non-revoked

 URLs to the Blob, these URLs will continue to resolve until they have been

 revoked.

   - For the slice method, state that the returned Blob is a new Blob with
 its own

 lifetime semantics – calling close on the new Blob is independent of calling
 close

 on the original Blob.



 *In section 8 (The FIleReader Interface)

 - State the FileReader reads directly over the given Blob, and not a copy
 with

 an independent lifetime.



 * In section 10 (Errors and Exceptions)

 - Addition of a ClosedError. If the File or Blob has had the close method
 called,

 then for asynchronous read methods the error attribute MUST return a

 “ClosedError” DOMError and synchronous read methods MUST throw a

 ClosedError exception.



 * In section 11.8 (Creating and Revoking a Blob URI)

 - For createObjectURL – If this method is called with a closed Blob
 argument,

 then user agents must throw a ClosedError exception.



 Similarly to how slice() clones the initial Blob to return one with its own

 independent lifetime, the same notion will be needed in other APIs which

 conceptually clone the data – namely FormData, any place the Structured
 Clone

 Algorithm is used, and BlobBuilder.

What about:

XHR.send(blob);
blob.close();

or

iframe.src = createObjectURL(blob);
blob.close();

In the second example, if we say that the iframe does copy the blob,
does that mean that closing the blob doesn't automatically revoke the
URL, since it points at the new copy?  Or does it point at the old
copy and fail?

 Similarly to how FileReader must act directly on the Blob’s data, the same
 notion

 will be needed in other APIs which must act on the data - namely XHR.send
 and

 WebSocket. These APIs will need to throw an error if called on a Blob that
 was

 closed and the resources are released.



 We’ve recently implemented this in experimental builds and have seen
 measurable

 performance improvements.



 The feedback we heard from our discussions with others at TPAC regarding our

 proposal to add a close() method to the Blob interface was that objects in
 the web

 platform potentially backed by expensive resources should have a
 deterministic

 way to be released.



 Thanks,

 Feras



 [1] http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/1499.html



Re: FileReader abort, again

2012-03-05 Thread Eric U
On Thu, Mar 1, 2012 at 11:20 AM, Arun Ranganathan
aranganat...@mozilla.com wrote:
 Eric,

   So we could:
   1. Say not to fire a loadend if onloadend or onabort

 Do you mean if onload, onerror, or onabort...?


 No, actually.  I'm looking for the right sequence of steps that results in 
 abort's loadend not firing if terminated by another read*.  Since abort will 
 fire an abort event and a loadened event as spec'd 
 (http://dev.w3.org/2006/webapi/FileAPI/#dfn-abort), if *those* event handlers 
 initiate a readAs*, we could then suppress abort's loadend.  This seems messy.

Ah, right--so a new read initiated from onload or onerror would NOT
suppress the loadend of the first read.  And I believe that this
matches XHR2, so we're good.  Nevermind.



 Actually, if we really want to match XHR2, we should qualify all the
 places that we fire loadend.  If the user calls XHR2's open in
 onerror
 or onload, that cancels its loadend.  However, a simple check on
 readyState at step 6 won't do it.  Because the user could call
 readAsText in onerror, then call abort in the second read's
 onloadstart, and we'd see readyState as DONE and fire loadend twice.

 To emulate XHR2 entirely, we'd need to have read methods dequeue any
 leftover tasks for previous read methods AND terminate the abort
 algorithm AND terminate the error algorithm of any previous read
 method.  What a mess.


 This may be the way to do it.

 The problem with emulating XHR2 is that open() and send() are distinct 
 concepts in XHR2, but in FileAPI, they are the same.  So in XHR2 an open() 
 canceling abort does make sense; abort() cancels a send(), and thus an open() 
 should cancel an abort().  But in FileAPI, our readAs* methods are equivalent 
 to *both* open() and send().  In FileAPI, an abort() cancels a readAs*; we 
 now have a scenario where a readAs* may cancel an abort().  How to make that 
 clear?

I'm not sure why it's any more confusing that read* is open+send.
read* can cancel abort, and abort can cancel read*.  OK.


 Perhaps there's a simpler way to say successfully calling a read
 method inhibits any previous read's loadend?

 I'm in favor of any shorthand :)  But this may not do justice to each readAs* 
 algorithm being better defined.

Hack 1: Don't call loadend synchronously.  Enqueue it, and let read*
methods clear the queues when they start up.  This differs from XHR,
though, and is a little odd.
Hack 2: Add a virtual generation counter/timestamp, not exposed to
script.  Increment it in read*, check it in abort before sending
loadend.  This is kind of complex, but works [and might be how I end
up implementing this in Chrome].

But really, I don't think either of those is better than just saying,
in read*, something like terminate the algorithm for any abort
sequence being processed.

Eric



Re: [fileapi] timing of readyState changes vs. events

2012-03-02 Thread Eric U
On Thu, Mar 1, 2012 at 11:09 PM, Anne van Kesteren ann...@opera.com wrote:
 On Fri, 02 Mar 2012 01:01:55 +0100, Eric U er...@google.com wrote:

 On Thu, Mar 1, 2012 at 3:16 PM, Arun Ranganathan
 aranganat...@mozilla.com wrote:

 OK, so the change is to ensure that these events are fired directly, and
 not queued, right?  I'll make this change.  This applies to all readAs*
 methods.


 Yup.  It should apply to any event associated with a state change [so
 e.g. onload, but not onloadend].


 Uhm. What you need to do is queue a task that changes the state and fires
 the event. You cannot just fire an event from asynchronous operations.

Pardon my ignorance, but why not?  Is it because you have to define
which task queue gets the operation?
So would that mean that e.g. the current spec for readAsDataURL would
have to queue steps 6 and 8-10?

Anyway, my point was just that load needed to be done synchronously
with the change to readyState, but loadend had no such restriction,
since it wasn't tied to the readyState change.



Re: [fileapi] timing of readyState changes vs. events

2012-03-01 Thread Eric U
On Thu, Mar 1, 2012 at 3:16 PM, Arun Ranganathan
aranganat...@mozilla.com wrote:
 Eric,

 In the readAsText in the latest draft [1] I see that readyState gets
 set to done When the blob has been read into memory fully.
 I see that elsewhere in the progress notification description, When
 the data from the blob has been completely read into memory, queue a
 task to fire a progress event called load.  So readyState changes
 separately from the sending of that progress event, since one is
 direct and the other queued, and script could observe the state in
 between.

 In the discussion at [2] we arranged to avoid that for FileWriter.
  We
 should do the same for FileReader.

 OK, so the change is to ensure that these events are fired directly, and not 
 queued, right?  I'll make this change.  This applies to all readAs* methods.

Yup.  It should apply to any event associated with a state change [so
e.g. onload, but not onloadend].



Re: Colliding FileWriters

2012-02-29 Thread Eric U
On Mon, Feb 27, 2012 at 4:40 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Mon, Feb 27, 2012 at 11:36 PM, Eric U er...@google.com wrote:
 One working subset would be:

 * Keep createFileWriter async.
 * Make it optionally exclusive [possibly by default].  If exclusive,
 its length member is trustworthy.  If not, it can go stale.
 * Add an append method [needed only for non-exclusive writes, but
 useful for logs, and a safe default].

 This sounds great to me if we make it exclusive by default and remove
 the .length member for non-exclusive writers. Or make it return
 null/undefined.

 I like exclusive-by-default.  Of course, that means that by default
 you have to remember to call close() or depend on GC, but that's
 probably OK.  I'm less sure about .length being unusable on
 non-exclusive writers, but it's growing on me.  Since by default
 writers would be exclusive, length would generally work just the same
 as it does now.  However, if it returns null/undefined in the
 nonexclusive case, users might accidentally do math on it (if (length
 0) = false), and get confused.  Perhaps it should throw?

 Also, what's the behavior when there's already an exclusive lock, and
 you call createFileWriter?  Should it just not call you until the
 lock's free?  Do we need a trylock that fails fast, calling
 errorCallback?  I think the former's probably more useful than the
 latter, and you can always use a timer to give up if it takes too
 long, but there's no way to cancel a request, and you might get a call
 far later, when you've forgotten that you requested it.

 However this brings up another problem, which is how to support
 clients that want to mix read and write operations. Currently this is
 supported, but as far as I can tell it's pretty awkward. Every time
 you want to read you have to nest two asynchronous function calls.
 First one to get a File reference, and then one to do the actual read
 using a FileReader object. You can reuse the File reference, but only
 if you are doing multiple reads in a row with no writing in between.

 I thought about this for a while, and realized that I had no good
 suggestion because I couldn't picture the use cases.  Do you have some
 handy that would help me think about it?

 Mixing reading and writing can be something as simple as increasing a
 counter somewhere in the file. First you need to read the counter
 value, then add one to it, then write the new value. But there's also
 more complex operations such as reordering a set of blocks to
 defragment the contents of a file. Yet another example would be
 modifying a .zip file to add a new file. When you do this you'll want
 to first read out the location of the current zip directory, then
 overwrite it with the new file and then the new directory.

That helps, thanks.  So we'll need to be able to do efficient
(read[-modify-write]*), and we'll need to hold the lock for the reads
as well as the writes.  The lock should prevent any other writes
[exclusive or not], but need not prevent unlocked reads.

 We sat down and did some thinking about these two issues. I.e. the
 locking and the read-write-mixed issue. The solution is good news and
 bad news. The good news is that we've come up with something that
 seems like it should work, the bad news is that it's a totally
 different design from the current FileReader and FileWriter designs.

Hmm...it's interesting, but I don't think we necessarily have to scrap
FR and FW to use it.

Here's a modified version that uses the existing interfaces:

interface LockedReaderWriter : FileReader, FileWriter {
[all the FileReader and FileWriter members]

readonly attribute File writeResult;
}

As with your proposal, as long as any read or write method has
outstanding events, the lock is held.  The difference here is that
after any write method completes, and until another one begins or the
lock is dropped, writeResult holds the state of the File as of the
completion of the write.  The rest of the time it's null.  That way
you're always as up-to-date as you can easily be, but no more so [it
doesn't show partial writes during progress events].  To read, you use
the standard FileReader interface, slicing writeResult as needed to
get the appropriate offset.

A potential feature of this design is that you could use it to read a
Blob that didn't come from writeResult, letting you pull in other data
while still holding the lock.  I'm not sure if we need that, but it's
there if we want it.

 To do the locking without requiring calls to .close() or relying on GC
 we use a similar setup to IndexedDB transactions. I.e. you get an
 object which represents a locked file. As long as you use that lock to
 read from and write to the file the lock keeps being held. However as
 soon as you return to the event loop from the last progress
 notification from the last read/write operation, the lock is
 automatically released.

I love that your design is [I believe] deadlock-free, as the
write

Re: FileReader abort, again

2012-02-29 Thread Eric U
Incidentally, the way XHR gets around this is to have open cancel any
in-progress abort.  We could certainly do the same thing, having any
readAs* cancel abort().

On Tue, Feb 28, 2012 at 4:15 PM, Eric U er...@google.com wrote:
 I like the Event Invariants writeup at the end.  It's only
 informative, but it is, indeed, informative.

 However, I'm not sure it quite matches the normative text in one
 respect.  Where you say [8.5.6 step 4]: Terminate any steps while
 processing a read method.  Does that also terminate the steps
 associated with an abort that terminated the read method?  Basically
 I'm not sure what steps while processing a read method means.

 Otherwise, if you start a new read in onabort [8.5.6 step 5], you'll
 still deliver the loadend [8.5.6 step 6].
 This contradicts 8.5.9.2.1 Once a loadstart has been fired, a
 corresponding loadend fires at completion of the read, EXCEPT if the
 read method has been cancelled using abort() and a new read method has
 been invoked.

        Eric [copying this into FileWriter]



Re: FileReader abort, again

2012-02-29 Thread Eric U
On Wed, Feb 29, 2012 at 1:43 PM, Arun Ranganathan
aranganat...@mozilla.com wrote:
 FileReader.abort is like a bad penny :)



 However, I'm not sure it quite matches the normative text in one
 respect.  Where you say [8.5.6 step 4]: Terminate any steps while
 processing a read method.  Does that also terminate the steps
 associated with an abort that terminated the read method?  Basically
 I'm not sure what steps while processing a read method means.

 I've changed this to terminate only the read algorithm (and hopefully it is 
 clear this isn't the same as the abort steps):

 http://dev.w3.org/2006/webapi/FileAPI/#terminate-an-algorithm and




 Otherwise, if you start a new read in onabort [8.5.6 step 5], you'll
 still deliver the loadend [8.5.6 step 6].
 This contradicts 8.5.9.2.1 Once a loadstart has been fired, a
 corresponding loadend fires at completion of the read, EXCEPT if the
 read method has been cancelled using abort() and a new read method
 has
 been invoked.

 This seems like familiar ground, and I'm sorry this contradiction still 
 exists.

 So we could:

 1. Say not to fire a loadend if onloadend or onabort re-initiate a read.  But 
 this may be odd in terms of analyzing a program before.

 2. Simply not fire loadend on abort.  I'm not sure this is a good idea.

 What's your counsel?  Have I missed something easier?

 -- A*

My email must have crossed yours mid-flight, but just in case, how
about speccing that read* methods terminate the abort algorithm?
That's what XHR2 does, and it looks like it works.  It's not the
easiest thing to figure out when reading the spec.  It took me a while
to get my mind around it in XHR2, but then that's a much more
complicated spec.  FileReader's small enough that I think it's not
unreasonable, and of course matching XHR2 means fewer surprises all
around.

Eric



Re: FileReader abort, again

2012-02-29 Thread Eric U
On Wed, Feb 29, 2012 at 2:57 PM, Arun Ranganathan
aranganat...@mozilla.com wrote:
 On Wed, Feb 29, 2012 at 1:43 PM, Arun Ranganathan
  Otherwise, if you start a new read in onabort [8.5.6 step 5],
  you'll
  still deliver the loadend [8.5.6 step 6].
  This contradicts 8.5.9.2.1 Once a loadstart has been fired, a
  corresponding loadend fires at completion of the read, EXCEPT if
  the
  read method has been cancelled using abort() and a new read method
  has
  been invoked.
 
  This seems like familiar ground, and I'm sorry this contradiction
  still exists.
 
  So we could:
 
  1. Say not to fire a loadend if onloadend or onabort re-initiate a
  read.  But this may be odd in terms of analyzing a program before.

Do you mean if onload, onerror, or onabort...?

  2. Simply not fire loadend on abort.  I'm not sure this is a good
  idea.

Agreed.  It should be there unless another read starts.

  What's your counsel?  Have I missed something easier?
 
  -- A*

 My email must have crossed yours mid-flight, but just in case, how
 about speccing that read* methods terminate the abort algorithm?
 That's what XHR2 does, and it looks like it works.  It's not the
 easiest thing to figure out when reading the spec.  It took me a
 while
 to get my mind around it in XHR2, but then that's a much more
 complicated spec.  FileReader's small enough that I think it's not
 unreasonable, and of course matching XHR2 means fewer surprises all
 around.


 OK, I'll study XHR2 and figure this out.  Spec'ing this isn't a quick win, 
 though, since abort's role is to terminate a read*!  So to have a 
 re-initiated read* terminate an abort will require some thought on invocation 
 order.

I don't see a conflict--abort terminates read, and read terminates abort.

Actually, if we really want to match XHR2, we should qualify all the
places that we fire loadend.  If the user calls XHR2's open in onerror
or onload, that cancels its loadend.  However, a simple check on
readyState at step 6 won't do it.  Because the user could call
readAsText in onerror, then call abort in the second read's
onloadstart, and we'd see readyState as DONE and fire loadend twice.

To emulate XHR2 entirely, we'd need to have read methods dequeue any
leftover tasks for previous read methods AND terminate the abort
algorithm AND terminate the error algorithm of any previous read
method.  What a mess.

Perhaps there's a simpler way to say successfully calling a read
method inhibits any previous read's loadend?

[steps 5 and 6 there are missing trailing periods, BTW]



[fileapi] timing of readyState changes vs. events

2012-02-29 Thread Eric U
In the readAsText in the latest draft [1] I see that readyState gets
set to done When the blob has been read into memory fully.
I see that elsewhere in the progress notification description, When
the data from the blob has been completely read into memory, queue a
task to fire a progress event called load.  So readyState changes
separately from the sending of that progress event, since one is
direct and the other queued, and script could observe the state in
between.

In the discussion at [2] we arranged to avoid that for FileWriter.  We
should do the same for FileReader.

Eric

[1] http://dev.w3.org/2006/webapi/FileAPI/
[2] http://lists.w3.org/Archives/Public/public-webapps/2010OctDec/0912.html



Re: [file-writer] WebIDL / References

2012-02-28 Thread Eric U
On Sat, Feb 25, 2012 at 5:02 AM, Ms2ger ms2...@gmail.com wrote:
 Hi all,

 There are a number of bugs in the WebIDL blocks in
 http://dev.w3.org/2009/dap/file-system/file-writer.html.

 * The 'in' token has been removed; void append (in Blob data); should
  be void append (Blob data);.

Fixed.

 * Event handlers should be [TreatNonCallableAsNull] Function? onfoo,
  not just Function.

Fixed.

 * Interfaces should not have [NoInterfaceObject] without a good reason.

Fixed.

 * FileException doesn't exist anymore; use DOMException.

Still to come.

 Also, the References section is severely out of date.

Fixed.

 HTH
 Ms2ger




FileReader abort, again

2012-02-28 Thread Eric U
I like the Event Invariants writeup at the end.  It's only
informative, but it is, indeed, informative.

However, I'm not sure it quite matches the normative text in one
respect.  Where you say [8.5.6 step 4]: Terminate any steps while
processing a read method.  Does that also terminate the steps
associated with an abort that terminated the read method?  Basically
I'm not sure what steps while processing a read method means.

Otherwise, if you start a new read in onabort [8.5.6 step 5], you'll
still deliver the loadend [8.5.6 step 6].
This contradicts 8.5.9.2.1 Once a loadstart has been fired, a
corresponding loadend fires at completion of the read, EXCEPT if the
read method has been cancelled using abort() and a new read method has
been invoked.

Eric [copying this into FileWriter]



Re: [file-writer] WebIDL / References

2012-02-27 Thread Eric U
Thanks!  I'll take care of those.

On Sat, Feb 25, 2012 at 5:02 AM, Ms2ger ms2...@gmail.com wrote:
 Hi all,

 There are a number of bugs in the WebIDL blocks in
 http://dev.w3.org/2009/dap/file-system/file-writer.html.

 * The 'in' token has been removed; void append (in Blob data); should
  be void append (Blob data);.
 * Event handlers should be [TreatNonCallableAsNull] Function? onfoo,
  not just Function.
 * Interfaces should not have [NoInterfaceObject] without a good reason.
 * FileException doesn't exist anymore; use DOMException.

 Also, the References section is severely out of date.

 HTH
 Ms2ger




Re: CfC: publish WD of File API: Writer + File API: Directories and System; deadline March 3

2012-02-27 Thread Eric U
Yeah, the reason is that Arun's more on-the-ball than I am.  I'll be
updating the spec quite soon, I hope.

On Mon, Feb 27, 2012 at 2:35 AM, Felix-Johannes Jendrusch
felix-johannes.jendru...@fokus.fraunhofer.de wrote:
 Hi,

 is there any reason why the File API: Writer and File API: Directories and 
 System specifications still use FileException/FileError-Objects? The File API 
 uses DOM4's DOMException/DOMError [1].

 Best regards,
 Felix

 [1] http://dev.w3.org/2006/webapi/FileAPI/#ErrorAndException




Re: Colliding FileWriters

2012-02-27 Thread Eric U
Sorry about the slow response; I've been busy with dev work, and am
now getting back to spec work.

On Sat, Jan 21, 2012 at 9:57 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Wed, Jan 11, 2012 at 1:41 PM, Eric U er...@google.com wrote:
 On Wed, Jan 11, 2012 at 12:25 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Tue, Jan 10, 2012 at 1:32 PM, Eric U er...@google.com wrote:
 On Tue, Jan 10, 2012 at 1:08 PM, Jonas Sicking jo...@sicking.cc wrote:
 Hi All,

 We've been looking at implementing FileWriter and had a couple of 
 questions.

 First of all, what happens if multiple pages create a FileWriter for
 the same FileEntry at the same time? Will both be able to write to the
 file at the same time and whoever writes lasts to a given byte wins?

 This isn't currently specified, and that's a hole we should fill.  By
 not having it in the spec, my assumption would be that last-wins would
 hold, but it would be good to clarify it if that's the behavior we
 want.  It's especially important given that there's nothing like
 fflush(), which would help users know what last meant.  Speaking of
 which, should we add a flushing mechanism?

 This is different from how file systems normally work since as long as
 file is open for writing that tends to prevent other processes from
 opening the same file.

 You're perhaps thinking of windows, where by default files are opened
 in exclusive mode?  On other operating systems, and on windows when
 you specify FILE_SHARE_WRITE in dwShareMode in CreateFile, multiple
 writers can exist simultaneously.

 Ah. I didn't realize this was different on other OSs. It still seems
 risky to not provide any means to get exclusive access. The only way I
 can see websites dealing with this is to create their own locking
 mechanism backed by using IndexedDB transactions as low-level atomic
 primitive (local-storage doesn't work since you can implement
 compare-and-swap in an atomic manner).

 Having a 'exclusive' flag for createFileWriter seems much easier and
 removes the IndexedDB dependency. I'd probably even say that it should
 default to true since on the web defaulting to safe rather than fast
 generally results in fewer bugs.

 I don't think I'd generally be averse to this.  However, it would then
 require some sort of a revocation mechanism as well.  If you're done
 with your FileWriter, you want to be able to get rid of it without
 depending on GC, so that another context can create one.  And if you
 forget to revoke it, behavior in the second context presumably depends
 on GC, which is a bit ugly.

 I definitely agree that we need an explicit revoking mechanism. We
 have a similar situation in IndexedDB where as long as a IDBDatabase
 object is alive for a given database, no one can upgrade the database
 version. Here we do have an explicit .close() method, but if you
 forget to call it you end up waiting for GC. It's possibly somewhat
 less of a problem in IndexedDB though since upgrading database
 versions should be pretty rare.

 I'm not quite sure how urgent this is yet, though.  I've been assuming
 that if you have transactional/synchronization semantics you want to
 maintain, you'll be using IDB anyway, or a server handshake, etc.  But
 of course it's easy to write a naive app that the user loads in two
 windows, with bad effect.

 Yeah, it's the user opens page in two windows scenario that I'm
 concerned about. As well as similar conditions if you for example have
 a Worker thread which holds a connection to the server and
 occasionally writes data to a file based on information from the
 server, and code in a window which reads data from the file and acts
 on it.

If the window is only reading, not writing, I don't see the problem
with the current design.
If the worker and window are both reading and writing, in the same
file, the problem might be in the app's design.

 I don't think we can relegate synchronization semantics to IDB. I
 think we should have synchronization semantics at least as the default
 mode for all data that is shared between Workers and Windows which can
 be running on different threads. One great example is localStorage
 which we spent a lot of effort on trying to make synchronized using
 the storage mutex. We failed there, but not due to a lack of desire,
 but due to the way the API is structured.

 Though if we add the 'exclusive' flag described above, then we'll need
 to keep createFileWriter async anyway.

 Right--I think we should pick whatever subset of these suggestions
 seems the most useful, since they overlap a bit.

 Agreed.

 One working subset would be:

 * Keep createFileWriter async.
 * Make it optionally exclusive [possibly by default].  If exclusive,
 its length member is trustworthy.  If not, it can go stale.
 * Add an append method [needed only for non-exclusive writes, but
 useful for logs, and a safe default].

 This sounds great to me if we make it exclusive by default and remove
 the .length member for non-exclusive writers. Or make

Re: [FileAPI, common] UTF-16 to UTF-8 conversion

2012-02-27 Thread Eric U
What I can do is procrastinate until we agree that BlobBuilder is
deprecated, and this is now the problem of the Blob constructor.  Over
to you, Arun and Jonas.

On Mon, Sep 26, 2011 at 11:45 AM, Eric U er...@google.com wrote:
 Thanks Glenn and Simon--I'll see what I can do.


 On Fri, Sep 23, 2011 at 1:34 AM, Simon Pieters sim...@opera.com wrote:

 On Fri, 23 Sep 2011 01:40:44 +0200, Glenn Maynard gl...@zewt.org wrote:

 BlobBuilder.append(text) says:

 Appends the supplied text to the current contents of the BlobBuilder,

 writing it as UTF-8, converting newlines as specified in endings.

 It doesn't elaborate any further.  The conversion from UTF-16 to UTF-8
 needs
 to be defined, in particular for the edge case of invalid UTF-16
 surrogates.  If this is already defined somewhere, it isn't referenced.

 I suppose this would belong in Common infrastructure, next to the
 existing
 section on UTF-8, not in FileAPI itself.


 WebSocket send() throws SYNTAX_ERR if its argument contains unpaired
 surrogates. It would be nice to be consistent.

 --
 Simon Pieters
 Opera Software





Re: [XHR] Invoking open() from event listeners

2012-01-17 Thread Eric U
On Tue, Dec 20, 2011 at 9:24 AM, Anne van Kesteren ann...@opera.com wrote:
 Sorry for restarting this thread, but it seems we did not reach any
 conclusions last time around.

 On Thu, 03 Nov 2011 00:07:48 +0100, Eric U er...@google.com wrote:

 I think I may have missed something important.  XHR2 specs just this
 behavior w.r.t. abort [another open will stop the abort's loadend] but
 /doesn't/ spec that for error or load.  That is, an open() in onerror
 or onload does not appear to cancel the pending loadend.  Anne, can
 you comment on why?


 I think I did not consider that scenario closely enough when I added support
 for these progress events.

 open() does terminate both abort() and send() (the way it does so is not
 very clear), but maybe it would be clearer if invoking open() set some kind
 of flag that is checked by both send() and abort() from the moment they
 start dispatching events.

 http://dvcs.w3.org/hg/xhr/raw-file/tip/Overview.html

Ah, I see how that works now.  So if you call open from
onerror/onabort/onload, there's no loadend from the terminated XHR.
And if you call open before onerror/onabort/onload, you don't get any
of those either?

If you call open from onerror, do other listeners later in the chain
get their onerror calls?

 Glenn suggested not allowing open() at all, but I think for XMLHttpRequest
 we are past that (we have e.g. the readystatechange event which has been
 around since XMLHttpRequest support was added and open() is definitely
 called from it in the wild).


 --
 Anne van Kesteren
 http://annevankesteren.nl/



Re: File modification

2012-01-11 Thread Eric U
On Wed, Jan 11, 2012 at 12:22 PM, Charles Pritchard ch...@jumis.com wrote:
 On 1/11/2012 9:00 AM, Glenn Maynard wrote:


 This isn't properly specced anywhere and may be impossible to implement
 perfectly, but previous discussions indicated that Chrome, at least, wanted
 File objects loaded from input elements to only represent access for the
 file as it is when the user opened it.  That is, the File is immutable (like
 a Blob), and if the underlying OS file changes (thus making the original
 data no longer available), attempting to read the File would fail.  (This
 was in the context of storing File in structured clone persistent storage,
 like IndexedDB.)


 Mozilla seems to only take a snapshot when the user opens the file. Chrome
 goes in the other direction, and does so intentionally with FileEntry.
 I'd prefer everyone follow Chrome.

We do so with FileEntry, in the sandbox, because it's intended to be a
much more powerful API than File, and the security aspects of it are
much simpler.  When the user drags a File into the browser, it's much
less clear that they intend to give the web app persistent access to
that File, including all future changes until the page is closed.  I
don't think we'd rush to make that change to the spec.  And if our
implementation isn't snapshotting currently, that's a bug.

 The spec on this could be nudged slightly to support Chrome's existing
 behavior.

 From dragdrop:
 http://www.whatwg.org/specs/web-apps/current-work/multipage/dnd.html
 The files attribute must return a live FileList sequence

 http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#live
 If a DOM object is said to be live, then the attributes and methods on that
 object must operate on the actual underlying data, not a snapshot of the
 data.

 Dragdrop continues:
 for a given FileList object and a given underlying file, the same File
 object must be used each time.

 Given that the underlying file can change, and the FileList sequence is
 live, it seems reasonable that subsequent reads of FileList would access a
 different File object when the underlying file has changed.

 FileList.onchanged would be appropriate. File.onupdated would not be
 appropriate. Entry.onupdated would be appropriate.


 I have one major technical concern: monitoring files for changes isn't
 free.  With only a DOM event, all instantiated Files (or Entries) would have
 to monitor changes; you don't want to depend on do something if an event
 handler is registered, since that violates the principle of event handler
 registration having no other side-effects.  Monitoring should be enabled
 explicitly.

 I also wonder whether this could be implemented everywhere, eg. on mobile
 systems.


 At this point, iOS still doesn't allow input type=file nor dataTransfer
 of file. So, we're looking far ahead.

 A system may send a FileList.onchanged() event when it notices that the
 FileList has been updated. It can be done on access of a live FileList when
 a mutation is detected. It could be done by occasional polling, or it could
 be done via notify-style OS hooks. In the first case, there is no
 significant overhead. webkitdirectory returns a FileList object that can be
 monitored via directory notification hooks; again, if the OS supports it.

 Event handlers have some side effects, but not in the scripting environment.
 onclick, for example, may mean that an element responds to touch events in
 the mobile environment.


 -Charles





Re: Colliding FileWriters

2012-01-11 Thread Eric U
On Wed, Jan 11, 2012 at 12:25 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Tue, Jan 10, 2012 at 1:32 PM, Eric U er...@google.com wrote:
 On Tue, Jan 10, 2012 at 1:08 PM, Jonas Sicking jo...@sicking.cc wrote:
 Hi All,

 We've been looking at implementing FileWriter and had a couple of questions.

 First of all, what happens if multiple pages create a FileWriter for
 the same FileEntry at the same time? Will both be able to write to the
 file at the same time and whoever writes lasts to a given byte wins?

 This isn't currently specified, and that's a hole we should fill.  By
 not having it in the spec, my assumption would be that last-wins would
 hold, but it would be good to clarify it if that's the behavior we
 want.  It's especially important given that there's nothing like
 fflush(), which would help users know what last meant.  Speaking of
 which, should we add a flushing mechanism?

 This is different from how file systems normally work since as long as
 file is open for writing that tends to prevent other processes from
 opening the same file.

 You're perhaps thinking of windows, where by default files are opened
 in exclusive mode?  On other operating systems, and on windows when
 you specify FILE_SHARE_WRITE in dwShareMode in CreateFile, multiple
 writers can exist simultaneously.

 Ah. I didn't realize this was different on other OSs. It still seems
 risky to not provide any means to get exclusive access. The only way I
 can see websites dealing with this is to create their own locking
 mechanism backed by using IndexedDB transactions as low-level atomic
 primitive (local-storage doesn't work since you can implement
 compare-and-swap in an atomic manner).

 Having a 'exclusive' flag for createFileWriter seems much easier and
 removes the IndexedDB dependency. I'd probably even say that it should
 default to true since on the web defaulting to safe rather than fast
 generally results in fewer bugs.

I don't think I'd generally be averse to this.  However, it would then
require some sort of a revocation mechanism as well.  If you're done
with your FileWriter, you want to be able to get rid of it without
depending on GC, so that another context can create one.  And if you
forget to revoke it, behavior in the second context presumably depends
on GC, which is a bit ugly.

I'm not quite sure how urgent this is yet, though.  I've been assuming
that if you have transactional/synchronization semantics you want to
maintain, you'll be using IDB anyway, or a server handshake, etc.  But
of course it's easy to write a naive app that the user loads in two
windows, with bad effect.

 A second question is why is FileEntry.createWriter asynchronous? It
 doesn't actually do any IO and so it seems like it could return an
 answer synchronously.

 FileWriter has a synchronous length property, just as Blob does, so it
 needs to do IO at creation time to look it up.

 So how does this work if you have two tabs running in different
 processes create FileWriters for the same FileEntry. Each tab could
 end up changing the file's size in which case the the other tabs
 FileWriter will either have to synchronously update its .length, or it
 will have an outdated length.

 So the IO you do when creating the FileWriter is basically unreliable
 as soon as it's done.

 So it seems like you could get the size when creating the FileEntry
 and then use that cached size when creating FileWriter instance.

The size in the FileEntry is no more reliable than that in the
FileWriter, of course.  But if you know you're the only writer,
either's good.

 Though I wonder if it wouldn't be better to remove the .length
 property. If anything we could add a asynchronous length getter or a
 write method which appends to the end of the file (since writing is
 already asynchronous).

A new async length getter's not needed; you can use file() for that already.
I didn't originally add append due to its apparent redundancy with
seek+write, but as you point out, seek+write doesn't guarantee to
append if there are multiple writers.

 Though if we add the 'exclusive' flag described above, then we'll need
 to keep createFileWriter async anyway.

Right--I think we should pick whatever subset of these suggestions
seems the most useful, since they overlap a bit.

One working subset would be:

* Keep createFileWriter async.
* Make it optionally exclusive [possibly by default].  If exclusive,
its length member is trustworthy.  If not, it can go stale.
* Add an append method [needed only for non-exclusive writes, but
useful for logs, and a safe default].

 Would this also explain why FileEntry.getFile is asynchronous? I.e. it
 won't call it's callback until all current FileWriters have been
 closed?

 Nope.  It's asynchronous because a File is a Blob, and has a
 synchronous length accessor, so we look up the length when we mint the
 File.  Note that the length can go stale if you have multiple writers,
 as we want to keep it fast.

 This reminds me

Re: File modification

2012-01-10 Thread Eric U
On Tue, Jan 10, 2012 at 1:29 PM, Charles Pritchard ch...@visc.us wrote:
 Modern operating systems have efficient mechanisms to send a signal when a 
 watched file or directory is modified.

 File and FileEntry have a last modified date-- currently we must poll entries 
 to see if the modification date changes. That works completely fine in 
 practice, but it doesn't give us a chance to exploit the efficiency of some 
 operating systems in notifying applications about file updates.

 So as a strawman: a File.onupdated event handler may be useful.

It seems like it would be most useful if the File or FileEntry points
to a file outside the sandbox defined by the FileSystem spec.  Does
any browser currently supply such a thing?  Chrome currently
implements this [with FileEntry] only for ChromeOS components that are
implemented as extensions.  Does any browser let you have a File
outside the sandbox *and* update its modification time?

If you're dealing only with FileEntries inside the sandbox, there are
already more efficient ways to tell yourself that you've changed
something.



Re: Colliding FileWriters

2012-01-10 Thread Eric U
On Tue, Jan 10, 2012 at 1:08 PM, Jonas Sicking jo...@sicking.cc wrote:
 Hi All,

 We've been looking at implementing FileWriter and had a couple of questions.

 First of all, what happens if multiple pages create a FileWriter for
 the same FileEntry at the same time? Will both be able to write to the
 file at the same time and whoever writes lasts to a given byte wins?

This isn't currently specified, and that's a hole we should fill.  By
not having it in the spec, my assumption would be that last-wins would
hold, but it would be good to clarify it if that's the behavior we
want.  It's especially important given that there's nothing like
fflush(), which would help users know what last meant.  Speaking of
which, should we add a flushing mechanism?

 This is different from how file systems normally work since as long as
 file is open for writing that tends to prevent other processes from
 opening the same file.

You're perhaps thinking of windows, where by default files are opened
in exclusive mode?  On other operating systems, and on windows when
you specify FILE_SHARE_WRITE in dwShareMode in CreateFile, multiple
writers can exist simultaneously.

 A second question is why is FileEntry.createWriter asynchronous? It
 doesn't actually do any IO and so it seems like it could return an
 answer synchronously.

FileWriter has a synchronous length property, just as Blob does, so it
needs to do IO at creation time to look it up.

 Is the intended way for this to work that FileEntry.createWriter acts
 as a choke point and ensures that only one active FileWriter for a
 given FileEntry exists at the same time. I.e. if one page creates a
 FileWriter for a FileEntry and starts writing to it, any other caller
 to FileEntry.createWriter will wait to fire it's callback until the
 first caller was done with its FileWriter. If that is the intended
 design I would have expected FileWriter to have an explicit .close()
 function though. Having to wait for GC to free a lock is always a bad
 idea.

Agreed re: GC, but currently in Chromium there is no choke point, and
one can create multiple writers, which can stomp on each others'
writes if that's what the user requests.  That being said, we don't
really hold files open right now, except during a write call.  In
between writes, we close the file, so while collisions are possible,
more likely one write will win entirely.  But we are opening the files
in shared mode.

 Would this also explain why FileEntry.getFile is asynchronous? I.e. it
 won't call it's callback until all current FileWriters have been
 closed?

Nope.  It's asynchronous because a File is a Blob, and has a
synchronous length accessor, so we look up the length when we mint the
File.  Note that the length can go stale if you have multiple writers,
as we want to keep it fast.

 These questions both apply to what's the intended behavior spec-wise,
 as well as what does Google Chrome do in the current implementation.

I'm personally OK with the current Chrome implementation, which does
no locking.  If users want transactional behavior, there are better
ways to get that via databases.  But I'm open to discussion.



Re: Bug in file system Api specification

2011-12-21 Thread Eric U
Bronislav:

Thanks for the tip; it's already fixed in the latest editor's
draft, so the fix will get published the next time the document is.
See the latest at
http://dev.w3.org/2009/dap/file-system/file-dir-sys.html.

 Eric

On Wed, Dec 21, 2011 at 12:21 AM, Bronislav Klučka
bronislav.klu...@bauglir.com wrote:
 Hi
 http://www.w3.org/TR/file-system-api/#widl-FileEntry-file
 says that successCallback is A callback that is called with the
 new FileWriter. there should be A callback that is called with the File

 BTW  was trying to file that bug myself, but I could not find suitable
 component in WebAppsWG product.

 Brona



Re: Is BlobBuilder needed?

2011-11-15 Thread Eric U
On Tue, Nov 15, 2011 at 5:41 AM, Rich Tibbett ri...@opera.com wrote:
 Jonas Sicking wrote:

 Hi everyone,

 It was pointed out to me on twitter that BlobBuilder can be replaced
 with simply making Blob constructable. I.e. the following code:

 var bb = new BlobBuilder();
 bb.append(blob1);
 bb.append(blob2);
 bb.append(some string);
 bb.append(myArrayBuffer);
 var b = bb.getBlob();

 would become

 b = new Blob([blob1, blob2, some string, myArrayBuffer]);

 or look at it another way:

 var x = new BlobBuilder();
 becomes
 var x = [];

 x.append(y);
 becomes
 x.push(y);

 var b = x.getBlob();
 becomes
 var b = new Blob(x);

 So at worst there is a one-to-one mapping in code required to simply
 have |new Blob|. At best it requires much fewer lines if the page has
 several parts available at once.

 And we'd save a whole class since Blobs already exist.

 Following the previous discussion (which seemed to raise no major
 objections) can we expect to see this in the File API spec sometime soon
 (assuming that spec is the right home for this)?

 This will require a coordinated edit to coincide with the removal of
 BlobBuilder from the File Writer API, right?

It need not be all that coordinated.  I can take it out [well...mark
it deprecated, pending implementation changes] any time after the Blob
constructor goes into the File API.

 Thanks,

 Rich


 / Jonas




Re: [FileAPI] FileReader.abort() and File[Saver|Writer].abort have different behaviors

2011-11-02 Thread Eric U
On Mon, Oct 3, 2011 at 6:13 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Mon, Oct 3, 2011 at 5:57 PM, Glenn Maynard gl...@zewt.org wrote:
 On Mon, Oct 3, 2011 at 8:10 PM, Jonas Sicking jo...@sicking.cc wrote:

 1. Make loadend not fire in case a new load is started from
 onabort/onload/onerror. Thus loadend and loadstart isn't always
 paired up. Though there is always a loadend fired after every
 loadstart.
 2. Make FileReader/FileWriter/FileSaver not behave like XHR. This also

 leaves the problem unsolved for XHR.

 Are there other options I'm missing?

 Or do both, improving XHR as much as backwards-compatibility allows and
 don't try to match other APIs to it exactly.  I'd much prefer weirdness be
 isolated to XHR than be perpetuated through every PE-based API.

 So what exactly are you proposing we do for XHR and for FileReader/FileWriter?

 I'm still not convinced that it's better for authors to require them
 to use setTimeout to start a new load as opposed to let them restart
 the new load from within an event and cancel all following events. I
 agree that this introduces some inconsistency, but it only does so
 when authors explicitly reuses a FileReader/XHR/FileWriter for
 multiple requests.

 And it only weakens the invariant, not removes it. So instead of

 * There's exactly one 'loadend' event for each 'loadstart' event.

 we'll have

 * There's always a 'loadend' event fired after each 'loadstart' event.
 However there might be other 'loadstart' events fired in between.

I'm for this.  It lets FileReader and FileWriter match XHR, avoids [in
the odd case] long strings of stacked-up loadend events, and users can
avoid all the issues either by creating a new FileReader or by
wrapping nested calls in timers if they care.  I believe Jonas is in
favor of this as well.

Can we put this one to bed?

 Eric



Re: [File API] Calling requestFileSystem with bad filesystem type

2011-11-02 Thread Eric U
On Fri, Oct 7, 2011 at 12:02 PM, Mark Pilgrim pilg...@google.com wrote:
 What should this do?

 requestFileSystem(2, 100, successCallback); // assume successCallback
 is defined properly

requestFileSystem doesn't throw, so you should get an errorCallback
call.  You haven't provided an errorCallback, so you should get a
silent failure.

It does seem like an error we could identify quickly enough to throw,
though, and in general I favor fail-fast for obviously bad parameters.
 Opinions?

Eric



Re: [FileAPI] FileReader.abort() and File[Saver|Writer].abort have different behaviors

2011-11-02 Thread Eric U
On Wed, Nov 2, 2011 at 3:56 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Wed, Nov 2, 2011 at 9:56 AM, Eric U er...@google.com wrote:
 On Mon, Oct 3, 2011 at 6:13 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Mon, Oct 3, 2011 at 5:57 PM, Glenn Maynard gl...@zewt.org wrote:
 On Mon, Oct 3, 2011 at 8:10 PM, Jonas Sicking jo...@sicking.cc wrote:

 1. Make loadend not fire in case a new load is started from
 onabort/onload/onerror. Thus loadend and loadstart isn't always
 paired up. Though there is always a loadend fired after every
 loadstart.
 2. Make FileReader/FileWriter/FileSaver not behave like XHR. This also

 leaves the problem unsolved for XHR.

 Are there other options I'm missing?

 Or do both, improving XHR as much as backwards-compatibility allows and
 don't try to match other APIs to it exactly.  I'd much prefer weirdness be
 isolated to XHR than be perpetuated through every PE-based API.

 So what exactly are you proposing we do for XHR and for 
 FileReader/FileWriter?

 I'm still not convinced that it's better for authors to require them
 to use setTimeout to start a new load as opposed to let them restart
 the new load from within an event and cancel all following events. I
 agree that this introduces some inconsistency, but it only does so
 when authors explicitly reuses a FileReader/XHR/FileWriter for
 multiple requests.

 And it only weakens the invariant, not removes it. So instead of

 * There's exactly one 'loadend' event for each 'loadstart' event.

 we'll have

 * There's always a 'loadend' event fired after each 'loadstart' event.
 However there might be other 'loadstart' events fired in between.

 I'm for this.  It lets FileReader and FileWriter match XHR, avoids [in
 the odd case] long strings of stacked-up loadend events, and users can
 avoid all the issues either by creating a new FileReader or by
 wrapping nested calls in timers if they care.  I believe Jonas is in
 favor of this as well.

 Can we put this one to bed?

 So the proposal here is to allow new loads to be started from within
 abort/error/load event handlers, and for loadend to *not* fire if a
 new load has already started by the time the abort/error/load event is
 done firing. And the goal is that XMLHttpRequest, FileReader and
 FileWriter all behave this way. Is this correct?

I think I may have missed something important.  XHR2 specs just this
behavior w.r.t. abort [another open will stop the abort's loadend] but
/doesn't/ spec that for error or load.  That is, an open() in onerror
or onload does not appear to cancel the pending loadend.  Anne, can
you comment on why?

 If so, I agree that this sounds like a good solution.

 / Jonas




Re: Is BlobBuilder needed?

2011-10-26 Thread Eric U
On Wed, Oct 26, 2011 at 4:14 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Tue, Oct 25, 2011 at 12:57 PM, Tab Atkins Jr. jackalm...@gmail.com wrote:
 On Tue, Oct 25, 2011 at 12:53 PM, Ojan Vafai o...@chromium.org wrote:
 The new API is smaller and simpler. Less to implement and less for web
 developers to understand. If it can meet all our use-cases without
 significant performance problems, then it's a win and we should do it.

 For line-endings, you could have the Blob constructor also take an optional
 endings argument:
 new Blob(String|Array|Blob|ArrayBuffer data, [optional] String contentType,
 [optional] String endings);

 I believe (or at least, I maintain) that we're trying to do
 dictionaries for this sort of thing.  Multiple optional arguments are
 *horrible* unless they are truly, actually, order-dependent such that
 you wouldn't ever specify a later one without already specifying a
 former one.

 I don't have a super strong opinion. I will however note that I think
 it'll be very common to specify a content-type, but much much more
 rare to specify any of the other types. But maybe using the syntax

 b = new Blob([foo, bar], { contentType: text/plain });

 isn't too bad. The other properties that I could think of that we'd
 want to add sometime in the future would be encoding for strings,
 including endianness for utf16 strings.

That looks good to me.  Endings can go in there, if we keep it.



Re: Is BlobBuilder needed?

2011-10-24 Thread Eric U
On Mon, Oct 24, 2011 at 3:52 PM, Jonas Sicking jo...@sicking.cc wrote:
 Hi everyone,

 It was pointed out to me on twitter that BlobBuilder can be replaced
 with simply making Blob constructable. I.e. the following code:

 var bb = new BlobBuilder();
 bb.append(blob1);
 bb.append(blob2);
 bb.append(some string);
 bb.append(myArrayBuffer);
 var b = bb.getBlob();

 would become

 b = new Blob([blob1, blob2, some string, myArrayBuffer]);

 or look at it another way:

 var x = new BlobBuilder();
 becomes
 var x = [];

 x.append(y);
 becomes
 x.push(y);

 var b = x.getBlob();
 becomes
 var b = new Blob(x);

 So at worst there is a one-to-one mapping in code required to simply
 have |new Blob|. At best it requires much fewer lines if the page has
 several parts available at once.

 And we'd save a whole class since Blobs already exist.

It does look cleaner this way, and getting rid of a whole class would
be very nice.

The only things that this lacks that BlobBuilder has are the endings
parameter for '\n' conversion in text and the content type.  The
varargs constructor makes it awkward to pass in flags of any
sort...any thoughts on how to do that cleanly?

Eric



Re: FileSystem API - The Flags interface

2011-10-09 Thread Eric U
The exception is thrown by getFile on DirectoryEntrySync, not by the
Flags constructor; both the example and the flags interface are
correct.

On Sat, Oct 8, 2011 at 11:54 AM, Bronislav Klučka
bronislav.klu...@bauglir.com wrote:
 Hello,
 http://www.w3.org/TR/file-system-api/#the-flags-interface
 If you look at the description of exclusive flag (4.2.1), the description
 states no exception, but the example (4.2.2) uses exception to determine
 whether file already existed.
 So the question is, what is wrong: the description or example?

 Brona Klucka






Re: [FileAPI] FileReader.abort() and File[Saver|Writer].abort have different behaviors

2011-09-29 Thread Eric U
On Thu, Sep 29, 2011 at 12:22 PM, Arun Ranganathan a...@mozilla.com wrote:

 On 9/21/11 8:07 PM, Eric U wrote:

 Update: I have made the changes to FileWriter/FileSaver's event
 sequences; they now match FileReader.
 That's not to say it won't change pending discussion, but FileWriter
 should continue to match FileReader whatever else happens.

       Eric

 Eric:

 After reading this email thread, and looking at your changes, I think I'll 
 make the following changes:

 1. Tighten requirement on onprogress such that we mandate firing *at least 
 one* progress event with a must.  Right now this is unclear as you point out, 
 not least of all because we don't mandate the user agent calling onprogress.
 2. Include a discussion of the invariants Jonas mentions [1], so that event 
 order is fleshed in the event section.
 3. Clarify exceptions to the 50ms event dispatch timeframe (notably for 
 progress events before load+loadend).

 To be clear, you've decided we're NOT going to veer from XHR2's abort/open 
 behavior (and thus what FileReader says now) in FileWriter/FileSaver right?

 Is this a good summary of changes that we should make?

 -- A*
 [1] http://lists.w3.org/Archives/Public/public-webapps/2011JulSep/1512.html

I think that works; #2 will be especially important.
However, if I read this right, we *don't* have the invariant that a
loadstart will always have a loadend.
Now that Anne's explained XHR2's model, it seems that an open can
cancel the loadend that an abort would have sent.  So the invariants
need to be a bit more complex.

I've updated FileWriter to take most of this into account, but *not*
that last bit yet; as written, I've got Jonas's original invariants,
which would lead to the stacked up loadend events at the end.



Re: Publishing specs before TPAC; Oct 14 is last day to start a CfC

2011-09-27 Thread Eric U
On Mon, Sep 26, 2011 at 2:38 PM, Arthur Barstow art.bars...@nokia.com wrote:
 The upcoming TPAC meeting (Oct 31 - Nov 01) provides an opportunity for
 joint WG meetings and lots of informal sharing. As such, some groups make
 spec publications right before TPAC.

 Note there is a 2-week publication blackout period around the TPAC week and
 Oct 24 is the last day to request publication before TPAC.  Given our 1-week
 CfC for new publications, weekends, etc., the schedule is:

 * Oct 14 - last day to start a CfC to publish
 * Oct 24 - last day to request publication
 * Oct 27 - last publications before TPAC
 * Nov 07 - publications resume

 *A lot of groups wait until the deadline so if you want to publish before
 TPAC, I encourage you to propose publication as soon as possible and by
 October 14 at the latest.
 *
 Some specs I'd like to see published before TPAC
 (http://www.w3.org/2008/webapps/wiki/PubStatus):

 * Clipboard APIs and Events - I think Hallvord has made quite a few changes
 since last publication on 12-Apr-2011. WDYT Hallvord?

 * D3E - not sure if next pub is CR or LC. Doug, Jacob?

 * File API (last published in 26-Oct-2010) - Arun, Jonas - what's up with
 this spec?

 * File API: Writer and Directories  System - WDYT Eric? Are the changes
 since the April 2011 publication significant?

There have been a few small changes since then, but I'm going to be
pretty tied up through TPAC; let's do a draft in November some time.

 * Indexed Database API - is this ready for LC?

 * Server-sent Events - 8 open bugs so I presume a new WD at this point.

 * Web Messaging - 6 open bugs so I presume a new WD at this point.

 -AB









Re: [FileAPI, common] UTF-16 to UTF-8 conversion

2011-09-26 Thread Eric U
Thanks Glenn and Simon--I'll see what I can do.

On Fri, Sep 23, 2011 at 1:34 AM, Simon Pieters sim...@opera.com wrote:

 On Fri, 23 Sep 2011 01:40:44 +0200, Glenn Maynard gl...@zewt.org wrote:

  BlobBuilder.append(text) says:

  Appends the supplied text to the current contents of the BlobBuilder,

 writing it as UTF-8, converting newlines as specified in endings.

 It doesn't elaborate any further.  The conversion from UTF-16 to UTF-8
 needs
 to be defined, in particular for the edge case of invalid UTF-16
 surrogates.  If this is already defined somewhere, it isn't referenced.

 I suppose this would belong in Common infrastructure, next to the
 existing
 section on UTF-8, not in FileAPI itself.


 WebSocket send() throws SYNTAX_ERR if its argument contains unpaired
 surrogates. It would be nice to be consistent.

 --
 Simon Pieters
 Opera Software




Re: [FileAPI] BlobBuilder.append(native)

2011-09-26 Thread Eric U
On Thu, Sep 22, 2011 at 4:47 PM, Glenn Maynard gl...@zewt.org wrote:
 native Newlines must be transformed to the default line-ending
 representation of the underlying host filesystem. For example, if the
 underlying filesystem is FAT32, newlines would be transformed into \r\n
 pairs as the text was appended to the state of the BlobBuilder.

 This is a bit odd: most programs write newlines according to the convention
 of the host system, not based on peeking at the underlying filesystem.  You
 won't even know the filesystem if you're writing to a network drive.  I'd
 suggest must be transformed according to the conventions of the local
 system, and let implementations decide what that is.  It should probably be
 explicit that the only valid options are \r\n and \n, or reading files back
 in which were transformed in this way will be difficult.

Good catch--I'll fix that.

 Also, in the Issue above that, it seems to mean native where it says
 transparent.

Yup.  That too.

Thanks!



Re: [FileAPI] FileReader.abort() and File[Saver|Writer].abort have different behaviors

2011-09-21 Thread Eric U
On Wed, Sep 21, 2011 at 2:28 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Wed, Sep 21, 2011 at 11:12 AM, Glenn Maynard gl...@zewt.org wrote:
 On Tue, Sep 20, 2011 at 8:40 PM, Eric U er...@google.com wrote:

 Indeed--however, from a quick skim of XHR and XHR2, that's not what
 they do.  They let open() terminate abort(), however far along it's
 gotten.  If we did that, then an abort killed by a read might lead to
 the aborted read never getting an onloadend.  But you could still get
 the stack-busting chain of onloadstart/onabort.

 Yuck.  I agree that's not a good thing to mimic for the sake of
 consistency.  Anne, is this intentional, or just something XHR is just stuck
 with for compatibility?  It looks like a new problem in XHR2--this couldn't
 happen in XHR1, because there was no abort event fired before loadend.

 If we wanted to prevent read methods from being called during abort,
 we'd probably want to do that by setting an aborting flag or mucking
 around with yet another readyState of ABORTING.

 That's annoying, but it's better than the current situation, and I think
 better than the XHR situation.  Receiving loadstart should guarantee the
 receipt of loadend.

 On Tue, Sep 20, 2011 at 7:43 PM, Jonas Sicking jo...@sicking.cc wrote:

 1. onloadstart fires exactly once
 2. There will be one onprogress event fired when 100% progress is reached
 3. Exactly one of onabort, onload and onerror fires
 4. onloadend fires exactly once.
 6. no onprogress events fire before onloadstart
 5. no onprogress events fire after onabort/onload/onerror
 6. no onabort/onoad/onerror events fire after onloadend

 7. after loadstart is fired, loadstart is not fired again until loadend has
 been fired (ie. only one set of progress events can be active on an object
 at one time).

 More precisely: loadstart should not be fired again until the dispatch of
 loadend *has completed*.  That is, you can't start a new progress sequence
 from within loadend, either, because there may be other listeners on the
 object that havn't yet received the loadend.

 I don't think we can do that for XHR without breaking backwards compat.

I just spent a bit more time with the XHR2 spec, and it looks like the
same looping behavior's legal there too, bouncing between
onreadystatechange and onabort, and stacking up a pending call to
onloadend for each loop.  When open terminates abort, abort completes
the step of the algorithm [here step 5], which includes a subsequent
call to onloadend.  It's not a queued task to be cancelled, as it's
all synchronous calls back and forth.

If we want the file specs to match the XHR spec, then we can just
leave this as it is in File Reader, and I'll match it in File Writer.
Recursion depth limit is up to the UA to set.  But I look forward to
hearing what Anne has to say about it before we settle on copying it.



Re: [FileAPI] FileReader.abort() and File[Saver|Writer].abort have different behaviors

2011-09-21 Thread Eric U
On Wed, Sep 21, 2011 at 3:09 PM, Glenn Maynard gl...@zewt.org wrote:
 On Wed, Sep 21, 2011 at 5:44 PM, Eric U er...@google.com wrote:

 If we want the file specs to match the XHR spec, then we can just
 leave this as it is in File Reader, and I'll match it in File Writer.
 Recursion depth limit is up to the UA to set.  But I look forward to
 hearing what Anne has to say about it before we settle on copying it.

 In my opinion, providing the no nesting guarantee is more useful than
 being consistent with XHR, if all new APIs provide it.

If we eliminate it entirely, then you can't ever start a new read on
the same object from the abort handler.  That seems like a reasonable
use case.

 This sort of thing seems obviously useful:

 function showActivity(obj)
 {
     obj.addEventHandler(loadstart, function() { div.hidden = false; },
 false);
     obj.addEventHandler(loadend, function() { div.hidden = true; },
 false);
 }

 With the currently specced behavior, this doesn't work--the div would end up
 hidden when it should be shown.  You shouldn't have to care how other code
 is triggering reads to do something this simple.

Adding a number-of-reads-outstanding counter isn't that much more
code.  And if you're really trying to keep things simple, you're not
aborting and then starting another read during the abort, so the above
code works in your app.



Re: [FileAPI] FileReader.abort() and File[Saver|Writer].abort have different behaviors

2011-09-21 Thread Eric U
On Wed, Sep 21, 2011 at 3:29 PM, Glenn Maynard gl...@zewt.org wrote:
 On Wed, Sep 21, 2011 at 6:14 PM, Eric U er...@google.com wrote:

 If we eliminate it entirely, then you can't ever start a new read on
 the same object from the abort handler.  That seems like a reasonable
 use case.

 It's trivial to stuff it into a zero-second timeout to knock it out of the
 event handler.  This is such a common and useful pattern that libraries have
 shorthand for it, eg. Prototype's Function#defer.  I don't think that's an
 onerous requirement at all; it's basically the same as specs saying queue
 an event.

While it's certainly not hard to work around, as you say, it seems
more complex and less likely to be obvious than the
counter-for-activity example, which feels like the classic push-pop
paradigm.  And expecting users to write their event handlers one way
for XHR and a different way for FileReader/FileWriter seems like
asking for trouble--you're going to get issues that only come up in
exceptional cases, and involve a fairly subtle reading of several
specs to get right.  I think we're better off going with consistency.

 Adding a number-of-reads-outstanding counter isn't that much more
 code.

 It's not much more code, but it's code dealing with a case that doesn't have
 to exist, working around a very ugly and unobvious sequence of events, and
 it's something that you really shouldn't have to worry about every single
 time you use loadstart/loadend pairs.

 And if you're really trying to keep things simple, you're not
 aborting and then starting another read during the abort, so the above
 code works in your app.

 The above code and the code triggering the reads might not even be written
 by the same people--the activity display might be a third-party component
 (who very well might not have thought of this; I wouldn't have, before this
 discussion).

 --
 Glenn Maynard





Re: [FileAPI] FileReader.abort() and File[Saver|Writer].abort have different behaviors

2011-09-21 Thread Eric U
On Wed, Sep 21, 2011 at 4:45 PM, Glenn Maynard gl...@zewt.org wrote:
 On Wed, Sep 21, 2011 at 6:51 PM, Eric U er...@google.com wrote:

 While it's certainly not hard to work around, as you say, it seems
 more complex and less likely to be obvious than the
 counter-for-activity example, which feels like the classic push-pop
 paradigm.

 The *need* to have counters to use loadstart/loadend at all isn't obvious,
 and it's a guarantee that many (perhaps most) users won't do this.  Pulling
 code out of events with a timer isn't at all complex--it's a simple, common
 pattern.  I think it's much more obvious, since if you don't do it an
 exception is raised (that you can search for if you don't know what to do),
 instead of a subtle bug being introduced.

 Also note that XHR cancels the abort method entirely if you start a new
 request during onabort, which means loadend isn't fired.  Having mismatched
 loadstart/loadend events seems equally ugly, and not something to try to be
 consistent with even if we're stuck with it for XHR.

Again, that's not what the XHR2 spec says.  See my summary up-thread
about the actual behavior, and Anne can correct my interpretation if
I'm wrong.

 And expecting users to write their event handlers one way
 for XHR and a different way for FileReader/FileWriter seems like
 asking for trouble--you're going to get issues that only come up in
 exceptional cases, and involve a fairly subtle reading of several
 specs to get right.  I think we're better off going with consistency.

 You can write code for XHR in the same way, if you want, punting open()
 calls out of abort/loadend event handlers with a timer.  It'd be depressing
 to see PE turn so ugly in an attempt to be consistent with a flawed legacy
 API; better to isolate the problem as much as possible.  (Are there any
 other APIs with this problem besides XHR that couldn't be fixed?)

Expecting users to rewrite handlers for XHR to match a new API, where
it's not necessary for XHR's use, seems wildly optimistic.

 Another way to deal with this is to make loadstart (and other parts of those
 calls, the error paths in particular) async, so if you start a new read
 within onabort, it won't actually start until the abort finishes.  (That's a
 more invasive behavioral change, of course, and I'm not sure I'd like it
 myself, but it's worth at least mentioning.)

That has other issues.  If you change readyState and call the handler
in separate actions [one immediate, one queued] you've got a strange
state in between.  On the other hand, if you don't change readyState
until the queued handler runs, you might try calling readAsText
multiple times, since nothing will [visibly] have changed.  Either of
those seems bad.



Re: [FileAPI] FileReader.abort() and File[Saver|Writer].abort have different behaviors

2011-09-20 Thread Eric U
On Mon, May 23, 2011 at 6:19 PM, Arun Ranganathan a...@mozilla.com wrote:
 On 5/23/11 6:14 PM, Arun Ranganathan wrote:

 On 5/23/11 1:20 PM, Kyle Huey wrote:

 To close the loop a bit here, Firefox 6 will make the change to
 FileReader.abort()'s throwing behavior agreed upon here.
 (https://bugzilla.mozilla.org/show_bug.cgi?id=657964)

 We have not changed the timing of the events, which are still dispatched
 synchronously.

 The editor's draft presently does nothing when readyState is EMPTY, but if
 readyState is DONE it is specified to set result to null and fire events
 (but flush any pending tasks that are queued).

 http://dev.w3.org/2006/webapi/FileAPI/#dfn-abort

 Also note that we're NOT firing *both* error and abort; we should only fire
 abort, and *not* error.

 I should change the spec. to throw.  Eric, you might change the spec. (and
 Chrome) to NOT fire error and abort events :)

 Sorry, to be a bit clearer: I'm talking about Eric changing
 http://dev.w3.org/2009/dap/file-system/file-writer.html#widl-FileSaver-abort-void
 to match http://dev.w3.org/2006/webapi/FileAPI/#dfn-abort

 -- A*




Sorry about the long delay here--a big release and a new baby absorbed
a lot of my time.  I'm going through the abort sequence right now, and
it turns out that there are a number of places in various algorithms
in FileWriter that should match FileReader more closely than they do.
However, there a couple of edge cases I'm unsure about.

1) Do you expect there to be an event called progress that indicates a
complete read, before the load event?
user agents MUST return at least one such result while processing
this read method, with the last returned value at completion of the
read -- Does that mean during onprogress, or would during onloadend
be sufficient?  What if the whole blob is read in a single backend
operation--could there be no calls to onprogress at all?

[Side note--the phrasing there is odd.  You say that useragents MUST
return, but the app's not required to call for the value, and it
can't return it if not asked.  Did you want to require the useragent
to make at least one onprogress call?]

2) The load and loadend events are queued When the data from the blob
has been completely read into memory.  If the user agent fires an
onprogress indicating all the data's been loaded, and the app calls
abort in that event handler, should those queued events be fired or
not?  If there are any tasks from the object's FileReader task source
in one of the task queues, then remove those tasks. makes it look
like no, but I wanted to make sure.  If #1 above is no or not
necessarily, then this might not ever come up anyway.

Thanks,

Eric



Re: [FileAPI] FileReader.abort() and File[Saver|Writer].abort have different behaviors

2011-09-20 Thread Eric U
On Tue, Sep 20, 2011 at 3:36 PM, Eric U er...@google.com wrote:
 On Mon, May 23, 2011 at 6:19 PM, Arun Ranganathan a...@mozilla.com wrote:
 On 5/23/11 6:14 PM, Arun Ranganathan wrote:

 On 5/23/11 1:20 PM, Kyle Huey wrote:

 To close the loop a bit here, Firefox 6 will make the change to
 FileReader.abort()'s throwing behavior agreed upon here.
 (https://bugzilla.mozilla.org/show_bug.cgi?id=657964)

 We have not changed the timing of the events, which are still dispatched
 synchronously.

 The editor's draft presently does nothing when readyState is EMPTY, but if
 readyState is DONE it is specified to set result to null and fire events
 (but flush any pending tasks that are queued).

 http://dev.w3.org/2006/webapi/FileAPI/#dfn-abort

 Also note that we're NOT firing *both* error and abort; we should only fire
 abort, and *not* error.

 I should change the spec. to throw.  Eric, you might change the spec. (and
 Chrome) to NOT fire error and abort events :)

 Sorry, to be a bit clearer: I'm talking about Eric changing
 http://dev.w3.org/2009/dap/file-system/file-writer.html#widl-FileSaver-abort-void
 to match http://dev.w3.org/2006/webapi/FileAPI/#dfn-abort

 -- A*




 Sorry about the long delay here--a big release and a new baby absorbed
 a lot of my time.  I'm going through the abort sequence right now, and
 it turns out that there are a number of places in various algorithms
 in FileWriter that should match FileReader more closely than they do.
 However, there a couple of edge cases I'm unsure about.

 1) Do you expect there to be an event called progress that indicates a
 complete read, before the load event?

On further reflection, another requirement prevents this in some
cases.  If you've made a non-terminal progress event less than 50ms
before completion, you're not permitted to make another at completion,
so I think you'd go straight to load and loadend.  However, if the
entire load took place in a single underlying operation that took less
than 50ms, do you have your choice of whether or not to fire
onprogress once before onload?

 user agents MUST return at least one such result while processing
 this read method, with the last returned value at completion of the
 read -- Does that mean during onprogress, or would during onloadend
 be sufficient?  What if the whole blob is read in a single backend
 operation--could there be no calls to onprogress at all?

 [Side note--the phrasing there is odd.  You say that useragents MUST
 return, but the app's not required to call for the value, and it
 can't return it if not asked.  Did you want to require the useragent
 to make at least one onprogress call?]

 2) The load and loadend events are queued When the data from the blob
 has been completely read into memory.  If the user agent fires an
 onprogress indicating all the data's been loaded, and the app calls
 abort in that event handler, should those queued events be fired or
 not?  If there are any tasks from the object's FileReader task source
 in one of the task queues, then remove those tasks. makes it look
 like no, but I wanted to make sure.  If #1 above is no or not
 necessarily, then this might not ever come up anyway.

 Thanks,

    Eric




Re: [FileAPI] FileReader.abort() and File[Saver|Writer].abort have different behaviors

2011-09-20 Thread Eric U
On Tue, Sep 20, 2011 at 4:43 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Tue, Sep 20, 2011 at 4:28 PM, Eric U er...@google.com wrote:
 On Tue, Sep 20, 2011 at 3:36 PM, Eric U er...@google.com wrote:
 On Mon, May 23, 2011 at 6:19 PM, Arun Ranganathan a...@mozilla.com wrote:
 On 5/23/11 6:14 PM, Arun Ranganathan wrote:

 On 5/23/11 1:20 PM, Kyle Huey wrote:

 To close the loop a bit here, Firefox 6 will make the change to
 FileReader.abort()'s throwing behavior agreed upon here.
 (https://bugzilla.mozilla.org/show_bug.cgi?id=657964)

 We have not changed the timing of the events, which are still dispatched
 synchronously.

 The editor's draft presently does nothing when readyState is EMPTY, but if
 readyState is DONE it is specified to set result to null and fire events
 (but flush any pending tasks that are queued).

 http://dev.w3.org/2006/webapi/FileAPI/#dfn-abort

 Also note that we're NOT firing *both* error and abort; we should only fire
 abort, and *not* error.

 I should change the spec. to throw.  Eric, you might change the spec. (and
 Chrome) to NOT fire error and abort events :)

 Sorry, to be a bit clearer: I'm talking about Eric changing
 http://dev.w3.org/2009/dap/file-system/file-writer.html#widl-FileSaver-abort-void
 to match http://dev.w3.org/2006/webapi/FileAPI/#dfn-abort

 -- A*




 Sorry about the long delay here--a big release and a new baby absorbed
 a lot of my time.  I'm going through the abort sequence right now, and
 it turns out that there are a number of places in various algorithms
 in FileWriter that should match FileReader more closely than they do.
 However, there a couple of edge cases I'm unsure about.

 1) Do you expect there to be an event called progress that indicates a
 complete read, before the load event?

 On further reflection, another requirement prevents this in some
 cases.  If you've made a non-terminal progress event less than 50ms
 before completion, you're not permitted to make another at completion,
 so I think you'd go straight to load and loadend.  However, if the
 entire load took place in a single underlying operation that took less
 than 50ms, do you have your choice of whether or not to fire
 onprogress once before onload?

 This is a spec-bug. We need to make an exception from the 50ms rule
 for the last onprogress event.

 From the webpage point of view, the following invariants should hold
 for each load:

 1. onloadstart fires exactly once
 2. There will be one onprogress event fired when 100% progress is reached
 3. Exactly one of onabort, onload and onerror fires
 4. onloadend fires exactly once.
 6. no onprogress events fire before onloadstart
 5. no onprogress events fire after onabort/onload/onerror
 6. no onabort/onoad/onerror events fire after onloadend

 The reason for 2 is so that the page always renders a complete
 progress bar if it only does progressbar updates from the onprogress
 event.

 Hope that makes sense?

It makes sense, and in general I like it.  But the sequence can get
more complicated [specifically, nested] if you have multiple read
calls, which is the kind of annoyance that brought me to send the
email.

I have a read running, and at some point I abort it--it could be in
onprogress or elsewhere.  In onabort I start another read.  In
onloadstart I abort again.  Repeat as many times as you like, then let
a read complete.  I believe we've specced that the event sequence
should look like this:

loadstart
[progress]*
--[events from here to XXX happen synchronously, with no queueing]
abort
loadstart
abort
loadstart
abort
loadstart
loadend
loadend
loadend
--[XXX]
[progress]+
load
loadend

Does that look like what you'd expect?  Am I reading it right?  Yes,
this is a wacky fringe case.  But it's certainly reasonable to expect
someone to start a new read in onabort, so you have to implement at
least enough bookkeeping for that case.  And UAs will want to defend
against stack overflow, in the event that a bad app sticks in an
abort/loadstart loop.



Re: [FileAPI] FileReader.abort() and File[Saver|Writer].abort have different behaviors

2011-09-20 Thread Eric U
On Tue, Sep 20, 2011 at 5:32 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Tue, Sep 20, 2011 at 5:26 PM, Glenn Maynard gl...@zewt.org wrote:
 On Tue, Sep 20, 2011 at 8:01 PM, Eric U er...@google.com wrote:

 I have a read running, and at some point I abort it--it could be in
 onprogress or elsewhere.  In onabort I start another read.  In
 onloadstart I abort again.  Repeat as many times as you like, then let
 a read complete.  I believe we've specced that the event sequence
 should look like this:

 loadstart
 [progress]*
 --[events from here to XXX happen synchronously, with no queueing]
 abort
 loadstart

 abort
 loadstart

 XHR handles this by not allowing a new request to be opened until the
 abort() method terminates.  Could that be done here?  It seems like an
 important thing to be consistent about.

 http://dev.w3.org/2006/webapi/XMLHttpRequest/#the-abort-method

 Ooh, that's a good idea.

 / Jonas

Indeed--however, from a quick skim of XHR and XHR2, that's not what
they do.  They let open() terminate abort(), however far along it's
gotten.  If we did that, then an abort killed by a read might lead to
the aborted read never getting an onloadend.  But you could still get
the stack-busting chain of onloadstart/onabort.

If we wanted to prevent read methods from being called during abort,
we'd probably want to do that by setting an aborting flag or mucking
around with yet another readyState of ABORTING.



Re: [whatwg] File API Streaming Blobs

2011-08-08 Thread Eric U
Sorry about the very slow response; I've been on leave, and am now
catching up on my email.

On Wed, Jun 22, 2011 at 11:54 AM, Arun Ranganathan a...@mozilla.com wrote:
 Greetings Adam,

 Ian, I wish I knew that earlier when I originally posted the idea,
 there was lots of discussion and good ideas but then it suddenly
 dropped of the face of the earth. Essentially I am fowarding this
 suggestion to public-webapps@w3.org on the basis as apparently most
 discussion of File API specs happen there, and would like to know how
 to move forward with this suggestion.

 The original suggestion and following comments are on the whatwg list
 archive, starting with

 http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2011-January/029973.html

 Summing up, the problem with the current implementation of Blobs is
 that once a URI has been generated for them, by design changes are no
 longer reflected in the object URL. In a streaming scenario, this is
 not what is needed, rather a long-living Blob that can be appended is
 needed and 'streamed' to other parts of the browser, e.g. thevideo
 oraudio  element.
 The original use case was:  make an application which will download
 media files from a server and cache them locally, as well as playing
 them without making the user wait for the entire file to be
 downloaded, converted to a blob, then saved and played, however such
 an API covers many other use cases such as on-the-fly on-device
 decryption of streamed media content (ie live streams either without
 end or static large files that to download completely would be a waste
 when only the first couple of seconds need to be buffered and
 decrypted before playback can begin)

 Some suggestions were to modify or create a new type of Blob, the
 StreamingBlob which can be changed without its object url changing and
 appended to as new data is downloaded or decoded, and using a similar
 process to how large files may start to be decoded/played by a browser
 before they are fully downloaded. Other suggestions suggested using a
 pull API on the Blob so browsers can request for new data
 asynchronously, such as in

 http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2011-January/029998.html

 Some problems however that a browser may face is what to do with urls
 which are opened twice, and whether the object url should start from
 the beginning (which would be needed for decoding encrypted, on-demand
 audio) or start from the end (similar to `tail`, for live streaming
 events that need decryption, etc.).

 Thanks,
 P.S. Sorry if I've not done this the right way by forwarding like
 this, I'm not usually active on mailing lists.



 I actually think moving to a streaming mode for file reads in general is
 desirable, but I'm not entirely sure extending Blobs is the way to go for
 *that* use case, which honestly is the main use case I'm interested in.  We
 may improve upon ideas after this API goes to Last Call for streaming file
 reads; hopefully we'll do a better job than other non-JavaScript APIs out
 there :) [1].  Blob objects as they are currently specified live in memory
 and represent in memory File objects as well.  A change to the underlying
 file isn't captured in the Blob snapshot; moreover, if the file moves or is
 no longer present at time of read, an error event is fired while processing
 a read operation.  The object URL may be dereferenced, but will result in a
 404.

 The Streaming API explored by WHATWG uses the Object URL scheme for
 videoconferencing use cases [2], and so the scheme itself is suitable for
 resources that are more dynamic than memory-resident Blob objects.
  Segment-plays/segment dereferencing in general can be handled through media
 fragments; the scheme can naturally be accompanied by fragment identifiers.

 I agree that it may be desirable to extend Blobs to do a few other things in
 general, maybe independent of better file reads.  You've Cc'd the right
 listserv :)  I'd be interested in what Eric has to say, since BlobBuilder
 evolves under his watch.

Having reviewed the threads, I'm not absolutely sure that we want to
add this stuff to Blob.  It seems like streaming is quite a bit
different than a lot of the problems people want to solve with Blobs,
and we may end up with a bit of a mess if we mash them together.
BlobBuilder does seem a decent match as a StreamBuilder, though.
Since Blobs are specifically non-mutable, it sounds like what you're
looking for is more like createObjectURL(blobBuilder) than
createObjectURL(blobBuildler.getBlob()).

From the threads and from my head, here are some questions:

1) Would reading from a stream always start at the beginning, or would
it start at the current point [e.g. in a live video stream]?
2) Would this have to support infinite streams?
3) Would we be expected to keep around data from the very beginning of
a stream, even if e.g. it's a live broadcast and you're now watching
hour 7?  If not, who controls the buffer size and what's the API for

[File API: FileSystem] Removed mimeType from toURL

2011-06-06 Thread Eric U
The optional mimeType parameter to Entry[Sync].toURL is redundant with
url.createObjectURL.  It also doesn't work with the URL format
proposed in the notes and now implemented in Chromium.  I have removed
it from the spec.

 Eric



Re: [File API: FileSystem] Path restrictions and case-sensitivity

2011-05-22 Thread Eric U
On Thu, May 12, 2011 at 1:34 AM, timeless timel...@gmail.com wrote:
 On Thu, May 12, 2011 at 3:02 AM, Eric U er...@google.com wrote:
 There are a few things going on here:

 yes

 1) Does the filesystem preserve case?  If it's case-sensitive, then
 yes.  If it's case-insensitive, then maybe.
 2) Is it case-sensitive?  If not, you have to decide how to do case
 folding, and that's locale-specific.  As I understand it, Unicode
 case-folding isn't locale specific, except when you choose to use the
 Turkish rules, which is exactly the problem we're talking about.
 3) If you're case folding, are you going to go with a single locale
 everywhere, or are you going to use the locale of the user?
 4) [I think this is what you're talking about w.r.t. not allowing both
 dotted and dotless i]: Should we attempt to detect filenames that are
 /too similar/ for some definition of /too similar/, ostensibly to
 avoid confusing the user.

 As I read what you wrote, you wanted:
 1) yes

 correct

 2) no

 correct

 3) a new locale in which I, ı, I and i all fold to the same letter, 
 everywhere

 I'm pretty sure Unicode's locale insensitive behavior is precisely
 what i want. I've included the section from Unicode 6 at the end.

 4) yes, possibly only for the case of I, ı, I and i

 4 is, in the general case, impossible.

 yes.

 It's not well-defined, and is just as likely to cause problems as solve them.

 There are some defined ways to solve them (accepting that perfect is
 the enemy of the good),
 - one is to take the definitions of too similar selected for idn
 registration...
 - another is to just accept the recommendation from unicode 6 text
 can be normalized to Normalization Form NFKC or NFKD after case
 folding

 If you *just* want to
 check for ı vs. i, it's possible, but it's still not clear to me that
 what you're doing will be the correct behavior in Turkish locales [are
 there any Turkish words, names abbreviations, etc. that only differ in
 that character?]

 Well, the classic example of this is sıkısınca / sikisince [1],
 but technically those differ in more than just the 'i' (they differ in
 the a/e at the end).

 My point is that if two things differ by such a small thing, it's
 better to force them to have visibly different names, this could be a
 '(2)' tacked onto the end of a file if the name is auto generated, or
 if the name is something a human is picking, it could be please pick
 another name, it looks too close to preview of other object object
 name.

This again is really oriented towards the file-picker use case which
we've agreed [I think?] isn't the most common use case.  Most of the
time we expect the filenames to be generated by an application that's
using the filesystem for a backing store.  Changing the filenames out
from under it 1) won't improve anything; 2) may break things.

Given that we're talking about a problem that's subjective and thus
can't really be solved, and the solution you propose is so
complicated, I really don't see that this is a win over just saying
we support all valid UTF-8 sequences; build whatever you want on top
of that.  There are ways to add some of the behavior you're asking
for in JavaScript libraries on top, as long as you're willing to have
a central coordinator for your filesystem access.  Let's let people
experiment with that as they wish.

It appears to me that a majority of those who've spoken up support
this conclusion, and will try to update the spec this week.  As
before, I'm still only speccing out the sandboxed filesystem, so
expansions into access outside the sandbox, and serialization of these
filenames into local filesystem names, can be dealt with later.

 The other instances I've run into all seem to be cases where there's a
 canonical spelling and then a folded for Latin users writing. I
 certainly can't speak for all cases.

 and it doesn't matter elsewhere.

 Actually, i think we ended up trying to compile blacklists while
 developing punycode [2] for IDN [3]. I guess rfc 4290 [4], 4713 [5],
 5564 [6], and 5992 [7], have tables which while not complete are
 certainly referencable, and given that UAs already have to deal with
 punycode, it's likely that they'd have access to those tables.

 I think the relevant section from unicode 6 [8] is probably 5.18 Case
 Mappings (page 171?)
 Where case distinctions are not important, other distinctions between 
 Unicode characters
 (in particular, compatibility distinctions) are generally ignored as well. 
 In such circumstances,
 text can be normalized to Normalization Form NFKC or NFKD after case folding,
 thereby producing a normalized form that erases both compatibility 
 distinctions and case
 distinctions.

 I think this is probably what I want

 However, such normalization should generally be done only on a restricted
 repertoire, such as identifiers (alphanumerics).

 Yes, I'm hand waving at this requirement - filenames are in a way
 identifiers, you aren't supposed to encode an essay

Re: [FileAPI] FileReader.abort() and File[Saver|Writer].abort have different behaviors

2011-05-17 Thread Eric U
On Tue, May 17, 2011 at 2:41 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Tue, May 17, 2011 at 2:35 PM, Kyle Huey m...@kylehuey.com wrote:
 The abort behaviors of FileReader and File[Saver|Writer] differ.  The
 writing objects throw if the abort method is called when a write is not
 currently under way, while the reading object does not throw.

 The behaviors should be consistent.  I don't particularly care either way,
 but I believe Jonas would like to change FileReader to match
 File[Saver|Writer].

 Yeah, since we made FileReader.readAsX throw when called in the wrong
 state, I believe doing the same for abort() is the better option.

 / Jonas

Sounds good to me.



Re: [FileAPI] FileReader.abort() and File[Saver|Writer].abort have different behaviors

2011-05-17 Thread Eric U
It was likely just an oversight on my part that they differ.
It does seem a bit odd to dispatch error/abort/loadend if aborting
with no write in progress, so I favor the FileWriter/FileSaver
behavior, but as long as they match, I'm not too bothered.

On Tue, May 17, 2011 at 2:35 PM, Kyle Huey m...@kylehuey.com wrote:
 The abort behaviors of FileReader and File[Saver|Writer] differ.  The
 writing objects throw if the abort method is called when a write is not
 currently under way, while the reading object does not throw.

 The behaviors should be consistent.  I don't particularly care either way,
 but I believe Jonas would like to change FileReader to match
 File[Saver|Writer].

 - Kyle




Re: [FileAPI] FileReader.abort() and File[Saver|Writer].abort have different behaviors

2011-05-17 Thread Eric U
On Tue, May 17, 2011 at 2:48 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Tue, May 17, 2011 at 2:42 PM, Eric U er...@google.com wrote:
 It was likely just an oversight on my part that they differ.
 It does seem a bit odd to dispatch error/abort/loadend if aborting
 with no write in progress, so I favor the FileWriter/FileSaver
 behavior, but as long as they match, I'm not too bothered.

 For what it's worth, FileReader.abort() currently follows what
 XHR.abort() does, which is to do nothing if called in the wrong
 state. I.e. no events are aborted and no exceptions are thrown.

Ah, my mistake; I was reading http://www.w3.org/TR/FileAPI/#abort
instead of http://dev.w3.org/2006/webapi/FileAPI/#abort.



Re: [File API: FileSystem] Path restrictions and case-sensitivity

2011-05-11 Thread Eric U
I've grouped responses to bits of this thread so far below:

Glenn said:
 If *this API's* concept of filenames is case-insensitive, then IMAGE.JPG
 and image.jpg represent the same file on English systems and two different
 files on Turkish systems, which is an interop problem.

Timeless replied:
 no, if the api is case insensitive, then it's case insensitive
 *everywhere*, both on Turkish and on English systems. Things could
 only be case sensitive when serialized to a real file system outside
 of the API. I'm not proposing a case insensitive system which is
 locale aware, i'm proposing one which always folds.

You're proposing not just a case-insensitive system, but one that forces e.g. an
English locale on all users, even those in a Turkish locale.  I don't think
that's an acceptable solution.

I also don't think having code that works in one locale and not another
[Glenn's image.jpg example] is fantastic.  It was what we were stuck with when
I was trying to allow implementers the choice of a pass-through implementation,
but given that that's fallen to the realities of path lengths on Windows, I feel
like we should try to do better.

Glenn:
 This can be solved at the application layer in applications that want
 it, without baking it into the filesystem API.

This is mostly true; you'd have to make sure that all alterations to the
filesystem went through a single choke-point or you'd have the potential for
race conditions [or you'd need to store the original-case filenames yourself,
and send the folded case down to the filesystem API].

Glenn:
 A virtual FS as the backing for the filesystem API does not resolve that core
 issue.  It makes sense to encourage authors to gracefully handle errors thrown
 by  creating files and directories.  Such a need has already been introduced
 via Google Chrome's unfortunate limitation of a 255 byte max path length.

That limitation grew out of the OS-dependent passthrough implementation.  We're
fixing that right now, with this proposal.

 The one take-away I have from that bug: it would have been nice to have a more
 descriptive error message.  It took awhile to figure out that the path length
 was too long for the implementation.

I apologize for that--it was an oversight.  If we can relax the restrictions to
a small set, it'll be more obvious what the problems are.  IIRC this problem was
particularly confusing because we were stopping you well short of the allowed
255 bytes, due to the your profile's nesting depth.

I'd like to obviate the need for complicated exceptions or APIs that suggest
better names, by leaving naming up to the app developer as much as possible.

[segue into other topics]

Glenn asked about future expansions of IndexedDB to handle Blobs, specifically
with respect to FileWriter and efficient incremental writes.

Jonas replied:
 A combination of FileWriter and IndexedDB should be able to handle
 this without problem. This would go beyond what is currently in the
 IndexedDB spec, but it's this part that we're planning on
 experimenting with.

 The way I have envisioned it to work is to add a function called
 createFileEntry somewhere, for example the IDBFactory interface. This
 would return a fileEntry which you could then write to using
 FileWriter as well as store in the database using normal database
 operations.

As Jonas and I have discussed in the past, I think that storing Blobs via
reference in IDB works fine, but when you make them modifiable FileEntries
instead, you either have to give up IDB's transactional nature or you have to
give up efficiency.  For large mutable Blobs, I don't think there's going to be
a clean interface there.  Still, I look forward to seeing what you come up with.

Eric



Re: [File API: FileSystem] Path restrictions and case-sensitivity

2011-05-11 Thread Eric U
On Wed, May 11, 2011 at 4:47 PM, timeless timel...@gmail.com wrote:
 On Thu, May 12, 2011 at 2:08 AM, Eric U er...@google.com wrote:
 Timeless replied:
 no, if the api is case insensitive, then it's case insensitive
 *everywhere*, both on Turkish and on English systems. Things could
 only be case sensitive when serialized to a real file system outside
 of the API. I'm not proposing a case insensitive system which is
 locale aware, i'm proposing one which always folds.

 You're proposing not just a case-insensitive system, but one that forces 
 e.g. an
 English locale on all users, even those in a Turkish locale.  I don't think
 that's an acceptable solution.

 No, I proposed case preserving. If the file is first created with a
 dotless i, that hint is preserved and a user agent could and should
 retain this (e.g. for when it serializes to a real file system). I'm
 just suggesting not allowing an application to ask for distinct dotted
 and dotless instances of the same approximate file name. There's a
 reasonable chance that case collisions will be disastrous when
 serialized, thus it's better to prevent case collisions when an
 application tries to create the file - the application can accept a
 suggested filename or generate a new one.

There are a few things going on here:

1) Does the filesystem preserve case?  If it's case-sensitive, then
yes.  If it's case-insensitive, then maybe.
2) Is it case-sensitive?  If not, you have to decide how to do case
folding, and that's locale-specific.  As I understand it, Unicode
case-folding isn't locale specific, except when you choose to use the
Turkish rules, which is exactly the problem we're talking about.
3) If you're case folding, are you going to go with a single locale
everywhere, or are you going to use the locale of the user?
4) [I think this is what you're talking about w.r.t. not allowing both
dotted and dotless i]: Should we attempt to detect filenames that are
/too similar/ for some definition of /too similar/, ostensibly to
avoid confusing the user.

As I read what you wrote, you wanted:
1) yes
2) no
3) a new locale in which I, ı, I and i all fold to the same letter, everywhere
4) yes, possibly only for the case of I, ı, I and i

4 is, in the general case, impossible.  It's not well-defined, and is
just as likely to cause problems as solve them.  If you *just* want to
check for ı vs. i, it's possible, but it's still not clear to me that
what you're doing will be the correct behavior in Turkish locales [are
there any Turkish words, names abbreviations, etc. that only differ in
that character?] and it doesn't matter elsewhere.



Re: [File API: FileSystem] Path restrictions and case-sensitivity

2011-05-11 Thread Eric U
On Wed, May 11, 2011 at 4:52 PM, Glenn Maynard gl...@zewt.org wrote:
 On Wed, May 11, 2011 at 7:08 PM, Eric U er...@google.com wrote:

  *everywhere*, both on Turkish and on English systems. Things could
  only be case sensitive when serialized to a real file system outside
  of the API. I'm not proposing a case insensitive system which is
  locale aware, i'm proposing one which always folds.

  no, if the api is case insensitive, then it's case insensitive
 You're proposing not just a case-insensitive system, but one that forces
 e.g. an
 English locale on all users, even those in a Turkish locale.  I don't
 think
 that's an acceptable solution.

 I also don't think having code that works in one locale and not another
 [Glenn's image.jpg example] is fantastic.  It was what we were stuck
 with when
 I was trying to allow implementers the choice of a pass-through
 implementation,
 but given that that's fallen to the realities of path lengths on Windows,
 I feel
 like we should try to do better.

 To clarify something which I wasn't aware of before digging into this
 deeper: Unicode case folding is *not* locale-sensitive.  Unlike lowercasing,
 it uses the same rules in all locales, except Turkish.  Turkish isn't just
 an easy-to-explain example of one of many differences (as it is with Unicode
 lowercasing); it is, as far as I see, the *only* exception.  Unicode's case
 folding rules have a special flag to enable Turkish in case folding, which
 we can safely ignore here--nobody uses it for filenames.  (Windows filenames
 don't honor that special case on Turkish systems, so those users are already
 accustomed to that.)

So it's not locale-sensitive unless it is, but nobody does that
anyway, so don't worry about it?  I'm a bit uneasy about that in
general, but Windows not supporting it is a good point.  Anyone know
about Mac or Linux systems?

 That said, it's still uncomfortable having a dependency on the Unicode
 folding table here: if it ever changes, it'll cause both interop problems
 and data consistency problems (two files which used to be distinct filenames
 turning into two files with the same filenames due to a browser update
 updating its Unicode data).  Granted, either case would probably be
 vanishingly rare in practice at this point.

Agreed [both in the discomfort and the rarity], but I think it's a
very ugly dependency anyway.

 All that aside, I think a much stronger argument for case-sensitive
 filenames is the ability to import files from essentially any environment;
 this API's filename rules are almost entirely a superset of all other
 filesystems and file containers.  For example, sites can allow importing
 (once the needed APIs are in place) directories of data into the sandbox,
 without having to modify any filenames to make it fit a more constrained
 API.  Similarly, sites can extract tarballs directly into the sandbox.
 (I've seen tars containing both Makefile and makefile; maybe people only
 do that to confound Windows users, but they exist.)

I've actually ended up in that situation on Linux, with tools that
autogenerated makefiles, but were run from Makefiles.  It's not a
situation I really wanted to be in, but it was nice that it actually
worked without me having to hack around it.

 I'm not liking the backslash exception.  It's the only thing that prevents
 this API from being a complete superset, as far as I can see, of all
 production filesystems.  Can we drop that rule?  It might be a little
 surprising to developers who have only worked in Windows, but they'll be
 surprised anyway, and it shouldn't lead to latent bugs.

It can't be a complete superset of all filesystems in that it doesn't
allow forward slash in filenames either.
However, I see your point.  You could certainly have a filename with a
backslash in it on a Linux/ext2 system.  Does anyone else have an
opinion on whether it's worth the confusion potential?

 Glenn:
  This can be solved at the application layer in applications that want
  it, without baking it into the filesystem API.

 This is mostly true; you'd have to make sure that all alterations to the
 filesystem went through a single choke-point or you'd have the potential
 for
 race conditions [or you'd need to store the original-case filenames
 yourself,
 and send the folded case down to the filesystem API].

 Yeah, it's not necessarily easy to get right, particularly if you have
 multiple threads running...



 (The rest was Charles, by the way.)

Ah, sorry Glenn and Charles.

  A virtual FS as the backing for the filesystem API does not resolve that
  core
  issue.  It makes sense to encourage authors to gracefully handle errors
  thrown
  by  creating files and directories.  Such a need has already been
  introduced
  via Google Chrome's unfortunate limitation of a 255 byte max path
  length.


 --
 Glenn Maynard






Re: [File API: FileSystem] Path restrictions and case-sensitivity

2011-05-11 Thread Eric U
On Wed, May 11, 2011 at 7:14 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Wednesday, May 11, 2011, Eric U er...@google.com wrote:
 I've grouped responses to bits of this thread so far below:

 Glenn said:
 If *this API's* concept of filenames is case-insensitive, then IMAGE.JPG
 and image.jpg represent the same file on English systems and two different
 files on Turkish systems, which is an interop problem.

 Timeless replied:
 no, if the api is case insensitive, then it's case insensitive
 *everywhere*, both on Turkish and on English systems. Things could
 only be case sensitive when serialized to a real file system outside
 of the API. I'm not proposing a case insensitive system which is
 locale aware, i'm proposing one which always folds.

 You're proposing not just a case-insensitive system, but one that forces 
 e.g. an
 English locale on all users, even those in a Turkish locale.  I don't think
 that's an acceptable solution.

 I also don't think having code that works in one locale and not another
 [Glenn's image.jpg example] is fantastic.  It was what we were stuck with 
 when
 I was trying to allow implementers the choice of a pass-through 
 implementation,
 but given that that's fallen to the realities of path lengths on Windows, I 
 feel
 like we should try to do better.

 Glenn:
 This can be solved at the application layer in applications that want
 it, without baking it into the filesystem API.

 This is mostly true; you'd have to make sure that all alterations to the
 filesystem went through a single choke-point or you'd have the potential for
 race conditions [or you'd need to store the original-case filenames yourself,
 and send the folded case down to the filesystem API].

 Glenn:
 A virtual FS as the backing for the filesystem API does not resolve that 
 core
 issue.  It makes sense to encourage authors to gracefully handle errors 
 thrown
 by  creating files and directories.  Such a need has already been introduced
 via Google Chrome's unfortunate limitation of a 255 byte max path length.

 That limitation grew out of the OS-dependent passthrough implementation.  
 We're
 fixing that right now, with this proposal.

 The one take-away I have from that bug: it would have been nice to have a 
 more
 descriptive error message.  It took awhile to figure out that the path 
 length
 was too long for the implementation.

 I apologize for that--it was an oversight.  If we can relax the restrictions 
 to
 a small set, it'll be more obvious what the problems are.  IIRC this problem 
 was
 particularly confusing because we were stopping you well short of the allowed
 255 bytes, due to the your profile's nesting depth.

 I'd like to obviate the need for complicated exceptions or APIs that suggest
 better names, by leaving naming up to the app developer as much as possible.

 [segue into other topics]

 Glenn asked about future expansions of IndexedDB to handle Blobs, 
 specifically
 with respect to FileWriter and efficient incremental writes.

 Jonas replied:
 A combination of FileWriter and IndexedDB should be able to handle
 this without problem. This would go beyond what is currently in the
 IndexedDB spec, but it's this part that we're planning on
 experimenting with.

 The way I have envisioned it to work is to add a function called
 createFileEntry somewhere, for example the IDBFactory interface. This
 would return a fileEntry which you could then write to using
 FileWriter as well as store in the database using normal database
 operations.

 As Jonas and I have discussed in the past, I think that storing Blobs via
 reference in IDB works fine, but when you make them modifiable FileEntries
 instead, you either have to give up IDB's transactional nature or you have to
 give up efficiency.  For large mutable Blobs, I don't think there's going to 
 be
 a clean interface there.  Still, I look forward to seeing what you come up 
 with.

 Why not simply make the API case sensitive and allow *any* filename
 that can be expressed in JavaScript strings.

That's the way I'm leaning.

 Implementations can do their best to make the on-filesystem-filename
 match as close as they can to the filename exposed in the API and keep
 a map which maps between OS filename and API filename for the cases
 when the two can't be the same.

We're not speccing out anything outside the sandbox yet, and we've
decided that a pass-through implementation is impractical, so we don't
need this approach yet, there being no on-filesystem-filename.  It
certainly could work for the oft-mentioned My Photos extension, when
we get around to that.

 So if the pake creates two files named Makefile and makefile on a
 system that is case insensitive, the implementation could call the
 second file makefile(2) and keep track of that mapping.

 This removes any concerns about case, internationalization and system
 limitation issues and thereby makes things very easy for web authors.

 I might be missing something obvious as I haven't

[File API: FileSystem] Path restrictions and case-sensitivity

2011-05-03 Thread Eric U
I'd like to bring back up the discussion that went on at [1] and [2].

In particular, I'd like to propose a minimal set of restrictions for
file names and paths, punt on the issue of what happens in later
layers of the API, and discuss case-sensitivity rules.

For the sandboxed filesystem, I propose that we disallow only:
* Embedded null characters [will likely break something somewhere]
* Embedded forward slash (/) [it's our delimiter]
* Embedded backslash (\) [will likely confuse people if we permit it]
* Files called '.' [has a meaning for us already]
* Files called '..' [has a meaning for us already]
* Path segments longer than 1KB [probably long enough, and I feel
better having a limit]
...and explicitly support anything other than that.  I'm not proposing
a maximum path length at this time...perhaps we should just say MUST
support at least X for some large X?

Regarding case sensitivity: I originally specced it as
case-insensitive-case-preserving to make it easier to support a
passthrough implementation on Windows and Mac.  However, as
passthroughs have turned out to be unfeasible [see previous thread on
path length problems], all case insensitivity really gets us is
potential locale issues.  I suggest we drop it and just go with a
case-sensitive filesystem.

Eric

[1] http://lists.w3.org/Archives/Public/public-webapps/2010OctDec/1031.html
[2] http://lists.w3.org/Archives/Public/public-webapps/2011JanMar/0704.html