[fileapi-directories-and-system/filewriter]
Status: The specs are clearly dead; it's just been way down on my priority list to do anything about it. We should funnel it off to be a Note [or whatever the proper procedure is--Art?]. Eric
Re: IndexedDB: Syntax for specifying persistent/temporary storage
Good writeup, Jonas--I think you've hit the major points. I think numeric priorities are both overkill and underpowered, depending upon their specific implementation. Without the promise we're currently making for Persistent storage [this will never be cleared unless you do it or the user explicitly requests it], numeric priorities are ultimately weaker than apps want. Unless we say that the top priority is the same as persistent, in which case we've added complexity without taking any away. The idea of Default is kind of appealing, and easy to migrate over to, but I'm not sure it's necessary. As Kinuko says, we can just unlock Persistent storage for apps on install, and let them migrate over whichever data needs it. This would work better if we supplied a tool to do an atomic migration, though--using the current APIs, apps would have to use 2x their storage during the transition, and browser developers might be able to implement it internally with a simple flag change or directory rename. I don't have a strong opinion there, but I lean toward just the two types rather than three. As for Alex's please clear up space event--it's not clear to me how to do that cleanly for apps that aren't currently loaded, which may need to talk to servers that aren't currently running, which the user may never plan to run again, or which require credentials to access their stored data, etc. On Wed, Dec 11, 2013 at 7:39 PM, Jonas Sicking jo...@sicking.cc wrote: Hi All, Thanks Jan for sending this. Now let me throw a giant wrench into this discussion :-) Unfortunately as we've been discussing webapps, manifests etc at mozilla I've slowly come to the realization that the temporary/persistent categorization isn't really fulfilling all the envisioned use cases. The background is that multiple platforms are now building the functionality to run normal websites outside of the browser. iOS was one of the first popular implementations of this. If you put a meta name=apple-mobile-web-app-capable content=yes in the markup of a page, and the user use bookmark to homescreen feature in iOS Safari, that almost turns the website into an app [1]. Google is currently working on implementing the same feature in Chrome for Android. At mozilla we created a proposal [2] for what is essentially a standardized version of the same idea. I think this approach is a really awesome use of the web and something that I'm very interested in supporting when designing these storage APIs. To support this use case, I think it needs to be possible for a website to first start as a website which the user only has a casual connection with, then gradually grow into something that the user essentially treats as a trusted app. Such a trusted app should have much more ability to store data without having to ask the user for permission, or without that data being suddenly deleted because we're low on disk space. In short, such an app should be treated more like a native app when it comes to storage. There are a few ways we can enable this use case. In the discussion below I'll use IndexedDB as an example of storage API, but it applies to all storage APIs equally. A) The temporary/persistent split almost enables this. We could say that when something that's a normal website stores data in temporary storage we count that data towards both per-origin and global quotas. If the global quota fills up, then we silently delete data from websites in an LRU fashion. If the user converts the website to an app by using bookmark to homescreen then we simply start treating the data stored in the temporary storage as persistent. I.e. we don't count it towards the global temporary-storage quota and we never delete it in order to make room for other websites. For persistent databases we would for normal websites put up a prompt (I'll leave out details like if this happens only when the quota API is used, or if can happen when the database is being written to). If persistent storage is used by a bookmarked app we simply would not prompt. In neither case would data stored in persistent storage ever be silently deleted in order to make room for other storage. The problem with this solution is that it doesn't give bookmarked apps the ability to create truly temporary data. Even data that a bookmarked app puts in the temporary storage is effectively treated as persistent and so not deleted if we start to run low on disk space. Temporary storage for apps is a feature that Android has, and that to some extent *nix OSs has had through use of /tmp. It definite is something that seems nice for constrained mobile devices. B) We could create a temporary/default/persistent split. I.e. we create three different storage categories. The default is what's used if no storage category is explicitly specified a IDB database is created. For normal webpages default is treated like temporary. I.e. it is counted towards
Re: FileSystem API
OK, I just finished making my way through the public-script-coord thread [I'm not on that list, but someone pointed me to it]. I have no official objections to you editing a spec based on Jonas's proposal, but I do have a couple of questions: 1) Why is this on public-script-coord instead of public-webapps? 2) Is any vendor other than Mozilla actually interested in this proposal? When it was brought up on public-webapps, and at the WebApps F2F, it dropped with a resounding thud. Given the standardization failure of the Chrome FileSystem API, this could be a massive waste of time. Or it could just be a way for Mozilla to document its filesystem API, since we've already got documentation of the Chrome API, but then you don't need to drag public-script-coord into that. I may have a few small bits of feedback on the color of the bikeshed, but mostly I'm going to stay out of it, lest I accidentally give the impression that we're going to implement it. As I stated at the F2F, we'll be the last ones to do it, but if 2 major browser vendors ship it first, we'll certainly consider it. On Mon, Aug 19, 2013 at 3:11 PM, Arun Ranganathan a...@mozilla.com wrote: Greetings Eric and WG, The Chair and I were discussing setting up repositories for the specifications discussed here (http://lists.w3.org/Archives/Public/public-webapps/2013JulSep/0307.html), notably the FileSystem API and File API v2. Before creating a repository to edit the FileSystem API, we thought we'd check with you about the first proposal, which Chrome implements, and get the Google perspective. You've edited the first FileSystem API proposal, which currently lives here (http://www.w3.org/TR/file-system-api/). Can I create a repository and edit the other proposal for FileSystem API, which currently exists as an email thread (http://lists.w3.org/Archives/Public/public-script-coord/2013JulSep/0379.html) ? Just checking to see if there are any objections or concerns that would stop a draft or future WG activity. Of course, technical nits should be heard as well, and can proceed concurrently with a draft :) -- A*
Re: ZIP archive API?
On Mon, May 6, 2013 at 5:03 AM, Glenn Maynard gl...@zewt.org wrote: On Mon, May 6, 2013 at 6:27 AM, Robin Berjon ro...@w3.org wrote: Another question to take into account here is whether this should only be about zip. One of the limitations of zip archives is that they aren't streamable. Without boiling the ocean, adding support for a streamable format (which I don't think needs be more complex than tar) would be a big plus. Zips are streamable. That's what the local file headers are for. http://www.pkware.com/documents/casestudies/APPNOTE.TXT This came up a few years ago; Gregg Tavares explained in [1] that only /some/ zipfiles are streamable, and you don't know whether yours are or not until you've seen the whole file. Eric [1] http://lists.w3.org/Archives/Public/public-webapps/2010AprJun/0362.html
Re: Blob URLs | autoRevoke, defaults, and resolutions
On Wed, May 1, 2013 at 5:16 PM, Glenn Maynard gl...@zewt.org wrote: On Wed, May 1, 2013 at 7:01 PM, Eric U er...@google.com wrote: Hmm...now Glenn points out another problem: if you /never/ load the image, for whatever reason, you can still leak it. How likely is that in good code, though? And is it worse than the current state in good or bad code? I think it's much too easy for well-meaning developers to mess this up. The example I gave is code that *does* use the URL, but the browser may or may not actually do anything with it. (I wouldn't even call that author error--it's an interoperability failure.) Also, the failures are both expensive and subtle (eg. lots of big blobs being silently leaked to disk), which is a pretty nasty failure mode. True. Another problem is that APIs should be able to receive an API, then use it multiple times. For example, srcset can change the image being displayed when the environment changes. oneTimeOnly would be weird in that case. For example, it would work when you load your page on a tablet, then work again when your browser outputs the display to a TV and changes the srcset image. (The image was never used, so the URL is still valid.) But then when you go back to the tablet screen and reconfigure back to the original configuration, it suddenly breaks, since the first URL was already used and discarded. The blob capture approach can be made to work with srcset, so this would work reliably. I'm not really sure what you're saying, here. If you want an URL to expire or otherwise be revoked, no, you can't use it multiple times after that. If you want it to work multiple times, don't revoke it or don't set oneTimeOnly.
Re: Blob URLs | autoRevoke, defaults, and resolutions
On Wed, May 1, 2013 at 3:36 PM, Arun Ranganathan a...@mozilla.com wrote: At the recent TPAC for Working Groups held in San Jose, Adrian Bateman, Jonas Sicking and I spent some time taking a look at how to remedy what the spec. says today about Blob URLs, both from the perspective of default behavior and in terms of what correct autoRevoke behavior should be. This email is to summarize those discussions. Blob URLs are used in different parts of the platform today, and are expected to work on the platform wherever URLs do. This includes CSS, MediaStream and MediaSource use cases [1], along with use of 'src='. (Separate discussions about a v2 of the File API spec, including use of a Futures-based model in lieu of the event model, took place, but submitting a LCWD with major interoperability amongst all browsers is a good goal for this draft.) Here's a summary of the Blob URL issues: 1. There's the relatively easy question of defaults. While the spec says that URL.createObjectURL should create a Blob URL which has autoRevoke: true by default [2], there isn't any implementation that supports this, whether that's IE's oneTimeOnly behavior (which is related but different), or Firefox's autoRevoke implementation. Chrome doesn't touch this yet :) The spec. will roll back the default from true to false. At least this matches what implementations do; there's been resistance to changing the default due to shipping applications relying on autoRevoke being false by default, or at least implementor reluctance [1]. Sounds good. Let's just be consistent. Switching the default to false would enable IE, Chrome, andFirefox to have interoperability with URL.createObjectURL(blobArg), though such a default places burdens on web developers to couple create* calls with revoke* calls to not leak Blobs. Jonas proposes a separate method, URL.createAutoRevokeObjectURL, which creates an autoRevoke URL. I'm lukewarm on that :-\ I'd support a new method with a different default, if we could figure out a reasonable thing for that new method to do. 2. Regardless of the default, there's the hard question of what to do with Blob URL revocation. Glenn / zewt points out that this applies, though perhaps less dramatically, to *manually* revoked Blob URLs, and provides some test cases [3]. Options are: 2a. To meticulously special-case Blob URLs, per Bug 17765 [4]. This calls for a synchronous step attached to wherever URLs are used to peg Blob URL data at fetch, so that the chance of a concurrent revocation doesn't cause things to behave unpredictably. Firefox does a variation of this with keeping channels open, but solving this bug interoperably is going to be very hard, and has to be done in different places across the platform. And even within CSS. This is hard to move forward with. Hard. 2b.To adopt an 80-20 rule, and only specify what happens for some cases that seem common, but expressly disallow other cases. This might be a more muted version of Bug 17765, especially if it can't be done within fetch [5]. Ugly. This could mean that the blob clause for basic fetch[5] only defines some cases where a synchronous fetch can be run (TBD) but expressly disallows others where synchronous fetching is not feasible. This would limit the use of Blob URLs pretty drastically, but might be the only solution. For instance, asynchronous calls accompanying embed, defer etc. might have to be expressly disallowed. It would be great if we do this in fetch [5] :-) Just to be clear, this would limit the use of *autoRevoke* Blob URLs, not all Blob URLs, yes? Essentially, this might be to do what Firefox does but document what dereference means [6], and be clear about what might break. Most implementors acknowledge that use of Blob URLs simply won't work in some cases (e.g. CSS cases, etc.). We should formalize that; it would involve listing what works explicitly. Anne? 2c. Re-use oneTimeOnly as in IE's behavior for autoRevoke (but call it autoRevoke). But we jettisoned this for race conditions e.g. // This is in IE only img2.src = URL.createObjectURL(fileBlob, {oneTimeOnly: true}); // race now! then fail in IE only img1.src = img2.src; will fail in IE with oneTimeOnly. It appears to fail reliably, but again, dereference URL may not be interoperable here. This is probably not what we should do, but it was worth listing, since it carries the brute force of a shipping implementation, and shows how some % of the market has actively solved this problem :) I'm not really sure this is so bad. I know it's the case I brought up, and I must admit that I disliked the oneTimeOnly when I first heard about it, but all other proposals [including not having automatic revocation at all] now seem worse. Here you've set something to be oneTimeOnly and used it twice; if that fails in IE, that's correct. If it works some of
Re: Blob URLs | autoRevoke, defaults, and resolutions
On Wed, May 1, 2013 at 4:53 PM, Jonas Sicking jo...@sicking.cc wrote: On Wed, May 1, 2013 at 4:25 PM, Eric U er...@google.com wrote: On Wed, May 1, 2013 at 3:36 PM, Arun Ranganathan a...@mozilla.com wrote: Switching the default to false would enable IE, Chrome, andFirefox to have interoperability with URL.createObjectURL(blobArg), though such a default places burdens on web developers to couple create* calls with revoke* calls to not leak Blobs. Jonas proposes a separate method, URL.createAutoRevokeObjectURL, which creates an autoRevoke URL. I'm lukewarm on that :-\ I'd support a new method with a different default, if we could figure out a reasonable thing for that new method to do. Yeah, the if-condition here is quite important. But if we can figure out this problem, then my proposal would be to add a new method which has a nicer name than createObjectURL as to encourage authors to use that and have fewer leaks. Heh; I wasn't even going to mention the name. 2. Regardless of the default, there's the hard question of what to do with Blob URL revocation. Glenn / zewt points out that this applies, though perhaps less dramatically, to *manually* revoked Blob URLs, and provides some test cases [3]. Options are: 2a. To meticulously special-case Blob URLs, per Bug 17765 [4]. This calls for a synchronous step attached to wherever URLs are used to peg Blob URL data at fetch, so that the chance of a concurrent revocation doesn't cause things to behave unpredictably. Firefox does a variation of this with keeping channels open, but solving this bug interoperably is going to be very hard, and has to be done in different places across the platform. And even within CSS. This is hard to move forward with. Hard. It actually has turned out to be surprisingly easy in Gecko. But I realize the same might not be true everywhere. Right, and defining just when it happens, across browsers, may also be hard. 2b.To adopt an 80-20 rule, and only specify what happens for some cases that seem common, but expressly disallow other cases. This might be a more muted version of Bug 17765, especially if it can't be done within fetch [5]. Ugly. This could mean that the blob clause for basic fetch[5] only defines some cases where a synchronous fetch can be run (TBD) but expressly disallows others where synchronous fetching is not feasible. This would limit the use of Blob URLs pretty drastically, but might be the only solution. For instance, asynchronous calls accompanying embed, defer etc. might have to be expressly disallowed. It would be great if we do this in fetch [5] :-) Just to be clear, this would limit the use of *autoRevoke* Blob URLs, not all Blob URLs, yes? No, it would limit the use of all *revokable* Blob URLs. Since you get exactly the same issues when the page calls revokeObjectURL manually. So that means that it applies to all Blob URLs. Ah, right; all revoked Blob URLs. 2c. Re-use oneTimeOnly as in IE's behavior for autoRevoke (but call it autoRevoke). But we jettisoned this for race conditions e.g. // This is in IE only img2.src = URL.createObjectURL(fileBlob, {oneTimeOnly: true}); // race now! then fail in IE only img1.src = img2.src; will fail in IE with oneTimeOnly. It appears to fail reliably, but again, dereference URL may not be interoperable here. This is probably not what we should do, but it was worth listing, since it carries the brute force of a shipping implementation, and shows how some % of the market has actively solved this problem :) I'm not really sure this is so bad. I know it's the case I brought up, and I must admit that I disliked the oneTimeOnly when I first heard about it, but all other proposals [including not having automatic revocation at all] now seem worse. Here you've set something to be oneTimeOnly and used it twice; if that fails in IE, that's correct. If it works some of the time in other browsers [after they implement oneTimeOnly], that's not good, but you did pretty much aim at your own foot. Developers that actively try to do the right thing will have consistent good results without extra code, at least. I realize that img1.src = img2.src failing is odd, but as [IIRC] Adrian pointed out, if it's an uncacheable image on a server that's gone away, couldn't that already happen, depending on your network stack implementation? I'm more worried that if implementations doesn't initiate the load synchronously, which is hard per your comment above, then it can easily be random which of the two loads succeeds and which fails. If the revoking happens at the end of the load, both loads could even succeed depending on timing and implementation details. Yup; I'm just saying that if you get a failure here, you shouldn't be surprised, no matter which img gets it. You did something explicitly wrong. Ideally we'd give predictable behavior, but if we can't do
Re: FileSystem compromise spec
On Fri, Nov 30, 2012 at 9:11 AM, SULLIVAN, BRYAN L bs3...@att.com wrote: -Original Message- From: Arthur Barstow [mailto:art.bars...@nokia.com] Sent: Friday, November 30, 2012 6:46 AM To: ext Eric U; Doug Schepers Cc: Web Applications Working Group WG Subject: Re: FileSystem compromise spec On 11/15/12 7:39 PM, ext Eric U wrote: As discussed at TPAC, there's little support for the current FileSystem API, but some support for a new API, and I promised to put forth a compromise proposal. In order to do that, I'd like to hear 1) what kinds of changes would make it more popular; 2) who I'm trying to convince. There are a number of folks who have said that they're not interested in a FileSystem API at all, so I'd rather concentrate my efforts on those with skin in the game. Note that even though we are a service provider and not a browser vendor, I do consider us to have skin in the game. Sure thing; I was looking to hear from those who were interested, not necessarily those who were implementers. * It's designed to handle both the sandbox and the outside-the-sandbox use cases. For folks interested in just the sandbox and no future expansions, that seems like wasted effort, and a sandbox-only API could be simpler. It's not clear to me that there is anyone interested in just the sandbox and no future expansions, but if there is, please speak up. I've certainly heard from folks with the opposite goal. I am still looking for evidence that IndexedDB provides a high-performance, scalable, cross-domain alternative to native filesystem access. I've seen conflicting information on that, and will gather this information with whatever tests can be found to validate performance of browsers for IndexedDB. I've seen no proposals for cross-domain access. It seems like it would be useful to look at these various file and database specs from a high level use case perspective (f.ex. one way to address UC X is to use spec X). If anyone is aware of some related docs, please let me know. Doug - wondering aloud here if this is something webplatform.org might cover or if you know of someone that might be interested in creating this type of documentation? In the Web TV IG I will be leading a task force specifically to address the recording and storage of media use cases, where storage options are the key focus. If someone can prove to us that in-the-sandbox storage addresses the needs (high-performance, scalable, cross-domain) then great; otherwise we will keep looking. Isn't in the sandbox a bit opposed to cross-domain? Or are you suggesting some kind of a shared sandbox? I'd like to hear from folks who are interested, but not in the current spec. I note that this request seems to exclude (or recommend silence) of counter-points from those that *want the current specs* as mentioned by Eric. So if there is a lack of contribution from those that support the other use cases noted (e.g. out-of-the-sandbox storage), it should not be taken as consensus with the alternative as discussed in this thread. That's because we took an informal poll at TPAC as to where folks stood on these options: 1) the current spec 2) an evolution of the current spec to be more like the newer proposals [the compromise spec] 3) chuck it all and start over ...and not a single person present voted for option 1. I'll count you as 1, but there was a lot more support for 2 or 3. I promised to make a proposal for 2, and 3 needs at the very least an editor and a spec to become viable. I'm still hoping to hear who it is that's interested in 2, so that I can make sure to address their concerns. I wasn't at TPAC, so I don't know who voted that way.
FileSystem compromise spec
As discussed at TPAC, there's little support for the current FileSystem API, but some support for a new API, and I promised to put forth a compromise proposal. In order to do that, I'd like to hear 1) what kinds of changes would make it more popular; 2) who I'm trying to convince. There are a number of folks who have said that they're not interested in a FileSystem API at all, so I'd rather concentrate my efforts on those with skin in the game. So far I've been hearing: * It's too complicated. A number of the methods aren't absolutely necessary if the user's willing to do a bit more work, so they should be dropped. * Even for what functionality we keep, it could be simpler. * The synchronous [worker-only] interface is superfluous. It's not necessary for 1.0, and it's a lot of extra implementation work. * It's designed to handle both the sandbox and the outside-the-sandbox use cases. For folks interested in just the sandbox and no future expansions, that seems like wasted effort, and a sandbox-only API could be simpler. It's not clear to me that there is anyone interested in just the sandbox and no future expansions, but if there is, please speak up. I've certainly heard from folks with the opposite goal. Does that sum it up? I'd like to hear from folks who are interested, but not in the current spec. Thanks, Eric
Re: [quota-api] Need for session storage type
On Tue, Oct 30, 2012 at 1:04 PM, Brady Eidson beid...@apple.com wrote: (Sending again as my first attempt seems to have not gone out to the list) On Oct 30, 2012, at 12:10 PM, Kinuko Yasuda kin...@chromium.org wrote: Reviving this thread as well... to give a chance to get more feedbacks before moving this forward. Let me briefly summarize: The proposal was to add 'Session' storage type to the quota API, whose data should get wiped when the session is closed. I like this. Past related discussion: * Should the data go away in an unexpected crash? -- It should follow the behavior of session cookies on the UA I'm not sure how useful it is to specify behavior in an unexpected crash. Almost by definition, such an event cannot have defined behavior. Not true--databases do this all the time, as do journaling filesystems. While it's hard to guarantee that e.g. all data is wiped from the system, you can certainly specify whether or not it should be accessible to script on the next page load after the crash. I think the bigger question is What's a session? Does it end if I: * close the window? * close the last window in this origin? * close the last window in this browser profile? * quit the browser? - With or without continue where I left off/load my same windows from last time? - Due to an update that caused a restart? - Due to a crash, with automatic crash recovery? * switch to another app on my phone/tablet? * use enough other apps on my phone/tablet that the browser gets purged from memory? I doubt browsers are consistent in all these situations, given that current Chrome doesn't behave the same as the Chrome of a year ago. So saying it should act like session cookies doesn't work. * Some storage APIs implicitly have default storage types (e.g. sessionStorage - session, AppCache - temp) but IDB and localStorage do not have them. If we have more storage types we might need an explicit way to associate a storage API (or a data unit) to a particular storage type. -- would be nice, we'll need a separate proposal / design for this though The idea sounds useful, but I may want to hear a bit more discussion / opinion from other developers / vendors. This is an especially squirrely area. Even the assumed default storage types listed are not necessarily accurate. For example, WebKit supports making AppCache permanent and that is supported on Mac and iOS. How we should define which technology belongs to which storage type is not obvious to me. It requires explicitly specifying a storage type for each existing and future storage technology. It requires that storage type being a must requirement for each of those specs. And that removes the ability for user agents to be flexible in managing their own storage. For example, today a user agent could implement AppCache as permanent… up to a limit… at which point the application could go over that limit but now only be temporary. We would either have to remove that flexibility or account for it in this API. Slightly tangent: A related question is how the new storage type should be enforced on various storage APIs. No storage APIs other than FileSystem API has an explicit way to associate their data/storage to a particular storage type, and the current FileSystem API only knows temporary and persistent types. Well, there's the distinction between localStorage and sessionStorage to keep in mind. (Not sure whether the former falls under temp or persistent, however). This is another example of the particularly squirrely area I mention above. As the LocalStorage spec reads today, any attempted guarantees as to the lifetime of the data are should level guarantees and therefore not guarantees at all. Therefore it is inarguably specified as a temporary storage. However, Apple treats LocalStorage as sacred as a file on the filesystem and we've reiterated our position on this in discussions in the past. WIll we have to report this in navigator.temporaryStorage anyways? If we're adding more storage types (with different expire options) it might be nice to have a better unified way to associate a group of data items/units to a specific storage type/options across different storage APIs. That's an interesting suggestion. It's implicit when choosing sessionStorage (session) or AppCache (temp) but unclear for IDB and localStorage. Maybe a standard API for this would be a good thing. I think we have to fully resolve this to move forward. Thanks, ~Brady
Re: Sandboxed Filesystem use cases? (was Re: Moving File API: Directories and System API to Note track?)
Asking about use cases that can be served by a filesystem API, but not by IDB, is reasonable [and I'll respond to it below], but it misses a lot of the point. The users I've talked to like the FS API because it's a simple interface that everyone already understands, that's powerful enough to handle a huge variety of use cases. Sure, the async API makes it a bit more complicated. Every API that handles large data is stuck with the same overhead there. But underneath that, people know what to expect from it and can figure it out very quickly. You just need to store 100KB? 1) Request a filesystem. 2) Open a file. 3) Write your data. Need a URL for that? Sure, it's just a file, so obviously that works. Want it organized in directories just like your server or dev environment? Go ahead. You don't have to write SQL queries, learn how to organize data into noSQL tables, or deal with version change transactions. If you want to see what's in your data store, you don't need to write a viewer to dump your tables; you just go to the URL of any directory in your store and browse around. Our URLs have a natural structure that matches the directory tree. If you add URLs to IDB, with its free-form key/value arrangement, I don't forsee an immediate natural mapping that doesn't involve lots of escaping, ugly URLs, and/or limitations. On to the use cases: Things that work well in a sandboxed filesystem that don't work well in IDB [or any of the other current storage APIs] are those that involve nontransactional modifications of large blobs of data. For example, video/photo/audio editing, which involve data that's too big to store lots of extra copies of for rollback of failed transactions, and which you don't necessarily want to try to fit into memory. Overwriting just the ID3 tag of an MP3, or just the comment section of the EXIF in a JPEG, would be much more efficient via a filesystem interface. Larger series of modifications to those files, which you don't want to hold in memory, would be similar. I know Jonas wants to bolt nontransactional data onto the side of IDB via FileHandle, but I think that the cure there is far worse than the disease, and I don't think anyone at Google likes that proposal. I haven't polled everyone, but that's the impression I get. Beyond individual use cases: When looking at use cases for a filesystem API, people often want to separate the sandboxed cases and the non-sandboxed cases [My Photos, etc.]. It's also worthwhile to look at the added value of having a single API that works for both cases. You have a photo organizer that works in the sandbox with downloaded files? If your browser supports external filesystems, you can adapt your code to run in either place with a very small change [mainly dealing with paths that aren't legal on the local system]. If you're using IDB in the sandbox, and have a different API to expose media directories, you've got to start over, and then you have to maintain both systems. One added API? It's pretty clear that people see the value of an API that lets one access My Photos from the web. That API is necessarily going to cope with files and directories on some platforms, even if others don't expose directories as such. If we're going to need to add a filesystem API of some kind to deal with that, also using the same API to manage a sandboxed storage area seems like a very small addition to the web platform, unlike the other storage APIs we've added in the past. Regarding your final note: I'm not sure what you're talking about with BlobBuilder; is that the EXIF overwrite case you're trying to handle? If so, File[Handle|Writer] with BlobBuilder and seek seems to handle it better than anything else. Eric On Tue, Sep 25, 2012 at 11:57 AM, Maciej Stachowiak m...@apple.com wrote: On Sep 25, 2012, at 10:20 AM, James Graham jgra...@opera.com wrote: In addition, this would be the fourth storage API that we have tried to introduce to the platform in 5 years (localStorage, WebSQL, IndexedDB being the other three), and the fifth in total. Of the four APIs excluding this one, one has failed over interoperability concerns (WebSQL), one has significant performance issues and is discouraged from production use (localStorage) and one suffers from a significant problems due to its legacy design (cookies). The remaining API (IndexedDB) has not yet achieved widespread use. It seems to me that we don't have a great track record in this area, and rushing to add yet another API probably isn't wise. I would rather see JS-level implementations of a filesystem-like API on top of IndexedDB in order to work out the kinks without creating a legacy that has to be maintained for back-compat than native implementations at this time. I share your concerns about adding yet-another-storage API. (Although I believe there are major websites that have adopted or are in the process of adopting IndexedDB). I like my
Re: Moving File API: Directories and System API to Note track?
While I don't see any other browsers showing interest in implementing the FileSystem API as currently specced, I do see Firefox coming around to the belief that a filesystem-style API is a good thing, hence their DeviceStorage API. Rather than scrap the API that we've put 2 years of discussion and work into, why not work with us to evolve it to something you'd like more? If you have objections to specific attributes of the API, wouldn't it be more efficient to change just those things than to start over from scratch? Or worse, to have the Chrome filesystem API, the Firefox filesystem API, etc.? If I understand correctly, folks at Mozilla think having a directory abstraction is too heavy-weight, and would prefer users to slice and dice paths by hand. OK, that's a small change, and the functionality's roughly equivalent. We could probably even make migration fairly easy with a small polyfill. Jonas suggests FileHandle to replace FileWriter. That's clearly not a move to greater simplicity, and no polyfill is possible, but it does open up the potential for higher perfomance, especially in a multi-process browser. As i said when you proposed it, I'm interested, and we'd also like to solve the locking use cases. Let's talk about it, rather than throw the baby out with the bathwater. Eric On Tue, Sep 18, 2012 at 4:04 AM, Olli Pettay olli.pet...@helsinki.fi wrote: Hi all, I think we should discuss about moving File API: Directories and System API from Recommendation track to Note. Mainly because the API hasn't been widely accepted nor implemented and also because there are other proposals which handle the same use cases. The problem with keeping the API in recommendation track is that people outside standardization world think that the API is the one which all the browsers will implement and as of now that doesn't seem likely. -Olli
Re: [File API] File behavior under modification
Agreed. On Wed, Jul 11, 2012 at 1:02 PM, Arun Ranganathan aranganat...@mozilla.com wrote: On May 23, 2012, at 9:58 AM, Glenn Maynard wrote: On Wed, May 23, 2012 at 3:03 AM, Kinuko Yasuda kin...@chromium.org wrote: Just to make sure, I assume 'the underlying storage' includes memory. Right. For simple Blobs without a mutable backing store, all of this essentially optimizes away. We should also make it clear whether .size and .lastModifiedDate should return live state or should just returning the same constant values. (I assume the latter) It would be the values at the time of the snapshot state. (I doubt it was ever actually intended that lastModifiedDate always return the file's latest mtime. We'll find out when one of the editors gets around to this thread...) I think the ideal behavior is that it reflects values at snapshot state, but that reads if snapshot state has modified fail. -- A*
Re: Feedback on Quota Management API
On Wed, May 30, 2012 at 11:59 AM, Boris Zbarsky bzbar...@mit.edu wrote: On 5/30/12 2:05 PM, Eric Uhrhane wrote: How about session, which is guaranteed to go away when the browser exits Should it go away if the browser crashes (or is killed by an OOM killer or the background process killer on something like Android) and then restarts and restores the session? Should it go away if the user has explicitly set the browser to restore sessions and then restarts it? Off the top of my head, I dunno. I was just giving examples to explain that I can't think of any other storage types isn't a very solid argument that there will never be any more. I'm not actually proposing that we implement any of these at this time. Also, having read Robert's blog post now, I think he makes some good points, especially w.r.t. feature detection.
[File API] File behavior under modification
According to the latest editor's draft [1], a File object must always return an accurate lastModifiedDate if at all possible. On getting, if user agents can make this information available, this MUST return a new Date[HTML] object initialized to the last modified date of the file; otherwise, this MUST return null. However, if the underlying file has been modified since the creation of the File, reads processed on the File must throw exceptions or fire error events. ...if the file has been modified on disk since the File object reference is created, user agents MUST throw a NotReadableError... These seem somewhat contradictory...you can always look at the modification time and see that it's changed, but if you try to read it after a change, it blows up. The non-normative text about security concerns makes me think that perhaps both types of operations should fail if the file has changed [... guarding against modifications of files on disk after a selection has taken place]. That may not be necessary, but if it's not, I think we should call it out in non-normative text that explains why you can read the mod time and not the data. This came up in https://bugs.webkit.org/show_bug.cgi?id=86811; I believe WebKit is currently noncompliant with this part of the spec, and we were debating the correct behavior. Currently WebKit delays grabbing the modification time of the file until it's been referenced by a read or slice(), so it won't notice modifications that happen between selection and read. That was done because the slice creates a File object reference, but in my reading creating the File referring to the file should be the time of the snapshot, not creating a Blob referring to a File. What's the correct behavior? Eric [1] http://dev.w3.org/2006/webapi/FileAPI/
Re: Colliding FileWriters
On Mon, Mar 19, 2012 at 3:55 PM, Jonas Sicking jo...@sicking.cc wrote: On Wed, Feb 29, 2012 at 8:44 AM, Eric U er...@google.com wrote: On Mon, Feb 27, 2012 at 4:40 PM, Jonas Sicking jo...@sicking.cc wrote: On Mon, Feb 27, 2012 at 11:36 PM, Eric U er...@google.com wrote: One working subset would be: * Keep createFileWriter async. * Make it optionally exclusive [possibly by default]. If exclusive, its length member is trustworthy. If not, it can go stale. * Add an append method [needed only for non-exclusive writes, but useful for logs, and a safe default]. This sounds great to me if we make it exclusive by default and remove the .length member for non-exclusive writers. Or make it return null/undefined. I like exclusive-by-default. Of course, that means that by default you have to remember to call close() or depend on GC, but that's probably OK. I'm less sure about .length being unusable on non-exclusive writers, but it's growing on me. Since by default writers would be exclusive, length would generally work just the same as it does now. However, if it returns null/undefined in the nonexclusive case, users might accidentally do math on it (if (length 0) = false), and get confused. Perhaps it should throw? Also, what's the behavior when there's already an exclusive lock, and you call createFileWriter? Should it just not call you until the lock's free? Do we need a trylock that fails fast, calling errorCallback? I think the former's probably more useful than the latter, and you can always use a timer to give up if it takes too long, but there's no way to cancel a request, and you might get a call far later, when you've forgotten that you requested it. However this brings up another problem, which is how to support clients that want to mix read and write operations. Currently this is supported, but as far as I can tell it's pretty awkward. Every time you want to read you have to nest two asynchronous function calls. First one to get a File reference, and then one to do the actual read using a FileReader object. You can reuse the File reference, but only if you are doing multiple reads in a row with no writing in between. I thought about this for a while, and realized that I had no good suggestion because I couldn't picture the use cases. Do you have some handy that would help me think about it? Mixing reading and writing can be something as simple as increasing a counter somewhere in the file. First you need to read the counter value, then add one to it, then write the new value. But there's also more complex operations such as reordering a set of blocks to defragment the contents of a file. Yet another example would be modifying a .zip file to add a new file. When you do this you'll want to first read out the location of the current zip directory, then overwrite it with the new file and then the new directory. That helps, thanks. So we'll need to be able to do efficient (read[-modify-write]*), and we'll need to hold the lock for the reads as well as the writes. The lock should prevent any other writes [exclusive or not], but need not prevent unlocked reads. I think we'd want to prevent unlocked reads too, otherwise the read might read the file in an inconsistent state. See more further down. We sat down and did some thinking about these two issues. I.e. the locking and the read-write-mixed issue. The solution is good news and bad news. The good news is that we've come up with something that seems like it should work, the bad news is that it's a totally different design from the current FileReader and FileWriter designs. Hmm...it's interesting, but I don't think we necessarily have to scrap FR and FW to use it. Here's a modified version that uses the existing interfaces: interface LockedReaderWriter : FileReader, FileWriter { [all the FileReader and FileWriter members] readonly attribute File writeResult; } Unfortunately this doesn't make sense since the functions on FileReader expects a Blob to be passed to them. We could certainly use slightly modified versions which doesn't take a Blob argument, but we can't inherit FileReader directly. You missed the point of the writeResult field. You can slice it and give it to the FileReader-derived functions to do your reads, so no modifications to the API are needed. However there are two downsides with an approach like this. First off it means that you *always* have to nest read/write operations in asynchronous callbacks. I.e. you always have to write code like: lock.write(...); lock.onsuccess = function() { lock.write(...); lock.onsuccess = function() { lock.read(...); lock.onsuccess = function() { lock.write(..., lock.result, ...); lock.onsuccess = function() { } } } } True. That's forced by the fact that we've modeled FileWriter after FileReader, which is modeled after XHR, which has explicit visible state
Re: BlobBuilder.append() should take ArrayBufferView in addition to ArrayBuffer
On Thu, Apr 12, 2012 at 12:54 PM, Anne van Kesteren ann...@opera.com wrote: On Thu, 12 Apr 2012 21:48:12 +0200, Boris Zbarsky bzbar...@mit.edu wrote: Because it's still in the current editor's draft and it's still in the Gecko code and I was just reviewing a patch to it and saw the API? ;) Eric, the plan is to remove that from File Writer, no? Yes. The next draft I publish will mark it deprecated, and it will eventually go away. However, currently at least Gecko and WebKit support BlobBuilder, and WebKit doesn't yet have the Blob constructor, so it'll be a little while before it actually fades away. That being said, we should be talking about making this addition to Blob, not to BlobBuilder. I thought we discussed long ago it should be removed in favor of a constructable(sp?) Blob? Could be. Like I said, it's still in the editor's draft. Blob with constructor is in http://dev.w3.org/2006/webapi/FileAPI/ Also, should it not accept just ArrayBufferView then as per XMLHttpRequest? Is there existing content depending on BlobBuilder and its ArrayBufferView stuff? I thought the idea was to not have BlobBuilder at all. -- Anne van Kesteren http://annevankesteren.nl/
Re: Delay in File * spec publications in /TR/ [Was: CfC: publish LCWD of File API; deadline March 3]
On Fri, Mar 30, 2012 at 5:39 AM, Arthur Barstow art.bars...@nokia.com wrote: Hi All - the publication of the File API LC was delayed because of some logistical issues for Arun as well as some additional edits he intends to make. This delay also resulted in Eric's two File * specs not being published since they have a dependency on the latest File API spec. Arun - can you please give us at least a rough idea when you expect the spec to be ready for LC publication? Jonas - as co-Editor of File API, can you help get the File API LC published? Eric - your File * docs were last published in April 2011 so I think it would be good to get new versions published in /TR/ soon-ish. OTOH, if they have dependencies on the latest File API, it may be better to postpone their publication until File API is published. WDYT? If it's going to be more than a month to get Arun+Jonas's spec up, we might as well go ahead and publish mine; they've had quite a bit of change. If it's less than that, let's just do them all together. -Thanks, ArtB On Feb 25, 2012, at 7:19 AM, Arthur Barstow wrote: Comments and bugs submitted during the pre-LC comment period for File API spec have been addressed and since there are no open bugs, this is a Call for Consensus to publish a LCWD of the File API spec using the latest ED as the basis: http://dev.w3.org/2006/webapi/FileAPI/ This CfC satisfies the group's requirement to record the group's decision to request advancement for this LCWD. Note the Process Document states the following regarding the significance/meaning of a LCWD: [[ http://www.w3.org/2005/10/Process-20051014/tr.html#last-call Purpose: A Working Group's Last Call announcement is a signal that: * the Working Group believes that it has satisfied its relevant technical requirements (e.g., of the charter or requirements document) in the Working Draft; * the Working Group believes that it has satisfied significant dependencies with other groups; * other groups SHOULD review the document to confirm that these dependencies have been satisfied. In general, a Last Call announcement is also a signal that the Working Group is planning to advance the technical report to later maturity levels. ]] If you have any comments or concerns about this CfC, please send them to public-webapps@w3.org by March 3 at the latest. Positive response is preferred and encouraged and silence will be assumed to be agreement with the proposal. -Thanks, AB
Re: Colliding FileWriters
On Wed, Feb 29, 2012 at 8:44 AM, Eric U er...@google.com wrote: On Mon, Feb 27, 2012 at 4:40 PM, Jonas Sicking jo...@sicking.cc wrote: On Mon, Feb 27, 2012 at 11:36 PM, Eric U er...@google.com wrote: One working subset would be: * Keep createFileWriter async. * Make it optionally exclusive [possibly by default]. If exclusive, its length member is trustworthy. If not, it can go stale. * Add an append method [needed only for non-exclusive writes, but useful for logs, and a safe default]. This sounds great to me if we make it exclusive by default and remove the .length member for non-exclusive writers. Or make it return null/undefined. I like exclusive-by-default. Of course, that means that by default you have to remember to call close() or depend on GC, but that's probably OK. I'm less sure about .length being unusable on non-exclusive writers, but it's growing on me. Since by default writers would be exclusive, length would generally work just the same as it does now. However, if it returns null/undefined in the nonexclusive case, users might accidentally do math on it (if (length 0) = false), and get confused. Perhaps it should throw? Also, what's the behavior when there's already an exclusive lock, and you call createFileWriter? Should it just not call you until the lock's free? Do we need a trylock that fails fast, calling errorCallback? I think the former's probably more useful than the latter, and you can always use a timer to give up if it takes too long, but there's no way to cancel a request, and you might get a call far later, when you've forgotten that you requested it. However this brings up another problem, which is how to support clients that want to mix read and write operations. Currently this is supported, but as far as I can tell it's pretty awkward. Every time you want to read you have to nest two asynchronous function calls. First one to get a File reference, and then one to do the actual read using a FileReader object. You can reuse the File reference, but only if you are doing multiple reads in a row with no writing in between. I thought about this for a while, and realized that I had no good suggestion because I couldn't picture the use cases. Do you have some handy that would help me think about it? Mixing reading and writing can be something as simple as increasing a counter somewhere in the file. First you need to read the counter value, then add one to it, then write the new value. But there's also more complex operations such as reordering a set of blocks to defragment the contents of a file. Yet another example would be modifying a .zip file to add a new file. When you do this you'll want to first read out the location of the current zip directory, then overwrite it with the new file and then the new directory. That helps, thanks. So we'll need to be able to do efficient (read[-modify-write]*), and we'll need to hold the lock for the reads as well as the writes. The lock should prevent any other writes [exclusive or not], but need not prevent unlocked reads. We sat down and did some thinking about these two issues. I.e. the locking and the read-write-mixed issue. The solution is good news and bad news. The good news is that we've come up with something that seems like it should work, the bad news is that it's a totally different design from the current FileReader and FileWriter designs. Hmm...it's interesting, but I don't think we necessarily have to scrap FR and FW to use it. Here's a modified version that uses the existing interfaces: interface LockedReaderWriter : FileReader, FileWriter { [all the FileReader and FileWriter members] readonly attribute File writeResult; } This came up in an offline discussion recently regarding an currently-unserved use case: using a web app to edit a file outside the browser sandbox. You can certainly drag the file into or out of the browser, but it's nothing like the experience you get with a native app, where if you select a file for editing you can read+write it many times, at its true location, without additional permission checks. If we added something like a refresh to regain expired locks with this object, and some way for the user to grant permissions to a file for the session, it could take care of that use case. What do you think? As with your proposal, as long as any read or write method has outstanding events, the lock is held. The difference here is that after any write method completes, and until another one begins or the lock is dropped, writeResult holds the state of the File as of the completion of the write. The rest of the time it's null. That way you're always as up-to-date as you can easily be, but no more so [it doesn't show partial writes during progress events]. To read, you use the standard FileReader interface, slicing writeResult as needed to get the appropriate offset. A potential feature
Re: Transferable and structured clones, was: Re: [FileAPI] Deterministic release of Blob proposal
On Wed, Mar 7, 2012 at 11:38 AM, Kenneth Russell k...@google.com wrote: On Tue, Mar 6, 2012 at 6:29 PM, Glenn Maynard gl...@zewt.org wrote: On Tue, Mar 6, 2012 at 4:24 PM, Michael Nordman micha...@google.com wrote: You can always call close() yourself, but Blob.close() should use the neuter mechanism already there, not make up a new one. Blobs aren't transferable, there is no existing mechanism that applies to them. Adding a blob.close() method is independent of making blob's transferable, the former is not prerequisite on the latter. There is an existing mechanism for closing objects. It's called neutering. Blob.close should use the same terminology, whether or not the object is a Transferable. On Tue, Mar 6, 2012 at 4:25 PM, Kenneth Russell k...@google.com wrote: I would be hesitant to impose a close() method on all future Transferable types. Why? All Transferable types must define how to neuter objects; all close() does is trigger it. I don't think adding one to ArrayBuffer would be a bad idea but I think that ideally it wouldn't be necessary. On memory constrained devices, it would still be more efficient to re-use large ArrayBuffers rather than close them and allocate new ones. That's often not possible, when the ArrayBuffer is returned to you from an API (eg. XHR2). This sounds like a good idea. As you pointed out offline, a key difference between Blobs and ArrayBuffers is that Blobs are always immutable. It isn't necessary to define Transferable semantics for Blobs in order to post them efficiently, but it was essential for ArrayBuffers. No new semantics need to be defined; the semantics of Transferable are defined by postMessage and are the same for all transferable objects. That's already done. The only thing that needs to be defined is how to neuter an object, which is what Blob.close() has to define anyway. Using Transferable for Blob will allow Blobs, ArrayBuffers, and any future large, structured clonable objects to all be released with the same mechanisms: either pass them in the transfer argument to a postMessage call, or use the consistent, identical close() method inherited from Transferable. This allows developers to think of the transfer list as a list of objects which won't be needed after the postMessage call. It doesn't matter that the underlying optimizations are different; the visible side-effects are identical (the object can no longer be accessed). Closing an object, and neutering it because it was transferred to a different owner, are different concepts. It's already been demonstrated that Blobs, being read-only, do not need to be transferred in order to send them efficiently from one owner to another. It's also been demonstrated that Blobs can be resource intensive and that an explicit closing mechanism is needed. I believe that we should fix the immediate problem and add a close() method to Blob. I'm not in favor of adding a similar method to ArrayBuffer at this time and therefore not to Transferable. There is a high-level goal to keep the typed array specification as minimal as possible, and having Transferable support leak in to the public methods of the interfaces contradicts that goal. This makes sense to me. Blob needs close independent of whether it's in Transferable, and Blob has no need to be Transferable, so let's not mix the two.
Re: [FileAPI] Deterministic release of Blob proposal
On Tue, Mar 6, 2012 at 5:12 PM, Feras Moussa fer...@microsoft.com wrote: From: Arun Ranganathan [mailto:aranganat...@mozilla.com] Sent: Tuesday, March 06, 2012 1:27 PM To: Feras Moussa Cc: Adrian Bateman; public-webapps@w3.org; Ian Hickson; Anne van Kesteren Subject: Re: [FileAPI] Deterministic release of Blob proposal Feras, In practice, I think this is important enough and manageable enough to include in the spec., and I'm willing to slow the train down if necessary, but I'd like to understand a few things first. Below: At TPAC we discussed the ability to deterministically close blobs with a few others. As we’ve discussed in the createObjectURL thread[1], a Blob may represent an expensive resource (eg. expensive in terms of memory, battery, or disk space). At present there is no way for an application to deterministically release the resource backing the Blob. Instead, an application must rely on the resource being cleaned up through a non-deterministic garbage collector once all references have been released. We have found that not having a way to deterministically release the resource causes a performance impact for a certain class of applications, and is especially important for mobile applications or devices with more limited resources. In particular, we’ve seen this become a problem for media intensive applications which interact with a large number of expensive blobs. For example, a gallery application may want to cycle through displaying many large images downloaded through websockets, and without a deterministic way to immediately release the reference to each image Blob, can easily begin to consume vast amounts of resources before the garbage collector is executed. To address this issue, we propose that a close method be added to the Blob interface. When called, the close method should release the underlying resource of the Blob, and future operations on the Blob will return a new error, a ClosedError. This allows an application to signal when it's finished using the Blob. Do you agree that Transferable (http://dev.w3.org/html5/spec/Overview.html#transferable-objects) seems to be what we're looking for, and that Blob should implement Transferable? Transferable addresses the use case of copying across threads, and neuters the source object (though honestly, the word neuter makes me wince -- naming is a problem on the web). We can have a more generic method on Transferable that serves our purpose here, rather than *.close(), and Blob can avail of that. This is something we can work out with HTML, and might be the right thing to do for the platform (although this creates something to think about for MessagePort and for ArrayBuffer, which also implement Transferable). I agree with your changes, but am confused by some edge cases: To support this change, the following changes in the File API spec are needed: * In section 6 (The Blob Interface) - Addition of a close method. When called, the close method releases the underlying resource of the Blob. Close renders the blob invalid, and further operations such as URL.createObjectURL or the FileReader read methods on the closed blob will fail and return a ClosedError. If there are any non-revoked URLs to the Blob, these URLs will continue to resolve until they have been revoked. - For the slice method, state that the returned Blob is a new Blob with its own lifetime semantics – calling close on the new Blob is independent of calling close on the original Blob. *In section 8 (The FIleReader Interface) - State the FileReader reads directly over the given Blob, and not a copy with an independent lifetime. * In section 10 (Errors and Exceptions) - Addition of a ClosedError. If the File or Blob has had the close method called, then for asynchronous read methods the error attribute MUST return a “ClosedError” DOMError and synchronous read methods MUST throw a ClosedError exception. * In section 11.8 (Creating and Revoking a Blob URI) - For createObjectURL – If this method is called with a closed Blob argument, then user agents must throw a ClosedError exception. Similarly to how slice() clones the initial Blob to return one with its own independent lifetime, the same notion will be needed in other APIs which conceptually clone the data – namely FormData, any place the Structured Clone Algorithm is used, and BlobBuilder. Similarly to how FileReader must act directly on the Blob’s data, the same notion will be needed in other APIs which must act on the data - namely XHR.send and WebSocket. These APIs will need to throw an error if called on a Blob that was closed and the resources are released. So Blob.slice() already presumes a new Blob, but I can certainly make this clearer. And I agree with the changes above, including the addition of
Re: FileReader abort, again
On Mon, Mar 5, 2012 at 2:01 PM, Eric U er...@google.com wrote: On Thu, Mar 1, 2012 at 11:20 AM, Arun Ranganathan aranganat...@mozilla.com wrote: Eric, So we could: 1. Say not to fire a loadend if onloadend or onabort Do you mean if onload, onerror, or onabort...? No, actually. I'm looking for the right sequence of steps that results in abort's loadend not firing if terminated by another read*. Since abort will fire an abort event and a loadened event as spec'd (http://dev.w3.org/2006/webapi/FileAPI/#dfn-abort), if *those* event handlers initiate a readAs*, we could then suppress abort's loadend. This seems messy. Ah, right--so a new read initiated from onload or onerror would NOT suppress the loadend of the first read. And I believe that this matches XHR2, so we're good. Nevermind. No, I retract that. In http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/1627.html Anne confirmed that a new open in onerror or onload /would/ suppress the loadend of the first send. So we would want a new read/write in onload or onerror to do the same, not just those in onabort. Actually, if we really want to match XHR2, we should qualify all the places that we fire loadend. If the user calls XHR2's open in onerror or onload, that cancels its loadend. However, a simple check on readyState at step 6 won't do it. Because the user could call readAsText in onerror, then call abort in the second read's onloadstart, and we'd see readyState as DONE and fire loadend twice. To emulate XHR2 entirely, we'd need to have read methods dequeue any leftover tasks for previous read methods AND terminate the abort algorithm AND terminate the error algorithm of any previous read method. What a mess. This may be the way to do it. The problem with emulating XHR2 is that open() and send() are distinct concepts in XHR2, but in FileAPI, they are the same. So in XHR2 an open() canceling abort does make sense; abort() cancels a send(), and thus an open() should cancel an abort(). But in FileAPI, our readAs* methods are equivalent to *both* open() and send(). In FileAPI, an abort() cancels a readAs*; we now have a scenario where a readAs* may cancel an abort(). How to make that clear? I'm not sure why it's any more confusing that read* is open+send. read* can cancel abort, and abort can cancel read*. OK. Perhaps there's a simpler way to say successfully calling a read method inhibits any previous read's loadend? I'm in favor of any shorthand :) But this may not do justice to each readAs* algorithm being better defined. Hack 1: Don't call loadend synchronously. Enqueue it, and let read* methods clear the queues when they start up. This differs from XHR, though, and is a little odd. Still works, but needs to be applied in multiple places. Hack 2: Add a virtual generation counter/timestamp, not exposed to script. Increment it in read*, check it in abort before sending loadend. This is kind of complex, but works [and might be how I end up implementing this in Chrome]. Still works, but needs to be applied in multiple places. But really, I don't think either of those is better than just saying, in read*, something like terminate the algorithm for any abort sequence being processed. ...or any previously-initiated read being processed.
Re: [FileAPI] Deterministic release of Blob proposal
After a brief internal discussion, we like the idea over in Chrome-land. Let's make sure that we carefully spec out the edge cases, though. See below for some. On Fri, Mar 2, 2012 at 4:54 PM, Feras Moussa fer...@microsoft.com wrote: At TPAC we discussed the ability to deterministically close blobs with a few others. As we’ve discussed in the createObjectURL thread[1], a Blob may represent an expensive resource (eg. expensive in terms of memory, battery, or disk space). At present there is no way for an application to deterministically release the resource backing the Blob. Instead, an application must rely on the resource being cleaned up through a non-deterministic garbage collector once all references have been released. We have found that not having a way to deterministically release the resource causes a performance impact for a certain class of applications, and is especially important for mobile applications or devices with more limited resources. In particular, we’ve seen this become a problem for media intensive applications which interact with a large number of expensive blobs. For example, a gallery application may want to cycle through displaying many large images downloaded through websockets, and without a deterministic way to immediately release the reference to each image Blob, can easily begin to consume vast amounts of resources before the garbage collector is executed. To address this issue, we propose that a close method be added to the Blob interface. When called, the close method should release the underlying resource of the Blob, and future operations on the Blob will return a new error, a ClosedError. This allows an application to signal when it's finished using the Blob. To support this change, the following changes in the File API spec are needed: * In section 6 (The Blob Interface) - Addition of a close method. When called, the close method releases the underlying resource of the Blob. Close renders the blob invalid, and further operations such as URL.createObjectURL or the FileReader read methods on the closed blob will fail and return a ClosedError. If there are any non-revoked URLs to the Blob, these URLs will continue to resolve until they have been revoked. - For the slice method, state that the returned Blob is a new Blob with its own lifetime semantics – calling close on the new Blob is independent of calling close on the original Blob. *In section 8 (The FIleReader Interface) - State the FileReader reads directly over the given Blob, and not a copy with an independent lifetime. * In section 10 (Errors and Exceptions) - Addition of a ClosedError. If the File or Blob has had the close method called, then for asynchronous read methods the error attribute MUST return a “ClosedError” DOMError and synchronous read methods MUST throw a ClosedError exception. * In section 11.8 (Creating and Revoking a Blob URI) - For createObjectURL – If this method is called with a closed Blob argument, then user agents must throw a ClosedError exception. Similarly to how slice() clones the initial Blob to return one with its own independent lifetime, the same notion will be needed in other APIs which conceptually clone the data – namely FormData, any place the Structured Clone Algorithm is used, and BlobBuilder. What about: XHR.send(blob); blob.close(); or iframe.src = createObjectURL(blob); blob.close(); In the second example, if we say that the iframe does copy the blob, does that mean that closing the blob doesn't automatically revoke the URL, since it points at the new copy? Or does it point at the old copy and fail? Similarly to how FileReader must act directly on the Blob’s data, the same notion will be needed in other APIs which must act on the data - namely XHR.send and WebSocket. These APIs will need to throw an error if called on a Blob that was closed and the resources are released. We’ve recently implemented this in experimental builds and have seen measurable performance improvements. The feedback we heard from our discussions with others at TPAC regarding our proposal to add a close() method to the Blob interface was that objects in the web platform potentially backed by expensive resources should have a deterministic way to be released. Thanks, Feras [1] http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/1499.html
Re: FileReader abort, again
On Thu, Mar 1, 2012 at 11:20 AM, Arun Ranganathan aranganat...@mozilla.com wrote: Eric, So we could: 1. Say not to fire a loadend if onloadend or onabort Do you mean if onload, onerror, or onabort...? No, actually. I'm looking for the right sequence of steps that results in abort's loadend not firing if terminated by another read*. Since abort will fire an abort event and a loadened event as spec'd (http://dev.w3.org/2006/webapi/FileAPI/#dfn-abort), if *those* event handlers initiate a readAs*, we could then suppress abort's loadend. This seems messy. Ah, right--so a new read initiated from onload or onerror would NOT suppress the loadend of the first read. And I believe that this matches XHR2, so we're good. Nevermind. Actually, if we really want to match XHR2, we should qualify all the places that we fire loadend. If the user calls XHR2's open in onerror or onload, that cancels its loadend. However, a simple check on readyState at step 6 won't do it. Because the user could call readAsText in onerror, then call abort in the second read's onloadstart, and we'd see readyState as DONE and fire loadend twice. To emulate XHR2 entirely, we'd need to have read methods dequeue any leftover tasks for previous read methods AND terminate the abort algorithm AND terminate the error algorithm of any previous read method. What a mess. This may be the way to do it. The problem with emulating XHR2 is that open() and send() are distinct concepts in XHR2, but in FileAPI, they are the same. So in XHR2 an open() canceling abort does make sense; abort() cancels a send(), and thus an open() should cancel an abort(). But in FileAPI, our readAs* methods are equivalent to *both* open() and send(). In FileAPI, an abort() cancels a readAs*; we now have a scenario where a readAs* may cancel an abort(). How to make that clear? I'm not sure why it's any more confusing that read* is open+send. read* can cancel abort, and abort can cancel read*. OK. Perhaps there's a simpler way to say successfully calling a read method inhibits any previous read's loadend? I'm in favor of any shorthand :) But this may not do justice to each readAs* algorithm being better defined. Hack 1: Don't call loadend synchronously. Enqueue it, and let read* methods clear the queues when they start up. This differs from XHR, though, and is a little odd. Hack 2: Add a virtual generation counter/timestamp, not exposed to script. Increment it in read*, check it in abort before sending loadend. This is kind of complex, but works [and might be how I end up implementing this in Chrome]. But really, I don't think either of those is better than just saying, in read*, something like terminate the algorithm for any abort sequence being processed. Eric
Re: [fileapi] timing of readyState changes vs. events
On Thu, Mar 1, 2012 at 11:09 PM, Anne van Kesteren ann...@opera.com wrote: On Fri, 02 Mar 2012 01:01:55 +0100, Eric U er...@google.com wrote: On Thu, Mar 1, 2012 at 3:16 PM, Arun Ranganathan aranganat...@mozilla.com wrote: OK, so the change is to ensure that these events are fired directly, and not queued, right? I'll make this change. This applies to all readAs* methods. Yup. It should apply to any event associated with a state change [so e.g. onload, but not onloadend]. Uhm. What you need to do is queue a task that changes the state and fires the event. You cannot just fire an event from asynchronous operations. Pardon my ignorance, but why not? Is it because you have to define which task queue gets the operation? So would that mean that e.g. the current spec for readAsDataURL would have to queue steps 6 and 8-10? Anyway, my point was just that load needed to be done synchronously with the change to readyState, but loadend had no such restriction, since it wasn't tied to the readyState change.
Re: [fileapi] timing of readyState changes vs. events
On Thu, Mar 1, 2012 at 3:16 PM, Arun Ranganathan aranganat...@mozilla.com wrote: Eric, In the readAsText in the latest draft [1] I see that readyState gets set to done When the blob has been read into memory fully. I see that elsewhere in the progress notification description, When the data from the blob has been completely read into memory, queue a task to fire a progress event called load. So readyState changes separately from the sending of that progress event, since one is direct and the other queued, and script could observe the state in between. In the discussion at [2] we arranged to avoid that for FileWriter. We should do the same for FileReader. OK, so the change is to ensure that these events are fired directly, and not queued, right? I'll make this change. This applies to all readAs* methods. Yup. It should apply to any event associated with a state change [so e.g. onload, but not onloadend].
Re: Colliding FileWriters
On Mon, Feb 27, 2012 at 4:40 PM, Jonas Sicking jo...@sicking.cc wrote: On Mon, Feb 27, 2012 at 11:36 PM, Eric U er...@google.com wrote: One working subset would be: * Keep createFileWriter async. * Make it optionally exclusive [possibly by default]. If exclusive, its length member is trustworthy. If not, it can go stale. * Add an append method [needed only for non-exclusive writes, but useful for logs, and a safe default]. This sounds great to me if we make it exclusive by default and remove the .length member for non-exclusive writers. Or make it return null/undefined. I like exclusive-by-default. Of course, that means that by default you have to remember to call close() or depend on GC, but that's probably OK. I'm less sure about .length being unusable on non-exclusive writers, but it's growing on me. Since by default writers would be exclusive, length would generally work just the same as it does now. However, if it returns null/undefined in the nonexclusive case, users might accidentally do math on it (if (length 0) = false), and get confused. Perhaps it should throw? Also, what's the behavior when there's already an exclusive lock, and you call createFileWriter? Should it just not call you until the lock's free? Do we need a trylock that fails fast, calling errorCallback? I think the former's probably more useful than the latter, and you can always use a timer to give up if it takes too long, but there's no way to cancel a request, and you might get a call far later, when you've forgotten that you requested it. However this brings up another problem, which is how to support clients that want to mix read and write operations. Currently this is supported, but as far as I can tell it's pretty awkward. Every time you want to read you have to nest two asynchronous function calls. First one to get a File reference, and then one to do the actual read using a FileReader object. You can reuse the File reference, but only if you are doing multiple reads in a row with no writing in between. I thought about this for a while, and realized that I had no good suggestion because I couldn't picture the use cases. Do you have some handy that would help me think about it? Mixing reading and writing can be something as simple as increasing a counter somewhere in the file. First you need to read the counter value, then add one to it, then write the new value. But there's also more complex operations such as reordering a set of blocks to defragment the contents of a file. Yet another example would be modifying a .zip file to add a new file. When you do this you'll want to first read out the location of the current zip directory, then overwrite it with the new file and then the new directory. That helps, thanks. So we'll need to be able to do efficient (read[-modify-write]*), and we'll need to hold the lock for the reads as well as the writes. The lock should prevent any other writes [exclusive or not], but need not prevent unlocked reads. We sat down and did some thinking about these two issues. I.e. the locking and the read-write-mixed issue. The solution is good news and bad news. The good news is that we've come up with something that seems like it should work, the bad news is that it's a totally different design from the current FileReader and FileWriter designs. Hmm...it's interesting, but I don't think we necessarily have to scrap FR and FW to use it. Here's a modified version that uses the existing interfaces: interface LockedReaderWriter : FileReader, FileWriter { [all the FileReader and FileWriter members] readonly attribute File writeResult; } As with your proposal, as long as any read or write method has outstanding events, the lock is held. The difference here is that after any write method completes, and until another one begins or the lock is dropped, writeResult holds the state of the File as of the completion of the write. The rest of the time it's null. That way you're always as up-to-date as you can easily be, but no more so [it doesn't show partial writes during progress events]. To read, you use the standard FileReader interface, slicing writeResult as needed to get the appropriate offset. A potential feature of this design is that you could use it to read a Blob that didn't come from writeResult, letting you pull in other data while still holding the lock. I'm not sure if we need that, but it's there if we want it. To do the locking without requiring calls to .close() or relying on GC we use a similar setup to IndexedDB transactions. I.e. you get an object which represents a locked file. As long as you use that lock to read from and write to the file the lock keeps being held. However as soon as you return to the event loop from the last progress notification from the last read/write operation, the lock is automatically released. I love that your design is [I believe] deadlock-free, as the write
Re: FileReader abort, again
Incidentally, the way XHR gets around this is to have open cancel any in-progress abort. We could certainly do the same thing, having any readAs* cancel abort(). On Tue, Feb 28, 2012 at 4:15 PM, Eric U er...@google.com wrote: I like the Event Invariants writeup at the end. It's only informative, but it is, indeed, informative. However, I'm not sure it quite matches the normative text in one respect. Where you say [8.5.6 step 4]: Terminate any steps while processing a read method. Does that also terminate the steps associated with an abort that terminated the read method? Basically I'm not sure what steps while processing a read method means. Otherwise, if you start a new read in onabort [8.5.6 step 5], you'll still deliver the loadend [8.5.6 step 6]. This contradicts 8.5.9.2.1 Once a loadstart has been fired, a corresponding loadend fires at completion of the read, EXCEPT if the read method has been cancelled using abort() and a new read method has been invoked. Eric [copying this into FileWriter]
Re: FileReader abort, again
On Wed, Feb 29, 2012 at 1:43 PM, Arun Ranganathan aranganat...@mozilla.com wrote: FileReader.abort is like a bad penny :) However, I'm not sure it quite matches the normative text in one respect. Where you say [8.5.6 step 4]: Terminate any steps while processing a read method. Does that also terminate the steps associated with an abort that terminated the read method? Basically I'm not sure what steps while processing a read method means. I've changed this to terminate only the read algorithm (and hopefully it is clear this isn't the same as the abort steps): http://dev.w3.org/2006/webapi/FileAPI/#terminate-an-algorithm and Otherwise, if you start a new read in onabort [8.5.6 step 5], you'll still deliver the loadend [8.5.6 step 6]. This contradicts 8.5.9.2.1 Once a loadstart has been fired, a corresponding loadend fires at completion of the read, EXCEPT if the read method has been cancelled using abort() and a new read method has been invoked. This seems like familiar ground, and I'm sorry this contradiction still exists. So we could: 1. Say not to fire a loadend if onloadend or onabort re-initiate a read. But this may be odd in terms of analyzing a program before. 2. Simply not fire loadend on abort. I'm not sure this is a good idea. What's your counsel? Have I missed something easier? -- A* My email must have crossed yours mid-flight, but just in case, how about speccing that read* methods terminate the abort algorithm? That's what XHR2 does, and it looks like it works. It's not the easiest thing to figure out when reading the spec. It took me a while to get my mind around it in XHR2, but then that's a much more complicated spec. FileReader's small enough that I think it's not unreasonable, and of course matching XHR2 means fewer surprises all around. Eric
Re: FileReader abort, again
On Wed, Feb 29, 2012 at 2:57 PM, Arun Ranganathan aranganat...@mozilla.com wrote: On Wed, Feb 29, 2012 at 1:43 PM, Arun Ranganathan Otherwise, if you start a new read in onabort [8.5.6 step 5], you'll still deliver the loadend [8.5.6 step 6]. This contradicts 8.5.9.2.1 Once a loadstart has been fired, a corresponding loadend fires at completion of the read, EXCEPT if the read method has been cancelled using abort() and a new read method has been invoked. This seems like familiar ground, and I'm sorry this contradiction still exists. So we could: 1. Say not to fire a loadend if onloadend or onabort re-initiate a read. But this may be odd in terms of analyzing a program before. Do you mean if onload, onerror, or onabort...? 2. Simply not fire loadend on abort. I'm not sure this is a good idea. Agreed. It should be there unless another read starts. What's your counsel? Have I missed something easier? -- A* My email must have crossed yours mid-flight, but just in case, how about speccing that read* methods terminate the abort algorithm? That's what XHR2 does, and it looks like it works. It's not the easiest thing to figure out when reading the spec. It took me a while to get my mind around it in XHR2, but then that's a much more complicated spec. FileReader's small enough that I think it's not unreasonable, and of course matching XHR2 means fewer surprises all around. OK, I'll study XHR2 and figure this out. Spec'ing this isn't a quick win, though, since abort's role is to terminate a read*! So to have a re-initiated read* terminate an abort will require some thought on invocation order. I don't see a conflict--abort terminates read, and read terminates abort. Actually, if we really want to match XHR2, we should qualify all the places that we fire loadend. If the user calls XHR2's open in onerror or onload, that cancels its loadend. However, a simple check on readyState at step 6 won't do it. Because the user could call readAsText in onerror, then call abort in the second read's onloadstart, and we'd see readyState as DONE and fire loadend twice. To emulate XHR2 entirely, we'd need to have read methods dequeue any leftover tasks for previous read methods AND terminate the abort algorithm AND terminate the error algorithm of any previous read method. What a mess. Perhaps there's a simpler way to say successfully calling a read method inhibits any previous read's loadend? [steps 5 and 6 there are missing trailing periods, BTW]
[fileapi] timing of readyState changes vs. events
In the readAsText in the latest draft [1] I see that readyState gets set to done When the blob has been read into memory fully. I see that elsewhere in the progress notification description, When the data from the blob has been completely read into memory, queue a task to fire a progress event called load. So readyState changes separately from the sending of that progress event, since one is direct and the other queued, and script could observe the state in between. In the discussion at [2] we arranged to avoid that for FileWriter. We should do the same for FileReader. Eric [1] http://dev.w3.org/2006/webapi/FileAPI/ [2] http://lists.w3.org/Archives/Public/public-webapps/2010OctDec/0912.html
Re: [file-writer] WebIDL / References
On Sat, Feb 25, 2012 at 5:02 AM, Ms2ger ms2...@gmail.com wrote: Hi all, There are a number of bugs in the WebIDL blocks in http://dev.w3.org/2009/dap/file-system/file-writer.html. * The 'in' token has been removed; void append (in Blob data); should be void append (Blob data);. Fixed. * Event handlers should be [TreatNonCallableAsNull] Function? onfoo, not just Function. Fixed. * Interfaces should not have [NoInterfaceObject] without a good reason. Fixed. * FileException doesn't exist anymore; use DOMException. Still to come. Also, the References section is severely out of date. Fixed. HTH Ms2ger
FileReader abort, again
I like the Event Invariants writeup at the end. It's only informative, but it is, indeed, informative. However, I'm not sure it quite matches the normative text in one respect. Where you say [8.5.6 step 4]: Terminate any steps while processing a read method. Does that also terminate the steps associated with an abort that terminated the read method? Basically I'm not sure what steps while processing a read method means. Otherwise, if you start a new read in onabort [8.5.6 step 5], you'll still deliver the loadend [8.5.6 step 6]. This contradicts 8.5.9.2.1 Once a loadstart has been fired, a corresponding loadend fires at completion of the read, EXCEPT if the read method has been cancelled using abort() and a new read method has been invoked. Eric [copying this into FileWriter]
Re: [file-writer] WebIDL / References
Thanks! I'll take care of those. On Sat, Feb 25, 2012 at 5:02 AM, Ms2ger ms2...@gmail.com wrote: Hi all, There are a number of bugs in the WebIDL blocks in http://dev.w3.org/2009/dap/file-system/file-writer.html. * The 'in' token has been removed; void append (in Blob data); should be void append (Blob data);. * Event handlers should be [TreatNonCallableAsNull] Function? onfoo, not just Function. * Interfaces should not have [NoInterfaceObject] without a good reason. * FileException doesn't exist anymore; use DOMException. Also, the References section is severely out of date. HTH Ms2ger
Re: CfC: publish WD of File API: Writer + File API: Directories and System; deadline March 3
Yeah, the reason is that Arun's more on-the-ball than I am. I'll be updating the spec quite soon, I hope. On Mon, Feb 27, 2012 at 2:35 AM, Felix-Johannes Jendrusch felix-johannes.jendru...@fokus.fraunhofer.de wrote: Hi, is there any reason why the File API: Writer and File API: Directories and System specifications still use FileException/FileError-Objects? The File API uses DOM4's DOMException/DOMError [1]. Best regards, Felix [1] http://dev.w3.org/2006/webapi/FileAPI/#ErrorAndException
Re: Colliding FileWriters
Sorry about the slow response; I've been busy with dev work, and am now getting back to spec work. On Sat, Jan 21, 2012 at 9:57 PM, Jonas Sicking jo...@sicking.cc wrote: On Wed, Jan 11, 2012 at 1:41 PM, Eric U er...@google.com wrote: On Wed, Jan 11, 2012 at 12:25 PM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jan 10, 2012 at 1:32 PM, Eric U er...@google.com wrote: On Tue, Jan 10, 2012 at 1:08 PM, Jonas Sicking jo...@sicking.cc wrote: Hi All, We've been looking at implementing FileWriter and had a couple of questions. First of all, what happens if multiple pages create a FileWriter for the same FileEntry at the same time? Will both be able to write to the file at the same time and whoever writes lasts to a given byte wins? This isn't currently specified, and that's a hole we should fill. By not having it in the spec, my assumption would be that last-wins would hold, but it would be good to clarify it if that's the behavior we want. It's especially important given that there's nothing like fflush(), which would help users know what last meant. Speaking of which, should we add a flushing mechanism? This is different from how file systems normally work since as long as file is open for writing that tends to prevent other processes from opening the same file. You're perhaps thinking of windows, where by default files are opened in exclusive mode? On other operating systems, and on windows when you specify FILE_SHARE_WRITE in dwShareMode in CreateFile, multiple writers can exist simultaneously. Ah. I didn't realize this was different on other OSs. It still seems risky to not provide any means to get exclusive access. The only way I can see websites dealing with this is to create their own locking mechanism backed by using IndexedDB transactions as low-level atomic primitive (local-storage doesn't work since you can implement compare-and-swap in an atomic manner). Having a 'exclusive' flag for createFileWriter seems much easier and removes the IndexedDB dependency. I'd probably even say that it should default to true since on the web defaulting to safe rather than fast generally results in fewer bugs. I don't think I'd generally be averse to this. However, it would then require some sort of a revocation mechanism as well. If you're done with your FileWriter, you want to be able to get rid of it without depending on GC, so that another context can create one. And if you forget to revoke it, behavior in the second context presumably depends on GC, which is a bit ugly. I definitely agree that we need an explicit revoking mechanism. We have a similar situation in IndexedDB where as long as a IDBDatabase object is alive for a given database, no one can upgrade the database version. Here we do have an explicit .close() method, but if you forget to call it you end up waiting for GC. It's possibly somewhat less of a problem in IndexedDB though since upgrading database versions should be pretty rare. I'm not quite sure how urgent this is yet, though. I've been assuming that if you have transactional/synchronization semantics you want to maintain, you'll be using IDB anyway, or a server handshake, etc. But of course it's easy to write a naive app that the user loads in two windows, with bad effect. Yeah, it's the user opens page in two windows scenario that I'm concerned about. As well as similar conditions if you for example have a Worker thread which holds a connection to the server and occasionally writes data to a file based on information from the server, and code in a window which reads data from the file and acts on it. If the window is only reading, not writing, I don't see the problem with the current design. If the worker and window are both reading and writing, in the same file, the problem might be in the app's design. I don't think we can relegate synchronization semantics to IDB. I think we should have synchronization semantics at least as the default mode for all data that is shared between Workers and Windows which can be running on different threads. One great example is localStorage which we spent a lot of effort on trying to make synchronized using the storage mutex. We failed there, but not due to a lack of desire, but due to the way the API is structured. Though if we add the 'exclusive' flag described above, then we'll need to keep createFileWriter async anyway. Right--I think we should pick whatever subset of these suggestions seems the most useful, since they overlap a bit. Agreed. One working subset would be: * Keep createFileWriter async. * Make it optionally exclusive [possibly by default]. If exclusive, its length member is trustworthy. If not, it can go stale. * Add an append method [needed only for non-exclusive writes, but useful for logs, and a safe default]. This sounds great to me if we make it exclusive by default and remove the .length member for non-exclusive writers. Or make
Re: [FileAPI, common] UTF-16 to UTF-8 conversion
What I can do is procrastinate until we agree that BlobBuilder is deprecated, and this is now the problem of the Blob constructor. Over to you, Arun and Jonas. On Mon, Sep 26, 2011 at 11:45 AM, Eric U er...@google.com wrote: Thanks Glenn and Simon--I'll see what I can do. On Fri, Sep 23, 2011 at 1:34 AM, Simon Pieters sim...@opera.com wrote: On Fri, 23 Sep 2011 01:40:44 +0200, Glenn Maynard gl...@zewt.org wrote: BlobBuilder.append(text) says: Appends the supplied text to the current contents of the BlobBuilder, writing it as UTF-8, converting newlines as specified in endings. It doesn't elaborate any further. The conversion from UTF-16 to UTF-8 needs to be defined, in particular for the edge case of invalid UTF-16 surrogates. If this is already defined somewhere, it isn't referenced. I suppose this would belong in Common infrastructure, next to the existing section on UTF-8, not in FileAPI itself. WebSocket send() throws SYNTAX_ERR if its argument contains unpaired surrogates. It would be nice to be consistent. -- Simon Pieters Opera Software
Re: [XHR] Invoking open() from event listeners
On Tue, Dec 20, 2011 at 9:24 AM, Anne van Kesteren ann...@opera.com wrote: Sorry for restarting this thread, but it seems we did not reach any conclusions last time around. On Thu, 03 Nov 2011 00:07:48 +0100, Eric U er...@google.com wrote: I think I may have missed something important. XHR2 specs just this behavior w.r.t. abort [another open will stop the abort's loadend] but /doesn't/ spec that for error or load. That is, an open() in onerror or onload does not appear to cancel the pending loadend. Anne, can you comment on why? I think I did not consider that scenario closely enough when I added support for these progress events. open() does terminate both abort() and send() (the way it does so is not very clear), but maybe it would be clearer if invoking open() set some kind of flag that is checked by both send() and abort() from the moment they start dispatching events. http://dvcs.w3.org/hg/xhr/raw-file/tip/Overview.html Ah, I see how that works now. So if you call open from onerror/onabort/onload, there's no loadend from the terminated XHR. And if you call open before onerror/onabort/onload, you don't get any of those either? If you call open from onerror, do other listeners later in the chain get their onerror calls? Glenn suggested not allowing open() at all, but I think for XMLHttpRequest we are past that (we have e.g. the readystatechange event which has been around since XMLHttpRequest support was added and open() is definitely called from it in the wild). -- Anne van Kesteren http://annevankesteren.nl/
Re: File modification
On Wed, Jan 11, 2012 at 12:22 PM, Charles Pritchard ch...@jumis.com wrote: On 1/11/2012 9:00 AM, Glenn Maynard wrote: This isn't properly specced anywhere and may be impossible to implement perfectly, but previous discussions indicated that Chrome, at least, wanted File objects loaded from input elements to only represent access for the file as it is when the user opened it. That is, the File is immutable (like a Blob), and if the underlying OS file changes (thus making the original data no longer available), attempting to read the File would fail. (This was in the context of storing File in structured clone persistent storage, like IndexedDB.) Mozilla seems to only take a snapshot when the user opens the file. Chrome goes in the other direction, and does so intentionally with FileEntry. I'd prefer everyone follow Chrome. We do so with FileEntry, in the sandbox, because it's intended to be a much more powerful API than File, and the security aspects of it are much simpler. When the user drags a File into the browser, it's much less clear that they intend to give the web app persistent access to that File, including all future changes until the page is closed. I don't think we'd rush to make that change to the spec. And if our implementation isn't snapshotting currently, that's a bug. The spec on this could be nudged slightly to support Chrome's existing behavior. From dragdrop: http://www.whatwg.org/specs/web-apps/current-work/multipage/dnd.html The files attribute must return a live FileList sequence http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#live If a DOM object is said to be live, then the attributes and methods on that object must operate on the actual underlying data, not a snapshot of the data. Dragdrop continues: for a given FileList object and a given underlying file, the same File object must be used each time. Given that the underlying file can change, and the FileList sequence is live, it seems reasonable that subsequent reads of FileList would access a different File object when the underlying file has changed. FileList.onchanged would be appropriate. File.onupdated would not be appropriate. Entry.onupdated would be appropriate. I have one major technical concern: monitoring files for changes isn't free. With only a DOM event, all instantiated Files (or Entries) would have to monitor changes; you don't want to depend on do something if an event handler is registered, since that violates the principle of event handler registration having no other side-effects. Monitoring should be enabled explicitly. I also wonder whether this could be implemented everywhere, eg. on mobile systems. At this point, iOS still doesn't allow input type=file nor dataTransfer of file. So, we're looking far ahead. A system may send a FileList.onchanged() event when it notices that the FileList has been updated. It can be done on access of a live FileList when a mutation is detected. It could be done by occasional polling, or it could be done via notify-style OS hooks. In the first case, there is no significant overhead. webkitdirectory returns a FileList object that can be monitored via directory notification hooks; again, if the OS supports it. Event handlers have some side effects, but not in the scripting environment. onclick, for example, may mean that an element responds to touch events in the mobile environment. -Charles
Re: Colliding FileWriters
On Wed, Jan 11, 2012 at 12:25 PM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jan 10, 2012 at 1:32 PM, Eric U er...@google.com wrote: On Tue, Jan 10, 2012 at 1:08 PM, Jonas Sicking jo...@sicking.cc wrote: Hi All, We've been looking at implementing FileWriter and had a couple of questions. First of all, what happens if multiple pages create a FileWriter for the same FileEntry at the same time? Will both be able to write to the file at the same time and whoever writes lasts to a given byte wins? This isn't currently specified, and that's a hole we should fill. By not having it in the spec, my assumption would be that last-wins would hold, but it would be good to clarify it if that's the behavior we want. It's especially important given that there's nothing like fflush(), which would help users know what last meant. Speaking of which, should we add a flushing mechanism? This is different from how file systems normally work since as long as file is open for writing that tends to prevent other processes from opening the same file. You're perhaps thinking of windows, where by default files are opened in exclusive mode? On other operating systems, and on windows when you specify FILE_SHARE_WRITE in dwShareMode in CreateFile, multiple writers can exist simultaneously. Ah. I didn't realize this was different on other OSs. It still seems risky to not provide any means to get exclusive access. The only way I can see websites dealing with this is to create their own locking mechanism backed by using IndexedDB transactions as low-level atomic primitive (local-storage doesn't work since you can implement compare-and-swap in an atomic manner). Having a 'exclusive' flag for createFileWriter seems much easier and removes the IndexedDB dependency. I'd probably even say that it should default to true since on the web defaulting to safe rather than fast generally results in fewer bugs. I don't think I'd generally be averse to this. However, it would then require some sort of a revocation mechanism as well. If you're done with your FileWriter, you want to be able to get rid of it without depending on GC, so that another context can create one. And if you forget to revoke it, behavior in the second context presumably depends on GC, which is a bit ugly. I'm not quite sure how urgent this is yet, though. I've been assuming that if you have transactional/synchronization semantics you want to maintain, you'll be using IDB anyway, or a server handshake, etc. But of course it's easy to write a naive app that the user loads in two windows, with bad effect. A second question is why is FileEntry.createWriter asynchronous? It doesn't actually do any IO and so it seems like it could return an answer synchronously. FileWriter has a synchronous length property, just as Blob does, so it needs to do IO at creation time to look it up. So how does this work if you have two tabs running in different processes create FileWriters for the same FileEntry. Each tab could end up changing the file's size in which case the the other tabs FileWriter will either have to synchronously update its .length, or it will have an outdated length. So the IO you do when creating the FileWriter is basically unreliable as soon as it's done. So it seems like you could get the size when creating the FileEntry and then use that cached size when creating FileWriter instance. The size in the FileEntry is no more reliable than that in the FileWriter, of course. But if you know you're the only writer, either's good. Though I wonder if it wouldn't be better to remove the .length property. If anything we could add a asynchronous length getter or a write method which appends to the end of the file (since writing is already asynchronous). A new async length getter's not needed; you can use file() for that already. I didn't originally add append due to its apparent redundancy with seek+write, but as you point out, seek+write doesn't guarantee to append if there are multiple writers. Though if we add the 'exclusive' flag described above, then we'll need to keep createFileWriter async anyway. Right--I think we should pick whatever subset of these suggestions seems the most useful, since they overlap a bit. One working subset would be: * Keep createFileWriter async. * Make it optionally exclusive [possibly by default]. If exclusive, its length member is trustworthy. If not, it can go stale. * Add an append method [needed only for non-exclusive writes, but useful for logs, and a safe default]. Would this also explain why FileEntry.getFile is asynchronous? I.e. it won't call it's callback until all current FileWriters have been closed? Nope. It's asynchronous because a File is a Blob, and has a synchronous length accessor, so we look up the length when we mint the File. Note that the length can go stale if you have multiple writers, as we want to keep it fast. This reminds me
Re: File modification
On Tue, Jan 10, 2012 at 1:29 PM, Charles Pritchard ch...@visc.us wrote: Modern operating systems have efficient mechanisms to send a signal when a watched file or directory is modified. File and FileEntry have a last modified date-- currently we must poll entries to see if the modification date changes. That works completely fine in practice, but it doesn't give us a chance to exploit the efficiency of some operating systems in notifying applications about file updates. So as a strawman: a File.onupdated event handler may be useful. It seems like it would be most useful if the File or FileEntry points to a file outside the sandbox defined by the FileSystem spec. Does any browser currently supply such a thing? Chrome currently implements this [with FileEntry] only for ChromeOS components that are implemented as extensions. Does any browser let you have a File outside the sandbox *and* update its modification time? If you're dealing only with FileEntries inside the sandbox, there are already more efficient ways to tell yourself that you've changed something.
Re: Colliding FileWriters
On Tue, Jan 10, 2012 at 1:08 PM, Jonas Sicking jo...@sicking.cc wrote: Hi All, We've been looking at implementing FileWriter and had a couple of questions. First of all, what happens if multiple pages create a FileWriter for the same FileEntry at the same time? Will both be able to write to the file at the same time and whoever writes lasts to a given byte wins? This isn't currently specified, and that's a hole we should fill. By not having it in the spec, my assumption would be that last-wins would hold, but it would be good to clarify it if that's the behavior we want. It's especially important given that there's nothing like fflush(), which would help users know what last meant. Speaking of which, should we add a flushing mechanism? This is different from how file systems normally work since as long as file is open for writing that tends to prevent other processes from opening the same file. You're perhaps thinking of windows, where by default files are opened in exclusive mode? On other operating systems, and on windows when you specify FILE_SHARE_WRITE in dwShareMode in CreateFile, multiple writers can exist simultaneously. A second question is why is FileEntry.createWriter asynchronous? It doesn't actually do any IO and so it seems like it could return an answer synchronously. FileWriter has a synchronous length property, just as Blob does, so it needs to do IO at creation time to look it up. Is the intended way for this to work that FileEntry.createWriter acts as a choke point and ensures that only one active FileWriter for a given FileEntry exists at the same time. I.e. if one page creates a FileWriter for a FileEntry and starts writing to it, any other caller to FileEntry.createWriter will wait to fire it's callback until the first caller was done with its FileWriter. If that is the intended design I would have expected FileWriter to have an explicit .close() function though. Having to wait for GC to free a lock is always a bad idea. Agreed re: GC, but currently in Chromium there is no choke point, and one can create multiple writers, which can stomp on each others' writes if that's what the user requests. That being said, we don't really hold files open right now, except during a write call. In between writes, we close the file, so while collisions are possible, more likely one write will win entirely. But we are opening the files in shared mode. Would this also explain why FileEntry.getFile is asynchronous? I.e. it won't call it's callback until all current FileWriters have been closed? Nope. It's asynchronous because a File is a Blob, and has a synchronous length accessor, so we look up the length when we mint the File. Note that the length can go stale if you have multiple writers, as we want to keep it fast. These questions both apply to what's the intended behavior spec-wise, as well as what does Google Chrome do in the current implementation. I'm personally OK with the current Chrome implementation, which does no locking. If users want transactional behavior, there are better ways to get that via databases. But I'm open to discussion.
Re: Bug in file system Api specification
Bronislav: Thanks for the tip; it's already fixed in the latest editor's draft, so the fix will get published the next time the document is. See the latest at http://dev.w3.org/2009/dap/file-system/file-dir-sys.html. Eric On Wed, Dec 21, 2011 at 12:21 AM, Bronislav Klučka bronislav.klu...@bauglir.com wrote: Hi http://www.w3.org/TR/file-system-api/#widl-FileEntry-file says that successCallback is A callback that is called with the new FileWriter. there should be A callback that is called with the File BTW was trying to file that bug myself, but I could not find suitable component in WebAppsWG product. Brona
Re: Is BlobBuilder needed?
On Tue, Nov 15, 2011 at 5:41 AM, Rich Tibbett ri...@opera.com wrote: Jonas Sicking wrote: Hi everyone, It was pointed out to me on twitter that BlobBuilder can be replaced with simply making Blob constructable. I.e. the following code: var bb = new BlobBuilder(); bb.append(blob1); bb.append(blob2); bb.append(some string); bb.append(myArrayBuffer); var b = bb.getBlob(); would become b = new Blob([blob1, blob2, some string, myArrayBuffer]); or look at it another way: var x = new BlobBuilder(); becomes var x = []; x.append(y); becomes x.push(y); var b = x.getBlob(); becomes var b = new Blob(x); So at worst there is a one-to-one mapping in code required to simply have |new Blob|. At best it requires much fewer lines if the page has several parts available at once. And we'd save a whole class since Blobs already exist. Following the previous discussion (which seemed to raise no major objections) can we expect to see this in the File API spec sometime soon (assuming that spec is the right home for this)? This will require a coordinated edit to coincide with the removal of BlobBuilder from the File Writer API, right? It need not be all that coordinated. I can take it out [well...mark it deprecated, pending implementation changes] any time after the Blob constructor goes into the File API. Thanks, Rich / Jonas
Re: [FileAPI] FileReader.abort() and File[Saver|Writer].abort have different behaviors
On Mon, Oct 3, 2011 at 6:13 PM, Jonas Sicking jo...@sicking.cc wrote: On Mon, Oct 3, 2011 at 5:57 PM, Glenn Maynard gl...@zewt.org wrote: On Mon, Oct 3, 2011 at 8:10 PM, Jonas Sicking jo...@sicking.cc wrote: 1. Make loadend not fire in case a new load is started from onabort/onload/onerror. Thus loadend and loadstart isn't always paired up. Though there is always a loadend fired after every loadstart. 2. Make FileReader/FileWriter/FileSaver not behave like XHR. This also leaves the problem unsolved for XHR. Are there other options I'm missing? Or do both, improving XHR as much as backwards-compatibility allows and don't try to match other APIs to it exactly. I'd much prefer weirdness be isolated to XHR than be perpetuated through every PE-based API. So what exactly are you proposing we do for XHR and for FileReader/FileWriter? I'm still not convinced that it's better for authors to require them to use setTimeout to start a new load as opposed to let them restart the new load from within an event and cancel all following events. I agree that this introduces some inconsistency, but it only does so when authors explicitly reuses a FileReader/XHR/FileWriter for multiple requests. And it only weakens the invariant, not removes it. So instead of * There's exactly one 'loadend' event for each 'loadstart' event. we'll have * There's always a 'loadend' event fired after each 'loadstart' event. However there might be other 'loadstart' events fired in between. I'm for this. It lets FileReader and FileWriter match XHR, avoids [in the odd case] long strings of stacked-up loadend events, and users can avoid all the issues either by creating a new FileReader or by wrapping nested calls in timers if they care. I believe Jonas is in favor of this as well. Can we put this one to bed? Eric
Re: [File API] Calling requestFileSystem with bad filesystem type
On Fri, Oct 7, 2011 at 12:02 PM, Mark Pilgrim pilg...@google.com wrote: What should this do? requestFileSystem(2, 100, successCallback); // assume successCallback is defined properly requestFileSystem doesn't throw, so you should get an errorCallback call. You haven't provided an errorCallback, so you should get a silent failure. It does seem like an error we could identify quickly enough to throw, though, and in general I favor fail-fast for obviously bad parameters. Opinions? Eric
Re: [FileAPI] FileReader.abort() and File[Saver|Writer].abort have different behaviors
On Wed, Nov 2, 2011 at 3:56 PM, Jonas Sicking jo...@sicking.cc wrote: On Wed, Nov 2, 2011 at 9:56 AM, Eric U er...@google.com wrote: On Mon, Oct 3, 2011 at 6:13 PM, Jonas Sicking jo...@sicking.cc wrote: On Mon, Oct 3, 2011 at 5:57 PM, Glenn Maynard gl...@zewt.org wrote: On Mon, Oct 3, 2011 at 8:10 PM, Jonas Sicking jo...@sicking.cc wrote: 1. Make loadend not fire in case a new load is started from onabort/onload/onerror. Thus loadend and loadstart isn't always paired up. Though there is always a loadend fired after every loadstart. 2. Make FileReader/FileWriter/FileSaver not behave like XHR. This also leaves the problem unsolved for XHR. Are there other options I'm missing? Or do both, improving XHR as much as backwards-compatibility allows and don't try to match other APIs to it exactly. I'd much prefer weirdness be isolated to XHR than be perpetuated through every PE-based API. So what exactly are you proposing we do for XHR and for FileReader/FileWriter? I'm still not convinced that it's better for authors to require them to use setTimeout to start a new load as opposed to let them restart the new load from within an event and cancel all following events. I agree that this introduces some inconsistency, but it only does so when authors explicitly reuses a FileReader/XHR/FileWriter for multiple requests. And it only weakens the invariant, not removes it. So instead of * There's exactly one 'loadend' event for each 'loadstart' event. we'll have * There's always a 'loadend' event fired after each 'loadstart' event. However there might be other 'loadstart' events fired in between. I'm for this. It lets FileReader and FileWriter match XHR, avoids [in the odd case] long strings of stacked-up loadend events, and users can avoid all the issues either by creating a new FileReader or by wrapping nested calls in timers if they care. I believe Jonas is in favor of this as well. Can we put this one to bed? So the proposal here is to allow new loads to be started from within abort/error/load event handlers, and for loadend to *not* fire if a new load has already started by the time the abort/error/load event is done firing. And the goal is that XMLHttpRequest, FileReader and FileWriter all behave this way. Is this correct? I think I may have missed something important. XHR2 specs just this behavior w.r.t. abort [another open will stop the abort's loadend] but /doesn't/ spec that for error or load. That is, an open() in onerror or onload does not appear to cancel the pending loadend. Anne, can you comment on why? If so, I agree that this sounds like a good solution. / Jonas
Re: Is BlobBuilder needed?
On Wed, Oct 26, 2011 at 4:14 PM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Oct 25, 2011 at 12:57 PM, Tab Atkins Jr. jackalm...@gmail.com wrote: On Tue, Oct 25, 2011 at 12:53 PM, Ojan Vafai o...@chromium.org wrote: The new API is smaller and simpler. Less to implement and less for web developers to understand. If it can meet all our use-cases without significant performance problems, then it's a win and we should do it. For line-endings, you could have the Blob constructor also take an optional endings argument: new Blob(String|Array|Blob|ArrayBuffer data, [optional] String contentType, [optional] String endings); I believe (or at least, I maintain) that we're trying to do dictionaries for this sort of thing. Multiple optional arguments are *horrible* unless they are truly, actually, order-dependent such that you wouldn't ever specify a later one without already specifying a former one. I don't have a super strong opinion. I will however note that I think it'll be very common to specify a content-type, but much much more rare to specify any of the other types. But maybe using the syntax b = new Blob([foo, bar], { contentType: text/plain }); isn't too bad. The other properties that I could think of that we'd want to add sometime in the future would be encoding for strings, including endianness for utf16 strings. That looks good to me. Endings can go in there, if we keep it.
Re: Is BlobBuilder needed?
On Mon, Oct 24, 2011 at 3:52 PM, Jonas Sicking jo...@sicking.cc wrote: Hi everyone, It was pointed out to me on twitter that BlobBuilder can be replaced with simply making Blob constructable. I.e. the following code: var bb = new BlobBuilder(); bb.append(blob1); bb.append(blob2); bb.append(some string); bb.append(myArrayBuffer); var b = bb.getBlob(); would become b = new Blob([blob1, blob2, some string, myArrayBuffer]); or look at it another way: var x = new BlobBuilder(); becomes var x = []; x.append(y); becomes x.push(y); var b = x.getBlob(); becomes var b = new Blob(x); So at worst there is a one-to-one mapping in code required to simply have |new Blob|. At best it requires much fewer lines if the page has several parts available at once. And we'd save a whole class since Blobs already exist. It does look cleaner this way, and getting rid of a whole class would be very nice. The only things that this lacks that BlobBuilder has are the endings parameter for '\n' conversion in text and the content type. The varargs constructor makes it awkward to pass in flags of any sort...any thoughts on how to do that cleanly? Eric
Re: FileSystem API - The Flags interface
The exception is thrown by getFile on DirectoryEntrySync, not by the Flags constructor; both the example and the flags interface are correct. On Sat, Oct 8, 2011 at 11:54 AM, Bronislav Klučka bronislav.klu...@bauglir.com wrote: Hello, http://www.w3.org/TR/file-system-api/#the-flags-interface If you look at the description of exclusive flag (4.2.1), the description states no exception, but the example (4.2.2) uses exception to determine whether file already existed. So the question is, what is wrong: the description or example? Brona Klucka
Re: [FileAPI] FileReader.abort() and File[Saver|Writer].abort have different behaviors
On Thu, Sep 29, 2011 at 12:22 PM, Arun Ranganathan a...@mozilla.com wrote: On 9/21/11 8:07 PM, Eric U wrote: Update: I have made the changes to FileWriter/FileSaver's event sequences; they now match FileReader. That's not to say it won't change pending discussion, but FileWriter should continue to match FileReader whatever else happens. Eric Eric: After reading this email thread, and looking at your changes, I think I'll make the following changes: 1. Tighten requirement on onprogress such that we mandate firing *at least one* progress event with a must. Right now this is unclear as you point out, not least of all because we don't mandate the user agent calling onprogress. 2. Include a discussion of the invariants Jonas mentions [1], so that event order is fleshed in the event section. 3. Clarify exceptions to the 50ms event dispatch timeframe (notably for progress events before load+loadend). To be clear, you've decided we're NOT going to veer from XHR2's abort/open behavior (and thus what FileReader says now) in FileWriter/FileSaver right? Is this a good summary of changes that we should make? -- A* [1] http://lists.w3.org/Archives/Public/public-webapps/2011JulSep/1512.html I think that works; #2 will be especially important. However, if I read this right, we *don't* have the invariant that a loadstart will always have a loadend. Now that Anne's explained XHR2's model, it seems that an open can cancel the loadend that an abort would have sent. So the invariants need to be a bit more complex. I've updated FileWriter to take most of this into account, but *not* that last bit yet; as written, I've got Jonas's original invariants, which would lead to the stacked up loadend events at the end.
Re: Publishing specs before TPAC; Oct 14 is last day to start a CfC
On Mon, Sep 26, 2011 at 2:38 PM, Arthur Barstow art.bars...@nokia.com wrote: The upcoming TPAC meeting (Oct 31 - Nov 01) provides an opportunity for joint WG meetings and lots of informal sharing. As such, some groups make spec publications right before TPAC. Note there is a 2-week publication blackout period around the TPAC week and Oct 24 is the last day to request publication before TPAC. Given our 1-week CfC for new publications, weekends, etc., the schedule is: * Oct 14 - last day to start a CfC to publish * Oct 24 - last day to request publication * Oct 27 - last publications before TPAC * Nov 07 - publications resume *A lot of groups wait until the deadline so if you want to publish before TPAC, I encourage you to propose publication as soon as possible and by October 14 at the latest. * Some specs I'd like to see published before TPAC (http://www.w3.org/2008/webapps/wiki/PubStatus): * Clipboard APIs and Events - I think Hallvord has made quite a few changes since last publication on 12-Apr-2011. WDYT Hallvord? * D3E - not sure if next pub is CR or LC. Doug, Jacob? * File API (last published in 26-Oct-2010) - Arun, Jonas - what's up with this spec? * File API: Writer and Directories System - WDYT Eric? Are the changes since the April 2011 publication significant? There have been a few small changes since then, but I'm going to be pretty tied up through TPAC; let's do a draft in November some time. * Indexed Database API - is this ready for LC? * Server-sent Events - 8 open bugs so I presume a new WD at this point. * Web Messaging - 6 open bugs so I presume a new WD at this point. -AB
Re: [FileAPI, common] UTF-16 to UTF-8 conversion
Thanks Glenn and Simon--I'll see what I can do. On Fri, Sep 23, 2011 at 1:34 AM, Simon Pieters sim...@opera.com wrote: On Fri, 23 Sep 2011 01:40:44 +0200, Glenn Maynard gl...@zewt.org wrote: BlobBuilder.append(text) says: Appends the supplied text to the current contents of the BlobBuilder, writing it as UTF-8, converting newlines as specified in endings. It doesn't elaborate any further. The conversion from UTF-16 to UTF-8 needs to be defined, in particular for the edge case of invalid UTF-16 surrogates. If this is already defined somewhere, it isn't referenced. I suppose this would belong in Common infrastructure, next to the existing section on UTF-8, not in FileAPI itself. WebSocket send() throws SYNTAX_ERR if its argument contains unpaired surrogates. It would be nice to be consistent. -- Simon Pieters Opera Software
Re: [FileAPI] BlobBuilder.append(native)
On Thu, Sep 22, 2011 at 4:47 PM, Glenn Maynard gl...@zewt.org wrote: native Newlines must be transformed to the default line-ending representation of the underlying host filesystem. For example, if the underlying filesystem is FAT32, newlines would be transformed into \r\n pairs as the text was appended to the state of the BlobBuilder. This is a bit odd: most programs write newlines according to the convention of the host system, not based on peeking at the underlying filesystem. You won't even know the filesystem if you're writing to a network drive. I'd suggest must be transformed according to the conventions of the local system, and let implementations decide what that is. It should probably be explicit that the only valid options are \r\n and \n, or reading files back in which were transformed in this way will be difficult. Good catch--I'll fix that. Also, in the Issue above that, it seems to mean native where it says transparent. Yup. That too. Thanks!
Re: [FileAPI] FileReader.abort() and File[Saver|Writer].abort have different behaviors
On Wed, Sep 21, 2011 at 2:28 PM, Jonas Sicking jo...@sicking.cc wrote: On Wed, Sep 21, 2011 at 11:12 AM, Glenn Maynard gl...@zewt.org wrote: On Tue, Sep 20, 2011 at 8:40 PM, Eric U er...@google.com wrote: Indeed--however, from a quick skim of XHR and XHR2, that's not what they do. They let open() terminate abort(), however far along it's gotten. If we did that, then an abort killed by a read might lead to the aborted read never getting an onloadend. But you could still get the stack-busting chain of onloadstart/onabort. Yuck. I agree that's not a good thing to mimic for the sake of consistency. Anne, is this intentional, or just something XHR is just stuck with for compatibility? It looks like a new problem in XHR2--this couldn't happen in XHR1, because there was no abort event fired before loadend. If we wanted to prevent read methods from being called during abort, we'd probably want to do that by setting an aborting flag or mucking around with yet another readyState of ABORTING. That's annoying, but it's better than the current situation, and I think better than the XHR situation. Receiving loadstart should guarantee the receipt of loadend. On Tue, Sep 20, 2011 at 7:43 PM, Jonas Sicking jo...@sicking.cc wrote: 1. onloadstart fires exactly once 2. There will be one onprogress event fired when 100% progress is reached 3. Exactly one of onabort, onload and onerror fires 4. onloadend fires exactly once. 6. no onprogress events fire before onloadstart 5. no onprogress events fire after onabort/onload/onerror 6. no onabort/onoad/onerror events fire after onloadend 7. after loadstart is fired, loadstart is not fired again until loadend has been fired (ie. only one set of progress events can be active on an object at one time). More precisely: loadstart should not be fired again until the dispatch of loadend *has completed*. That is, you can't start a new progress sequence from within loadend, either, because there may be other listeners on the object that havn't yet received the loadend. I don't think we can do that for XHR without breaking backwards compat. I just spent a bit more time with the XHR2 spec, and it looks like the same looping behavior's legal there too, bouncing between onreadystatechange and onabort, and stacking up a pending call to onloadend for each loop. When open terminates abort, abort completes the step of the algorithm [here step 5], which includes a subsequent call to onloadend. It's not a queued task to be cancelled, as it's all synchronous calls back and forth. If we want the file specs to match the XHR spec, then we can just leave this as it is in File Reader, and I'll match it in File Writer. Recursion depth limit is up to the UA to set. But I look forward to hearing what Anne has to say about it before we settle on copying it.
Re: [FileAPI] FileReader.abort() and File[Saver|Writer].abort have different behaviors
On Wed, Sep 21, 2011 at 3:09 PM, Glenn Maynard gl...@zewt.org wrote: On Wed, Sep 21, 2011 at 5:44 PM, Eric U er...@google.com wrote: If we want the file specs to match the XHR spec, then we can just leave this as it is in File Reader, and I'll match it in File Writer. Recursion depth limit is up to the UA to set. But I look forward to hearing what Anne has to say about it before we settle on copying it. In my opinion, providing the no nesting guarantee is more useful than being consistent with XHR, if all new APIs provide it. If we eliminate it entirely, then you can't ever start a new read on the same object from the abort handler. That seems like a reasonable use case. This sort of thing seems obviously useful: function showActivity(obj) { obj.addEventHandler(loadstart, function() { div.hidden = false; }, false); obj.addEventHandler(loadend, function() { div.hidden = true; }, false); } With the currently specced behavior, this doesn't work--the div would end up hidden when it should be shown. You shouldn't have to care how other code is triggering reads to do something this simple. Adding a number-of-reads-outstanding counter isn't that much more code. And if you're really trying to keep things simple, you're not aborting and then starting another read during the abort, so the above code works in your app.
Re: [FileAPI] FileReader.abort() and File[Saver|Writer].abort have different behaviors
On Wed, Sep 21, 2011 at 3:29 PM, Glenn Maynard gl...@zewt.org wrote: On Wed, Sep 21, 2011 at 6:14 PM, Eric U er...@google.com wrote: If we eliminate it entirely, then you can't ever start a new read on the same object from the abort handler. That seems like a reasonable use case. It's trivial to stuff it into a zero-second timeout to knock it out of the event handler. This is such a common and useful pattern that libraries have shorthand for it, eg. Prototype's Function#defer. I don't think that's an onerous requirement at all; it's basically the same as specs saying queue an event. While it's certainly not hard to work around, as you say, it seems more complex and less likely to be obvious than the counter-for-activity example, which feels like the classic push-pop paradigm. And expecting users to write their event handlers one way for XHR and a different way for FileReader/FileWriter seems like asking for trouble--you're going to get issues that only come up in exceptional cases, and involve a fairly subtle reading of several specs to get right. I think we're better off going with consistency. Adding a number-of-reads-outstanding counter isn't that much more code. It's not much more code, but it's code dealing with a case that doesn't have to exist, working around a very ugly and unobvious sequence of events, and it's something that you really shouldn't have to worry about every single time you use loadstart/loadend pairs. And if you're really trying to keep things simple, you're not aborting and then starting another read during the abort, so the above code works in your app. The above code and the code triggering the reads might not even be written by the same people--the activity display might be a third-party component (who very well might not have thought of this; I wouldn't have, before this discussion). -- Glenn Maynard
Re: [FileAPI] FileReader.abort() and File[Saver|Writer].abort have different behaviors
On Mon, May 23, 2011 at 6:19 PM, Arun Ranganathan a...@mozilla.com wrote: On 5/23/11 6:14 PM, Arun Ranganathan wrote: On 5/23/11 1:20 PM, Kyle Huey wrote: To close the loop a bit here, Firefox 6 will make the change to FileReader.abort()'s throwing behavior agreed upon here. (https://bugzilla.mozilla.org/show_bug.cgi?id=657964) We have not changed the timing of the events, which are still dispatched synchronously. The editor's draft presently does nothing when readyState is EMPTY, but if readyState is DONE it is specified to set result to null and fire events (but flush any pending tasks that are queued). http://dev.w3.org/2006/webapi/FileAPI/#dfn-abort Also note that we're NOT firing *both* error and abort; we should only fire abort, and *not* error. I should change the spec. to throw. Eric, you might change the spec. (and Chrome) to NOT fire error and abort events :) Sorry, to be a bit clearer: I'm talking about Eric changing http://dev.w3.org/2009/dap/file-system/file-writer.html#widl-FileSaver-abort-void to match http://dev.w3.org/2006/webapi/FileAPI/#dfn-abort -- A* Sorry about the long delay here--a big release and a new baby absorbed a lot of my time. I'm going through the abort sequence right now, and it turns out that there are a number of places in various algorithms in FileWriter that should match FileReader more closely than they do. However, there a couple of edge cases I'm unsure about. 1) Do you expect there to be an event called progress that indicates a complete read, before the load event? user agents MUST return at least one such result while processing this read method, with the last returned value at completion of the read -- Does that mean during onprogress, or would during onloadend be sufficient? What if the whole blob is read in a single backend operation--could there be no calls to onprogress at all? [Side note--the phrasing there is odd. You say that useragents MUST return, but the app's not required to call for the value, and it can't return it if not asked. Did you want to require the useragent to make at least one onprogress call?] 2) The load and loadend events are queued When the data from the blob has been completely read into memory. If the user agent fires an onprogress indicating all the data's been loaded, and the app calls abort in that event handler, should those queued events be fired or not? If there are any tasks from the object's FileReader task source in one of the task queues, then remove those tasks. makes it look like no, but I wanted to make sure. If #1 above is no or not necessarily, then this might not ever come up anyway. Thanks, Eric
Re: [FileAPI] FileReader.abort() and File[Saver|Writer].abort have different behaviors
On Tue, Sep 20, 2011 at 3:36 PM, Eric U er...@google.com wrote: On Mon, May 23, 2011 at 6:19 PM, Arun Ranganathan a...@mozilla.com wrote: On 5/23/11 6:14 PM, Arun Ranganathan wrote: On 5/23/11 1:20 PM, Kyle Huey wrote: To close the loop a bit here, Firefox 6 will make the change to FileReader.abort()'s throwing behavior agreed upon here. (https://bugzilla.mozilla.org/show_bug.cgi?id=657964) We have not changed the timing of the events, which are still dispatched synchronously. The editor's draft presently does nothing when readyState is EMPTY, but if readyState is DONE it is specified to set result to null and fire events (but flush any pending tasks that are queued). http://dev.w3.org/2006/webapi/FileAPI/#dfn-abort Also note that we're NOT firing *both* error and abort; we should only fire abort, and *not* error. I should change the spec. to throw. Eric, you might change the spec. (and Chrome) to NOT fire error and abort events :) Sorry, to be a bit clearer: I'm talking about Eric changing http://dev.w3.org/2009/dap/file-system/file-writer.html#widl-FileSaver-abort-void to match http://dev.w3.org/2006/webapi/FileAPI/#dfn-abort -- A* Sorry about the long delay here--a big release and a new baby absorbed a lot of my time. I'm going through the abort sequence right now, and it turns out that there are a number of places in various algorithms in FileWriter that should match FileReader more closely than they do. However, there a couple of edge cases I'm unsure about. 1) Do you expect there to be an event called progress that indicates a complete read, before the load event? On further reflection, another requirement prevents this in some cases. If you've made a non-terminal progress event less than 50ms before completion, you're not permitted to make another at completion, so I think you'd go straight to load and loadend. However, if the entire load took place in a single underlying operation that took less than 50ms, do you have your choice of whether or not to fire onprogress once before onload? user agents MUST return at least one such result while processing this read method, with the last returned value at completion of the read -- Does that mean during onprogress, or would during onloadend be sufficient? What if the whole blob is read in a single backend operation--could there be no calls to onprogress at all? [Side note--the phrasing there is odd. You say that useragents MUST return, but the app's not required to call for the value, and it can't return it if not asked. Did you want to require the useragent to make at least one onprogress call?] 2) The load and loadend events are queued When the data from the blob has been completely read into memory. If the user agent fires an onprogress indicating all the data's been loaded, and the app calls abort in that event handler, should those queued events be fired or not? If there are any tasks from the object's FileReader task source in one of the task queues, then remove those tasks. makes it look like no, but I wanted to make sure. If #1 above is no or not necessarily, then this might not ever come up anyway. Thanks, Eric
Re: [FileAPI] FileReader.abort() and File[Saver|Writer].abort have different behaviors
On Tue, Sep 20, 2011 at 4:43 PM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Sep 20, 2011 at 4:28 PM, Eric U er...@google.com wrote: On Tue, Sep 20, 2011 at 3:36 PM, Eric U er...@google.com wrote: On Mon, May 23, 2011 at 6:19 PM, Arun Ranganathan a...@mozilla.com wrote: On 5/23/11 6:14 PM, Arun Ranganathan wrote: On 5/23/11 1:20 PM, Kyle Huey wrote: To close the loop a bit here, Firefox 6 will make the change to FileReader.abort()'s throwing behavior agreed upon here. (https://bugzilla.mozilla.org/show_bug.cgi?id=657964) We have not changed the timing of the events, which are still dispatched synchronously. The editor's draft presently does nothing when readyState is EMPTY, but if readyState is DONE it is specified to set result to null and fire events (but flush any pending tasks that are queued). http://dev.w3.org/2006/webapi/FileAPI/#dfn-abort Also note that we're NOT firing *both* error and abort; we should only fire abort, and *not* error. I should change the spec. to throw. Eric, you might change the spec. (and Chrome) to NOT fire error and abort events :) Sorry, to be a bit clearer: I'm talking about Eric changing http://dev.w3.org/2009/dap/file-system/file-writer.html#widl-FileSaver-abort-void to match http://dev.w3.org/2006/webapi/FileAPI/#dfn-abort -- A* Sorry about the long delay here--a big release and a new baby absorbed a lot of my time. I'm going through the abort sequence right now, and it turns out that there are a number of places in various algorithms in FileWriter that should match FileReader more closely than they do. However, there a couple of edge cases I'm unsure about. 1) Do you expect there to be an event called progress that indicates a complete read, before the load event? On further reflection, another requirement prevents this in some cases. If you've made a non-terminal progress event less than 50ms before completion, you're not permitted to make another at completion, so I think you'd go straight to load and loadend. However, if the entire load took place in a single underlying operation that took less than 50ms, do you have your choice of whether or not to fire onprogress once before onload? This is a spec-bug. We need to make an exception from the 50ms rule for the last onprogress event. From the webpage point of view, the following invariants should hold for each load: 1. onloadstart fires exactly once 2. There will be one onprogress event fired when 100% progress is reached 3. Exactly one of onabort, onload and onerror fires 4. onloadend fires exactly once. 6. no onprogress events fire before onloadstart 5. no onprogress events fire after onabort/onload/onerror 6. no onabort/onoad/onerror events fire after onloadend The reason for 2 is so that the page always renders a complete progress bar if it only does progressbar updates from the onprogress event. Hope that makes sense? It makes sense, and in general I like it. But the sequence can get more complicated [specifically, nested] if you have multiple read calls, which is the kind of annoyance that brought me to send the email. I have a read running, and at some point I abort it--it could be in onprogress or elsewhere. In onabort I start another read. In onloadstart I abort again. Repeat as many times as you like, then let a read complete. I believe we've specced that the event sequence should look like this: loadstart [progress]* --[events from here to XXX happen synchronously, with no queueing] abort loadstart abort loadstart abort loadstart loadend loadend loadend --[XXX] [progress]+ load loadend Does that look like what you'd expect? Am I reading it right? Yes, this is a wacky fringe case. But it's certainly reasonable to expect someone to start a new read in onabort, so you have to implement at least enough bookkeeping for that case. And UAs will want to defend against stack overflow, in the event that a bad app sticks in an abort/loadstart loop.
Re: [FileAPI] FileReader.abort() and File[Saver|Writer].abort have different behaviors
On Tue, Sep 20, 2011 at 5:32 PM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Sep 20, 2011 at 5:26 PM, Glenn Maynard gl...@zewt.org wrote: On Tue, Sep 20, 2011 at 8:01 PM, Eric U er...@google.com wrote: I have a read running, and at some point I abort it--it could be in onprogress or elsewhere. In onabort I start another read. In onloadstart I abort again. Repeat as many times as you like, then let a read complete. I believe we've specced that the event sequence should look like this: loadstart [progress]* --[events from here to XXX happen synchronously, with no queueing] abort loadstart abort loadstart XHR handles this by not allowing a new request to be opened until the abort() method terminates. Could that be done here? It seems like an important thing to be consistent about. http://dev.w3.org/2006/webapi/XMLHttpRequest/#the-abort-method Ooh, that's a good idea. / Jonas Indeed--however, from a quick skim of XHR and XHR2, that's not what they do. They let open() terminate abort(), however far along it's gotten. If we did that, then an abort killed by a read might lead to the aborted read never getting an onloadend. But you could still get the stack-busting chain of onloadstart/onabort. If we wanted to prevent read methods from being called during abort, we'd probably want to do that by setting an aborting flag or mucking around with yet another readyState of ABORTING.
Re: [whatwg] File API Streaming Blobs
Sorry about the very slow response; I've been on leave, and am now catching up on my email. On Wed, Jun 22, 2011 at 11:54 AM, Arun Ranganathan a...@mozilla.com wrote: Greetings Adam, Ian, I wish I knew that earlier when I originally posted the idea, there was lots of discussion and good ideas but then it suddenly dropped of the face of the earth. Essentially I am fowarding this suggestion to public-webapps@w3.org on the basis as apparently most discussion of File API specs happen there, and would like to know how to move forward with this suggestion. The original suggestion and following comments are on the whatwg list archive, starting with http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2011-January/029973.html Summing up, the problem with the current implementation of Blobs is that once a URI has been generated for them, by design changes are no longer reflected in the object URL. In a streaming scenario, this is not what is needed, rather a long-living Blob that can be appended is needed and 'streamed' to other parts of the browser, e.g. thevideo oraudio element. The original use case was: make an application which will download media files from a server and cache them locally, as well as playing them without making the user wait for the entire file to be downloaded, converted to a blob, then saved and played, however such an API covers many other use cases such as on-the-fly on-device decryption of streamed media content (ie live streams either without end or static large files that to download completely would be a waste when only the first couple of seconds need to be buffered and decrypted before playback can begin) Some suggestions were to modify or create a new type of Blob, the StreamingBlob which can be changed without its object url changing and appended to as new data is downloaded or decoded, and using a similar process to how large files may start to be decoded/played by a browser before they are fully downloaded. Other suggestions suggested using a pull API on the Blob so browsers can request for new data asynchronously, such as in http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2011-January/029998.html Some problems however that a browser may face is what to do with urls which are opened twice, and whether the object url should start from the beginning (which would be needed for decoding encrypted, on-demand audio) or start from the end (similar to `tail`, for live streaming events that need decryption, etc.). Thanks, P.S. Sorry if I've not done this the right way by forwarding like this, I'm not usually active on mailing lists. I actually think moving to a streaming mode for file reads in general is desirable, but I'm not entirely sure extending Blobs is the way to go for *that* use case, which honestly is the main use case I'm interested in. We may improve upon ideas after this API goes to Last Call for streaming file reads; hopefully we'll do a better job than other non-JavaScript APIs out there :) [1]. Blob objects as they are currently specified live in memory and represent in memory File objects as well. A change to the underlying file isn't captured in the Blob snapshot; moreover, if the file moves or is no longer present at time of read, an error event is fired while processing a read operation. The object URL may be dereferenced, but will result in a 404. The Streaming API explored by WHATWG uses the Object URL scheme for videoconferencing use cases [2], and so the scheme itself is suitable for resources that are more dynamic than memory-resident Blob objects. Segment-plays/segment dereferencing in general can be handled through media fragments; the scheme can naturally be accompanied by fragment identifiers. I agree that it may be desirable to extend Blobs to do a few other things in general, maybe independent of better file reads. You've Cc'd the right listserv :) I'd be interested in what Eric has to say, since BlobBuilder evolves under his watch. Having reviewed the threads, I'm not absolutely sure that we want to add this stuff to Blob. It seems like streaming is quite a bit different than a lot of the problems people want to solve with Blobs, and we may end up with a bit of a mess if we mash them together. BlobBuilder does seem a decent match as a StreamBuilder, though. Since Blobs are specifically non-mutable, it sounds like what you're looking for is more like createObjectURL(blobBuilder) than createObjectURL(blobBuildler.getBlob()). From the threads and from my head, here are some questions: 1) Would reading from a stream always start at the beginning, or would it start at the current point [e.g. in a live video stream]? 2) Would this have to support infinite streams? 3) Would we be expected to keep around data from the very beginning of a stream, even if e.g. it's a live broadcast and you're now watching hour 7? If not, who controls the buffer size and what's the API for
[File API: FileSystem] Removed mimeType from toURL
The optional mimeType parameter to Entry[Sync].toURL is redundant with url.createObjectURL. It also doesn't work with the URL format proposed in the notes and now implemented in Chromium. I have removed it from the spec. Eric
Re: [File API: FileSystem] Path restrictions and case-sensitivity
On Thu, May 12, 2011 at 1:34 AM, timeless timel...@gmail.com wrote: On Thu, May 12, 2011 at 3:02 AM, Eric U er...@google.com wrote: There are a few things going on here: yes 1) Does the filesystem preserve case? If it's case-sensitive, then yes. If it's case-insensitive, then maybe. 2) Is it case-sensitive? If not, you have to decide how to do case folding, and that's locale-specific. As I understand it, Unicode case-folding isn't locale specific, except when you choose to use the Turkish rules, which is exactly the problem we're talking about. 3) If you're case folding, are you going to go with a single locale everywhere, or are you going to use the locale of the user? 4) [I think this is what you're talking about w.r.t. not allowing both dotted and dotless i]: Should we attempt to detect filenames that are /too similar/ for some definition of /too similar/, ostensibly to avoid confusing the user. As I read what you wrote, you wanted: 1) yes correct 2) no correct 3) a new locale in which I, ı, I and i all fold to the same letter, everywhere I'm pretty sure Unicode's locale insensitive behavior is precisely what i want. I've included the section from Unicode 6 at the end. 4) yes, possibly only for the case of I, ı, I and i 4 is, in the general case, impossible. yes. It's not well-defined, and is just as likely to cause problems as solve them. There are some defined ways to solve them (accepting that perfect is the enemy of the good), - one is to take the definitions of too similar selected for idn registration... - another is to just accept the recommendation from unicode 6 text can be normalized to Normalization Form NFKC or NFKD after case folding If you *just* want to check for ı vs. i, it's possible, but it's still not clear to me that what you're doing will be the correct behavior in Turkish locales [are there any Turkish words, names abbreviations, etc. that only differ in that character?] Well, the classic example of this is sıkısınca / sikisince [1], but technically those differ in more than just the 'i' (they differ in the a/e at the end). My point is that if two things differ by such a small thing, it's better to force them to have visibly different names, this could be a '(2)' tacked onto the end of a file if the name is auto generated, or if the name is something a human is picking, it could be please pick another name, it looks too close to preview of other object object name. This again is really oriented towards the file-picker use case which we've agreed [I think?] isn't the most common use case. Most of the time we expect the filenames to be generated by an application that's using the filesystem for a backing store. Changing the filenames out from under it 1) won't improve anything; 2) may break things. Given that we're talking about a problem that's subjective and thus can't really be solved, and the solution you propose is so complicated, I really don't see that this is a win over just saying we support all valid UTF-8 sequences; build whatever you want on top of that. There are ways to add some of the behavior you're asking for in JavaScript libraries on top, as long as you're willing to have a central coordinator for your filesystem access. Let's let people experiment with that as they wish. It appears to me that a majority of those who've spoken up support this conclusion, and will try to update the spec this week. As before, I'm still only speccing out the sandboxed filesystem, so expansions into access outside the sandbox, and serialization of these filenames into local filesystem names, can be dealt with later. The other instances I've run into all seem to be cases where there's a canonical spelling and then a folded for Latin users writing. I certainly can't speak for all cases. and it doesn't matter elsewhere. Actually, i think we ended up trying to compile blacklists while developing punycode [2] for IDN [3]. I guess rfc 4290 [4], 4713 [5], 5564 [6], and 5992 [7], have tables which while not complete are certainly referencable, and given that UAs already have to deal with punycode, it's likely that they'd have access to those tables. I think the relevant section from unicode 6 [8] is probably 5.18 Case Mappings (page 171?) Where case distinctions are not important, other distinctions between Unicode characters (in particular, compatibility distinctions) are generally ignored as well. In such circumstances, text can be normalized to Normalization Form NFKC or NFKD after case folding, thereby producing a normalized form that erases both compatibility distinctions and case distinctions. I think this is probably what I want However, such normalization should generally be done only on a restricted repertoire, such as identifiers (alphanumerics). Yes, I'm hand waving at this requirement - filenames are in a way identifiers, you aren't supposed to encode an essay
Re: [FileAPI] FileReader.abort() and File[Saver|Writer].abort have different behaviors
On Tue, May 17, 2011 at 2:41 PM, Jonas Sicking jo...@sicking.cc wrote: On Tue, May 17, 2011 at 2:35 PM, Kyle Huey m...@kylehuey.com wrote: The abort behaviors of FileReader and File[Saver|Writer] differ. The writing objects throw if the abort method is called when a write is not currently under way, while the reading object does not throw. The behaviors should be consistent. I don't particularly care either way, but I believe Jonas would like to change FileReader to match File[Saver|Writer]. Yeah, since we made FileReader.readAsX throw when called in the wrong state, I believe doing the same for abort() is the better option. / Jonas Sounds good to me.
Re: [FileAPI] FileReader.abort() and File[Saver|Writer].abort have different behaviors
It was likely just an oversight on my part that they differ. It does seem a bit odd to dispatch error/abort/loadend if aborting with no write in progress, so I favor the FileWriter/FileSaver behavior, but as long as they match, I'm not too bothered. On Tue, May 17, 2011 at 2:35 PM, Kyle Huey m...@kylehuey.com wrote: The abort behaviors of FileReader and File[Saver|Writer] differ. The writing objects throw if the abort method is called when a write is not currently under way, while the reading object does not throw. The behaviors should be consistent. I don't particularly care either way, but I believe Jonas would like to change FileReader to match File[Saver|Writer]. - Kyle
Re: [FileAPI] FileReader.abort() and File[Saver|Writer].abort have different behaviors
On Tue, May 17, 2011 at 2:48 PM, Jonas Sicking jo...@sicking.cc wrote: On Tue, May 17, 2011 at 2:42 PM, Eric U er...@google.com wrote: It was likely just an oversight on my part that they differ. It does seem a bit odd to dispatch error/abort/loadend if aborting with no write in progress, so I favor the FileWriter/FileSaver behavior, but as long as they match, I'm not too bothered. For what it's worth, FileReader.abort() currently follows what XHR.abort() does, which is to do nothing if called in the wrong state. I.e. no events are aborted and no exceptions are thrown. Ah, my mistake; I was reading http://www.w3.org/TR/FileAPI/#abort instead of http://dev.w3.org/2006/webapi/FileAPI/#abort.
Re: [File API: FileSystem] Path restrictions and case-sensitivity
I've grouped responses to bits of this thread so far below: Glenn said: If *this API's* concept of filenames is case-insensitive, then IMAGE.JPG and image.jpg represent the same file on English systems and two different files on Turkish systems, which is an interop problem. Timeless replied: no, if the api is case insensitive, then it's case insensitive *everywhere*, both on Turkish and on English systems. Things could only be case sensitive when serialized to a real file system outside of the API. I'm not proposing a case insensitive system which is locale aware, i'm proposing one which always folds. You're proposing not just a case-insensitive system, but one that forces e.g. an English locale on all users, even those in a Turkish locale. I don't think that's an acceptable solution. I also don't think having code that works in one locale and not another [Glenn's image.jpg example] is fantastic. It was what we were stuck with when I was trying to allow implementers the choice of a pass-through implementation, but given that that's fallen to the realities of path lengths on Windows, I feel like we should try to do better. Glenn: This can be solved at the application layer in applications that want it, without baking it into the filesystem API. This is mostly true; you'd have to make sure that all alterations to the filesystem went through a single choke-point or you'd have the potential for race conditions [or you'd need to store the original-case filenames yourself, and send the folded case down to the filesystem API]. Glenn: A virtual FS as the backing for the filesystem API does not resolve that core issue. It makes sense to encourage authors to gracefully handle errors thrown by creating files and directories. Such a need has already been introduced via Google Chrome's unfortunate limitation of a 255 byte max path length. That limitation grew out of the OS-dependent passthrough implementation. We're fixing that right now, with this proposal. The one take-away I have from that bug: it would have been nice to have a more descriptive error message. It took awhile to figure out that the path length was too long for the implementation. I apologize for that--it was an oversight. If we can relax the restrictions to a small set, it'll be more obvious what the problems are. IIRC this problem was particularly confusing because we were stopping you well short of the allowed 255 bytes, due to the your profile's nesting depth. I'd like to obviate the need for complicated exceptions or APIs that suggest better names, by leaving naming up to the app developer as much as possible. [segue into other topics] Glenn asked about future expansions of IndexedDB to handle Blobs, specifically with respect to FileWriter and efficient incremental writes. Jonas replied: A combination of FileWriter and IndexedDB should be able to handle this without problem. This would go beyond what is currently in the IndexedDB spec, but it's this part that we're planning on experimenting with. The way I have envisioned it to work is to add a function called createFileEntry somewhere, for example the IDBFactory interface. This would return a fileEntry which you could then write to using FileWriter as well as store in the database using normal database operations. As Jonas and I have discussed in the past, I think that storing Blobs via reference in IDB works fine, but when you make them modifiable FileEntries instead, you either have to give up IDB's transactional nature or you have to give up efficiency. For large mutable Blobs, I don't think there's going to be a clean interface there. Still, I look forward to seeing what you come up with. Eric
Re: [File API: FileSystem] Path restrictions and case-sensitivity
On Wed, May 11, 2011 at 4:47 PM, timeless timel...@gmail.com wrote: On Thu, May 12, 2011 at 2:08 AM, Eric U er...@google.com wrote: Timeless replied: no, if the api is case insensitive, then it's case insensitive *everywhere*, both on Turkish and on English systems. Things could only be case sensitive when serialized to a real file system outside of the API. I'm not proposing a case insensitive system which is locale aware, i'm proposing one which always folds. You're proposing not just a case-insensitive system, but one that forces e.g. an English locale on all users, even those in a Turkish locale. I don't think that's an acceptable solution. No, I proposed case preserving. If the file is first created with a dotless i, that hint is preserved and a user agent could and should retain this (e.g. for when it serializes to a real file system). I'm just suggesting not allowing an application to ask for distinct dotted and dotless instances of the same approximate file name. There's a reasonable chance that case collisions will be disastrous when serialized, thus it's better to prevent case collisions when an application tries to create the file - the application can accept a suggested filename or generate a new one. There are a few things going on here: 1) Does the filesystem preserve case? If it's case-sensitive, then yes. If it's case-insensitive, then maybe. 2) Is it case-sensitive? If not, you have to decide how to do case folding, and that's locale-specific. As I understand it, Unicode case-folding isn't locale specific, except when you choose to use the Turkish rules, which is exactly the problem we're talking about. 3) If you're case folding, are you going to go with a single locale everywhere, or are you going to use the locale of the user? 4) [I think this is what you're talking about w.r.t. not allowing both dotted and dotless i]: Should we attempt to detect filenames that are /too similar/ for some definition of /too similar/, ostensibly to avoid confusing the user. As I read what you wrote, you wanted: 1) yes 2) no 3) a new locale in which I, ı, I and i all fold to the same letter, everywhere 4) yes, possibly only for the case of I, ı, I and i 4 is, in the general case, impossible. It's not well-defined, and is just as likely to cause problems as solve them. If you *just* want to check for ı vs. i, it's possible, but it's still not clear to me that what you're doing will be the correct behavior in Turkish locales [are there any Turkish words, names abbreviations, etc. that only differ in that character?] and it doesn't matter elsewhere.
Re: [File API: FileSystem] Path restrictions and case-sensitivity
On Wed, May 11, 2011 at 4:52 PM, Glenn Maynard gl...@zewt.org wrote: On Wed, May 11, 2011 at 7:08 PM, Eric U er...@google.com wrote: *everywhere*, both on Turkish and on English systems. Things could only be case sensitive when serialized to a real file system outside of the API. I'm not proposing a case insensitive system which is locale aware, i'm proposing one which always folds. no, if the api is case insensitive, then it's case insensitive You're proposing not just a case-insensitive system, but one that forces e.g. an English locale on all users, even those in a Turkish locale. I don't think that's an acceptable solution. I also don't think having code that works in one locale and not another [Glenn's image.jpg example] is fantastic. It was what we were stuck with when I was trying to allow implementers the choice of a pass-through implementation, but given that that's fallen to the realities of path lengths on Windows, I feel like we should try to do better. To clarify something which I wasn't aware of before digging into this deeper: Unicode case folding is *not* locale-sensitive. Unlike lowercasing, it uses the same rules in all locales, except Turkish. Turkish isn't just an easy-to-explain example of one of many differences (as it is with Unicode lowercasing); it is, as far as I see, the *only* exception. Unicode's case folding rules have a special flag to enable Turkish in case folding, which we can safely ignore here--nobody uses it for filenames. (Windows filenames don't honor that special case on Turkish systems, so those users are already accustomed to that.) So it's not locale-sensitive unless it is, but nobody does that anyway, so don't worry about it? I'm a bit uneasy about that in general, but Windows not supporting it is a good point. Anyone know about Mac or Linux systems? That said, it's still uncomfortable having a dependency on the Unicode folding table here: if it ever changes, it'll cause both interop problems and data consistency problems (two files which used to be distinct filenames turning into two files with the same filenames due to a browser update updating its Unicode data). Granted, either case would probably be vanishingly rare in practice at this point. Agreed [both in the discomfort and the rarity], but I think it's a very ugly dependency anyway. All that aside, I think a much stronger argument for case-sensitive filenames is the ability to import files from essentially any environment; this API's filename rules are almost entirely a superset of all other filesystems and file containers. For example, sites can allow importing (once the needed APIs are in place) directories of data into the sandbox, without having to modify any filenames to make it fit a more constrained API. Similarly, sites can extract tarballs directly into the sandbox. (I've seen tars containing both Makefile and makefile; maybe people only do that to confound Windows users, but they exist.) I've actually ended up in that situation on Linux, with tools that autogenerated makefiles, but were run from Makefiles. It's not a situation I really wanted to be in, but it was nice that it actually worked without me having to hack around it. I'm not liking the backslash exception. It's the only thing that prevents this API from being a complete superset, as far as I can see, of all production filesystems. Can we drop that rule? It might be a little surprising to developers who have only worked in Windows, but they'll be surprised anyway, and it shouldn't lead to latent bugs. It can't be a complete superset of all filesystems in that it doesn't allow forward slash in filenames either. However, I see your point. You could certainly have a filename with a backslash in it on a Linux/ext2 system. Does anyone else have an opinion on whether it's worth the confusion potential? Glenn: This can be solved at the application layer in applications that want it, without baking it into the filesystem API. This is mostly true; you'd have to make sure that all alterations to the filesystem went through a single choke-point or you'd have the potential for race conditions [or you'd need to store the original-case filenames yourself, and send the folded case down to the filesystem API]. Yeah, it's not necessarily easy to get right, particularly if you have multiple threads running... (The rest was Charles, by the way.) Ah, sorry Glenn and Charles. A virtual FS as the backing for the filesystem API does not resolve that core issue. It makes sense to encourage authors to gracefully handle errors thrown by creating files and directories. Such a need has already been introduced via Google Chrome's unfortunate limitation of a 255 byte max path length. -- Glenn Maynard
Re: [File API: FileSystem] Path restrictions and case-sensitivity
On Wed, May 11, 2011 at 7:14 PM, Jonas Sicking jo...@sicking.cc wrote: On Wednesday, May 11, 2011, Eric U er...@google.com wrote: I've grouped responses to bits of this thread so far below: Glenn said: If *this API's* concept of filenames is case-insensitive, then IMAGE.JPG and image.jpg represent the same file on English systems and two different files on Turkish systems, which is an interop problem. Timeless replied: no, if the api is case insensitive, then it's case insensitive *everywhere*, both on Turkish and on English systems. Things could only be case sensitive when serialized to a real file system outside of the API. I'm not proposing a case insensitive system which is locale aware, i'm proposing one which always folds. You're proposing not just a case-insensitive system, but one that forces e.g. an English locale on all users, even those in a Turkish locale. I don't think that's an acceptable solution. I also don't think having code that works in one locale and not another [Glenn's image.jpg example] is fantastic. It was what we were stuck with when I was trying to allow implementers the choice of a pass-through implementation, but given that that's fallen to the realities of path lengths on Windows, I feel like we should try to do better. Glenn: This can be solved at the application layer in applications that want it, without baking it into the filesystem API. This is mostly true; you'd have to make sure that all alterations to the filesystem went through a single choke-point or you'd have the potential for race conditions [or you'd need to store the original-case filenames yourself, and send the folded case down to the filesystem API]. Glenn: A virtual FS as the backing for the filesystem API does not resolve that core issue. It makes sense to encourage authors to gracefully handle errors thrown by creating files and directories. Such a need has already been introduced via Google Chrome's unfortunate limitation of a 255 byte max path length. That limitation grew out of the OS-dependent passthrough implementation. We're fixing that right now, with this proposal. The one take-away I have from that bug: it would have been nice to have a more descriptive error message. It took awhile to figure out that the path length was too long for the implementation. I apologize for that--it was an oversight. If we can relax the restrictions to a small set, it'll be more obvious what the problems are. IIRC this problem was particularly confusing because we were stopping you well short of the allowed 255 bytes, due to the your profile's nesting depth. I'd like to obviate the need for complicated exceptions or APIs that suggest better names, by leaving naming up to the app developer as much as possible. [segue into other topics] Glenn asked about future expansions of IndexedDB to handle Blobs, specifically with respect to FileWriter and efficient incremental writes. Jonas replied: A combination of FileWriter and IndexedDB should be able to handle this without problem. This would go beyond what is currently in the IndexedDB spec, but it's this part that we're planning on experimenting with. The way I have envisioned it to work is to add a function called createFileEntry somewhere, for example the IDBFactory interface. This would return a fileEntry which you could then write to using FileWriter as well as store in the database using normal database operations. As Jonas and I have discussed in the past, I think that storing Blobs via reference in IDB works fine, but when you make them modifiable FileEntries instead, you either have to give up IDB's transactional nature or you have to give up efficiency. For large mutable Blobs, I don't think there's going to be a clean interface there. Still, I look forward to seeing what you come up with. Why not simply make the API case sensitive and allow *any* filename that can be expressed in JavaScript strings. That's the way I'm leaning. Implementations can do their best to make the on-filesystem-filename match as close as they can to the filename exposed in the API and keep a map which maps between OS filename and API filename for the cases when the two can't be the same. We're not speccing out anything outside the sandbox yet, and we've decided that a pass-through implementation is impractical, so we don't need this approach yet, there being no on-filesystem-filename. It certainly could work for the oft-mentioned My Photos extension, when we get around to that. So if the pake creates two files named Makefile and makefile on a system that is case insensitive, the implementation could call the second file makefile(2) and keep track of that mapping. This removes any concerns about case, internationalization and system limitation issues and thereby makes things very easy for web authors. I might be missing something obvious as I haven't
[File API: FileSystem] Path restrictions and case-sensitivity
I'd like to bring back up the discussion that went on at [1] and [2]. In particular, I'd like to propose a minimal set of restrictions for file names and paths, punt on the issue of what happens in later layers of the API, and discuss case-sensitivity rules. For the sandboxed filesystem, I propose that we disallow only: * Embedded null characters [will likely break something somewhere] * Embedded forward slash (/) [it's our delimiter] * Embedded backslash (\) [will likely confuse people if we permit it] * Files called '.' [has a meaning for us already] * Files called '..' [has a meaning for us already] * Path segments longer than 1KB [probably long enough, and I feel better having a limit] ...and explicitly support anything other than that. I'm not proposing a maximum path length at this time...perhaps we should just say MUST support at least X for some large X? Regarding case sensitivity: I originally specced it as case-insensitive-case-preserving to make it easier to support a passthrough implementation on Windows and Mac. However, as passthroughs have turned out to be unfeasible [see previous thread on path length problems], all case insensitivity really gets us is potential locale issues. I suggest we drop it and just go with a case-sensitive filesystem. Eric [1] http://lists.w3.org/Archives/Public/public-webapps/2010OctDec/1031.html [2] http://lists.w3.org/Archives/Public/public-webapps/2011JanMar/0704.html