RE: flush() | was Re: FileSystem API Comments
Hi Arun, We believe that the flush property should be specified when getting the file handle as in option 1. One benefit of this is that it will enable the buffer of both reads and writes for the same handle. On the other hand, if we specify it on every write operation (as in option 2) we could run into inconsistencies when invoking the write method with and without the flag. Based on our interpretation of option 1, it seems as though the flush() function would not be available. This is how we perceived its usage would look like: navigator.getFileSystem().then(function(root) { return root.openWrite(path/to/file.txt, {autoFlushing: true}); }).then(function(fileHandle) { fileHandle.write(blob); // data is written to the system cache and is flushed to disk without delay }); Thank you, Ali From: Arun Ranganathan [mailto:a...@mozilla.com] Sent: Friday, October 31, 2014 11:19 AM To: Ali Alabbas Cc: Web Applications Working Group WG Subject: flush() | was Re: FileSystem API Comments Greetings Ali! I've been thinking about the discussion of flush(), and would like to see if I can make my previous statement a bit more nuanced. It turns out that flush() (in the vein of fsync/sync) is pretty useful, and after discussion with a few folks within Mozilla, I realize that it isn't as simple as tacking it on to the write-family of Promises - as you point out, it is a potentially expensive operation. Something like a flush feature might help the following use cases: 1. Creating a database technology on top of the filesystem technology. This might include IndexedDB, but also WebSQL (as a hypothetical example). Most transactional operations like this need the ability to do something like flush. 2. Then, there's the use case of compiling C++ codebases to JS. Well-known examples of this are games, leveraging asm.js. In this genre of use case, sometimes a large database is brought over (e.g. sqlite). It could be memory backed, but it is a definite bonus if it could be filesystem backed. Something like flush helps make that a possibility. Now the question is how to do this in a WebAPI, allowing for the power along with the mitigations that a web app might need, notably for performance? A few ideas below: On Oct 21, 2014, at 4:36 PM, Ali Alabbas a...@microsoft.com wrote: * flush() - This is costly functionality to expose and is likely to be overused by callers. It would be beneficial to automatically flush changes to disk by allowing the default file write behavior by the OS. For example, on Windows, we would leave it up to the filesystem cache to determine the best time to flush to disk. This is non-deterministic from the app's point of view, but the only time it is a potential problem is when there's a hard power-off. Most apps should not be concerned with this; only apps that have very high data reliability requirements would need the granular control of flushing to disk. In those cases a developer should use IndexedDB. So we should consider obscuring this functionality since it's not a common requirement and has a performance impact if it's widely used. I agree with the idea of obscuring the functionality a bit, especially given that it might not be necessary for a large class of operations. A few ways to do that: 1. Add this to a dictionary option when coining the FileHandleWritable from the Directory (e.g. add it to something like the OpeWriteOptions: http://w3c.github.io/filesystem-api/Overview.html#widl-Directory-openWrite-Promise-FileHandleWritable--DOMString-File-path-OpenWriteOptions-options). This way, the developer has the ability to coin a more expensive promise, if that particular set of write operations needs this feature. 2. Add this to the set of options on the FileHandleWritable. This could be by dictionary, again. Or, it could be a boolean on the FileHandleWritable's write(). This latter might not be specific enough. Like other implementations, ours is not going to buffer anything, but rely on the underlying operating system's buffer for writes and reads. 3. Stick with the idea of a method, like flush(). In this case, we might have to caveat the use of this, since the possibility of inexperienced developer misuse is high :-) It might help to see if we can determine some boundaries on this. Any feedback on some of these options would be valuable. I am thinking of 1. and 2. - A*
flush() | was Re: FileSystem API Comments
Greetings Ali! I’ve been thinking about the discussion of flush(), and would like to see if I can make my previous statement a bit more nuanced. It turns out that flush() (in the vein of fsync/sync) is pretty useful, and after discussion with a few folks within Mozilla, I realize that it isn’t as simple as tacking it on to the “write-family” of Promises — as you point out, it is a potentially expensive operation. Something like a flush feature might help the following use cases: 1. Creating a database technology on top of the filesystem technology. This might include IndexedDB, but also WebSQL (as a hypothetical example). Most transactional operations like this need the ability to do something like flush. 2. Then, there’s the use case of compiling C++ codebases to JS. Well-known examples of this are games, leveraging asm.js. In this genre of use case, sometimes a large database is brought over (e.g. sqlite). It could be memory backed, but it is a definite bonus if it could be filesystem backed. Something like flush helps make that a possibility. Now the question is how to do this in a WebAPI, allowing for the power along with the mitigations that a web app might need, notably for performance? A few ideas below: On Oct 21, 2014, at 4:36 PM, Ali Alabbas a...@microsoft.com wrote: * flush() - This is costly functionality to expose and is likely to be overused by callers. It would be beneficial to automatically flush changes to disk by allowing the default file write behavior by the OS. For example, on Windows, we would leave it up to the filesystem cache to determine the best time to flush to disk. This is non-deterministic from the app's point of view, but the only time it is a potential problem is when there's a hard power-off. Most apps should not be concerned with this; only apps that have very high data reliability requirements would need the granular control of flushing to disk. In those cases a developer should use IndexedDB. So we should consider obscuring this functionality since it's not a common requirement and has a performance impact if it's widely used. I agree with the idea of obscuring the functionality a bit, especially given that it might not be necessary for a large class of operations. A few ways to do that: 1. Add this to a dictionary option when coining the FileHandleWritable from the Directory (e.g. add it to something like the OpeWriteOptions: http://w3c.github.io/filesystem-api/Overview.html#widl-Directory-openWrite-Promise-FileHandleWritable--DOMString-File-path-OpenWriteOptions-options). This way, the developer has the ability to “coin” a “more expensive” promise, if that particular set of write operations needs this feature. 2. Add this to the set of options on the FileHandleWritable. This could be by dictionary, again. Or, it could be a boolean on the FileHandleWritable’s write(). This latter might not be specific enough. Like other implementations, ours is not going to buffer anything, but rely on the underlying operating system’s buffer for writes and reads. 3. Stick with the idea of a method, like flush(). In this case, we might have to caveat the use of this, since the possibility of inexperienced developer misuse is high :-) It might help to see if we can determine some boundaries on this. Any feedback on some of these options would be valuable. I am thinking of 1. and 2. — A*
RE: FileSystem API Comments
On Tue Oct 21 09:36 PM, Jonas Sicking wrote: 1.1 Use cases (3. Audio/Photo editor with offline access or local cache for speed) * Edited files should be accessible by other client-side applications - Having the sandboxed file system share its contents between all apps would allow apps to tamper with the files of another app. This could result in corrupted files and perhaps an invalid state for some apps that expect certain contents to exist in a file. This makes us wonder: should we warn users about files that are being opened and written to? Each origin has a separate sandboxed filesystem. There is no way for websites to read each other's filesystems. This is no different from IndexedDB or localStorage. This also means that we have the same prompting behavior, the same Quota Management dependency and the same security model as IndexedDB and localStorage. That contradicts: - Edited files should be accessible by other client-side applications The api should allow for editing a 'shared folder' which multiple applications / web apps can access. That implies a sort of locking/unlocking api: e.g. photo editor fs = api.getFileSystem({shareName: photos}).then((dir) = { dir.openWrite(pic.jpeg) }); super photo viewer fs = api.getFileSystem({shareName: photos}).then((dir) = { dir.openRead(pic.jpeg) }); What happens with the pic.jpeg?
Re: FileSystem API Comments
I don't see a contradiction. Each *web* app sees only files accessible from its domain (so your two apps have distinct pic.jpeg). Each *native* app has access to whatever the operating system says. Or am I missing something in your message? Cheers, David On 22/10/14 12:23, Jonathan Bond-Caron wrote: That contradicts: - Edited files should be accessible by other client-side applications The api should allow for editing a 'shared folder' which multiple applications / web apps can access. That implies a sort of locking/unlocking api: e.g. photo editor fs = api.getFileSystem({shareName: photos}).then((dir) = { dir.openWrite(pic.jpeg) }); super photo viewer fs = api.getFileSystem({shareName: photos}).then((dir) = { dir.openRead(pic.jpeg) }); What happens with the pic.jpeg? -- David Rajchenbach-Teller, PhD Performance Team, Mozilla signature.asc Description: OpenPGP digital signature
Re: FileSystem API Comments
22.10.2014, 12:32, David Rajchenbach-Teller dtel...@mozilla.com: I don't see a contradiction. Each *web* app sees only files accessible from its domain (so your two apps have distinct pic.jpeg). Each *native* app has access to whatever the operating system says. There are a lot of use cases for sharing data with apps of *different* origins, although there is of course a more complex security story than when everything goes into a potentially opaque sandbox. (And to make the basic security story work it makes sense to have some level of opacity in the sandbox). The lack of a mechanism to do so is a huge difference with native - I have directories in my filesystem that are autosynched to things online, but are also visible. The idea behind web intents/activites/etc generalises obviously to remove the distinction between web and native - I should be able to use a web-based image manipulation tool on stuff in my filesystem. Or several. At the moment that can be done in a somewhat hacky way by uploading files, manipulating them, then asking the user to save them back. But whereas I have mail clients that store each email message on the filesystem, so I can import stuff into a different program myself instead of having to go through a service provider, that doesn't work for web-based email systems even when those are designed to be functional offline. etc etc. cheers Chaals Or am I missing something in your message? Cheers, David On 22/10/14 12:23, Jonathan Bond-Caron wrote: That contradicts: - Edited files should be accessible by other client-side applications The api should allow for editing a 'shared folder' which multiple applications / web apps can access. That implies a sort of locking/unlocking api: e.g. photo editor fs = api.getFileSystem({shareName: photos}).then((dir) = { dir.openWrite(pic.jpeg) }); super photo viewer fs = api.getFileSystem({shareName: photos}).then((dir) = { dir.openRead(pic.jpeg) }); What happens with the pic.jpeg? -- David Rajchenbach-Teller, PhD Performance Team, Mozilla -- Charles McCathie Nevile - web standards - CTO Office, Yandex cha...@yandex-team.ru - - - Find more at http://yandex.com
Re: FileSystem API Comments
Ali, First, thanks for your timely comments :) I’m in the process of editing the FileSystem API. Responses inline: On Oct 21, 2014, at 4:36 PM, Ali Alabbas a...@microsoft.com wrote: 1.1 Use cases (3. Audio/Photo editor with offline access or local cache for speed) * Edited files should be accessible by other client-side applications - Having the sandboxed file system share its contents between all apps would allow apps to tamper with the files of another app. snip / Admitedly, these use cases have been borrowed from the “File API: Directories and System” specification (which is now a W3C Note), at least for the purpose of providing equivalent functionality. In practice, everything you’ve pointed out makes it a hard problem to solve. The per-origin sandbox model also raises file lock issues on multiple access, but they are probably easier to solve, and not as prevalent. Also, we’re going to forego the “temporary” and “persistent” distinctions that are in the draft I think. And while there’s a technical dependency on Quota Manager, I don’t think there’s a spec. dependency in terms of API. Of course, certain Directory operations may reject a promise with a quota error. 3. The Directory Interface * Change events - I would like to revisit the discussion on apps getting notifications of changes to files/directories. This is a good point; right now there’s no way to do this. I’m open to suggestions. An early scratch pad version of my spec. changes proposed Directory as an EventTarget also, but… this won’t work for a variety of reasons. * removeDeep() move() - Do these support links or junctions? If not, what is the expected behavior? No; but the entire API doesn’t support these right now. * enumerate() - It would be useful to have pre-filtering support for the following: file/directory, ranges, wildcard search. Currently we would be forced to enumerate the entire set and manually filter out the items we want from the result set. I completely agree this would be useful, but there’s a problem to solve even before we get there! Right now, we say that we’ll fullfill the enumerate promise with something called “EventStream” which was initially a Tab Atkins proposal, and which would be really useful to get right. It’s underspecified right now, but I’m a fan of it ;-) We’ll have to think about how to return wildcard searches, etc., and how to annotate results in the result set. 4. The FileHandle Interface * FileHandles - Is this basically going to be the first to get the handle gets to use it and all subsequent calls need to wait for the file handle to become available again? Are there more details about the locking model used here? Yes; essentially the “first invocation” uses then releases it. A version of this problem was encountered when specifying FileReader (http://dev.w3.org/2006/webapi/FileAPI/#dfn-filereader) which used the internal state (but also accessible to the developer) “LOADING” to prevent multiple concurrent reads. * Auto-closing of FileHandles - This may cause confusion as it does not match the common developer mental model of a file handle which is “opened” and then available for use until it's “closed”. Perhaps it would be advantageous to have an explicit close function as part of the FileHandle interface? There are pros and cons either way. I’d be interested in solving this for the lion’s share of use cases. I’m not strongly opinionated on the matter of an explicit close function (we have one on Blob, for example), but it seems even this has drawbacks. * AbortableProgressPromise - It is not clear how a developer would define the abort callback of an AbortableProgressPromise. It seems that the user agent would be responsible for setting the abort callback since it instantiates and returns the AbortableProgressPromise. We’re going to not use an AbortableProgressPromise, but we will probably have a new beast called CancelablePromise. 5. The FileHandleWritable Interface * write() flush() - It might be useful to have support for “transacted” streams where the caller can write to a copy of the file and then have it atomically replaced: swap the old file with the new one and then delete the old file. Agreed. * flush() - This is costly functionality to expose and is likely to be overused by callers. Agreed — let’s flush flush(). 6. FileSystem Configuration Parameters * Dictionary DestinationDict - The DestinationDict seems to exist to facilitate the renaming of a directory in conjunction with a move. However, the same operation is done differently for files which makes the functionality non-uniform. Perhaps we can add a rename() function to make it more intuitive? I’ll commit to sample code and more “spec text” to make this clearer in my next
FileSystem API Comments
Hello, I'm with the IE Platform team at Microsoft. We have a few comments on the latest editor's draft of the newly proposed FileSystem API [1]. 1.1 Use cases (3. Audio/Photo editor with offline access or local cache for speed) * Edited files should be accessible by other client-side applications - Having the sandboxed file system share its contents between all apps would allow apps to tamper with the files of another app. This could result in corrupted files and perhaps an invalid state for some apps that expect certain contents to exist in a file. This makes us wonder: should we warn users about files that are being opened and written to? If an app is just doing a read, can it open a file or directory without the user's permission, or could this pose a possible issue as well? Also, is the Quota Management API going to be a dependency? It's unclear what we would do with regards to requesting permission to access files. Will this spec be responsible for defining what questions/permission inquiries are presented and when they are presented to the user? For example, what happens when one file is locked for use by a different application? Is the user notified and given the option to open a read-only copy of that file? 3. The Directory Interface * Change events - I would like to revisit the discussion on apps getting notifications of changes to files/directories. There are many scenarios where an application would want to react to renames/moves of a file/directory. There would also be value in being notified of a change to a directory's structure. If an app has a file browser that allows a user to select files and/or directories and another app makes changes to the sandboxed filesystem, then it would be expected that the first app should be notified and would be able to refresh its directory tree. Otherwise it would require the user to somehow force a refresh which would not be a good user experience since the user would expect the file browser to update on its own. * removeDeep() move() - Do these support links or junctions? If not, what is the expected behavior? * enumerate() - It would be useful to have pre-filtering support for the following: file/directory, ranges, wildcard search. Currently we would be forced to enumerate the entire set and manually filter out the items we want from the result set. - Callers often know exactly whether or not they want to enumerate files or folders. For example, an image upload service may only be interested in the files present in a directory rather than all of its directories. Perhaps it would be useful to have enumerateFiles() and enumerateDirectories() for this purpose? Or we could have another argument for enumerate() that is an enum (directory, file). - Supporting optimized pagination of large directories. We could have arguments for a starting index and length we would be able to specify a range of items to retrieve within the result set. - Supporting the wildcard character to pre-filter a list of files/directories (e.g. *.jpg). 4. The FileHandle Interface * FileHandles - Is this basically going to be the first to get the handle gets to use it and all subsequent calls need to wait for the file handle to become available again? Are there more details about the locking model used here? * Auto-closing of FileHandles - This may cause confusion as it does not match the common developer mental model of a file handle which is opened and then available for use until it's closed. Perhaps it would be advantageous to have an explicit close function as part of the FileHandle interface? With the current behavior there can be overhead with the unintended closure of the FileHandle that would require a developer to continuously open/close a FileHandle. The currently defined behavior assumes that a developer is done with all their file manipulations when they have completed a promise chain. However, a developer may want to keep the FileHandle open to be used elsewhere at some other point in time that is not related to the current promise chain. An example of the usefulness of having an explicit close function is if you were to implement a word processor and wanted to lock down the file that it currently has open for the period of its editing. This way you are free to continue operating on that file for the duration that it is open, protecting the file from other processes, and not having to undergo the costly setup and teardown of a file handle. * AbortableProgressPromise - It is not clear how a developer would define the abort callback of an AbortableProgressPromise. It seems that the user agent would be responsible for setting the abort callback since it instantiates and returns the AbortableProgressPromise. 5. The FileHandleWritable Interface * write() flush() - It might be useful to have support for transacted streams where
Re: FileSystem API Comments
On 10/21/14 4:36 PM, Ali Alabbas wrote: Hello, I'm with the IE Platform team at Microsoft. We have a few comments on the latest editor's draft of the newly proposed FileSystem API [1]. I believe [1] is Arun's http://w3c.github.io/filesystem-api/Overview.html. 1.1 Use cases (3. Audio/Photo editor with offline access or local cache for speed) * Edited files should be accessible by other client-side applications - Having the sandboxed file system share its contents between all apps would allow apps to tamper with the files of another app. This could result in corrupted files and perhaps an invalid state for some apps that expect certain contents to exist in a file. This makes us wonder: should we warn users about files that are being opened and written to? If an app is just doing a read, can it open a file or directory without the user's permission, or could this pose a possible issue as well? Also, is the Quota Management API going to be a dependency? It's unclear what we would do with regards to requesting permission to access files. Will this spec be responsible for defining what questions/permission inquiries are presented and when they are presented to the user? For example, what happens when one file is locked for use by a different application? Is the user notified and given the option to open a read-only copy of that file? 3. The Directory Interface * Change events - I would like to revisit the discussion on apps getting notifications of changes to files/directories. There are many scenarios where an application would want to react to renames/moves of a file/directory. There would also be value in being notified of a change to a directory's structure. If an app has a file browser that allows a user to select files and/or directories and another app makes changes to the sandboxed filesystem, then it would be expected that the first app should be notified and would be able to refresh its directory tree. Otherwise it would require the user to somehow force a refresh which would not be a good user experience since the user would expect the file browser to update on its own. * removeDeep() move() - Do these support links or junctions? If not, what is the expected behavior? * enumerate() - It would be useful to have pre-filtering support for the following: file/directory, ranges, wildcard search. Currently we would be forced to enumerate the entire set and manually filter out the items we want from the result set. - Callers often know exactly whether or not they want to enumerate files or folders. For example, an image upload service may only be interested in the files present in a directory rather than all of its directories. Perhaps it would be useful to have enumerateFiles() and enumerateDirectories() for this purpose? Or we could have another argument for enumerate() that is an enum (directory, file). - Supporting optimized pagination of large directories. We could have arguments for a starting index and length we would be able to specify a range of items to retrieve within the result set. - Supporting the wildcard character to pre-filter a list of files/directories (e.g. *.jpg). 4. The FileHandle Interface * FileHandles - Is this basically going to be the first to get the handle gets to use it and all subsequent calls need to wait for the file handle to become available again? Are there more details about the locking model used here? * Auto-closing of FileHandles - This may cause confusion as it does not match the common developer mental model of a file handle which is “opened” and then available for use until it's “closed”. Perhaps it would be advantageous to have an explicit close function as part of the FileHandle interface? With the current behavior there can be overhead with the unintended closure of the FileHandle that would require a developer to continuously open/close a FileHandle. The currently defined behavior assumes that a developer is done with all their file manipulations when they have completed a promise chain. However, a developer may want to keep the FileHandle open to be used elsewhere at some other point in time that is not related to the current promise chain. An example of the usefulness of having an explicit close function is if you were to implement a word processor and wanted to lock down the file that it currently has open for the period of its editing. This way you are free to continue operating on that file for the duration that it is open, protecting the file from other processes, and not having to undergo the costly setup and teardown of a file handle. * AbortableProgressPromise - It is not clear how a developer would define the abort callback of an AbortableProgressPromise. It seems that the user agent would be responsible for setting the abort callback since it instantiates and returns the AbortableProgressPromise. 5. The FileHandleWritable Interface * write() flush()