RE: flush() | was Re: FileSystem API Comments

2014-11-06 Thread Ali Alabbas
Hi Arun,

We believe that the flush property should be specified when getting the file 
handle as in option 1. One benefit of this is that it will enable the buffer of 
both reads and writes for the same handle. On the other hand, if we specify it 
on every write operation (as in option 2) we could run into inconsistencies 
when invoking the write method with and without the flag.

Based on our interpretation of option 1, it seems as though the flush() 
function would not be available. This is how we perceived its usage would look 
like:

navigator.getFileSystem().then(function(root) {
return root.openWrite(path/to/file.txt, {autoFlushing: true});
}).then(function(fileHandle) {
fileHandle.write(blob);  // data is written to the system cache and is 
flushed to disk without delay
});

Thank you,
Ali


From: Arun Ranganathan [mailto:a...@mozilla.com] 
Sent: Friday, October 31, 2014 11:19 AM
To: Ali Alabbas
Cc: Web Applications Working Group WG
Subject: flush() | was Re: FileSystem API Comments

Greetings Ali!

I've been thinking about the discussion of flush(), and would like to see if I 
can make my previous statement a bit more nuanced. It turns out that flush() 
(in the vein of fsync/sync) is pretty useful, and after discussion with a few 
folks within Mozilla, I realize that it isn't as simple as tacking it on to the 
write-family of Promises - as you point out, it is a potentially expensive 
operation.

Something like a flush feature might help the following use cases:

1. Creating a database technology on top of the filesystem technology. This 
might include IndexedDB, but also WebSQL (as a hypothetical example). Most 
transactional operations like this need the ability to do something like flush.

2. Then, there's the use case of compiling C++ codebases to JS. Well-known 
examples of this are games, leveraging asm.js. In this genre of use case, 
sometimes a large database is brought over (e.g. sqlite). It could be memory 
backed, but it is a definite bonus if it could be filesystem backed. Something 
like flush helps make that a possibility.

Now the question is how to do this in a WebAPI, allowing for the power along 
with the mitigations that a web app might need, notably for performance? A few 
ideas below:

On Oct 21, 2014, at 4:36 PM, Ali Alabbas a...@microsoft.com wrote:


 * flush()
 - This is costly functionality to expose and is likely to be overused by 
callers. It would be beneficial to automatically flush changes to disk by 
allowing the default file write behavior by the OS. For example, on Windows, we 
would leave it up to the filesystem cache to determine the best time to flush 
to disk. This is non-deterministic from the app's point of view, but the only 
time it is a potential problem is when there's a hard power-off. Most apps 
should not be concerned with this; only apps that have very high data 
reliability requirements would need the granular control of flushing to disk. 
In those cases a developer should use IndexedDB. So we should consider 
obscuring this functionality since it's not a common requirement and has a 
performance impact if it's widely used.


I agree with the idea of obscuring the functionality a bit, especially given 
that it might not be necessary for a large class of operations. A few ways to 
do that:

1. Add this to a dictionary option when coining the FileHandleWritable from the 
Directory (e.g. add it to something like the OpeWriteOptions: 
http://w3c.github.io/filesystem-api/Overview.html#widl-Directory-openWrite-Promise-FileHandleWritable--DOMString-File-path-OpenWriteOptions-options).

This way, the developer has the ability to coin a more expensive promise, 
if that particular set of write operations needs this feature.

2. Add this to the set of options on the FileHandleWritable.

This could be by dictionary, again. Or, it could be a boolean on the 
FileHandleWritable's write(). This latter might not be specific enough. Like 
other implementations, ours is not going to buffer anything, but rely on the 
underlying operating system's buffer for writes and reads.

3. Stick with the idea of a method, like flush(). In this case, we might have 
to caveat the use of this, since the possibility of inexperienced developer 
misuse is high :-) It might help to see if we can determine some boundaries on 
this.

Any feedback on some of these options would be valuable. I am thinking of 1. 
and 2.

- A*






flush() | was Re: FileSystem API Comments

2014-10-31 Thread Arun Ranganathan
Greetings Ali!

I’ve been thinking about the discussion of flush(), and would like to see if I 
can make my previous statement a bit more nuanced. It turns out that flush() 
(in the vein of fsync/sync) is pretty useful, and after discussion with a few 
folks within Mozilla, I realize that it isn’t as simple as tacking it on to the 
“write-family” of Promises — as you point out, it is a potentially expensive 
operation.

Something like a flush feature might help the following use cases:

1. Creating a database technology on top of the filesystem technology. This 
might include IndexedDB, but also WebSQL (as a hypothetical example). Most 
transactional operations like this need the ability to do something like flush.

2. Then, there’s the use case of compiling C++ codebases to JS. Well-known 
examples of this are games, leveraging asm.js. In this genre of use case, 
sometimes a large database is brought over (e.g. sqlite). It could be memory 
backed, but it is a definite bonus if it could be filesystem backed. Something 
like flush helps make that a possibility.

Now the question is how to do this in a WebAPI, allowing for the power along 
with the mitigations that a web app might need, notably for performance? A few 
ideas below:

On Oct 21, 2014, at 4:36 PM, Ali Alabbas a...@microsoft.com wrote:

  * flush()
  - This is costly functionality to expose and is likely to be overused by 
 callers. It would be beneficial to automatically flush changes to disk by 
 allowing the default file write behavior by the OS. For example, on Windows, 
 we would leave it up to the filesystem cache to determine the best time to 
 flush to disk. This is non-deterministic from the app's point of view, but 
 the only time it is a potential problem is when there's a hard power-off. 
 Most apps should not be concerned with this; only apps that have very high 
 data reliability requirements would need the granular control of flushing to 
 disk. In those cases a developer should use IndexedDB. So we should consider 
 obscuring this functionality since it's not a common requirement and has a 
 performance impact if it's widely used.


I agree with the idea of obscuring the functionality a bit, especially given 
that it might not be necessary for a large class of operations. A few ways to 
do that:

1. Add this to a dictionary option when coining the FileHandleWritable from the 
Directory (e.g. add it to something like the OpeWriteOptions: 
http://w3c.github.io/filesystem-api/Overview.html#widl-Directory-openWrite-Promise-FileHandleWritable--DOMString-File-path-OpenWriteOptions-options).

This way, the developer has the ability to “coin” a “more expensive” promise, 
if that particular set of write operations needs this feature.

2. Add this to the set of options on the FileHandleWritable.

This could be by dictionary, again. Or, it could be a boolean on the 
FileHandleWritable’s write(). This latter might not be specific enough. Like 
other implementations, ours is not going to buffer anything, but rely on the 
underlying operating system’s buffer for writes and reads.

3. Stick with the idea of a method, like flush(). In this case, we might have 
to caveat the use of this, since the possibility of inexperienced developer 
misuse is high :-) It might help to see if we can determine some boundaries on 
this.

Any feedback on some of these options would be valuable. I am thinking of 1. 
and 2.

— A*





RE: FileSystem API Comments

2014-10-22 Thread Jonathan Bond-Caron
On Tue Oct 21 09:36 PM, Jonas Sicking wrote:
  1.1 Use cases (3. Audio/Photo editor with offline access or local
  cache for
  speed)
 
* Edited files should be accessible by other client-side
  applications
 
   - Having the sandboxed file system share its contents between all
  apps would allow apps to tamper with the files of another app. This
  could result in corrupted files and perhaps an invalid state for some
  apps that expect certain contents to exist in a file. This makes us
  wonder: should we warn users about files that are being opened and
  written to? 
 
 Each origin has a separate sandboxed filesystem. There is no way for websites
 to read each other's filesystems. This is no different from IndexedDB or
 localStorage. This also means that we have the same prompting behavior, the
 same Quota Management dependency and the same security model as
 IndexedDB and localStorage.
 

That contradicts:
- Edited files should be accessible by other client-side applications

The api should allow for editing a 'shared folder' which multiple applications 
/ web apps can access.
That implies a sort of locking/unlocking api:

e.g.
photo editor
fs = api.getFileSystem({shareName: photos}).then((dir) = { 
dir.openWrite(pic.jpeg) });

super photo viewer
fs = api.getFileSystem({shareName: photos}).then((dir) = { 
dir.openRead(pic.jpeg) });

What happens with the pic.jpeg?




Re: FileSystem API Comments

2014-10-22 Thread David Rajchenbach-Teller
I don't see a contradiction.
Each *web* app sees only files accessible from its domain (so your two
apps have distinct pic.jpeg).
Each *native* app has access to whatever the operating system says.

Or am I missing something in your message?

Cheers,
 David

On 22/10/14 12:23, Jonathan Bond-Caron wrote:
 That contradicts:
 - Edited files should be accessible by other client-side applications
 
 The api should allow for editing a 'shared folder' which multiple 
 applications / web apps can access.
 That implies a sort of locking/unlocking api:
 
 e.g.
 photo editor
 fs = api.getFileSystem({shareName: photos}).then((dir) = { 
 dir.openWrite(pic.jpeg) });
 
 super photo viewer
 fs = api.getFileSystem({shareName: photos}).then((dir) = { 
 dir.openRead(pic.jpeg) });
 
 What happens with the pic.jpeg?
 
 


-- 
David Rajchenbach-Teller, PhD
 Performance Team, Mozilla



signature.asc
Description: OpenPGP digital signature


Re: FileSystem API Comments

2014-10-22 Thread chaals


22.10.2014, 12:32, David Rajchenbach-Teller dtel...@mozilla.com:
 I don't see a contradiction.
 Each *web* app sees only files accessible from its domain (so your two
 apps have distinct pic.jpeg).
 Each *native* app has access to whatever the operating system says.

There are a lot of use cases for sharing data with apps of *different* origins, 
although there is of course a more complex security story than when everything 
goes into a potentially opaque sandbox. (And to make the basic security story 
work it makes sense to have some level of opacity in the sandbox).

The lack of a mechanism to do so is a huge difference with native - I have 
directories in my filesystem that are autosynched to things online, but are 
also visible.

The idea behind web intents/activites/etc generalises obviously to remove the 
distinction between web and native - I should be able to use a web-based image 
manipulation tool on stuff in my filesystem. Or several.

At the moment that can be done in a somewhat hacky way by uploading files, 
manipulating them, then asking the user to save them back. But whereas I have 
mail clients that store each email message on the filesystem, so I can import 
stuff into a different program myself instead of having to go through a service 
provider, that doesn't work for web-based email systems even when those are 
designed to be functional offline.

etc etc.

cheers

Chaals

 Or am I missing something in your message?

 Cheers,
  David

 On 22/10/14 12:23, Jonathan Bond-Caron wrote:
  That contradicts:
  - Edited files should be accessible by other client-side applications

  The api should allow for editing a 'shared folder' which multiple 
 applications / web apps can access.
  That implies a sort of locking/unlocking api:

  e.g.
  photo editor
  fs = api.getFileSystem({shareName: photos}).then((dir) = { 
 dir.openWrite(pic.jpeg) });

  super photo viewer
  fs = api.getFileSystem({shareName: photos}).then((dir) = { 
 dir.openRead(pic.jpeg) });

  What happens with the pic.jpeg?

 --
 David Rajchenbach-Teller, PhD
  Performance Team, Mozilla

--
Charles McCathie Nevile - web standards - CTO Office, Yandex
cha...@yandex-team.ru - - - Find more at http://yandex.com



Re: FileSystem API Comments

2014-10-22 Thread Arun Ranganathan
Ali,

First, thanks for your timely comments :) I’m in the process of editing the 
FileSystem API.

Responses inline:

On Oct 21, 2014, at 4:36 PM, Ali Alabbas a...@microsoft.com wrote:

  
 1.1 Use cases (3. Audio/Photo editor with offline access or local cache for 
 speed)
  
   * Edited files should be accessible by other client-side applications
  - Having the sandboxed file system share its contents between all apps 
 would allow apps to tamper with the files of another app. 


snip /

Admitedly, these use cases have been borrowed from the “File API: Directories 
and System” specification (which is now a W3C Note), at least for the purpose 
of providing equivalent functionality. In practice, everything you’ve pointed 
out makes it a hard problem to solve. 

The per-origin sandbox model also raises file lock issues on multiple access, 
but they are probably easier to solve, and not as prevalent.

Also, we’re going to forego the “temporary” and “persistent” distinctions that 
are in the draft I think. And while there’s a technical dependency on Quota 
Manager, I don’t think there’s a spec. dependency in terms of API. Of course, 
certain Directory operations may reject a promise with a quota error.


  
  
 3. The Directory Interface
  
   * Change events
  - I would like to revisit the discussion on apps getting notifications 
 of changes to files/directories. 


This is a good point; right now there’s no way to do this. I’m open to 
suggestions. An early scratch pad version of my spec. changes proposed 
Directory as an EventTarget also, but… this won’t work for a variety of reasons.



  
   * removeDeep()  move()
  - Do these support links or junctions? If not, what is the expected 
 behavior?


No; but the entire API doesn’t support these right now.


  
   * enumerate()
  - It would be useful to have pre-filtering support for the following: 
 file/directory, ranges, wildcard search. Currently we would be forced to 
 enumerate the entire set and manually filter out the items we want from the 
 result set.


I completely agree this would be useful, but there’s a problem to solve even 
before we get there! Right now, we say that we’ll fullfill the enumerate 
promise with something called “EventStream” which was initially a Tab Atkins 
proposal, and which would be really useful to get right. It’s underspecified 
right now, but I’m a fan of it ;-)

We’ll have to think about how to return wildcard searches, etc., and how to 
annotate results in the result set.


  
 4. The FileHandle Interface
  
   * FileHandles
  - Is this basically going to be the first to get the handle gets to use 
 it and all subsequent calls need to wait for the file handle to become 
 available again? Are there more details about the locking model used here?


Yes; essentially the “first invocation” uses then releases it.

A version of this problem was encountered when specifying FileReader 
(http://dev.w3.org/2006/webapi/FileAPI/#dfn-filereader) which used the internal 
state (but also accessible to the developer) “LOADING” to prevent multiple 
concurrent reads.


  
   * Auto-closing of FileHandles
  - This may cause confusion as it does not match the common developer 
 mental model of a file handle which is “opened” and then available for use 
 until it's “closed”. Perhaps it would be advantageous to have an explicit 
 close function as part of the FileHandle interface? 


There are pros and cons either way. I’d be interested in solving this for the 
lion’s share of use cases. I’m not strongly opinionated on the matter of an 
explicit close function (we have one on Blob, for example), but it seems even 
this has drawbacks.


   * AbortableProgressPromise
  - It is not clear how a developer would define the abort callback of an 
 AbortableProgressPromise. It seems that the user agent would be responsible 
 for setting the abort callback since it instantiates and returns the 
 AbortableProgressPromise.


We’re going to not use an AbortableProgressPromise, but we will probably have a 
new beast called CancelablePromise.


  
  
 5. The FileHandleWritable Interface
  
   * write()  flush()
  - It might be useful to have support for “transacted” streams where the 
 caller can write to a copy of the file and then have it atomically replaced: 
 swap the old file with the new one and then delete the old file. 


Agreed.


  
   * flush()
  - This is costly functionality to expose and is likely to be overused by 
 callers. 


Agreed — let’s flush flush().


 6. FileSystem Configuration Parameters
  
   * Dictionary DestinationDict
  - The DestinationDict seems to exist to facilitate the renaming of a 
 directory in conjunction with a move. However, the same operation is done 
 differently for files which makes the functionality non-uniform. Perhaps we 
 can add a rename() function to make it more intuitive?
  


I’ll commit to sample code and more “spec text” to make this clearer in my next 

FileSystem API Comments

2014-10-21 Thread Ali Alabbas
Hello,

I'm with the IE Platform team at Microsoft. We have a few comments on the 
latest editor's draft of the newly proposed FileSystem API [1].

1.1 Use cases (3. Audio/Photo editor with offline access or local cache for 
speed)

  * Edited files should be accessible by other client-side applications
 - Having the sandboxed file system share its contents between all apps 
would allow apps to tamper with the files of another app. This could result in 
corrupted files and perhaps an invalid state for some apps that expect certain 
contents to exist in a file. This makes us wonder: should we warn users about 
files that are being opened and written to? If an app is just doing a read, can 
it open a file or directory without the user's permission, or could this pose a 
possible issue as well? Also, is the Quota Management API going to be a 
dependency? It's unclear what we would do with regards to requesting permission 
to access files. Will this spec be responsible for defining what 
questions/permission inquiries are presented and when they are presented to the 
user? For example, what happens when one file is locked for use by a different 
application? Is the user notified and given the option to open a read-only copy 
of that file?


3. The Directory Interface

  * Change events
 - I would like to revisit the discussion on apps getting notifications of 
changes to files/directories. There are many scenarios where an application 
would want to react to renames/moves of a file/directory. There would also be 
value in being notified of a change to a directory's structure. If an app has a 
file browser that allows a user to select files and/or directories and another 
app makes changes to the sandboxed filesystem, then it would be expected that 
the first app should be notified and would be able to refresh its directory 
tree. Otherwise it would require the user to somehow force a refresh which 
would not be a good user experience since the user would expect the file 
browser to update on its own.

  * removeDeep()  move()
 - Do these support links or junctions? If not, what is the expected 
behavior?

  * enumerate()
 - It would be useful to have pre-filtering support for the following: 
file/directory, ranges, wildcard search. Currently we would be forced to 
enumerate the entire set and manually filter out the items we want from the 
result set.
 - Callers often know exactly whether or not they want to enumerate 
files or folders. For example, an image upload service may only be interested 
in the files present in a directory rather than all of its directories. Perhaps 
it would be useful to have enumerateFiles() and enumerateDirectories() for this 
purpose? Or we could have another argument for enumerate() that is an enum 
(directory, file).
 - Supporting optimized pagination of large directories. We could have 
arguments for a starting index and length we would be able to specify a range 
of items to retrieve within the result set.
 - Supporting the wildcard character to pre-filter a list of 
files/directories (e.g. *.jpg).

4. The FileHandle Interface

  * FileHandles
 - Is this basically going to be the first to get the handle gets to use it 
and all subsequent calls need to wait for the file handle to become available 
again? Are there more details about the locking model used here?

  * Auto-closing of FileHandles
 - This may cause confusion as it does not match the common developer 
mental model of a file handle which is opened and then available for use 
until it's closed. Perhaps it would be advantageous to have an explicit close 
function as part of the FileHandle interface? With the current behavior there 
can be overhead with the unintended closure of the FileHandle that would 
require a developer to continuously open/close a FileHandle. The currently 
defined behavior assumes that a developer is done with all their file 
manipulations when they have completed a promise chain. However, a developer 
may want to keep the FileHandle open to be used elsewhere at some other point 
in time that is not related to the current promise chain. An example of the 
usefulness of having an explicit close function is if you were to implement a 
word processor and wanted to lock down the file that it currently has open for 
the period of its editing. This way you are free to continue operating on that 
file for the duration that it is open, protecting the file from other 
processes, and not having to undergo the costly setup and teardown of a file 
handle.

  * AbortableProgressPromise
 - It is not clear how a developer would define the abort callback of an 
AbortableProgressPromise. It seems that the user agent would be responsible for 
setting the abort callback since it instantiates and returns the 
AbortableProgressPromise.


5. The FileHandleWritable Interface

  * write()  flush()
 - It might be useful to have support for transacted streams where 

Re: FileSystem API Comments

2014-10-21 Thread Arthur Barstow

On 10/21/14 4:36 PM, Ali Alabbas wrote:


Hello,

I'm with the IE Platform team at Microsoft. We have a few comments on 
the latest editor's draft of the newly proposed FileSystem API [1].





I believe [1] is Arun's http://w3c.github.io/filesystem-api/Overview.html.

1.1 Use cases (3. Audio/Photo editor with offline access or local 
cache for speed)


* Edited files should be accessible by other client-side applications

- Having the sandboxed file system share its contents between all apps 
would allow apps to tamper with the files of another app. This could 
result in corrupted files and perhaps an invalid state for some apps 
that expect certain contents to exist in a file. This makes us wonder: 
should we warn users about files that are being opened and written to? 
If an app is just doing a read, can it open a file or directory 
without the user's permission, or could this pose a possible issue as 
well? Also, is the Quota Management API going to be a dependency? It's 
unclear what we would do with regards to requesting permission to 
access files. Will this spec be responsible for defining what 
questions/permission inquiries are presented and when they are 
presented to the user? For example, what happens when one file is 
locked for use by a different application? Is the user notified and 
given the option to open a read-only copy of that file?


3. The Directory Interface

* Change events

- I would like to revisit the discussion on apps getting notifications 
of changes to files/directories. There are many scenarios where an 
application would want to react to renames/moves of a file/directory. 
There would also be value in being notified of a change to a 
directory's structure. If an app has a file browser that allows a user 
to select files and/or directories and another app makes changes to 
the sandboxed filesystem, then it would be expected that the first app 
should be notified and would be able to refresh its directory tree. 
Otherwise it would require the user to somehow force a refresh which 
would not be a good user experience since the user would expect the 
file browser to update on its own.


* removeDeep()  move()

- Do these support links or junctions? If not, what is the expected 
behavior?


* enumerate()

- It would be useful to have pre-filtering support for the following: 
file/directory, ranges, wildcard search. Currently we would be forced 
to enumerate the entire set and manually filter out the items we want 
from the result set.


- Callers often know exactly whether or not they want to enumerate 
files or folders. For example, an image upload service may only be 
interested in the files present in a directory rather than all of its 
directories. Perhaps it would be useful to have enumerateFiles() and 
enumerateDirectories() for this purpose? Or we could have another 
argument for enumerate() that is an enum (directory, file).


- Supporting optimized pagination of large directories. We could have 
arguments for a starting index and length we would be able to specify 
a range of items to retrieve within the result set.


- Supporting the wildcard character to pre-filter a list of 
files/directories (e.g. *.jpg).


4. The FileHandle Interface

* FileHandles

- Is this basically going to be the first to get the handle gets to 
use it and all subsequent calls need to wait for the file handle to 
become available again? Are there more details about the locking model 
used here?


* Auto-closing of FileHandles

- This may cause confusion as it does not match the common developer 
mental model of a file handle which is “opened” and then available for 
use until it's “closed”. Perhaps it would be advantageous to have an 
explicit close function as part of the FileHandle interface? With the 
current behavior there can be overhead with the unintended closure of 
the FileHandle that would require a developer to continuously 
open/close a FileHandle. The currently defined behavior assumes that a 
developer is done with all their file manipulations when they have 
completed a promise chain. However, a developer may want to keep the 
FileHandle open to be used elsewhere at some other point in time that 
is not related to the current promise chain. An example of the 
usefulness of having an explicit close function is if you were to 
implement a word processor and wanted to lock down the file that it 
currently has open for the period of its editing. This way you are 
free to continue operating on that file for the duration that it is 
open, protecting the file from other processes, and not having to 
undergo the costly setup and teardown of a file handle.


* AbortableProgressPromise

- It is not clear how a developer would define the abort callback of 
an AbortableProgressPromise. It seems that the user agent would be 
responsible for setting the abort callback since it instantiates and 
returns the AbortableProgressPromise.


5. The FileHandleWritable Interface

* write()  flush()