Re: File API: Blob and underlying file changes.

2010-02-01 Thread Dmitry Titov
Going a bit back to current spec and changing underlying files - here is an
update on our thinking (and current implementation plan). We played with
File/Blob ideas a little more and talked with some of our app developers. In
regard to a problem of changing file, most folks feel the Blob is best to be
though of as a 'snapshot of a byte range' with a delayed promise to deliver
the actual bytes in that range from the underlying data storage. It is a
'delayed promise' because all the actual 'reading' methods are async.
Basically, in terms of implementation, the Blob is not a 'container of
bytes' but rather a 'reference' to the byte range.

As such, the async read operations later may fail, for many reasons - the
file can be deleted, renamed, modified, etc. It seems developers sometimes
want to be oblivious to those problems, but in other scenarios they want to
process them. Basically, it's app-specific choice. It appears that the
following implementation goes along with the current edition of the spec but
also provides the ability to detect the file change:

1. File derives from Blob, so there is a File.size that performs synchronous
file I/O. Not ideal, but easy to use and compatible with current forms
upload.
2. File.slice() also does a synchronous IO and captures the current size and
modification time of the underlying file - and caches it in the resulting
Blob.
3. Subsequent Blob.slice() and Blob.size calls do not do any file IO, but
merely operate on cached values. So the only Blob methods that do sync IO
are those on the File object. Subsequent slicing operates on the file
information captured from File and propagate it to derived Blobs.
4. In xhr.send() and FileReader, if the UA discovers that the underlying
file is changed, it behaves just like when other file errors are discovered
- returning 'error' progress event and setting FileReader.error attribute
for example. We might need another FileError code for that if existing ones
do not feel adequate.

This way, the folks who don't care about changing files could simply ignore
the error results - because they likely do not worry about other errors as
well (such as NOT_FOUND_ERR). At the same time, folks that worry about such
things, could simply process the errors already specified. It also doesn't
add new exceptions to the picture so no special code is needed in simple
cases.

One obvious difficulty here is the synchronous file IO on File.size and
File.slice(). Trying to eliminate it requires some complexity in API that is
not obviously better. It either leads to some strange APIs like a getSize()
with a callback that delivers the size, or/and breaks behavior of currently
implemented File (and most developer's expectations). In any case, an
attempt to completely avoid sync IO and preserve correctness seems to be
calling for a way more involved API. Considering that most uploaders which
slice the file and send it in pieces will likely do it in a worker thread,
sync IO in these places perhaps is a lesser evil then complicated (or dual)
API...

Thanks,
Dmitry

On Wed, Jan 27, 2010 at 4:40 AM, Juan Lanus juan.la...@gmail.com wrote:

 On Wed, Jan 27, 2010 at 01:16, Robert O'Callahan rob...@ocallahan.org
 wrote:
  On Wed, Jan 27, 2010 at 5:38 AM, Juan Lanus juan.la...@gmail.com
 wrote:
 
  Quite right Bob. But still the lock is the way to go. At least as of
  today.
 
  HTML5 might be mainstream for the next 10 years, starting rather soon.
 
  In the meanwhile OSs will also evolve, in a way that we can't tell
  now. But if there are common issues, like this one, somebody will come
  up with a smart solution maybe soon.
  For example feeding an image of the file as of the instant it was
  opened (like relational databases do to provide stable queries) by
  keeping a temporary map to the original disk segments that comprised
  the file before it was changed.
  For example Apple is encouraging advisory locks
 
 
 http://developer.apple.com/mac/library/technotes/tn/tn2037.html#OSSolutions
  asking developers to design in an environment-aware mood.
 
  In my experience, almost no code uses advisory locking unless it is being
  explicitly designed for some kind of concurrent usage, i.e., Apple's
 advice
  is not being followed. If that's not going to suddenly change --- and I
 see
  no evidence it will --- then asking the UA to apply a mandatory lock is
  asking the UA to do something impossible, which is generally not a good
  idea.
  Rob

 Right, not talking about locks any more because it would be telling
 HOW the UA should do it, and what is best for the UA developers is to
 be told WHAT to do.
 Not writing a tutorial but a specification. Let the developer find out
 how to do it, this year, and with the tools that will be available by
 2020.

 Now, out of the locks subject, what I want to be sure of is that the
 specification does not specify the mutating blob, the origin of this
 thread.
 --
 Juan


  He was pierced for our transgressions, he was crushed 

Re: File API: Blob and underlying file changes.

2010-02-01 Thread Jonas Sicking
On Mon, Feb 1, 2010 at 12:27 PM, Dmitry Titov dim...@chromium.org wrote:
 Basically, it's app-specific choice. It appears that the
 following implementation goes along with the current edition of the spec but
 also provides the ability to detect the file change:
 1. File derives from Blob, so there is a File.size that performs synchronous
 file I/O. Not ideal, but easy to use and compatible with current forms
 upload.
 2. File.slice() also does a synchronous IO and captures the current size and
 modification time of the underlying file - and caches it in the resulting
 Blob.

Note that the synch IO is not required by the spec. You can just cache
the filesize when the File object is created, which always happens
asynchronously. Then used that cached value through all calls to
File.size and Blob.slice().

 4. In xhr.send() and FileReader, if the UA discovers that the underlying
 file is changed, it behaves just like when other file errors are discovered
 - returning 'error' progress event and setting FileReader.error attribute
 for example. We might need another FileError code for that if existing ones
 do not feel adequate.

This is definitely an interesting idea, possibly even something that
we should standardize. I don't really feel strongly either way, though
I am curious about platform support if the file lives in NFS or samba
or some such.

 One obvious difficulty here is the synchronous file IO on File.size and
 File.slice(). Trying to eliminate it requires some complexity in API that is
 not obviously better.

See above.

/ Jonas



Re: File API: Blob and underlying file changes.

2010-01-26 Thread Juan Lanus
On Sun, Jan 24, 2010 at 8:04 AM, Juan Lanus juan.la...@gmail.com wrote:

 ** Locking
 What's wrong with file locking?


 Rob O'Callahan answered that:
 One problem is that mandatory locking is not supported on Mac or most Linux
 installs.

Quite right Bob. But still the lock is the way to go. At least as of today.

HTML5 might be mainstream for the next 10 years, starting rather soon.

In the meanwhile OSs will also evolve, in a way that we can't tell
now. But if there are common issues, like this one, somebody will come
up with a smart solution maybe soon.
For example feeding an image of the file as of the instant it was
opened (like relational databases do to provide stable queries) by
keeping a temporary map to the original disk segments that comprised
the file before it was changed.
For example Apple is encouraging advisory locks
http://developer.apple.com/mac/library/technotes/tn/tn2037.html#OSSolutions
asking developers to design in an environment-aware mood.

Maybe now that I think it a bit more, specifying that the UA should
get a lock is telling HOW to do things, while the use cases practice
teaches us that at the requirements level one should say WHAT to do.

What if the specification only said that the UA has to do its best to
get an integral copy of the input file, or else after doing whatever
it MUST raise an error?
This will leave headroom for the UA designers and also is what the
specification says now, isn't it? I got scared by the mutating blob
solution.
--
Juan Lanus



Re: File API: Blob and underlying file changes.

2010-01-26 Thread Robert O'Callahan
On Wed, Jan 27, 2010 at 5:38 AM, Juan Lanus juan.la...@gmail.com wrote:

 Quite right Bob. But still the lock is the way to go. At least as of today.

 HTML5 might be mainstream for the next 10 years, starting rather soon.

 In the meanwhile OSs will also evolve, in a way that we can't tell
 now. But if there are common issues, like this one, somebody will come
 up with a smart solution maybe soon.
 For example feeding an image of the file as of the instant it was
 opened (like relational databases do to provide stable queries) by
 keeping a temporary map to the original disk segments that comprised
 the file before it was changed.
 For example Apple is encouraging advisory locks
 http://developer.apple.com/mac/library/technotes/tn/tn2037.html#OSSolutions
 asking developers to design in an environment-aware mood.


In my experience, almost no code uses advisory locking unless it is being
explicitly designed for some kind of concurrent usage, i.e., Apple's advice
is not being followed. If that's not going to suddenly change --- and I see
no evidence it will --- then asking the UA to apply a mandatory lock is
asking the UA to do something impossible, which is generally not a good
idea.

Rob
-- 
He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all. [Isaiah
53:5-6]


Re: File API: Blob and underlying file changes.

2010-01-24 Thread Robert O'Callahan
On Sun, Jan 24, 2010 at 8:04 AM, Juan Lanus juan.la...@gmail.com wrote:

 ** Locking
 What's wrong with file locking?


One problem is that mandatory locking is not supported on Mac or most Linux
installs.

Rob
-- 
He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all. [Isaiah
53:5-6]


Re: File API: Blob and underlying file changes.

2010-01-23 Thread Juan Lanus
I'm new to this list and to all the W3C work so I might be completely
wrong. That said, let's say.

Dmitry posed a simple question: If a file's blob should be kept in
sync with the file's content in disk, or not. He did not get a yes
or no answer but instead triggered a near 30 posts thread that as I
see it denotes a certain lack of definition so far.
This is what I think, after having read only the draft and this thread:

** The mutating blob
The idea of keeping the disk file in sync with its working version,
the mutating blob, as too risky and impractical. IMO doing so will
raise a lot of issues while solving none. What is the scenario that
calls for such feature? I can't see any, buy I can yes see lots of
scenarios where data stability is desirable.
For example a disk file holding the data of an active relational
database. The scenario is uploading a big file where possibly many
concurrent applications introduce changes anywhere in the file, every
few seconds. I know that this example is contrived, but there might be
many others with similar characteristics, albeit not so clear and
dramatic. In this scenario the UA might be completely busy in trying
to keep current with the changes, like during a DoS attack.
Another requirement for a database file is that it has to be
consistent, so sending a slice of one version lumped with a slice of a
later version is unacceptable.
If, and only if, there is an unavoidable requirement for such a
feature then I strongly suggest that the API specifies a flag
informing the application that the original file changed during the
operation but without doing nothing.
Let the developer decide if she wants to take any action, instead of
trying in advance to solve her a problem that might not exist. In one
post Dmitry says that he found out that developers expect Blob to be
a 'snapshot'. This is the way to go: talking with developers and also
with software architects who already solved issues like this years
ago.

** Locking
What's wrong with file locking? May be it was discussed in prior
sessions I didn't read, because it seems to be already discarded.
But locking is the universally accepted solution in multitasking
operating systems. The API should lock the files to prevent them to be
written by other applications, for a short while or during a long time
It is a must, to make the read atomic (atomic is not desirable but a must):
1 the UA SHOULD lock the file (a mandatory lock preventing writes
by other apps) and open it
. 1a the file refuses to be locked
.. 1a1 the operation fails with a file is locked error
.. 1a2 the use case fails
2 the UA uses the file
3 the UA unlocks the file by issuing a close method
For small files this does not make a difference. But what happens if
the file is huge? In this case leave the problem to the developer, the
one who knows about the environment and the particular requirements.
For example the developer could choose to swiftly copy the file into a
blob and close it to release the brief lock if it is a busy file
(database ...), or have if locked during a lengthy transference
operation if the file content is static (video, or backup ...).
It is not possible to solve all the developer's issues at this point,
we can only provide tools, the simpler the better, for the developers
to leverage.
For very special cases there might be an option locking=no to open a
file allowing other applications to change it.
Intuitively I perceive this as a security crack. Such a file could
become a communicacion area between the computer contents and the web.
A trojan could repeatedly paste information in the file for the UA to
send it to the bad guy's server.
This could be achieved by setting a trojan listener in the OS to
detect when the user selected a file.
As I see it when the user allows the UA to grab a file then she means
what the file contains right now and we MUST not deceive her.

** Avoid involving technology limitations in the design
The File API is sort of an impedance adapter between the latency of
the Internet connections and the speed of disk drives (disks or
whatever, think of the future).
As such, it must be able to handle any speed difference. In the future
the case difference might change its sign.
Also, the API must consider that what today is regarded as big might
be regular in the future and small after a while. For example
making a memory copy of a 300MB file is possible today but not when
the computers, even the mainframes, sported a few MB RAM.
The virtual memory that most OSs have is an existing implementation
of a in-memory file backed by disk storage. This issue is already
solved, since the seventies. A program, like the UA, can pump lots of
data into RAM and the OS will use the disk to store the bytes in case
of a shortage in real RAM. This way computers, like PCs, appear to
have twice as much RAM as they have physically installed, at the cost
of some performance loss that is completely compatible with Internet

Re: File API: Blob and underlying file changes.

2010-01-21 Thread Jian Li
Treating blobs as snapshots sounds like a reasonable approach and it will
make the life of the chunked upload and other scenarios easier. Now the
problem is: how do we get the blob (snapshot) out of the file?

1) We can still keep the current relationship between File and Blob. When we
slice a file by calling File.slice, a new blob that captures the current
file size and modification time is returned. The following Blob operations,
like slice, will simply inherit the cached size and modification time. When
we access the underlying file data in XHR.send() or FileReader, the
modification time will be verified and an exception could be thrown.

2) We can remove the inheritance of Blob from File and introduce
File.getAsBlob() as dimich suggested. This seems to be more elegant.
However, it requires changing the File API spec a lot.


On Wed, Jan 20, 2010 at 3:44 PM, Eric Uhrhane er...@google.com wrote:

 On Wed, Jan 20, 2010 at 3:23 PM, Dmitry Titov dim...@chromium.org wrote:
  On Wed, Jan 20, 2010 at 2:30 PM, Eric Uhrhane er...@google.com wrote:
 
  I think it could.  Here's a third option:
  Make all blobs, file-based or not, just as async as the blobs in
  option 2.  They never do sync IO, but could potentially fail future
  read operations if their metadata is out of date [e.g. reading beyond
  EOF].  However, expose the modification time on File via an async
  method and allow the user to pass it in to a read call to enforce
  fail if changed since this time.  This keeps all file accesses
  async, but still allows for chunked uploads without mixing files
  accidentally.  If we allow users to refresh the modification time
  asynchronously, it also allows for adding a file to a form, changing
  the file on disk, and then uploading the new file.  The user would
  look up the mod time when starting the upload, rather than when the
  file's selected.
 
  It would be great to avoid sync file I/O on calls like Blob.size. They
 would
  simply return cached value. Actual mismatch would be detected during
 actual
  read operation.
  However then I'm not sure how to keep File derived from Blob, since:
  1) Currently, in FF and WebKit File.fileSize is a sync I/O that returns
  current file size. The current spec says File is derived from Blob and
 Blob
  has Blob.size property that is likely going to co-exist with
 File.fileSize
  for a while, for compat reasons. It's weird for file.size and
 file.fileSize
  to return different things.

 True, but we'd probably want to deprecate file.fileSize anyway and
 then get rid of it, since it's synchronous.

  2) Currently, xhr.send(file) does not fail and sends the version of the
 file
  that exists somewhere around xhr.send(file) call was issued. Since File
 is
  also a Blob, xhr.send(blob) would behave the same which means if we want
 to
  preserve this behavior the Blob can not fail async read operation if file
  has changed.
  There is a contradiction here. One way to resolve it would be to break
 File
  is Blob and to be able to capture the File as Blob by having
  file.getAsBlob(). The latter would make a snapshot of the state of the
 file,
  to be able to fail subsequent async read operations if the file has been
  changed.
  I've asked a few people around in a non-scientific poll and it seems
  developers expect Blob to be a 'snapshot', reflecting the state of the
 file
  (or Canvas if we get Canvas.getBlob(...)) at the moment of Blob creation.
  Since it's obviously bad to actually copy data, it seems acceptable to
  capture enough information (like mod time) so the read operations later
 can
  fail if underlying storage has been changed. It feels really strange if
  reading the Blob can yield some data from one version of a file (or
 Canvas)
  mixed with some data from newer version, without any indication that this
 is
  happening.
  All that means there is an option 3:
  3. Treat all Blobs as 'snapshots' that refer to the range of underlying
 data
  at the moment of creation of the Blob. Blobs produced further by
  Blob.slice() operation inherit the captured state w/o actually verifying
 it
  against 'live' underlying objects like files. All Blobs can be 'read' (or
  'sent') via operations that can fail if the underlying content has
 changed.
  Optionally, expose snapshotTime property and perhaps read if not changed
  since parameter to read operations. Do not derive File from Blob, rather
  have File.getAsBlob() that produces a Blob which is a snapshot of the
 file
  at the moment of call. The advantage here is that it removes need for
 sync
  operations from Blob and provides mechanism to ensure the changing
  underlying storage is detectable. The disadvantage is a bit more
 complexity
  and bigger change to File spec.

 That sounds good to me.  If we're treating blobs as snapshots, I
 retract my suggestion of the read-if-not-changed-since parameter.  All
 reads after the data has changed should fail.  If you want to do a
 chunked upload, don't snapshot your 

Re: File API: Blob and underlying file changes.

2010-01-21 Thread Jonas Sicking
One thing to remember here is that if we require snapshotting, that
will mean paying potentially very high costs every time the
snapshotting operation is used. Potetially copying hundreds of
megabytes of data (think video).

But if we don't require snapshotting, things will only break if the
user takes the action to modify a file after giving the page access to
it.

Also, in general snapshotting is something that UAs can experiment
with without requiring changes to the spec. Even though File.slice is
a synchronous function, the UA can implement snapshotting without
using synchronous IO. The UA could simply do a asynchronous file copy
in the background. If any read operations are performed on the slice
those could simply be stalled until the copy is finished since reads
are always asynchronous.

/ Jonas

On Thu, Jan 21, 2010 at 11:22 AM, Eric Uhrhane er...@google.com wrote:
 On Thu, Jan 21, 2010 at 11:15 AM, Jian Li jia...@chromium.org wrote:
 Treating blobs as snapshots sounds like a reasonable approach and it will
 make the life of the chunked upload and other scenarios easier. Now the
 problem is: how do we get the blob (snapshot) out of the file?
 1) We can still keep the current relationship between File and Blob. When we
 slice a file by calling File.slice, a new blob that captures the current
 file size and modification time is returned. The following Blob operations,
 like slice, will simply inherit the cached size and modification time. When
 we access the underlying file data in XHR.send() or FileReader, the
 modification time will be verified and an exception could be thrown.

 This would require File.slice to do synchronous file IO, whereas
 Blob.slice doesn't do that.

 2) We can remove the inheritance of Blob from File and introduce
 File.getAsBlob() as dimich suggested. This seems to be more elegant.
 However, it requires changing the File API spec a lot.

 On Wed, Jan 20, 2010 at 3:44 PM, Eric Uhrhane er...@google.com wrote:

 On Wed, Jan 20, 2010 at 3:23 PM, Dmitry Titov dim...@chromium.org wrote:
  On Wed, Jan 20, 2010 at 2:30 PM, Eric Uhrhane er...@google.com wrote:
 
  I think it could.  Here's a third option:
  Make all blobs, file-based or not, just as async as the blobs in
  option 2.  They never do sync IO, but could potentially fail future
  read operations if their metadata is out of date [e.g. reading beyond
  EOF].  However, expose the modification time on File via an async
  method and allow the user to pass it in to a read call to enforce
  fail if changed since this time.  This keeps all file accesses
  async, but still allows for chunked uploads without mixing files
  accidentally.  If we allow users to refresh the modification time
  asynchronously, it also allows for adding a file to a form, changing
  the file on disk, and then uploading the new file.  The user would
  look up the mod time when starting the upload, rather than when the
  file's selected.
 
  It would be great to avoid sync file I/O on calls like Blob.size. They
  would
  simply return cached value. Actual mismatch would be detected during
  actual
  read operation.
  However then I'm not sure how to keep File derived from Blob, since:
  1) Currently, in FF and WebKit File.fileSize is a sync I/O that returns
  current file size. The current spec says File is derived from Blob and
  Blob
  has Blob.size property that is likely going to co-exist with
  File.fileSize
  for a while, for compat reasons. It's weird for file.size and
  file.fileSize
  to return different things.

 True, but we'd probably want to deprecate file.fileSize anyway and
 then get rid of it, since it's synchronous.

  2) Currently, xhr.send(file) does not fail and sends the version of the
  file
  that exists somewhere around xhr.send(file) call was issued. Since File
  is
  also a Blob, xhr.send(blob) would behave the same which means if we want
  to
  preserve this behavior the Blob can not fail async read operation if
  file
  has changed.
  There is a contradiction here. One way to resolve it would be to break
  File
  is Blob and to be able to capture the File as Blob by having
  file.getAsBlob(). The latter would make a snapshot of the state of the
  file,
  to be able to fail subsequent async read operations if the file has been
  changed.
  I've asked a few people around in a non-scientific poll and it seems
  developers expect Blob to be a 'snapshot', reflecting the state of the
  file
  (or Canvas if we get Canvas.getBlob(...)) at the moment of Blob
  creation.
  Since it's obviously bad to actually copy data, it seems acceptable to
  capture enough information (like mod time) so the read operations later
  can
  fail if underlying storage has been changed. It feels really strange if
  reading the Blob can yield some data from one version of a file (or
  Canvas)
  mixed with some data from newer version, without any indication that
  this is
  happening.
  All that means there is an option 3:
  3. Treat all Blobs as 'snapshots' 

Re: File API: Blob and underlying file changes.

2010-01-21 Thread Michael Nordman
On Thu, Jan 21, 2010 at 12:49 PM, Jonas Sicking jo...@sicking.cc wrote:

 One thing to remember here is that if we require snapshotting, that
 will mean paying potentially very high costs every time the
 snapshotting operation is used. Potetially copying hundreds of
 megabytes of data (think video).


I was thinking of different semantics. If the underlying bits change
sometime after a 'snapshot' is taken, the 'snapshot' becomes invalid and you
cannot access the underying bits. If an application wants guaranteed access
to the 'snapshot', it would have to explicitly save a copy somewhere
(sandboxed file system / coin a new transient 'Blob' via a new blob.copy()
method) and refer to the copy.

So no costly copies are made w/o explicit direction to do so from the app.

But if we don't require snapshotting, things will only break if the
 user takes the action to modify a file after giving the page access to
 it.

 Also, in general snapshotting is something that UAs can experiment
 with without requiring changes to the spec. Even though File.slice is
 a synchronous function, the UA can implement snapshotting without
 using synchronous IO. The UA could simply do a asynchronous file copy
 in the background. If any read operations are performed on the slice
 those could simply be stalled until the copy is finished since reads
 are always asynchronous.

 / Jonas

 On Thu, Jan 21, 2010 at 11:22 AM, Eric Uhrhane er...@google.com wrote:
  On Thu, Jan 21, 2010 at 11:15 AM, Jian Li jia...@chromium.org wrote:
  Treating blobs as snapshots sounds like a reasonable approach and it
 will
  make the life of the chunked upload and other scenarios easier. Now the
  problem is: how do we get the blob (snapshot) out of the file?
  1) We can still keep the current relationship between File and Blob.
 When we
  slice a file by calling File.slice, a new blob that captures the current
  file size and modification time is returned. The following Blob
 operations,
  like slice, will simply inherit the cached size and modification time.
 When
  we access the underlying file data in XHR.send() or FileReader, the
  modification time will be verified and an exception could be thrown.
 
  This would require File.slice to do synchronous file IO, whereas
  Blob.slice doesn't do that.
 
  2) We can remove the inheritance of Blob from File and introduce
  File.getAsBlob() as dimich suggested. This seems to be more elegant.
  However, it requires changing the File API spec a lot.
 
  On Wed, Jan 20, 2010 at 3:44 PM, Eric Uhrhane er...@google.com wrote:
 
  On Wed, Jan 20, 2010 at 3:23 PM, Dmitry Titov dim...@chromium.org
 wrote:
   On Wed, Jan 20, 2010 at 2:30 PM, Eric Uhrhane er...@google.com
 wrote:
  
   I think it could.  Here's a third option:
   Make all blobs, file-based or not, just as async as the blobs in
   option 2.  They never do sync IO, but could potentially fail future
   read operations if their metadata is out of date [e.g. reading
 beyond
   EOF].  However, expose the modification time on File via an async
   method and allow the user to pass it in to a read call to enforce
   fail if changed since this time.  This keeps all file accesses
   async, but still allows for chunked uploads without mixing files
   accidentally.  If we allow users to refresh the modification time
   asynchronously, it also allows for adding a file to a form, changing
   the file on disk, and then uploading the new file.  The user would
   look up the mod time when starting the upload, rather than when the
   file's selected.
  
   It would be great to avoid sync file I/O on calls like Blob.size.
 They
   would
   simply return cached value. Actual mismatch would be detected during
   actual
   read operation.
   However then I'm not sure how to keep File derived from Blob, since:
   1) Currently, in FF and WebKit File.fileSize is a sync I/O that
 returns
   current file size. The current spec says File is derived from Blob
 and
   Blob
   has Blob.size property that is likely going to co-exist with
   File.fileSize
   for a while, for compat reasons. It's weird for file.size and
   file.fileSize
   to return different things.
 
  True, but we'd probably want to deprecate file.fileSize anyway and
  then get rid of it, since it's synchronous.
 
   2) Currently, xhr.send(file) does not fail and sends the version of
 the
   file
   that exists somewhere around xhr.send(file) call was issued. Since
 File
   is
   also a Blob, xhr.send(blob) would behave the same which means if we
 want
   to
   preserve this behavior the Blob can not fail async read operation if
   file
   has changed.
   There is a contradiction here. One way to resolve it would be to
 break
   File
   is Blob and to be able to capture the File as Blob by having
   file.getAsBlob(). The latter would make a snapshot of the state of
 the
   file,
   to be able to fail subsequent async read operations if the file has
 been
   changed.
   I've asked a few people around in a non-scientific 

Re: File API: Blob and underlying file changes.

2010-01-21 Thread Dmitry Titov
I think the 'snapshotting' discussed above does not imply the actual copy of
data, sync or async. The proposal seems to be to 'snapshot' enough
information (in case of file on a disk - the modification time is enogh) so
that later read operations can fail reliably if the Blob is out of sync with
underlying storage. Making copies of large video files will probably never
be a feasible option, for size/time issues and for potentially quite
complicated lifetime of such copies... We might provide a separate API for
file manipulation that can be used to make temporary copies of files in
cases where it is a good idea, and that could be used in conjunction with
Blob API perhaps, but it seems to be a separate functionality. It is also
interesting to think of Blobs backed by some other objects, Canvas for
example.

Perhaps 'snapshotting' is not an ideal name, but I think discussion above
means it as capture the state of the underlying object so the data can be
read in the future but w/o a guarantee that the read operation will
actually succeed - since there can not be a guarantee that underlying object
is still there.

On Thu, Jan 21, 2010 at 12:49 PM, Jonas Sicking jo...@sicking.cc wrote:

 One thing to remember here is that if we require snapshotting, that
 will mean paying potentially very high costs every time the
 snapshotting operation is used. Potetially copying hundreds of
 megabytes of data (think video).

 But if we don't require snapshotting, things will only break if the
 user takes the action to modify a file after giving the page access to
 it.

 Also, in general snapshotting is something that UAs can experiment
 with without requiring changes to the spec. Even though File.slice is
 a synchronous function, the UA can implement snapshotting without
 using synchronous IO. The UA could simply do a asynchronous file copy
 in the background. If any read operations are performed on the slice
 those could simply be stalled until the copy is finished since reads
 are always asynchronous.

 / Jonas

 On Thu, Jan 21, 2010 at 11:22 AM, Eric Uhrhane er...@google.com wrote:
  On Thu, Jan 21, 2010 at 11:15 AM, Jian Li jia...@chromium.org wrote:
  Treating blobs as snapshots sounds like a reasonable approach and it
 will
  make the life of the chunked upload and other scenarios easier. Now the
  problem is: how do we get the blob (snapshot) out of the file?
  1) We can still keep the current relationship between File and Blob.
 When we
  slice a file by calling File.slice, a new blob that captures the current
  file size and modification time is returned. The following Blob
 operations,
  like slice, will simply inherit the cached size and modification time.
 When
  we access the underlying file data in XHR.send() or FileReader, the
  modification time will be verified and an exception could be thrown.
 
  This would require File.slice to do synchronous file IO, whereas
  Blob.slice doesn't do that.
 
  2) We can remove the inheritance of Blob from File and introduce
  File.getAsBlob() as dimich suggested. This seems to be more elegant.
  However, it requires changing the File API spec a lot.
 
  On Wed, Jan 20, 2010 at 3:44 PM, Eric Uhrhane er...@google.com wrote:
 
  On Wed, Jan 20, 2010 at 3:23 PM, Dmitry Titov dim...@chromium.org
 wrote:
   On Wed, Jan 20, 2010 at 2:30 PM, Eric Uhrhane er...@google.com
 wrote:
  
   I think it could.  Here's a third option:
   Make all blobs, file-based or not, just as async as the blobs in
   option 2.  They never do sync IO, but could potentially fail future
   read operations if their metadata is out of date [e.g. reading
 beyond
   EOF].  However, expose the modification time on File via an async
   method and allow the user to pass it in to a read call to enforce
   fail if changed since this time.  This keeps all file accesses
   async, but still allows for chunked uploads without mixing files
   accidentally.  If we allow users to refresh the modification time
   asynchronously, it also allows for adding a file to a form, changing
   the file on disk, and then uploading the new file.  The user would
   look up the mod time when starting the upload, rather than when the
   file's selected.
  
   It would be great to avoid sync file I/O on calls like Blob.size.
 They
   would
   simply return cached value. Actual mismatch would be detected during
   actual
   read operation.
   However then I'm not sure how to keep File derived from Blob, since:
   1) Currently, in FF and WebKit File.fileSize is a sync I/O that
 returns
   current file size. The current spec says File is derived from Blob
 and
   Blob
   has Blob.size property that is likely going to co-exist with
   File.fileSize
   for a while, for compat reasons. It's weird for file.size and
   file.fileSize
   to return different things.
 
  True, but we'd probably want to deprecate file.fileSize anyway and
  then get rid of it, since it's synchronous.
 
   2) Currently, xhr.send(file) does not fail and sends the 

Re: File API: Blob and underlying file changes.

2010-01-21 Thread Jian Li
What we mean for snapshotting here is not to copy all the underlying data.
Instead, we only intend to capture the least information needed in order to
verify if the underlying data have been changed.

I agreed with Eric that the first option could cause inconsistent semantics
between File.slice and Bloc.slice. But how are we going to address the
synchronous call to get the file size for Blob.size if the blob is a file?


On Thu, Jan 21, 2010 at 12:49 PM, Jonas Sicking jo...@sicking.cc wrote:

 One thing to remember here is that if we require snapshotting, that
 will mean paying potentially very high costs every time the
 snapshotting operation is used. Potetially copying hundreds of
 megabytes of data (think video).

 But if we don't require snapshotting, things will only break if the
 user takes the action to modify a file after giving the page access to
 it.

 Also, in general snapshotting is something that UAs can experiment
 with without requiring changes to the spec. Even though File.slice is
 a synchronous function, the UA can implement snapshotting without
 using synchronous IO. The UA could simply do a asynchronous file copy
 in the background. If any read operations are performed on the slice
 those could simply be stalled until the copy is finished since reads
 are always asynchronous.

 / Jonas

 On Thu, Jan 21, 2010 at 11:22 AM, Eric Uhrhane er...@google.com wrote:
  On Thu, Jan 21, 2010 at 11:15 AM, Jian Li jia...@chromium.org wrote:
  Treating blobs as snapshots sounds like a reasonable approach and it
 will
  make the life of the chunked upload and other scenarios easier. Now the
  problem is: how do we get the blob (snapshot) out of the file?
  1) We can still keep the current relationship between File and Blob.
 When we
  slice a file by calling File.slice, a new blob that captures the current
  file size and modification time is returned. The following Blob
 operations,
  like slice, will simply inherit the cached size and modification time.
 When
  we access the underlying file data in XHR.send() or FileReader, the
  modification time will be verified and an exception could be thrown.
 
  This would require File.slice to do synchronous file IO, whereas
  Blob.slice doesn't do that.
 
  2) We can remove the inheritance of Blob from File and introduce
  File.getAsBlob() as dimich suggested. This seems to be more elegant.
  However, it requires changing the File API spec a lot.
 
  On Wed, Jan 20, 2010 at 3:44 PM, Eric Uhrhane er...@google.com wrote:
 
  On Wed, Jan 20, 2010 at 3:23 PM, Dmitry Titov dim...@chromium.org
 wrote:
   On Wed, Jan 20, 2010 at 2:30 PM, Eric Uhrhane er...@google.com
 wrote:
  
   I think it could.  Here's a third option:
   Make all blobs, file-based or not, just as async as the blobs in
   option 2.  They never do sync IO, but could potentially fail future
   read operations if their metadata is out of date [e.g. reading
 beyond
   EOF].  However, expose the modification time on File via an async
   method and allow the user to pass it in to a read call to enforce
   fail if changed since this time.  This keeps all file accesses
   async, but still allows for chunked uploads without mixing files
   accidentally.  If we allow users to refresh the modification time
   asynchronously, it also allows for adding a file to a form, changing
   the file on disk, and then uploading the new file.  The user would
   look up the mod time when starting the upload, rather than when the
   file's selected.
  
   It would be great to avoid sync file I/O on calls like Blob.size.
 They
   would
   simply return cached value. Actual mismatch would be detected during
   actual
   read operation.
   However then I'm not sure how to keep File derived from Blob, since:
   1) Currently, in FF and WebKit File.fileSize is a sync I/O that
 returns
   current file size. The current spec says File is derived from Blob
 and
   Blob
   has Blob.size property that is likely going to co-exist with
   File.fileSize
   for a while, for compat reasons. It's weird for file.size and
   file.fileSize
   to return different things.
 
  True, but we'd probably want to deprecate file.fileSize anyway and
  then get rid of it, since it's synchronous.
 
   2) Currently, xhr.send(file) does not fail and sends the version of
 the
   file
   that exists somewhere around xhr.send(file) call was issued. Since
 File
   is
   also a Blob, xhr.send(blob) would behave the same which means if we
 want
   to
   preserve this behavior the Blob can not fail async read operation if
   file
   has changed.
   There is a contradiction here. One way to resolve it would be to
 break
   File
   is Blob and to be able to capture the File as Blob by having
   file.getAsBlob(). The latter would make a snapshot of the state of
 the
   file,
   to be able to fail subsequent async read operations if the file has
 been
   changed.
   I've asked a few people around in a non-scientific poll and it seems
   developers expect Blob to be 

Re: File API: Blob and underlying file changes.

2010-01-20 Thread Dmitry Titov
So it seems there is 2 ideas on how to handle the underlying file changes in
case of File and Blob objects, nicely captured by Arun above:

1. Keep all Blobs 'mutating', following the underlying file change. In
particular, it means that Blob.size and similar properties may change from
query to query, reflecting the current file state. In case the Blob was
sliced and corresponding portion of the file does not exist anymore, it
would be clamped, potentially to 0, as currently specified. Read operations
would simply read the clamped portion. That would provide similar behavior
of all Blobs regardless if they are the Files or obtained via slice(). It
also has a slight disadvantage that every access to Blob.size or
Blob.slice() will incur synchronous file I/O. Note that current
File.fileSize is already implemented like that in FF and WebKit and uses
sync file I/O.

2. Treat Blobs that are Files and Blobs that are produced by slice() as
different blobs, semantically. While former ones would 'mutate' with the
file on the disk (to keep compat with form submission), the later would
simply 'inherit' the file information and never do sync IO. Instead, they
would fail later during async read operations. This has disadvantage of Blob
behaving differently in some cases, making it hard for web developers to
produce correct code. The synchronous file IO would be reduced but not
completely eliminated, because the Blobs that are Files would continue to
'sync' with the underlying file stats during sync JS calls. One benefit is
that it allows detection of file content change, via checks of modification
time captured when the first slice() operation is performed and verified
during async read operations, which provides a way to implement reliable
file operations in face of changing files, if the developer wants to spent
an effort to do so.

It seems folks on the thread do not like the duplicity of Blobs (hard to
program and debug), and there is also a desire to avoid synchronous file IO.
It seems the spec today leans more to the #1. The only problem with it is
that it's hard to implement some scenarios, like a big file upload in chunks
- in case the file changes, the result of upload may actually be a mix of
new and old file contents and there is no way to check... Perhaps we can
expose File.modificationTime? It still dos not get rid of sync I/O...

Dmitry

On Fri, Jan 15, 2010 at 12:10 PM, Dmitry Titov dim...@chromium.org wrote:

 On Fri, Jan 15, 2010 at 11:50 AM, Jonas Sicking jo...@sicking.cc wrote:


 This doesn't address the problem that authors are unlikely to even
 attempt to deal with this situation, given how rare it is. And even
 less likely to deal with it successfully given how hard the situation
 is reproduce while testing.


 I don't know how rare the case is. It might become less rare if there is an
 uploader of big movie files and it's easy to overwrite the big movie file by
 hitting 'save' button in movie editor while it is still uploading... Perhaps
 such uploader can use other means to detect the file change though...

 It would be nice to spell out *some* behavior though, or we can end up
 with some incompatible implementations. Speaking about Blob.slice(), what is
 recommended behavior of resultant Blobs on the underlying file change?




 / Jonas





Re: File API: Blob and underlying file changes.

2010-01-20 Thread Eric Uhrhane
On Wed, Jan 20, 2010 at 1:45 PM, Dmitry Titov dim...@chromium.org wrote:
 So it seems there is 2 ideas on how to handle the underlying file changes in
 case of File and Blob objects, nicely captured by Arun above:
 1. Keep all Blobs 'mutating', following the underlying file change. In
 particular, it means that Blob.size and similar properties may change from
 query to query, reflecting the current file state. In case the Blob was
 sliced and corresponding portion of the file does not exist anymore, it
 would be clamped, potentially to 0, as currently specified. Read operations
 would simply read the clamped portion. That would provide similar behavior
 of all Blobs regardless if they are the Files or obtained via slice(). It
 also has a slight disadvantage that every access to Blob.size or
 Blob.slice() will incur synchronous file I/O. Note that current
 File.fileSize is already implemented like that in FF and WebKit and uses
 sync file I/O.
 2. Treat Blobs that are Files and Blobs that are produced by slice() as
 different blobs, semantically. While former ones would 'mutate' with the
 file on the disk (to keep compat with form submission), the later would
 simply 'inherit' the file information and never do sync IO. Instead, they
 would fail later during async read operations. This has disadvantage of Blob
 behaving differently in some cases, making it hard for web developers to
 produce correct code. The synchronous file IO would be reduced but not
 completely eliminated, because the Blobs that are Files would continue to
 'sync' with the underlying file stats during sync JS calls. One benefit is
 that it allows detection of file content change, via checks of modification
 time captured when the first slice() operation is performed and verified
 during async read operations, which provides a way to implement reliable
 file operations in face of changing files, if the developer wants to spent
 an effort to do so.

 It seems folks on the thread do not like the duplicity of Blobs (hard to
 program and debug), and there is also a desire to avoid synchronous file IO.
 It seems the spec today leans more to the #1. The only problem with it is
 that it's hard to implement some scenarios, like a big file upload in chunks
 - in case the file changes, the result of upload may actually be a mix of
 new and old file contents and there is no way to check... Perhaps we can
 expose File.modificationTime? It still dos not get rid of sync I/O...

I think it could.  Here's a third option:

Make all blobs, file-based or not, just as async as the blobs in
option 2.  They never do sync IO, but could potentially fail future
read operations if their metadata is out of date [e.g. reading beyond
EOF].  However, expose the modification time on File via an async
method and allow the user to pass it in to a read call to enforce
fail if changed since this time.  This keeps all file accesses
async, but still allows for chunked uploads without mixing files
accidentally.  If we allow users to refresh the modification time
asynchronously, it also allows for adding a file to a form, changing
the file on disk, and then uploading the new file.  The user would
look up the mod time when starting the upload, rather than when the
file's selected.

Eric

 Dmitry
 On Fri, Jan 15, 2010 at 12:10 PM, Dmitry Titov dim...@chromium.org wrote:

 On Fri, Jan 15, 2010 at 11:50 AM, Jonas Sicking jo...@sicking.cc wrote:

 This doesn't address the problem that authors are unlikely to even
 attempt to deal with this situation, given how rare it is. And even
 less likely to deal with it successfully given how hard the situation
 is reproduce while testing.

 I don't know how rare the case is. It might become less rare if there is
 an uploader of big movie files and it's easy to overwrite the big movie file
 by hitting 'save' button in movie editor while it is still uploading...
 Perhaps such uploader can use other means to detect the file change
 though...
 It would be nice to spell out some behavior though, or we can end up with
 some incompatible implementations. Speaking about Blob.slice(), what is
 recommended behavior of resultant Blobs on the underlying file change?



 / Jonas






Re: File API: Blob and underlying file changes.

2010-01-20 Thread Dmitry Titov
On Wed, Jan 20, 2010 at 2:30 PM, Eric Uhrhane er...@google.com wrote:

 I think it could.  Here's a third option:

 Make all blobs, file-based or not, just as async as the blobs in
 option 2.  They never do sync IO, but could potentially fail future
 read operations if their metadata is out of date [e.g. reading beyond
 EOF].  However, expose the modification time on File via an async
 method and allow the user to pass it in to a read call to enforce
 fail if changed since this time.  This keeps all file accesses
 async, but still allows for chunked uploads without mixing files
 accidentally.  If we allow users to refresh the modification time
 asynchronously, it also allows for adding a file to a form, changing
 the file on disk, and then uploading the new file.  The user would
 look up the mod time when starting the upload, rather than when the
 file's selected.


It would be great to avoid sync file I/O on calls like Blob.size. They would
simply return cached value. Actual mismatch would be detected during actual
read operation.

However then I'm not sure how to keep File derived from Blob, since:

1) Currently, in FF and WebKit File.fileSize is a sync I/O that returns
current file size. The current spec says File is derived from Blob and Blob
has Blob.size property that is likely going to co-exist with File.fileSize
for a while, for compat reasons. It's weird for file.size and file.fileSize
to return different things.

2) Currently, xhr.send(file) does not fail and sends the version of the file
that exists somewhere around xhr.send(file) call was issued. Since File is
also a Blob, xhr.send(blob) would behave the same which means if we want to
preserve this behavior the Blob can not fail async read operation if file
has changed.

There is a contradiction here. One way to resolve it would be to break File
is Blob and to be able to capture the File as Blob by having
file.getAsBlob(). The latter would make a snapshot of the state of the file,
to be able to fail subsequent async read operations if the file has been
changed.

I've asked a few people around in a non-scientific poll and it seems
developers expect Blob to be a 'snapshot', reflecting the state of the file
(or Canvas if we get Canvas.getBlob(...)) at the moment of Blob creation.
Since it's obviously bad to actually copy data, it seems acceptable to
capture enough information (like mod time) so the read operations later can
fail if underlying storage has been changed. It feels really strange if
reading the Blob can yield some data from one version of a file (or Canvas)
mixed with some data from newer version, without any indication that this is
happening.

All that means there is an option 3:

3. Treat all Blobs as 'snapshots' that refer to the range of underlying data
at the moment of creation of the Blob. Blobs produced further by
Blob.slice() operation inherit the captured state w/o actually verifying it
against 'live' underlying objects like files. All Blobs can be 'read' (or
'sent') via operations that can fail if the underlying content has changed.
Optionally, expose snapshotTime property and perhaps read if not changed
since parameter to read operations. Do not derive File from Blob, rather
have File.getAsBlob() that produces a Blob which is a snapshot of the file
at the moment of call. The advantage here is that it removes need for sync
operations from Blob and provides mechanism to ensure the changing
underlying storage is detectable. The disadvantage is a bit more complexity
and bigger change to File spec.


Re: File API: Blob and underlying file changes.

2010-01-20 Thread Eric Uhrhane
On Wed, Jan 20, 2010 at 3:23 PM, Dmitry Titov dim...@chromium.org wrote:
 On Wed, Jan 20, 2010 at 2:30 PM, Eric Uhrhane er...@google.com wrote:

 I think it could.  Here's a third option:
 Make all blobs, file-based or not, just as async as the blobs in
 option 2.  They never do sync IO, but could potentially fail future
 read operations if their metadata is out of date [e.g. reading beyond
 EOF].  However, expose the modification time on File via an async
 method and allow the user to pass it in to a read call to enforce
 fail if changed since this time.  This keeps all file accesses
 async, but still allows for chunked uploads without mixing files
 accidentally.  If we allow users to refresh the modification time
 asynchronously, it also allows for adding a file to a form, changing
 the file on disk, and then uploading the new file.  The user would
 look up the mod time when starting the upload, rather than when the
 file's selected.

 It would be great to avoid sync file I/O on calls like Blob.size. They would
 simply return cached value. Actual mismatch would be detected during actual
 read operation.
 However then I'm not sure how to keep File derived from Blob, since:
 1) Currently, in FF and WebKit File.fileSize is a sync I/O that returns
 current file size. The current spec says File is derived from Blob and Blob
 has Blob.size property that is likely going to co-exist with File.fileSize
 for a while, for compat reasons. It's weird for file.size and file.fileSize
 to return different things.

True, but we'd probably want to deprecate file.fileSize anyway and
then get rid of it, since it's synchronous.

 2) Currently, xhr.send(file) does not fail and sends the version of the file
 that exists somewhere around xhr.send(file) call was issued. Since File is
 also a Blob, xhr.send(blob) would behave the same which means if we want to
 preserve this behavior the Blob can not fail async read operation if file
 has changed.
 There is a contradiction here. One way to resolve it would be to break File
 is Blob and to be able to capture the File as Blob by having
 file.getAsBlob(). The latter would make a snapshot of the state of the file,
 to be able to fail subsequent async read operations if the file has been
 changed.
 I've asked a few people around in a non-scientific poll and it seems
 developers expect Blob to be a 'snapshot', reflecting the state of the file
 (or Canvas if we get Canvas.getBlob(...)) at the moment of Blob creation.
 Since it's obviously bad to actually copy data, it seems acceptable to
 capture enough information (like mod time) so the read operations later can
 fail if underlying storage has been changed. It feels really strange if
 reading the Blob can yield some data from one version of a file (or Canvas)
 mixed with some data from newer version, without any indication that this is
 happening.
 All that means there is an option 3:
 3. Treat all Blobs as 'snapshots' that refer to the range of underlying data
 at the moment of creation of the Blob. Blobs produced further by
 Blob.slice() operation inherit the captured state w/o actually verifying it
 against 'live' underlying objects like files. All Blobs can be 'read' (or
 'sent') via operations that can fail if the underlying content has changed.
 Optionally, expose snapshotTime property and perhaps read if not changed
 since parameter to read operations. Do not derive File from Blob, rather
 have File.getAsBlob() that produces a Blob which is a snapshot of the file
 at the moment of call. The advantage here is that it removes need for sync
 operations from Blob and provides mechanism to ensure the changing
 underlying storage is detectable. The disadvantage is a bit more complexity
 and bigger change to File spec.

That sounds good to me.  If we're treating blobs as snapshots, I
retract my suggestion of the read-if-not-changed-since parameter.  All
reads after the data has changed should fail.  If you want to do a
chunked upload, don't snapshot your file into a blob until you're
ready to start.  Once you've done that, just slice off parts of the
blob, not the file.



Re: File API: Blob and underlying file changes.

2010-01-15 Thread Darin Fisher
I don't think we should worry about underlying file changes.

If the app wants to cut a file into parts and copy them separately, then
perhaps the app should first copy the file into a private area.  (I'm
presuming that one day, we'll have the concept of a chroot'd private file
storage area for a web app.)

I think we should avoid solutions that involve file locking since it is bad
for the user (loss of control) if their files are locked by the browser on
behalf of a web app.

It might be reasonable, however, to lock a file while sending it.

-Darin


On Thu, Jan 14, 2010 at 2:41 PM, Jian Li jia...@chromium.org wrote:

 It seems that we feel that when a File object is sent via either Form or
 XHR, the latest underlying version should be used. When we get a slice via
 Blob.slice, we assume that the underlying file data is stable since then.

 So for uploader scenario, we need to cut a big file into multiple pieces.
 With current File API spec, we will have to do something like the following
 to make sure that all pieces are cut from a stable file.
 var file = myInputElement.files[0];
 var blob = file.slice(0, file.size);
 var piece1 = blob.slice(0, 1000);
 var piece2 = blob.slice(1001, 1000);
 ...

 The above seems a bit ugly. If we want to make it clean, what Dmitry
 proposed above seems to be reasonable. But it would require non-trivial spec
 change.


 On Wed, Jan 13, 2010 at 11:28 AM, Dmitry Titov dim...@chromium.orgwrote:

 Atomic read is obviously a nice thing - it would be hard to program
 against API that behaves as unpredictably as a single read operation that
 reads half of old content and half of new content.

 At the same note, it would be likely very hard to program against Blob
 objects if they could change underneath unpredictably. Imagine that we need
 to build an uploader that cuts a big file in multiple pieces and sends those
 pieces to the servers so they will be stitched together later. If during
 this operation the underlying file changes and this changes all the pieces
 that Blobs refer to (due to clamping and just silent change of content), all
 the slicing/stitching assumptions are invalid and it's hard to even notice
 since blobs are simply 'clamped' silently. Some degree of mess is possible
 then.

 Another use case could be a JPEG image processor that uses slice() to cut
 the headers from the image file and then uses info from the headers to cut
 further JFIF fields from the file (reading EXIF and populating local
 database of images for example). Changing the file in the middle of that is
 bad.

 It seems the typical use cases that will need Blob.slice() functionality
 form 'units of work' where Blob.slice() is used with likely assumption that
 underlying data is stable and does not change silently. Such a 'unit of
 work'  should fail as a whole if underlying file changes. One way to achieve
 that is to reliably fail operations with 'derived' Blobs and even perhaps
 have a 'isValid' property on it. 'Derived' Blobs are those obtained via
 slice(), as opposite to 'original' Blobs that are also File.

 One disadvantage of this approach is that it implies that the same Blob
 has 2 possible behaviors - when it is obtained via Blob.slice() (or other
 methods) vs is a File.

 It all could be a bit cleaner if File did not derive from Blob, but
 instead had getAsBlob() method - then it would be possible to say that Blobs
 are always immutable but may become 'invalid' over time if underlying data
 changes. The FileReader can then be just a BlobReader and have cleaner
 semantics.

 If that was the case, then xhr.send(file) would capture the state of file
 at the moment of sending, while xhr.send(blob) would fail with exception if
 the blob is 'invalid' at the moment of send() operation. This would keep
 compatibility with current behavior and avoid duplicity of Blob behavior.
 Quite a change to the spec though...

 Dmitry

 On Wed, Jan 13, 2010 at 2:38 AM, Jonas Sicking jo...@sicking.cc wrote:

 On Tue, Jan 12, 2010 at 5:28 PM, Chris Prince cpri...@google.com
 wrote:
  For the record, I'd like to make the read atomic, such that you can
  never get half a file before a change, and half after. But it likely
  depends on what OSs can enforce here.
 
  I think *enforcing* atomicity is difficult across all OSes.
 
  But implementations can get nearly the same effect by checking the
  file's last modification time at the start + end of the API call.  If
  it has changed, the read operation can throw an exception.

 I'm talking about during the actual read. I.e. not related to the
 lifetime of the File object, just related to the time between the
 first 'progress' event, and the 'loadend' event. If the file changes
 during this time there is no way to fake atomicity since the partial
 file has already been returned.

 / Jonas






Re: File API: Blob and underlying file changes.

2010-01-15 Thread Jonas Sicking
On Thu, Jan 14, 2010 at 11:58 PM, Darin Fisher da...@chromium.org wrote:
 I don't think we should worry about underlying file changes.
 If the app wants to cut a file into parts and copy them separately, then
 perhaps the app should first copy the file into a private area.  (I'm
 presuming that one day, we'll have the concept of a chroot'd private file
 storage area for a web app.)
 I think we should avoid solutions that involve file locking since it is bad
 for the user (loss of control) if their files are locked by the browser on
 behalf of a web app.
 It might be reasonable, however, to lock a file while sending it.

I largely agree. Though I think it'd be reasonable to lock the file
while reading it too.

/ Jonas

 On Thu, Jan 14, 2010 at 2:41 PM, Jian Li jia...@chromium.org wrote:

 It seems that we feel that when a File object is sent via either Form or
 XHR, the latest underlying version should be used. When we get a slice via
 Blob.slice, we assume that the underlying file data is stable since then.
 So for uploader scenario, we need to cut a big file into multiple pieces.
 With current File API spec, we will have to do something like the following
 to make sure that all pieces are cut from a stable file.
     var file = myInputElement.files[0];
     var blob = file.slice(0, file.size);
     var piece1 = blob.slice(0, 1000);
     var piece2 = blob.slice(1001, 1000);
     ...
 The above seems a bit ugly. If we want to make it clean, what Dmitry
 proposed above seems to be reasonable. But it would require non-trivial spec
 change.

 On Wed, Jan 13, 2010 at 11:28 AM, Dmitry Titov dim...@chromium.org
 wrote:

 Atomic read is obviously a nice thing - it would be hard to program
 against API that behaves as unpredictably as a single read operation that
 reads half of old content and half of new content.
 At the same note, it would be likely very hard to program against Blob
 objects if they could change underneath unpredictably. Imagine that we need
 to build an uploader that cuts a big file in multiple pieces and sends those
 pieces to the servers so they will be stitched together later. If during
 this operation the underlying file changes and this changes all the pieces
 that Blobs refer to (due to clamping and just silent change of content), all
 the slicing/stitching assumptions are invalid and it's hard to even notice
 since blobs are simply 'clamped' silently. Some degree of mess is possible
 then.
 Another use case could be a JPEG image processor that uses slice() to cut
 the headers from the image file and then uses info from the headers to cut
 further JFIF fields from the file (reading EXIF and populating local
 database of images for example). Changing the file in the middle of that is
 bad.
 It seems the typical use cases that will need Blob.slice() functionality
 form 'units of work' where Blob.slice() is used with likely assumption that
 underlying data is stable and does not change silently. Such a 'unit of
 work'  should fail as a whole if underlying file changes. One way to achieve
 that is to reliably fail operations with 'derived' Blobs and even perhaps
 have a 'isValid' property on it. 'Derived' Blobs are those obtained via
 slice(), as opposite to 'original' Blobs that are also File.
 One disadvantage of this approach is that it implies that the same Blob
 has 2 possible behaviors - when it is obtained via Blob.slice() (or other
 methods) vs is a File.
 It all could be a bit cleaner if File did not derive from Blob, but
 instead had getAsBlob() method - then it would be possible to say that Blobs
 are always immutable but may become 'invalid' over time if underlying data
 changes. The FileReader can then be just a BlobReader and have cleaner
 semantics.
 If that was the case, then xhr.send(file) would capture the state of file
 at the moment of sending, while xhr.send(blob) would fail with exception if
 the blob is 'invalid' at the moment of send() operation. This would keep
 compatibility with current behavior and avoid duplicity of Blob behavior.
 Quite a change to the spec though...
 Dmitry
 On Wed, Jan 13, 2010 at 2:38 AM, Jonas Sicking jo...@sicking.cc wrote:

 On Tue, Jan 12, 2010 at 5:28 PM, Chris Prince cpri...@google.com
 wrote:
  For the record, I'd like to make the read atomic, such that you can
  never get half a file before a change, and half after. But it likely
  depends on what OSs can enforce here.
 
  I think *enforcing* atomicity is difficult across all OSes.
 
  But implementations can get nearly the same effect by checking the
  file's last modification time at the start + end of the API call.  If
  it has changed, the read operation can throw an exception.

 I'm talking about during the actual read. I.e. not related to the
 lifetime of the File object, just related to the time between the
 first 'progress' event, and the 'loadend' event. If the file changes
 during this time there is no way to fake atomicity since the partial
 file has already been returned.

 

Re: File API: Blob and underlying file changes.

2010-01-15 Thread Darin Fisher
On Fri, Jan 15, 2010 at 10:19 AM, Dmitry Titov dim...@chromium.org wrote:

 Nobody proposed locking the file. Sorry for being unclear if that sounds
 like it. Basically it's all about timestamps.

 As Chris proposed earlier, a read operation can grab the timestamp of the
 file before and after reading its content and throw exception if the
 timestamps do not match. This is pretty good approximation of 'atomic' read
 - although it can not guarantee success, it can at least provide reliable
 detection of it.


but doesn't that imply some degree of unpredictability for web developers?
 must they always handle that exception even though it is an extremely rare
occurrence?  also, what about normal form submission, in which the file
reading happens asynchronously to form.submit().




 Same thing with the Blob - the slice() may capture the timestamp of the
 content it's based on. Blob can throw exception later if the modification
 timestamp of underlying data has changed since the time of Blob's creation.


also note that we MUST NOT design APIs that involve synchronous file access.
 no stat calls allowed on the main UI thread please!  (remember the
network filesystem case.)

in other words, assuming detection of file changes happens asynchronously,
we'll have trouble producing exceptions as you describe.




 Both actual OS locking and requiring copying files to a safe location
 before slice() seem to be problematic, for different reasons. Good example
 is youtube uploader that needs to slice and send 1Gb file, while having a
 way to reliably detect the change of the underlyign file, terminate current
 upload and potentially request another one. Copying is hard because of size
 and locking, even if provided by OS, may stay in the way of user's workflow.

 Dmitry

 On Thu, Jan 14, 2010 at 11:58 PM, Darin Fisher da...@chromium.org wrote:

 I don't think we should worry about underlying file changes.

 If the app wants to cut a file into parts and copy them separately, then
 perhaps the app should first copy the file into a private area.  (I'm
 presuming that one day, we'll have the concept of a chroot'd private file
 storage area for a web app.)

 I think we should avoid solutions that involve file locking since it is
 bad for the user (loss of control) if their files are locked by the browser
 on behalf of a web app.

 It might be reasonable, however, to lock a file while sending it.

 -Darin


 On Thu, Jan 14, 2010 at 2:41 PM, Jian Li jia...@chromium.org wrote:

 It seems that we feel that when a File object is sent via either Form or
 XHR, the latest underlying version should be used. When we get a slice via
 Blob.slice, we assume that the underlying file data is stable since then.

 So for uploader scenario, we need to cut a big file into multiple pieces.
 With current File API spec, we will have to do something like the following
 to make sure that all pieces are cut from a stable file.
  var file = myInputElement.files[0];
 var blob = file.slice(0, file.size);
 var piece1 = blob.slice(0, 1000);
 var piece2 = blob.slice(1001, 1000);
 ...

 The above seems a bit ugly. If we want to make it clean, what Dmitry
 proposed above seems to be reasonable. But it would require non-trivial spec
 change.


 On Wed, Jan 13, 2010 at 11:28 AM, Dmitry Titov dim...@chromium.orgwrote:

 Atomic read is obviously a nice thing - it would be hard to program
 against API that behaves as unpredictably as a single read operation that
 reads half of old content and half of new content.

 At the same note, it would be likely very hard to program against Blob
 objects if they could change underneath unpredictably. Imagine that we need
 to build an uploader that cuts a big file in multiple pieces and sends 
 those
 pieces to the servers so they will be stitched together later. If during
 this operation the underlying file changes and this changes all the pieces
 that Blobs refer to (due to clamping and just silent change of content), 
 all
 the slicing/stitching assumptions are invalid and it's hard to even notice
 since blobs are simply 'clamped' silently. Some degree of mess is possible
 then.

 Another use case could be a JPEG image processor that uses slice() to
 cut the headers from the image file and then uses info from the headers to
 cut further JFIF fields from the file (reading EXIF and populating local
 database of images for example). Changing the file in the middle of that is
 bad.

 It seems the typical use cases that will need Blob.slice() functionality
 form 'units of work' where Blob.slice() is used with likely assumption that
 underlying data is stable and does not change silently. Such a 'unit of
 work'  should fail as a whole if underlying file changes. One way to 
 achieve
 that is to reliably fail operations with 'derived' Blobs and even perhaps
 have a 'isValid' property on it. 'Derived' Blobs are those obtained via
 slice(), as opposite to 'original' Blobs that are also File.

 One 

Re: File API: Blob and underlying file changes.

2010-01-15 Thread Jonas Sicking
On Fri, Jan 15, 2010 at 10:19 AM, Dmitry Titov dim...@chromium.org wrote:
 Nobody proposed locking the file. Sorry for being unclear if that sounds
 like it. Basically it's all about timestamps.
 As Chris proposed earlier, a read operation can grab the timestamp of the
 file before and after reading its content and throw exception if the
 timestamps do not match. This is pretty good approximation of 'atomic' read
 - although it can not guarantee success, it can at least provide reliable
 detection of it.

I don't understand how you intend to use the timestamp. Consider the
following scenario:

1. User drops a 10MB File onto the a page.
2. Page requests to read the file using FileReader.readAsBinaryString
and installs a 'progress' event listener.
3. Implementation grabs a the current timestamp and then starts reading the file
4. After 2MB of data is read the implementation updates
FileReader.result with the partial read and fires a 'progress' event.
5. Page grabs the partial result and processes it.
6. After another 1MB of data is read, but before another 'progress'
event has been fired, the user modifies the file such that the
timestamp changes
7. The implementation detects that the timestamp has changed.

Now what?

You can't throw an exception since part of the file has already been
delivered. You could raise an error event, but that's unlikely to be
treated correctly by the page as this is a very rare condition and
hard to test for, so the page author has likely not written correct
code to deal with it. It's additionally not atomic since the read
started, but was interrupted.

/ Jonas



Re: File API: Blob and underlying file changes.

2010-01-15 Thread Jonas Sicking
On Fri, Jan 15, 2010 at 11:42 AM, Dmitry Titov dim...@chromium.org wrote:


 On Fri, Jan 15, 2010 at 10:36 AM, Jonas Sicking jo...@sicking.cc wrote:

 On Fri, Jan 15, 2010 at 10:19 AM, Dmitry Titov dim...@chromium.org
 wrote:
  Nobody proposed locking the file. Sorry for being unclear if that sounds
  like it. Basically it's all about timestamps.
  As Chris proposed earlier, a read operation can grab the timestamp of
  the
  file before and after reading its content and throw exception if the
  timestamps do not match. This is pretty good approximation of 'atomic'
  read
  - although it can not guarantee success, it can at least provide
  reliable
  detection of it.

 I don't understand how you intend to use the timestamp. Consider the
 following scenario:

 1. User drops a 10MB File onto the a page.
 2. Page requests to read the file using FileReader.readAsBinaryString
 and installs a 'progress' event listener.
 3. Implementation grabs a the current timestamp and then starts reading
 the file
 4. After 2MB of data is read the implementation updates
 FileReader.result with the partial read and fires a 'progress' event.
 5. Page grabs the partial result and processes it.
 6. After another 1MB of data is read, but before another 'progress'
 event has been fired, the user modifies the file such that the
 timestamp changes
 7. The implementation detects that the timestamp has changed.

 Now what?

 You can't throw an exception since part of the file has already been
 delivered. You could raise an error event, but that's unlikely to be
 treated correctly by the page as this is a very rare condition and
 hard to test for, so the page author has likely not written correct
 code to deal with it.

 FileReader has both 'error' and 'abort' events, in addition to 'progress'.
 It seems we just can use those? There is always a possibility that async
 operation that comes with partial results may fail as a whole - the only
 real way to ensure its atomicity would be to reliably lock the file or/and
 make a copy - which as this thread indicates are both not always possible.
 So yeah, in case FileReader returned 2MB and file suddenly changed to be
 only 1Mb, the next event the page should get is 'error'.
 What would be other possibility?

This doesn't address the problem that authors are unlikely to even
attempt to deal with this situation, given how rare it is. And even
less likely to deal with it successfully given how hard the situation
is reproduce while testing.

/ Jonas



Re: File API: Blob and underlying file changes.

2010-01-14 Thread Jian Li
It seems that we feel that when a File object is sent via either Form or
XHR, the latest underlying version should be used. When we get a slice via
Blob.slice, we assume that the underlying file data is stable since then.

So for uploader scenario, we need to cut a big file into multiple pieces.
With current File API spec, we will have to do something like the following
to make sure that all pieces are cut from a stable file.
var file = myInputElement.files[0];
var blob = file.slice(0, file.size);
var piece1 = blob.slice(0, 1000);
var piece2 = blob.slice(1001, 1000);
...

The above seems a bit ugly. If we want to make it clean, what Dmitry
proposed above seems to be reasonable. But it would require non-trivial spec
change.


On Wed, Jan 13, 2010 at 11:28 AM, Dmitry Titov dim...@chromium.org wrote:

 Atomic read is obviously a nice thing - it would be hard to program against
 API that behaves as unpredictably as a single read operation that reads half
 of old content and half of new content.

 At the same note, it would be likely very hard to program against Blob
 objects if they could change underneath unpredictably. Imagine that we need
 to build an uploader that cuts a big file in multiple pieces and sends those
 pieces to the servers so they will be stitched together later. If during
 this operation the underlying file changes and this changes all the pieces
 that Blobs refer to (due to clamping and just silent change of content), all
 the slicing/stitching assumptions are invalid and it's hard to even notice
 since blobs are simply 'clamped' silently. Some degree of mess is possible
 then.

 Another use case could be a JPEG image processor that uses slice() to cut
 the headers from the image file and then uses info from the headers to cut
 further JFIF fields from the file (reading EXIF and populating local
 database of images for example). Changing the file in the middle of that is
 bad.

 It seems the typical use cases that will need Blob.slice() functionality
 form 'units of work' where Blob.slice() is used with likely assumption that
 underlying data is stable and does not change silently. Such a 'unit of
 work'  should fail as a whole if underlying file changes. One way to achieve
 that is to reliably fail operations with 'derived' Blobs and even perhaps
 have a 'isValid' property on it. 'Derived' Blobs are those obtained via
 slice(), as opposite to 'original' Blobs that are also File.

 One disadvantage of this approach is that it implies that the same Blob has
 2 possible behaviors - when it is obtained via Blob.slice() (or other
 methods) vs is a File.

 It all could be a bit cleaner if File did not derive from Blob, but instead
 had getAsBlob() method - then it would be possible to say that Blobs are
 always immutable but may become 'invalid' over time if underlying data
 changes. The FileReader can then be just a BlobReader and have cleaner
 semantics.

 If that was the case, then xhr.send(file) would capture the state of file
 at the moment of sending, while xhr.send(blob) would fail with exception if
 the blob is 'invalid' at the moment of send() operation. This would keep
 compatibility with current behavior and avoid duplicity of Blob behavior.
 Quite a change to the spec though...

 Dmitry

 On Wed, Jan 13, 2010 at 2:38 AM, Jonas Sicking jo...@sicking.cc wrote:

 On Tue, Jan 12, 2010 at 5:28 PM, Chris Prince cpri...@google.com wrote:
  For the record, I'd like to make the read atomic, such that you can
  never get half a file before a change, and half after. But it likely
  depends on what OSs can enforce here.
 
  I think *enforcing* atomicity is difficult across all OSes.
 
  But implementations can get nearly the same effect by checking the
  file's last modification time at the start + end of the API call.  If
  it has changed, the read operation can throw an exception.

 I'm talking about during the actual read. I.e. not related to the
 lifetime of the File object, just related to the time between the
 first 'progress' event, and the 'loadend' event. If the file changes
 during this time there is no way to fake atomicity since the partial
 file has already been returned.

 / Jonas





Re: File API: Blob and underlying file changes.

2010-01-13 Thread Jonas Sicking
On Tue, Jan 12, 2010 at 5:28 PM, Chris Prince cpri...@google.com wrote:
 For the record, I'd like to make the read atomic, such that you can
 never get half a file before a change, and half after. But it likely
 depends on what OSs can enforce here.

 I think *enforcing* atomicity is difficult across all OSes.

 But implementations can get nearly the same effect by checking the
 file's last modification time at the start + end of the API call.  If
 it has changed, the read operation can throw an exception.

I'm talking about during the actual read. I.e. not related to the
lifetime of the File object, just related to the time between the
first 'progress' event, and the 'loadend' event. If the file changes
during this time there is no way to fake atomicity since the partial
file has already been returned.

/ Jonas



Re: File API: Blob and underlying file changes.

2010-01-13 Thread Dmitry Titov
Atomic read is obviously a nice thing - it would be hard to program against
API that behaves as unpredictably as a single read operation that reads half
of old content and half of new content.

At the same note, it would be likely very hard to program against Blob
objects if they could change underneath unpredictably. Imagine that we need
to build an uploader that cuts a big file in multiple pieces and sends those
pieces to the servers so they will be stitched together later. If during
this operation the underlying file changes and this changes all the pieces
that Blobs refer to (due to clamping and just silent change of content), all
the slicing/stitching assumptions are invalid and it's hard to even notice
since blobs are simply 'clamped' silently. Some degree of mess is possible
then.

Another use case could be a JPEG image processor that uses slice() to cut
the headers from the image file and then uses info from the headers to cut
further JFIF fields from the file (reading EXIF and populating local
database of images for example). Changing the file in the middle of that is
bad.

It seems the typical use cases that will need Blob.slice() functionality
form 'units of work' where Blob.slice() is used with likely assumption that
underlying data is stable and does not change silently. Such a 'unit of
work'  should fail as a whole if underlying file changes. One way to achieve
that is to reliably fail operations with 'derived' Blobs and even perhaps
have a 'isValid' property on it. 'Derived' Blobs are those obtained via
slice(), as opposite to 'original' Blobs that are also File.

One disadvantage of this approach is that it implies that the same Blob has
2 possible behaviors - when it is obtained via Blob.slice() (or other
methods) vs is a File.

It all could be a bit cleaner if File did not derive from Blob, but instead
had getAsBlob() method - then it would be possible to say that Blobs are
always immutable but may become 'invalid' over time if underlying data
changes. The FileReader can then be just a BlobReader and have cleaner
semantics.

If that was the case, then xhr.send(file) would capture the state of file at
the moment of sending, while xhr.send(blob) would fail with exception if the
blob is 'invalid' at the moment of send() operation. This would keep
compatibility with current behavior and avoid duplicity of Blob behavior.
Quite a change to the spec though...

Dmitry

On Wed, Jan 13, 2010 at 2:38 AM, Jonas Sicking jo...@sicking.cc wrote:

 On Tue, Jan 12, 2010 at 5:28 PM, Chris Prince cpri...@google.com wrote:
  For the record, I'd like to make the read atomic, such that you can
  never get half a file before a change, and half after. But it likely
  depends on what OSs can enforce here.
 
  I think *enforcing* atomicity is difficult across all OSes.
 
  But implementations can get nearly the same effect by checking the
  file's last modification time at the start + end of the API call.  If
  it has changed, the read operation can throw an exception.

 I'm talking about during the actual read. I.e. not related to the
 lifetime of the File object, just related to the time between the
 first 'progress' event, and the 'loadend' event. If the file changes
 during this time there is no way to fake atomicity since the partial
 file has already been returned.

 / Jonas



Re: File API: Blob and underlying file changes.

2010-01-12 Thread Chris Prince
 For the record, I'd like to make the read atomic, such that you can
 never get half a file before a change, and half after. But it likely
 depends on what OSs can enforce here.

I think *enforcing* atomicity is difficult across all OSes.

But implementations can get nearly the same effect by checking the
file's last modification time at the start + end of the API call.  If
it has changed, the read operation can throw an exception.



File API: Blob and underlying file changes.

2010-01-08 Thread Dmitry Titov
Hi,

Does the Blob, which is obtained as File (so it refers to an actual file on
disk) track the changes in the underlying file and 'mutates', or does it
represent the 'snapshot' of the file, or does it become 'invalid'?

Today, if a user selects a file using input type=file, and then the file
on the disk changes before the 'submit' is clicked, the form will submit the
latest version of the file.
This may be a surprisingly popular use case, when user submits a file via
form and wants to do 'last moment' changes in the file, after partial
pre-populating the form. It works 'intuitively' today.

Now, if the page decides to use XHR to upload the file, I think

var file = myInputElement.files[0];
var xhr = ...
xhr.send(file);

should also send the version of the file that exists at the moment of
xhr.send(file), not when user picked the file (for consistency with form
action).

Assuming this is desired behavior, what should the following do:

var file = myInputElement.files[0];
var blob = file.slice(0, file.size);
// ... now file on the disk changes ...
xhr.send(blob);

Will it:
- send the new version of the whole file (and update blob.size?)
- send captured number of bytes from the new version of the file (perhaps
truncated since file may be shorter now)
- send original bytes from the previous version of the file that existed
when Blob was created (sort of 'copy on write')
- throw exception
?


Thanks,
Dmitry


Re: File API: Blob and underlying file changes.

2010-01-08 Thread Dmitry Titov
Adding reply from Jonas Sicking from anther list (which I used first by
mistake :( )

Technically, you should send this email to the webapps mailing list,
since that is where this spec is being developed.

That said, this is a really hard problem, and one that is hard to
test. One thing that we decided when we did security review on this
stuff at mozilla is that if a File object is ever passed cross origin
using postMessage, then the File object that the other origin has
should not work if the file is changed on disc. For some definition of
not work.


On Fri, Jan 8, 2010 at 2:21 PM, Dmitry Titov dim...@chromium.org wrote:

 Hi,

 Does the Blob, which is obtained as File (so it refers to an actual file on
 disk) track the changes in the underlying file and 'mutates', or does it
 represent the 'snapshot' of the file, or does it become 'invalid'?

 Today, if a user selects a file using input type=file, and then the file
 on the disk changes before the 'submit' is clicked, the form will submit the
 latest version of the file.
 This may be a surprisingly popular use case, when user submits a file via
 form and wants to do 'last moment' changes in the file, after partial
 pre-populating the form. It works 'intuitively' today.

 Now, if the page decides to use XHR to upload the file, I think

 var file = myInputElement.files[0];
 var xhr = ...
 xhr.send(file);

 should also send the version of the file that exists at the moment of
 xhr.send(file), not when user picked the file (for consistency with form
 action).

 Assuming this is desired behavior, what should the following do:

 var file = myInputElement.files[0];
 var blob = file.slice(0, file.size);
 // ... now file on the disk changes ...
 xhr.send(blob);

 Will it:
 - send the new version of the whole file (and update blob.size?)
 - send captured number of bytes from the new version of the file (perhaps
 truncated since file may be shorter now)
 - send original bytes from the previous version of the file that existed
 when Blob was created (sort of 'copy on write')
 - throw exception
 ?


 Thanks,
 Dmitry