Re: File API: Blob and underlying file changes.

2010-01-15 Thread Darin Fisher
I don't think we should worry about underlying file changes.

If the app wants to cut a file into parts and copy them separately, then
perhaps the app should first copy the file into a private area.  (I'm
presuming that one day, we'll have the concept of a chroot'd private file
storage area for a web app.)

I think we should avoid solutions that involve file locking since it is bad
for the user (loss of control) if their files are locked by the browser on
behalf of a web app.

It might be reasonable, however, to lock a file while sending it.

-Darin


On Thu, Jan 14, 2010 at 2:41 PM, Jian Li jia...@chromium.org wrote:

 It seems that we feel that when a File object is sent via either Form or
 XHR, the latest underlying version should be used. When we get a slice via
 Blob.slice, we assume that the underlying file data is stable since then.

 So for uploader scenario, we need to cut a big file into multiple pieces.
 With current File API spec, we will have to do something like the following
 to make sure that all pieces are cut from a stable file.
 var file = myInputElement.files[0];
 var blob = file.slice(0, file.size);
 var piece1 = blob.slice(0, 1000);
 var piece2 = blob.slice(1001, 1000);
 ...

 The above seems a bit ugly. If we want to make it clean, what Dmitry
 proposed above seems to be reasonable. But it would require non-trivial spec
 change.


 On Wed, Jan 13, 2010 at 11:28 AM, Dmitry Titov dim...@chromium.orgwrote:

 Atomic read is obviously a nice thing - it would be hard to program
 against API that behaves as unpredictably as a single read operation that
 reads half of old content and half of new content.

 At the same note, it would be likely very hard to program against Blob
 objects if they could change underneath unpredictably. Imagine that we need
 to build an uploader that cuts a big file in multiple pieces and sends those
 pieces to the servers so they will be stitched together later. If during
 this operation the underlying file changes and this changes all the pieces
 that Blobs refer to (due to clamping and just silent change of content), all
 the slicing/stitching assumptions are invalid and it's hard to even notice
 since blobs are simply 'clamped' silently. Some degree of mess is possible
 then.

 Another use case could be a JPEG image processor that uses slice() to cut
 the headers from the image file and then uses info from the headers to cut
 further JFIF fields from the file (reading EXIF and populating local
 database of images for example). Changing the file in the middle of that is
 bad.

 It seems the typical use cases that will need Blob.slice() functionality
 form 'units of work' where Blob.slice() is used with likely assumption that
 underlying data is stable and does not change silently. Such a 'unit of
 work'  should fail as a whole if underlying file changes. One way to achieve
 that is to reliably fail operations with 'derived' Blobs and even perhaps
 have a 'isValid' property on it. 'Derived' Blobs are those obtained via
 slice(), as opposite to 'original' Blobs that are also File.

 One disadvantage of this approach is that it implies that the same Blob
 has 2 possible behaviors - when it is obtained via Blob.slice() (or other
 methods) vs is a File.

 It all could be a bit cleaner if File did not derive from Blob, but
 instead had getAsBlob() method - then it would be possible to say that Blobs
 are always immutable but may become 'invalid' over time if underlying data
 changes. The FileReader can then be just a BlobReader and have cleaner
 semantics.

 If that was the case, then xhr.send(file) would capture the state of file
 at the moment of sending, while xhr.send(blob) would fail with exception if
 the blob is 'invalid' at the moment of send() operation. This would keep
 compatibility with current behavior and avoid duplicity of Blob behavior.
 Quite a change to the spec though...

 Dmitry

 On Wed, Jan 13, 2010 at 2:38 AM, Jonas Sicking jo...@sicking.cc wrote:

 On Tue, Jan 12, 2010 at 5:28 PM, Chris Prince cpri...@google.com
 wrote:
  For the record, I'd like to make the read atomic, such that you can
  never get half a file before a change, and half after. But it likely
  depends on what OSs can enforce here.
 
  I think *enforcing* atomicity is difficult across all OSes.
 
  But implementations can get nearly the same effect by checking the
  file's last modification time at the start + end of the API call.  If
  it has changed, the read operation can throw an exception.

 I'm talking about during the actual read. I.e. not related to the
 lifetime of the File object, just related to the time between the
 first 'progress' event, and the 'loadend' event. If the file changes
 during this time there is no way to fake atomicity since the partial
 file has already been returned.

 / Jonas






Re: File API: Blob and underlying file changes.

2010-01-15 Thread Jonas Sicking
On Thu, Jan 14, 2010 at 11:58 PM, Darin Fisher da...@chromium.org wrote:
 I don't think we should worry about underlying file changes.
 If the app wants to cut a file into parts and copy them separately, then
 perhaps the app should first copy the file into a private area.  (I'm
 presuming that one day, we'll have the concept of a chroot'd private file
 storage area for a web app.)
 I think we should avoid solutions that involve file locking since it is bad
 for the user (loss of control) if their files are locked by the browser on
 behalf of a web app.
 It might be reasonable, however, to lock a file while sending it.

I largely agree. Though I think it'd be reasonable to lock the file
while reading it too.

/ Jonas

 On Thu, Jan 14, 2010 at 2:41 PM, Jian Li jia...@chromium.org wrote:

 It seems that we feel that when a File object is sent via either Form or
 XHR, the latest underlying version should be used. When we get a slice via
 Blob.slice, we assume that the underlying file data is stable since then.
 So for uploader scenario, we need to cut a big file into multiple pieces.
 With current File API spec, we will have to do something like the following
 to make sure that all pieces are cut from a stable file.
     var file = myInputElement.files[0];
     var blob = file.slice(0, file.size);
     var piece1 = blob.slice(0, 1000);
     var piece2 = blob.slice(1001, 1000);
     ...
 The above seems a bit ugly. If we want to make it clean, what Dmitry
 proposed above seems to be reasonable. But it would require non-trivial spec
 change.

 On Wed, Jan 13, 2010 at 11:28 AM, Dmitry Titov dim...@chromium.org
 wrote:

 Atomic read is obviously a nice thing - it would be hard to program
 against API that behaves as unpredictably as a single read operation that
 reads half of old content and half of new content.
 At the same note, it would be likely very hard to program against Blob
 objects if they could change underneath unpredictably. Imagine that we need
 to build an uploader that cuts a big file in multiple pieces and sends those
 pieces to the servers so they will be stitched together later. If during
 this operation the underlying file changes and this changes all the pieces
 that Blobs refer to (due to clamping and just silent change of content), all
 the slicing/stitching assumptions are invalid and it's hard to even notice
 since blobs are simply 'clamped' silently. Some degree of mess is possible
 then.
 Another use case could be a JPEG image processor that uses slice() to cut
 the headers from the image file and then uses info from the headers to cut
 further JFIF fields from the file (reading EXIF and populating local
 database of images for example). Changing the file in the middle of that is
 bad.
 It seems the typical use cases that will need Blob.slice() functionality
 form 'units of work' where Blob.slice() is used with likely assumption that
 underlying data is stable and does not change silently. Such a 'unit of
 work'  should fail as a whole if underlying file changes. One way to achieve
 that is to reliably fail operations with 'derived' Blobs and even perhaps
 have a 'isValid' property on it. 'Derived' Blobs are those obtained via
 slice(), as opposite to 'original' Blobs that are also File.
 One disadvantage of this approach is that it implies that the same Blob
 has 2 possible behaviors - when it is obtained via Blob.slice() (or other
 methods) vs is a File.
 It all could be a bit cleaner if File did not derive from Blob, but
 instead had getAsBlob() method - then it would be possible to say that Blobs
 are always immutable but may become 'invalid' over time if underlying data
 changes. The FileReader can then be just a BlobReader and have cleaner
 semantics.
 If that was the case, then xhr.send(file) would capture the state of file
 at the moment of sending, while xhr.send(blob) would fail with exception if
 the blob is 'invalid' at the moment of send() operation. This would keep
 compatibility with current behavior and avoid duplicity of Blob behavior.
 Quite a change to the spec though...
 Dmitry
 On Wed, Jan 13, 2010 at 2:38 AM, Jonas Sicking jo...@sicking.cc wrote:

 On Tue, Jan 12, 2010 at 5:28 PM, Chris Prince cpri...@google.com
 wrote:
  For the record, I'd like to make the read atomic, such that you can
  never get half a file before a change, and half after. But it likely
  depends on what OSs can enforce here.
 
  I think *enforcing* atomicity is difficult across all OSes.
 
  But implementations can get nearly the same effect by checking the
  file's last modification time at the start + end of the API call.  If
  it has changed, the read operation can throw an exception.

 I'm talking about during the actual read. I.e. not related to the
 lifetime of the File object, just related to the time between the
 first 'progress' event, and the 'loadend' event. If the file changes
 during this time there is no way to fake atomicity since the partial
 file has already been returned.

 

Restart the group

2010-01-15 Thread Rokesh Jankie
Hi there,

I just found out that this group is very interesting and it stopped because
of several reasons in 2007.
We are in 2010 now and I think the momentum is there for declarative
application development.
We have a proposal and want to make it to an open specification.

The question: how to continue from here ? Is it possible to trigger people
from certain groups/companies to join ?

Hope to hear from you soon.

Kind regards,
Rokesh Jankie



---
QAFE
Enterprise Applications Made Easy

Website :http://www.qafe.com/ LinkedIn :
   http://www.linkedin.com/in/rjankie
Twitter :  http://twitter.com/qafe
 Company :  http://www.qualogy.com
Youtube :http://youtube.com/qafechannel  Google profile:
http://www.google.com/profiles/qafeframework


Re: Restart the group

2010-01-15 Thread Rokesh Jankie
I'm sorry for my mistake then. I would like to participate.
Nice to hear the the group is much alive and active.

Thanks for the note.

Regards,
Rokesh Jankie


---
QAFE
powered by experience  quality

Website : http://www.qafe.com/
Youtube : http://youtube.com/qafechannel
LinkedIn: http://www.linkedin.com/groups?gid=134874

Sent from Voorburg, ZH, Netherlands

On Fri, Jan 15, 2010 at 14:21, Lachlan Hunt lachlan.h...@lachy.id.auwrote:

 Rokesh Jankie wrote:

 I just found out that this group is very interesting and it stopped
 because
 of several reasons in 2007.


 You must be mistaken.  This group is very much alive and active.  While the
 Web API WG charter ended in 2007, the Web Apps group was rechartered [1],
 merging the Web API and App Formats WGs.

 [1] http://www.w3.org/2008/webapps/charter/

 --
 Lachlan Hunt - Opera Software
 http://lachy.id.au/
 http://www.opera.com/



Re: Restart the group

2010-01-15 Thread Arthur Barstow

Rokesh,

On Jan 15, 2010, at 6:43 AM, ext Rokesh Jankie wrote:
I just found out that this group is very interesting and it stopped  
because of several reasons in 2007.


Perhaps you are thinking of the Web Applications Format (WAF) WG and  
Web API WG which both ended in 2008 (and as Lachan indicated, they  
were merged to form the Web Applications WG).


We are in 2010 now and I think the momentum is there for  
declarative application development.

We have a proposal and want to make it to an open specification.

The question: how to continue from here ? Is it possible to trigger  
people from certain groups/companies to join ?


FYI, in 2006, the WAF WG began work on a declarative format for  
applications and user interfaces (aka DFAUI) spec and that work  
ended with the publication of the following Working Group Note in  
September 2007:


 Declarative Formats for Applications and User Interfaces
 http://www.w3.org/TR/dfaui/

-Art Barstow





Re: File API: Blob and underlying file changes.

2010-01-15 Thread Darin Fisher
On Fri, Jan 15, 2010 at 10:19 AM, Dmitry Titov dim...@chromium.org wrote:

 Nobody proposed locking the file. Sorry for being unclear if that sounds
 like it. Basically it's all about timestamps.

 As Chris proposed earlier, a read operation can grab the timestamp of the
 file before and after reading its content and throw exception if the
 timestamps do not match. This is pretty good approximation of 'atomic' read
 - although it can not guarantee success, it can at least provide reliable
 detection of it.


but doesn't that imply some degree of unpredictability for web developers?
 must they always handle that exception even though it is an extremely rare
occurrence?  also, what about normal form submission, in which the file
reading happens asynchronously to form.submit().




 Same thing with the Blob - the slice() may capture the timestamp of the
 content it's based on. Blob can throw exception later if the modification
 timestamp of underlying data has changed since the time of Blob's creation.


also note that we MUST NOT design APIs that involve synchronous file access.
 no stat calls allowed on the main UI thread please!  (remember the
network filesystem case.)

in other words, assuming detection of file changes happens asynchronously,
we'll have trouble producing exceptions as you describe.




 Both actual OS locking and requiring copying files to a safe location
 before slice() seem to be problematic, for different reasons. Good example
 is youtube uploader that needs to slice and send 1Gb file, while having a
 way to reliably detect the change of the underlyign file, terminate current
 upload and potentially request another one. Copying is hard because of size
 and locking, even if provided by OS, may stay in the way of user's workflow.

 Dmitry

 On Thu, Jan 14, 2010 at 11:58 PM, Darin Fisher da...@chromium.org wrote:

 I don't think we should worry about underlying file changes.

 If the app wants to cut a file into parts and copy them separately, then
 perhaps the app should first copy the file into a private area.  (I'm
 presuming that one day, we'll have the concept of a chroot'd private file
 storage area for a web app.)

 I think we should avoid solutions that involve file locking since it is
 bad for the user (loss of control) if their files are locked by the browser
 on behalf of a web app.

 It might be reasonable, however, to lock a file while sending it.

 -Darin


 On Thu, Jan 14, 2010 at 2:41 PM, Jian Li jia...@chromium.org wrote:

 It seems that we feel that when a File object is sent via either Form or
 XHR, the latest underlying version should be used. When we get a slice via
 Blob.slice, we assume that the underlying file data is stable since then.

 So for uploader scenario, we need to cut a big file into multiple pieces.
 With current File API spec, we will have to do something like the following
 to make sure that all pieces are cut from a stable file.
  var file = myInputElement.files[0];
 var blob = file.slice(0, file.size);
 var piece1 = blob.slice(0, 1000);
 var piece2 = blob.slice(1001, 1000);
 ...

 The above seems a bit ugly. If we want to make it clean, what Dmitry
 proposed above seems to be reasonable. But it would require non-trivial spec
 change.


 On Wed, Jan 13, 2010 at 11:28 AM, Dmitry Titov dim...@chromium.orgwrote:

 Atomic read is obviously a nice thing - it would be hard to program
 against API that behaves as unpredictably as a single read operation that
 reads half of old content and half of new content.

 At the same note, it would be likely very hard to program against Blob
 objects if they could change underneath unpredictably. Imagine that we need
 to build an uploader that cuts a big file in multiple pieces and sends 
 those
 pieces to the servers so they will be stitched together later. If during
 this operation the underlying file changes and this changes all the pieces
 that Blobs refer to (due to clamping and just silent change of content), 
 all
 the slicing/stitching assumptions are invalid and it's hard to even notice
 since blobs are simply 'clamped' silently. Some degree of mess is possible
 then.

 Another use case could be a JPEG image processor that uses slice() to
 cut the headers from the image file and then uses info from the headers to
 cut further JFIF fields from the file (reading EXIF and populating local
 database of images for example). Changing the file in the middle of that is
 bad.

 It seems the typical use cases that will need Blob.slice() functionality
 form 'units of work' where Blob.slice() is used with likely assumption that
 underlying data is stable and does not change silently. Such a 'unit of
 work'  should fail as a whole if underlying file changes. One way to 
 achieve
 that is to reliably fail operations with 'derived' Blobs and even perhaps
 have a 'isValid' property on it. 'Derived' Blobs are those obtained via
 slice(), as opposite to 'original' Blobs that are also File.

 One 

Re: File API: Blob and underlying file changes.

2010-01-15 Thread Jonas Sicking
On Fri, Jan 15, 2010 at 10:19 AM, Dmitry Titov dim...@chromium.org wrote:
 Nobody proposed locking the file. Sorry for being unclear if that sounds
 like it. Basically it's all about timestamps.
 As Chris proposed earlier, a read operation can grab the timestamp of the
 file before and after reading its content and throw exception if the
 timestamps do not match. This is pretty good approximation of 'atomic' read
 - although it can not guarantee success, it can at least provide reliable
 detection of it.

I don't understand how you intend to use the timestamp. Consider the
following scenario:

1. User drops a 10MB File onto the a page.
2. Page requests to read the file using FileReader.readAsBinaryString
and installs a 'progress' event listener.
3. Implementation grabs a the current timestamp and then starts reading the file
4. After 2MB of data is read the implementation updates
FileReader.result with the partial read and fires a 'progress' event.
5. Page grabs the partial result and processes it.
6. After another 1MB of data is read, but before another 'progress'
event has been fired, the user modifies the file such that the
timestamp changes
7. The implementation detects that the timestamp has changed.

Now what?

You can't throw an exception since part of the file has already been
delivered. You could raise an error event, but that's unlikely to be
treated correctly by the page as this is a very rare condition and
hard to test for, so the page author has likely not written correct
code to deal with it. It's additionally not atomic since the read
started, but was interrupted.

/ Jonas



IndexedDB and MVCC

2010-01-15 Thread Chris Anderson
Hi,

I've been reading the new IndexedDB spec as published here:
http://www.w3.org/TR/IndexedDB/

My first impression is that this simpler than WebSimpleDB, but not too
simple. I'm happy to see detached readers being mentioned.

There's one other piece of the concurrency story that could be useful.

In section 3.2.2 Object Store Storage steps

step 7: If the no-overwrite flag was passed to these steps and is set,
and a record already exists with its key being key, then terminate
these steps and set error code CONSTRAINT_ERR.

I think it wouldn't add much complexity to use a compare-and-swap
pattern, instead of a no-write-if-exists pattern. This would allow for
better concurrency via optimistic updates, and look a lot like HTTP
etags.

It could be accomplished by allowing an object store to take a
key-path for the update-token. Then subsequent updates could require
that the key-path match. (Some additional complexity: we'd need the
ability to check for a matching update-token, then change it, in a
transaction).

CouchDB uses an MVCC token that must match to allow updates. This
allows us to avoid locking. But even more important is the parallels
we have with HTTP Etags (if-match for idempotence, if-none-match for
caching).

The CouchDB style of MVCC can be accomplished by updates in a
compare-and-swap transaction, so technically I can do what I want in
the spec as it stands. But I still think the parallels to HTTP etags
can be instructive.

Chris


-- 
Chris Anderson
http://jchrisa.net
http://couch.io



Re: File API: Blob and underlying file changes.

2010-01-15 Thread Jonas Sicking
On Fri, Jan 15, 2010 at 11:42 AM, Dmitry Titov dim...@chromium.org wrote:


 On Fri, Jan 15, 2010 at 10:36 AM, Jonas Sicking jo...@sicking.cc wrote:

 On Fri, Jan 15, 2010 at 10:19 AM, Dmitry Titov dim...@chromium.org
 wrote:
  Nobody proposed locking the file. Sorry for being unclear if that sounds
  like it. Basically it's all about timestamps.
  As Chris proposed earlier, a read operation can grab the timestamp of
  the
  file before and after reading its content and throw exception if the
  timestamps do not match. This is pretty good approximation of 'atomic'
  read
  - although it can not guarantee success, it can at least provide
  reliable
  detection of it.

 I don't understand how you intend to use the timestamp. Consider the
 following scenario:

 1. User drops a 10MB File onto the a page.
 2. Page requests to read the file using FileReader.readAsBinaryString
 and installs a 'progress' event listener.
 3. Implementation grabs a the current timestamp and then starts reading
 the file
 4. After 2MB of data is read the implementation updates
 FileReader.result with the partial read and fires a 'progress' event.
 5. Page grabs the partial result and processes it.
 6. After another 1MB of data is read, but before another 'progress'
 event has been fired, the user modifies the file such that the
 timestamp changes
 7. The implementation detects that the timestamp has changed.

 Now what?

 You can't throw an exception since part of the file has already been
 delivered. You could raise an error event, but that's unlikely to be
 treated correctly by the page as this is a very rare condition and
 hard to test for, so the page author has likely not written correct
 code to deal with it.

 FileReader has both 'error' and 'abort' events, in addition to 'progress'.
 It seems we just can use those? There is always a possibility that async
 operation that comes with partial results may fail as a whole - the only
 real way to ensure its atomicity would be to reliably lock the file or/and
 make a copy - which as this thread indicates are both not always possible.
 So yeah, in case FileReader returned 2MB and file suddenly changed to be
 only 1Mb, the next event the page should get is 'error'.
 What would be other possibility?

This doesn't address the problem that authors are unlikely to even
attempt to deal with this situation, given how rare it is. And even
less likely to deal with it successfully given how hard the situation
is reproduce while testing.

/ Jonas