Re: File API: Blob and underlying file changes.
I don't think we should worry about underlying file changes. If the app wants to cut a file into parts and copy them separately, then perhaps the app should first copy the file into a private area. (I'm presuming that one day, we'll have the concept of a chroot'd private file storage area for a web app.) I think we should avoid solutions that involve file locking since it is bad for the user (loss of control) if their files are locked by the browser on behalf of a web app. It might be reasonable, however, to lock a file while sending it. -Darin On Thu, Jan 14, 2010 at 2:41 PM, Jian Li jia...@chromium.org wrote: It seems that we feel that when a File object is sent via either Form or XHR, the latest underlying version should be used. When we get a slice via Blob.slice, we assume that the underlying file data is stable since then. So for uploader scenario, we need to cut a big file into multiple pieces. With current File API spec, we will have to do something like the following to make sure that all pieces are cut from a stable file. var file = myInputElement.files[0]; var blob = file.slice(0, file.size); var piece1 = blob.slice(0, 1000); var piece2 = blob.slice(1001, 1000); ... The above seems a bit ugly. If we want to make it clean, what Dmitry proposed above seems to be reasonable. But it would require non-trivial spec change. On Wed, Jan 13, 2010 at 11:28 AM, Dmitry Titov dim...@chromium.orgwrote: Atomic read is obviously a nice thing - it would be hard to program against API that behaves as unpredictably as a single read operation that reads half of old content and half of new content. At the same note, it would be likely very hard to program against Blob objects if they could change underneath unpredictably. Imagine that we need to build an uploader that cuts a big file in multiple pieces and sends those pieces to the servers so they will be stitched together later. If during this operation the underlying file changes and this changes all the pieces that Blobs refer to (due to clamping and just silent change of content), all the slicing/stitching assumptions are invalid and it's hard to even notice since blobs are simply 'clamped' silently. Some degree of mess is possible then. Another use case could be a JPEG image processor that uses slice() to cut the headers from the image file and then uses info from the headers to cut further JFIF fields from the file (reading EXIF and populating local database of images for example). Changing the file in the middle of that is bad. It seems the typical use cases that will need Blob.slice() functionality form 'units of work' where Blob.slice() is used with likely assumption that underlying data is stable and does not change silently. Such a 'unit of work' should fail as a whole if underlying file changes. One way to achieve that is to reliably fail operations with 'derived' Blobs and even perhaps have a 'isValid' property on it. 'Derived' Blobs are those obtained via slice(), as opposite to 'original' Blobs that are also File. One disadvantage of this approach is that it implies that the same Blob has 2 possible behaviors - when it is obtained via Blob.slice() (or other methods) vs is a File. It all could be a bit cleaner if File did not derive from Blob, but instead had getAsBlob() method - then it would be possible to say that Blobs are always immutable but may become 'invalid' over time if underlying data changes. The FileReader can then be just a BlobReader and have cleaner semantics. If that was the case, then xhr.send(file) would capture the state of file at the moment of sending, while xhr.send(blob) would fail with exception if the blob is 'invalid' at the moment of send() operation. This would keep compatibility with current behavior and avoid duplicity of Blob behavior. Quite a change to the spec though... Dmitry On Wed, Jan 13, 2010 at 2:38 AM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jan 12, 2010 at 5:28 PM, Chris Prince cpri...@google.com wrote: For the record, I'd like to make the read atomic, such that you can never get half a file before a change, and half after. But it likely depends on what OSs can enforce here. I think *enforcing* atomicity is difficult across all OSes. But implementations can get nearly the same effect by checking the file's last modification time at the start + end of the API call. If it has changed, the read operation can throw an exception. I'm talking about during the actual read. I.e. not related to the lifetime of the File object, just related to the time between the first 'progress' event, and the 'loadend' event. If the file changes during this time there is no way to fake atomicity since the partial file has already been returned. / Jonas
Re: File API: Blob and underlying file changes.
On Thu, Jan 14, 2010 at 11:58 PM, Darin Fisher da...@chromium.org wrote: I don't think we should worry about underlying file changes. If the app wants to cut a file into parts and copy them separately, then perhaps the app should first copy the file into a private area. (I'm presuming that one day, we'll have the concept of a chroot'd private file storage area for a web app.) I think we should avoid solutions that involve file locking since it is bad for the user (loss of control) if their files are locked by the browser on behalf of a web app. It might be reasonable, however, to lock a file while sending it. I largely agree. Though I think it'd be reasonable to lock the file while reading it too. / Jonas On Thu, Jan 14, 2010 at 2:41 PM, Jian Li jia...@chromium.org wrote: It seems that we feel that when a File object is sent via either Form or XHR, the latest underlying version should be used. When we get a slice via Blob.slice, we assume that the underlying file data is stable since then. So for uploader scenario, we need to cut a big file into multiple pieces. With current File API spec, we will have to do something like the following to make sure that all pieces are cut from a stable file. var file = myInputElement.files[0]; var blob = file.slice(0, file.size); var piece1 = blob.slice(0, 1000); var piece2 = blob.slice(1001, 1000); ... The above seems a bit ugly. If we want to make it clean, what Dmitry proposed above seems to be reasonable. But it would require non-trivial spec change. On Wed, Jan 13, 2010 at 11:28 AM, Dmitry Titov dim...@chromium.org wrote: Atomic read is obviously a nice thing - it would be hard to program against API that behaves as unpredictably as a single read operation that reads half of old content and half of new content. At the same note, it would be likely very hard to program against Blob objects if they could change underneath unpredictably. Imagine that we need to build an uploader that cuts a big file in multiple pieces and sends those pieces to the servers so they will be stitched together later. If during this operation the underlying file changes and this changes all the pieces that Blobs refer to (due to clamping and just silent change of content), all the slicing/stitching assumptions are invalid and it's hard to even notice since blobs are simply 'clamped' silently. Some degree of mess is possible then. Another use case could be a JPEG image processor that uses slice() to cut the headers from the image file and then uses info from the headers to cut further JFIF fields from the file (reading EXIF and populating local database of images for example). Changing the file in the middle of that is bad. It seems the typical use cases that will need Blob.slice() functionality form 'units of work' where Blob.slice() is used with likely assumption that underlying data is stable and does not change silently. Such a 'unit of work' should fail as a whole if underlying file changes. One way to achieve that is to reliably fail operations with 'derived' Blobs and even perhaps have a 'isValid' property on it. 'Derived' Blobs are those obtained via slice(), as opposite to 'original' Blobs that are also File. One disadvantage of this approach is that it implies that the same Blob has 2 possible behaviors - when it is obtained via Blob.slice() (or other methods) vs is a File. It all could be a bit cleaner if File did not derive from Blob, but instead had getAsBlob() method - then it would be possible to say that Blobs are always immutable but may become 'invalid' over time if underlying data changes. The FileReader can then be just a BlobReader and have cleaner semantics. If that was the case, then xhr.send(file) would capture the state of file at the moment of sending, while xhr.send(blob) would fail with exception if the blob is 'invalid' at the moment of send() operation. This would keep compatibility with current behavior and avoid duplicity of Blob behavior. Quite a change to the spec though... Dmitry On Wed, Jan 13, 2010 at 2:38 AM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jan 12, 2010 at 5:28 PM, Chris Prince cpri...@google.com wrote: For the record, I'd like to make the read atomic, such that you can never get half a file before a change, and half after. But it likely depends on what OSs can enforce here. I think *enforcing* atomicity is difficult across all OSes. But implementations can get nearly the same effect by checking the file's last modification time at the start + end of the API call. If it has changed, the read operation can throw an exception. I'm talking about during the actual read. I.e. not related to the lifetime of the File object, just related to the time between the first 'progress' event, and the 'loadend' event. If the file changes during this time there is no way to fake atomicity since the partial file has already been returned.
Restart the group
Hi there, I just found out that this group is very interesting and it stopped because of several reasons in 2007. We are in 2010 now and I think the momentum is there for declarative application development. We have a proposal and want to make it to an open specification. The question: how to continue from here ? Is it possible to trigger people from certain groups/companies to join ? Hope to hear from you soon. Kind regards, Rokesh Jankie --- QAFE Enterprise Applications Made Easy Website :http://www.qafe.com/ LinkedIn : http://www.linkedin.com/in/rjankie Twitter : http://twitter.com/qafe Company : http://www.qualogy.com Youtube :http://youtube.com/qafechannel Google profile: http://www.google.com/profiles/qafeframework
Re: Restart the group
I'm sorry for my mistake then. I would like to participate. Nice to hear the the group is much alive and active. Thanks for the note. Regards, Rokesh Jankie --- QAFE powered by experience quality Website : http://www.qafe.com/ Youtube : http://youtube.com/qafechannel LinkedIn: http://www.linkedin.com/groups?gid=134874 Sent from Voorburg, ZH, Netherlands On Fri, Jan 15, 2010 at 14:21, Lachlan Hunt lachlan.h...@lachy.id.auwrote: Rokesh Jankie wrote: I just found out that this group is very interesting and it stopped because of several reasons in 2007. You must be mistaken. This group is very much alive and active. While the Web API WG charter ended in 2007, the Web Apps group was rechartered [1], merging the Web API and App Formats WGs. [1] http://www.w3.org/2008/webapps/charter/ -- Lachlan Hunt - Opera Software http://lachy.id.au/ http://www.opera.com/
Re: Restart the group
Rokesh, On Jan 15, 2010, at 6:43 AM, ext Rokesh Jankie wrote: I just found out that this group is very interesting and it stopped because of several reasons in 2007. Perhaps you are thinking of the Web Applications Format (WAF) WG and Web API WG which both ended in 2008 (and as Lachan indicated, they were merged to form the Web Applications WG). We are in 2010 now and I think the momentum is there for declarative application development. We have a proposal and want to make it to an open specification. The question: how to continue from here ? Is it possible to trigger people from certain groups/companies to join ? FYI, in 2006, the WAF WG began work on a declarative format for applications and user interfaces (aka DFAUI) spec and that work ended with the publication of the following Working Group Note in September 2007: Declarative Formats for Applications and User Interfaces http://www.w3.org/TR/dfaui/ -Art Barstow
Re: File API: Blob and underlying file changes.
On Fri, Jan 15, 2010 at 10:19 AM, Dmitry Titov dim...@chromium.org wrote: Nobody proposed locking the file. Sorry for being unclear if that sounds like it. Basically it's all about timestamps. As Chris proposed earlier, a read operation can grab the timestamp of the file before and after reading its content and throw exception if the timestamps do not match. This is pretty good approximation of 'atomic' read - although it can not guarantee success, it can at least provide reliable detection of it. but doesn't that imply some degree of unpredictability for web developers? must they always handle that exception even though it is an extremely rare occurrence? also, what about normal form submission, in which the file reading happens asynchronously to form.submit(). Same thing with the Blob - the slice() may capture the timestamp of the content it's based on. Blob can throw exception later if the modification timestamp of underlying data has changed since the time of Blob's creation. also note that we MUST NOT design APIs that involve synchronous file access. no stat calls allowed on the main UI thread please! (remember the network filesystem case.) in other words, assuming detection of file changes happens asynchronously, we'll have trouble producing exceptions as you describe. Both actual OS locking and requiring copying files to a safe location before slice() seem to be problematic, for different reasons. Good example is youtube uploader that needs to slice and send 1Gb file, while having a way to reliably detect the change of the underlyign file, terminate current upload and potentially request another one. Copying is hard because of size and locking, even if provided by OS, may stay in the way of user's workflow. Dmitry On Thu, Jan 14, 2010 at 11:58 PM, Darin Fisher da...@chromium.org wrote: I don't think we should worry about underlying file changes. If the app wants to cut a file into parts and copy them separately, then perhaps the app should first copy the file into a private area. (I'm presuming that one day, we'll have the concept of a chroot'd private file storage area for a web app.) I think we should avoid solutions that involve file locking since it is bad for the user (loss of control) if their files are locked by the browser on behalf of a web app. It might be reasonable, however, to lock a file while sending it. -Darin On Thu, Jan 14, 2010 at 2:41 PM, Jian Li jia...@chromium.org wrote: It seems that we feel that when a File object is sent via either Form or XHR, the latest underlying version should be used. When we get a slice via Blob.slice, we assume that the underlying file data is stable since then. So for uploader scenario, we need to cut a big file into multiple pieces. With current File API spec, we will have to do something like the following to make sure that all pieces are cut from a stable file. var file = myInputElement.files[0]; var blob = file.slice(0, file.size); var piece1 = blob.slice(0, 1000); var piece2 = blob.slice(1001, 1000); ... The above seems a bit ugly. If we want to make it clean, what Dmitry proposed above seems to be reasonable. But it would require non-trivial spec change. On Wed, Jan 13, 2010 at 11:28 AM, Dmitry Titov dim...@chromium.orgwrote: Atomic read is obviously a nice thing - it would be hard to program against API that behaves as unpredictably as a single read operation that reads half of old content and half of new content. At the same note, it would be likely very hard to program against Blob objects if they could change underneath unpredictably. Imagine that we need to build an uploader that cuts a big file in multiple pieces and sends those pieces to the servers so they will be stitched together later. If during this operation the underlying file changes and this changes all the pieces that Blobs refer to (due to clamping and just silent change of content), all the slicing/stitching assumptions are invalid and it's hard to even notice since blobs are simply 'clamped' silently. Some degree of mess is possible then. Another use case could be a JPEG image processor that uses slice() to cut the headers from the image file and then uses info from the headers to cut further JFIF fields from the file (reading EXIF and populating local database of images for example). Changing the file in the middle of that is bad. It seems the typical use cases that will need Blob.slice() functionality form 'units of work' where Blob.slice() is used with likely assumption that underlying data is stable and does not change silently. Such a 'unit of work' should fail as a whole if underlying file changes. One way to achieve that is to reliably fail operations with 'derived' Blobs and even perhaps have a 'isValid' property on it. 'Derived' Blobs are those obtained via slice(), as opposite to 'original' Blobs that are also File. One
Re: File API: Blob and underlying file changes.
On Fri, Jan 15, 2010 at 10:19 AM, Dmitry Titov dim...@chromium.org wrote: Nobody proposed locking the file. Sorry for being unclear if that sounds like it. Basically it's all about timestamps. As Chris proposed earlier, a read operation can grab the timestamp of the file before and after reading its content and throw exception if the timestamps do not match. This is pretty good approximation of 'atomic' read - although it can not guarantee success, it can at least provide reliable detection of it. I don't understand how you intend to use the timestamp. Consider the following scenario: 1. User drops a 10MB File onto the a page. 2. Page requests to read the file using FileReader.readAsBinaryString and installs a 'progress' event listener. 3. Implementation grabs a the current timestamp and then starts reading the file 4. After 2MB of data is read the implementation updates FileReader.result with the partial read and fires a 'progress' event. 5. Page grabs the partial result and processes it. 6. After another 1MB of data is read, but before another 'progress' event has been fired, the user modifies the file such that the timestamp changes 7. The implementation detects that the timestamp has changed. Now what? You can't throw an exception since part of the file has already been delivered. You could raise an error event, but that's unlikely to be treated correctly by the page as this is a very rare condition and hard to test for, so the page author has likely not written correct code to deal with it. It's additionally not atomic since the read started, but was interrupted. / Jonas
IndexedDB and MVCC
Hi, I've been reading the new IndexedDB spec as published here: http://www.w3.org/TR/IndexedDB/ My first impression is that this simpler than WebSimpleDB, but not too simple. I'm happy to see detached readers being mentioned. There's one other piece of the concurrency story that could be useful. In section 3.2.2 Object Store Storage steps step 7: If the no-overwrite flag was passed to these steps and is set, and a record already exists with its key being key, then terminate these steps and set error code CONSTRAINT_ERR. I think it wouldn't add much complexity to use a compare-and-swap pattern, instead of a no-write-if-exists pattern. This would allow for better concurrency via optimistic updates, and look a lot like HTTP etags. It could be accomplished by allowing an object store to take a key-path for the update-token. Then subsequent updates could require that the key-path match. (Some additional complexity: we'd need the ability to check for a matching update-token, then change it, in a transaction). CouchDB uses an MVCC token that must match to allow updates. This allows us to avoid locking. But even more important is the parallels we have with HTTP Etags (if-match for idempotence, if-none-match for caching). The CouchDB style of MVCC can be accomplished by updates in a compare-and-swap transaction, so technically I can do what I want in the spec as it stands. But I still think the parallels to HTTP etags can be instructive. Chris -- Chris Anderson http://jchrisa.net http://couch.io
Re: File API: Blob and underlying file changes.
On Fri, Jan 15, 2010 at 11:42 AM, Dmitry Titov dim...@chromium.org wrote: On Fri, Jan 15, 2010 at 10:36 AM, Jonas Sicking jo...@sicking.cc wrote: On Fri, Jan 15, 2010 at 10:19 AM, Dmitry Titov dim...@chromium.org wrote: Nobody proposed locking the file. Sorry for being unclear if that sounds like it. Basically it's all about timestamps. As Chris proposed earlier, a read operation can grab the timestamp of the file before and after reading its content and throw exception if the timestamps do not match. This is pretty good approximation of 'atomic' read - although it can not guarantee success, it can at least provide reliable detection of it. I don't understand how you intend to use the timestamp. Consider the following scenario: 1. User drops a 10MB File onto the a page. 2. Page requests to read the file using FileReader.readAsBinaryString and installs a 'progress' event listener. 3. Implementation grabs a the current timestamp and then starts reading the file 4. After 2MB of data is read the implementation updates FileReader.result with the partial read and fires a 'progress' event. 5. Page grabs the partial result and processes it. 6. After another 1MB of data is read, but before another 'progress' event has been fired, the user modifies the file such that the timestamp changes 7. The implementation detects that the timestamp has changed. Now what? You can't throw an exception since part of the file has already been delivered. You could raise an error event, but that's unlikely to be treated correctly by the page as this is a very rare condition and hard to test for, so the page author has likely not written correct code to deal with it. FileReader has both 'error' and 'abort' events, in addition to 'progress'. It seems we just can use those? There is always a possibility that async operation that comes with partial results may fail as a whole - the only real way to ensure its atomicity would be to reliably lock the file or/and make a copy - which as this thread indicates are both not always possible. So yeah, in case FileReader returned 2MB and file suddenly changed to be only 1Mb, the next event the page should get is 'error'. What would be other possibility? This doesn't address the problem that authors are unlikely to even attempt to deal with this situation, given how rare it is. And even less likely to deal with it successfully given how hard the situation is reproduce while testing. / Jonas