[whatwg] [selectors4] drag-and-drop pseudo-classes

2012-08-14 Thread fantasai

The CSSWG discussed drag-and-drop pseudo-classes today. The current
proposal is to have three pseudo-classes:

  * One for the element representing the drop target that
would receive the item if it were dropped.
  * One for all elements representing possible drop targets
that could receive the item.
  * One for all elements representing drop targets that do
not accept this type of item.

We'd like comments on
  a) whether this is a correct and useful set of pseudo-elements
  b) what these pseudo-elements should be called, that would best
 (most clearly and succinctly) represent their functionality
 to authors using them

Name sets being considered:

Set A Set B   Set CSet D

 :active-drop:drop:current-drop:active-drop
 :drop   :can-drop:valid-drop  :valid-drop
 :no-drop:no-drop :invalid-drop:invalid-drop

Thanks~
~fantasai


Re: [whatwg] [selectors4] drag-and-drop pseudo-classes

2012-08-14 Thread Ryosuke Niwa
On Mon, Aug 13, 2012 at 9:19 PM, fantasai fantasai.li...@inkedblade.netwrote:

 The CSSWG discussed drag-and-drop pseudo-classes today. The current
 proposal is to have three pseudo-classes:

   * One for the element representing the drop target that
 would receive the item if it were dropped.
   * One for all elements representing possible drop targets
 that could receive the item.


How do we find these elements? On one hand, if we're only supporting
dropzone attribute, then adding new pseudo element seems unnecessary. On
the other hand, I can't think of ways to detect whether an element could
return false or prevents the default action on dragover/dragenter events
without firing those events.

- Ryosuke


Re: [whatwg] Was is considered to use JSON-LD instead of creating application/microdata+json?

2012-08-14 Thread Henri Sivonen
On Fri, Aug 10, 2012 at 1:39 PM, Markus Lanthaler
markus.lantha...@gmx.net wrote:
  Well, I would say there are several advantages. First of all, JSON-LD
 is
  more flexible and expressive.

 More flexible and expressive than what?

 Than application/microdata+json.

That's a problem right there. It means that JSON-LD requires more
consumer complexity than application/microdata+json.

-- 
Henri Sivonen
hsivo...@iki.fi
http://hsivonen.iki.fi/


Re: [whatwg] StringEncoding: Allowed encodings for TextEncoder

2012-08-14 Thread Simon Pieters
On Thu, 09 Aug 2012 19:42:07 +0200, Joshua Bell jsb...@chromium.org  
wrote:


http://wiki.whatwg.org/wiki/StringEncoding has been updated to restrict  
the

supported encodings for encoding to UTF-8, UTF-16 and UTF-16BE.

I'm tempted to take it further to just UTF-8 and see if anyone complains.


I was going to suggest doing so. We've gone UTF-8-only for new features  
(workers, webvtt, appcache manifest, etc). The Encoding spec says New  
content and formats must exclusively use the utf-8 encoding.. Is there a  
use case for utf-16/utf-16be?


--
Simon Pieters
Opera Software


Re: [whatwg] StringEncoding: Allowed encodings for TextEncoder

2012-08-14 Thread Jonas Sicking
I think the main reason would be if there are modern formats which use
UTF16 which we want to allow people to create documents in. I asked on
twitter for such formats and got some responses:

https://twitter.com/SickingJ/status/234060964058763264

/ Jonas

On Tue, Aug 14, 2012 at 7:42 AM, Simon Pieters sim...@opera.com wrote:
 On Thu, 09 Aug 2012 19:42:07 +0200, Joshua Bell jsb...@chromium.org wrote:

 http://wiki.whatwg.org/wiki/StringEncoding has been updated to restrict
 the
 supported encodings for encoding to UTF-8, UTF-16 and UTF-16BE.

 I'm tempted to take it further to just UTF-8 and see if anyone complains.


 I was going to suggest doing so. We've gone UTF-8-only for new features
 (workers, webvtt, appcache manifest, etc). The Encoding spec says New
 content and formats must exclusively use the utf-8 encoding.. Is there a
 use case for utf-16/utf-16be?

 --
 Simon Pieters
 Opera Software


Re: [whatwg] StringEncoding open issues

2012-08-14 Thread Joshua Bell
On Mon, Aug 6, 2012 at 5:06 PM, Glenn Maynard gl...@zewt.org wrote:

 I agree with Jonas that encoding should just use a replacement character
 (U+FFFD for Unicode encodings, '?' otherwise), and that we should put off
 other modes (eg. exceptions and user-specified replacement characters)
 until there's a clear need.

 My intuition is that encoding DOMString to UTF-16 should never have errors;
 if there are dangling surrogates, pass them through unchanged.  There's no
 point in using a placeholder that says an error occured here, when the
 error can be passed through in exactly the same form (not possible with eg.
 DOMString-SJIS).  I don't feel strongly about this only because outputting
 UTF-16 is so rare to begin with.

 On Mon, Aug 6, 2012 at 1:29 PM, Joshua Bell jsb...@chromium.org wrote:

  - if the document is encoded in UTF-8, UTF-16LE or UTF-16BE and includes
  the byte order mark (the encoding-specific serialization of U+FEFF).


 This rarely detects the wrong type, but that doesn't mean it's not the
 wrong answer.  If my input is meant to be UTF-8, and someone hands me
 BOM-marked UTF-16, I want it to fail in the same way it would if someone
 passed in SJIS.  I don't want it silently translated.

 On the other hand, it probably does make sense for UTF-16 to switch to
 UTF-16BE, since that's by definition the original purpose of the BOM.

 The convention iconv uses, which I think is a useful one, is decoding from
 UTF-16 means try to figure out the encoding from the BOM, if any, and
 UTF-16LE and UTF-16BE mean always use this exact encoding.


Let me take a crack at making this into an algorithm:

In the TextDecoder constructor:

   - If encoding is not specified, set an internal useBOM flag
   - If encoding is specified and is a case insensitive match for utf-16
   set an internal useBOM flag.

NOTE: This means if utf-8, utf-16le or utf-16be is explicitly
specified the flag is not set.

When decode() is called

   - If useBOM is set and the stream offset is 0, then
  - If there are not enough bytes to test for a BOM then return without
  emitting anything (NOTE: if not streaming an EOF byte would be present in
  the stream which would be a negative match for a BOM)
  - If encoding is utf-16 and the first bytes match 0xFF 0xFE or 0xFE
  0xFF then set current encoding to utf-16 or utf-16be respectively and
  advance the stream past the BOM. The current encoding is used until the
  stream is reset.
  - Otherwise, if the first bytes match 0xFF 0xFE, 0xFE 0xFF, or 0xEF
  0xBB 0xBF then set current encoding to utf-16, utf-16be or utf-8
  respectively and advance the stream past the BOM. The current encoding is
  used until the stream is reset.
   - Otherwise, if useBOM is not set and the steam offset is 0, then if the
   encoding is utf-8, utf-16 or utf-16be
  - If the first bytes match 0xFF 0xFE, 0xFE 0xFF, or 0xEF 0xBB 0xBF
  then let detected encoding be utf-16, utf-16be or utf-8
respectively.
  If the detected encoding matches the object's encoding, advance
the stream
  past the BOM. Otherwise, if the fatal flag is set then throw a
  EncodingError DOMException. Otherwise, the decoding algorithm proceeds.
  - If there are not enough bytes to test for a BOM then return without
  emitting anything (NOTE: if not streaming an EOF byte would be inserted
  which would be a negative match for a BOM)

Working the current encoding switcheroo into the spec will require some
refactoring, so trying to get consensus here first.

In English:

   - Create an encoder with TextDecoder() and if present a BOM will be
   respected (and consumed) otherwise default to UTF-8
   - Create an encoder with TextDecoder(utf-16) and either UTF-16LE or
   UTF-16BE BOM will be respected (and consumed) otherwise default to UTF-16LE
   (which may decode garbage if UTF-8 BOM or other non-UTF-16 data is present)
   - Create an encoder with TextDecoder(utf-8,
   {fatal:true}), TextDecoder(utf-16le, {fatal:true}),
   TextDecoder(utf-16be, {fatal:true}) and a matching BOM will be consumed,
   a mismatching BOM will throw an EncodingError
   - Create an encoder with TextDecoder(utf-8), TextDecoder(utf-16le),
   TextDecoder(utf-16be) and a matching BOM will be consumed, a mismatching
   BOM will be blithely decoded (probably giving you replacement characters),
   but not throwing.

 * If one of the UTF encodings is specified AND the BOM matches then the
  leading BOM character (U+FEFF) MUST NOT be emitted in the output
 character
  sequence (i.e. it is silently consumed)
 

 It's a little weird that

 data = readFile(user-supplied-file.txt); // shortcutting for brevity
 var s = new TextDecoder(utf-16).decode(data); // or utf-8
 s = s.replace(a, b);
 var data2 = new TextEncoder(utf-16).encode(s);
 writeFile(user-supplied-file.txt, data2);

 causes the BOM to be quietly stripped away.  Normally if you're modifying a
 file, you want to pass through the BOM (or lack 

Re: [whatwg] [selectors4] drag-and-drop pseudo-classes

2012-08-14 Thread Tab Atkins Jr.
On Mon, Aug 13, 2012 at 11:55 PM, Ryosuke Niwa rn...@webkit.org wrote:
 On Mon, Aug 13, 2012 at 9:19 PM, fantasai 
 fantasai.li...@inkedblade.netwrote:
 The CSSWG discussed drag-and-drop pseudo-classes today. The current
 proposal is to have three pseudo-classes:

   * One for the element representing the drop target that
 would receive the item if it were dropped.
   * One for all elements representing possible drop targets
 that could receive the item.

 How do we find these elements? On one hand, if we're only supporting
 dropzone attribute, then adding new pseudo element seems unnecessary. On
 the other hand, I can't think of ways to detect whether an element could
 return false or prevents the default action on dragover/dragenter events
 without firing those events.

Just using [dropzone], yes.

We're not adding a pseudo-element, we're adding pseudo-classes.

I'm not sure how we can possibly do these without pseudo-classes.  Can
you outline what you think it would be?  We have to (a) only trigger
these pseudo-classes while a drag is happening, and (b) trigger the
valid/invalid distinction based on what type was declared in JS or
assumed by OS-level data for the dragged thing.

As well, the pseudo that matches the drop target that will be used if
you dropped right now might not be expressible in pure CSS even given
the above.  It's probably equivalent to when you :hover it, but
there are applications that basically have this functionality that
work differently - for example, I think that the built-in Windows
solitaire game highlight the closest drop target to the current mouse
pointer, even if you're nowhere near the actual drop zone.

~TJ


Re: [whatwg] [selectors4] drag-and-drop pseudo-classes

2012-08-14 Thread Tab Atkins Jr.
On Tue, Aug 14, 2012 at 3:03 AM, Sebastian Zartner
sebastianzart...@gmail.com wrote:
   * One for all elements representing possible drop targets
 that could receive the item.
   * One for all elements representing drop targets that do
 not accept this type of item.

 This sounds like these two pseudo-classes would do exactly the opposite. So
 why not use :not() for this case?

Nope, the distinction is similar to :valid/:invalid - usually, most
elements will match neither, because they're not drop targets at all,
so they can't be a valid drop target *or* an invalid one.

~TJ


Re: [whatwg] Archive API - proposal

2012-08-14 Thread Jonas Sicking
On Tue, Jul 17, 2012 at 7:23 PM, Andrea Marchesini b...@mozilla.com wrote:
 Hi All,

 I would like to propose a new javascript/web API that provides the ability to 
 read the content of an archive file through DOMFile objects.
 I have started to work on this API because it has been requested during some 
 Mozilla Game Meeting by game developers who often use ZIP files as storage 
 system.

 What I'm describing is a read-only and asynchronous API built on top of 
 FileAPI ( http://dev.w3.org/2006/webapi/FileAPI/ ).

 Here a draft written in webIDL:

 interface ArchiveRequest : DOMRequest
 {
   // this is the ArchiveReader:
   readonly attribute nsIDOMArchiveReader reader;
 }

 [Constructor(Blob blob)]
 interface ArchiveReader
 {
   // any method is supposed to be asynchronous

   // The ArchiveRequest.result is an array of strings (the filenames)
   ArchiveRequest getFilenames();

   // The ArchiveRequest.result is a DOMFile 
 (http://dev.w3.org/2006/webapi/FileAPI/#dfn-file)
   ArchiveRequest getFile(DOMString filename);
 };

 Here an example about how to use it:

 function startRead() {
   // Starting from a input type=file id=file /:
   var file = document.getElementById('file').files[0];

   if (file.type != 'application/zip') {
 alert(This archive format is not supported);
 return;
   }

   // The ArchiveReader object works with Blob objects:
   var archiveReader = new ArchiveReader(file);

   // Any request is asynchronous:
   var handler = archiveReader.getFilenames();
   handler.onsuccess = getFilenamesSuccess;
   handler.onerror = errorHandler;

   // Multiple requests can run at the same time:
   var handler2 = archiveReader.getFile(levels/1.txt);
   handler2.onsuccess = getFileSuccess;
   handler2.onerror = errorHandler;
 }

 // The getFilenames handler receives a list of DOMString:
 function getFilenamesSuccess() {
   for (var i = 0; i  this.result.length; ++i) {
 /* this.reader is the ArchiveReader:
 var handle = this.reader.getFile(this.result[i]);
 handle.onsuccess = ...
 */
   }
 }

 // The GetFile handler receives a File/Blob object (and it can be used with 
 FileReader):
 function getFileSuccess() {
   var reader = FileReader();
   reader.readAsText(this.result);
   reader.onload = function(event) {
 // alert(event.target.result);
   }
 }

 function errorHandler() {
   // ...
 }

 I would like to receive feedback about this.. In particular:
 . Do you think it can be useful?
 . Do you see any limitation, any feature missing?

FWIW, this API is now available in Firefox nightly builds. It's
currently on track to ship in Firefox 17. Feedback would still be
greatly appreciated!

/ Jonas


Re: [whatwg] [selectors4] drag-and-drop pseudo-classes

2012-08-14 Thread fantasai

On 08/13/2012 11:55 PM, Ryosuke Niwa wrote:

On Mon, Aug 13, 2012 at 9:19 PM, fantasai fantasai.li...@inkedblade.net 
mailto:fantasai.li...@inkedblade.net wrote:

The CSSWG discussed drag-and-drop pseudo-classes today. The current
proposal is to have three pseudo-classes:

   * One for the element representing the drop target that
 would receive the item if it were dropped.
   * One for all elements representing possible drop targets
 that could receive the item.

How do we find these elements? On one hand, if we're only supporting dropzone 
attribute, then adding new pseudo element seems
unnecessary. On the other hand, I can't think of ways to detect whether an 
element could return false or prevents the default
action on dragover/dragenter events without firing those events.


I don't know. I'm just going on what was asked for in the following thread. :)
  http://lists.w3.org/Archives/Public/www-style/2011Sep/0402.html

The spec prose so far is this:
  http://dev.w3.org/csswg/selectors4/#drag-pseudos
The definition is pretty generic; I'm happy to add details on how
exactly it should work with HTML, if someone can provide them.

~fantasai


Re: [whatwg] [selectors4] drag-and-drop pseudo-classes

2012-08-14 Thread fantasai

On 08/14/2012 03:03 AM, Sebastian Zartner wrote:

   * One for all elements representing possible drop targets
 that could receive the item.
   * One for all elements representing drop targets that do
 not accept this type of item.

This sounds like these two pseudo-classes would do exactly the opposite.
So why not use :not() for this case?


That question was asked and answered here:
  http://lists.w3.org/Archives/Public/www-style/2011Sep/0417.html

Apparently an invalid dropzone is one that accepts drops, but not of this
type, so it's not exactly the same as :not(:valid-drop). I'm unsure if
it's necessary to add at this point, but can add it and mark at-risk for
the time being.

~fantasai


Re: [whatwg] [selectors4] drag-and-drop pseudo-classes

2012-08-14 Thread Ryosuke Niwa
On Tue, Aug 14, 2012 at 11:04 AM, Tab Atkins Jr. jackalm...@gmail.comwrote:

 On Mon, Aug 13, 2012 at 11:55 PM, Ryosuke Niwa rn...@webkit.org wrote:
  On Mon, Aug 13, 2012 at 9:19 PM, fantasai fantasai.li...@inkedblade.net
 wrote:
  The CSSWG discussed drag-and-drop pseudo-classes today. The current
  proposal is to have three pseudo-classes:
 
* One for the element representing the drop target that
  would receive the item if it were dropped.
* One for all elements representing possible drop targets
  that could receive the item.
 
  How do we find these elements? On one hand, if we're only supporting
  dropzone attribute, then adding new pseudo element seems unnecessary. On
  the other hand, I can't think of ways to detect whether an element could
  return false or prevents the default action on dragover/dragenter events
  without firing those events.

 Just using [dropzone], yes.

 We're not adding a pseudo-element, we're adding pseudo-classes.

 I'm not sure how we can possibly do these without pseudo-classes.  Can
 you outline what you think it would be?


I'm asking how we're supposed to implement this pseudo-classes given that
the only way to know whether an element can receive the item is by firing
dragenter and/or dragover events. e.g.

http://dev.w3.org/csswg/selectors4/#drag-pseudos says
The :valid-drop-target pseudo-class represents an element that is a
possible drop target for an item that is currently being dragged in a
drag-and-drop interface.

How are we going to figure out whether a given element is a possible drop
target for an item, when the element can dynamically decide whether to
accept the item or not in dragenter/dragover events?

As well, the pseudo that matches the drop target that will be used if
 you dropped right now might not be expressible in pure CSS even given
 the above.  It's probably equivalent to when you :hover it, but
 there are applications that basically have this functionality that
 work differently - for example, I think that the built-in Windows
 solitaire game highlight the closest drop target to the current mouse
 pointer, even if you're nowhere near the actual drop zone.


Yeah, and that's not compatible with how drag and drop are implemented on
the Web.

- Ryosuke


Re: [whatwg] StringEncoding: Allowed encodings for TextEncoder

2012-08-14 Thread Glenn Maynard
On Tue, Aug 14, 2012 at 9:42 AM, Simon Pieters sim...@opera.com wrote:

 On Thu, 09 Aug 2012 19:42:07 +0200, Joshua Bell jsb...@chromium.org
 wrote:

  
 http://wiki.whatwg.org/wiki/**StringEncodinghttp://wiki.whatwg.org/wiki/StringEncodinghas
  been updated to restrict the
 supported encodings for encoding to UTF-8, UTF-16 and UTF-16BE.

 I'm tempted to take it further to just UTF-8 and see if anyone complains.


 I was going to suggest doing so. We've gone UTF-8-only for new features
 (workers, webvtt, appcache manifest, etc). The Encoding spec says New
 content and formats must exclusively use the utf-8 encoding.. Is there a
 use case for utf-16/utf-16be?


Specs can't (meaningfully) place normative requirements on all new content
and formats.  This should be a note.

-- 
Glenn Maynard


Re: [whatwg] [selectors4] drag-and-drop pseudo-classes

2012-08-14 Thread Tab Atkins Jr.
On Tue, Aug 14, 2012 at 12:13 PM, Ryosuke Niwa rn...@webkit.org wrote:
 I'm asking how we're supposed to implement this pseudo-classes given that
 the only way to know whether an element can receive the item is by firing
 dragenter and/or dragover events. e.g.

No, we can know it declaratively via the dropzone attribute.  That's
what these will key off of in HTML.  In @dropzone, you can declare the
types of data that it will accept, and you know the type of the data
as soon as the drag starts, so you have all the info you need.

We obviously can't address dropzones that are only detectable during
the dragover event.  That's fine - they just won't respond to these
pseudo-classes.  Consider it an inducement to use the new, better
model that @dropzone allows.


 As well, the pseudo that matches the drop target that will be used if
 you dropped right now might not be expressible in pure CSS even given
 the above.  It's probably equivalent to when you :hover it, but
 there are applications that basically have this functionality that
 work differently - for example, I think that the built-in Windows
 solitaire game highlight the closest drop target to the current mouse
 pointer, even if you're nowhere near the actual drop zone.

 Yeah, and that's not compatible with how drag and drop are implemented on
 the Web.

I know.  You'll notice that I didn't suggest we somehow change to
that.  ^_^  However, other languages might want this kind of model,
and we could in the future add a switch to allow this kind of behavior
in HTML. My point is just that, even if you solve the other problems,
you still might not be able to implement that pseudoclass in existing
CSS.

~TJ


Re: [whatwg] [selectors4] drag-and-drop pseudo-classes

2012-08-14 Thread Ryosuke Niwa
On Tue, Aug 14, 2012 at 12:53 PM, Tab Atkins Jr. jackalm...@gmail.comwrote:

 On Tue, Aug 14, 2012 at 12:13 PM, Ryosuke Niwa rn...@webkit.org wrote:
  I'm asking how we're supposed to implement this pseudo-classes given that
  the only way to know whether an element can receive the item is by firing
  dragenter and/or dragover events. e.g.

 No, we can know it declaratively via the dropzone attribute.  That's
 what these will key off of in HTML.  In @dropzone, you can declare the
 types of data that it will accept, and you know the type of the data
 as soon as the drag starts, so you have all the info you need.


Okay, thanks for the clarification.

We obviously can't address dropzones that are only detectable during
 the dragover event.  That's fine - they just won't respond to these
 pseudo-classes.  Consider it an inducement to use the new, better
 model that @dropzone allows.


I'm not sure if I'm a big fun of that idea given that I haven't seen people
using dropzone attribute in wild. Have other browser vendors even
implemented it yet? (We haven't prefixed it in WebKit) All in all, I feel
like it's premature to build more features on top of it.

- Ryosuke


Re: [whatwg] Archive API - proposal

2012-08-14 Thread Glenn Maynard
(I've reordered my responses to give a more logical progression.)

On Tue, Jul 17, 2012 at 9:23 PM, Andrea Marchesini b...@mozilla.com wrote:

 // The getFilenames handler receives a list of DOMString:
 var handle = this.reader.getFile(this.result[i]);


This interface is problematic.  Since ZIP files don't have a standard
encoding, filenames in ZIPs are often garbage.  This API requires that
filenames round-trip uniquely, or else files aren't accessible t all.  For
example, if you have two filenames in CP932, 日 and 本, but the encoding
isn't determined correctly, you may end up with two files both with a
filename of ??.  Either you can't open either file, or you can only open
one of them.  This isn't theoretical; I hit ZIP files like this in the wild
regularly.

Instead, I'd recommend that the primary API simply returns File objects
directly from the ZIP.  For example:

var reader = archive.getFiles();
reader.onsuccess = function(result) {
// result = [File, File, File, File...];

console.log(result[0].name);
// read the file
new FileReader(result[0]);
}

This allows opening files without any dependency on the filename.  Since
File objects are by design lightweight--no decompression should happen
until you actually read from the file--this isn't expensive and won't
perform any extra I/O.  All the information you need to expose a File
object is in the central directory (filename, mtime, decompressed size).

I would like to receive feedback about this.. In particular:
 . Do you think it can be useful?
 . Do you see any limitation, any feature missing?


It should be possible to get the CRC32 of files, which ZIP stores in the
central directory.  This both allows the user to perform checksum
verification himself if wanted, and all the other variously useful things
about being able to get a file's checksum without having to read the whole
file.

(I don't think CRC32 checks should be performed automatically, since it's
too hard for that to make sense when random access is involved.)

  // The ArchiveReader object works with Blob objects:
   var archiveReader = new ArchiveReader(file);

   // Any request is asynchronous:


The only operation that needs to be asynchronous is creating the
ArchiveReader itself.  It should parse the ZIP central record before before
returning a result.  Once you've done that you can do the rest
synchronously, because no further I/O is necessary until you actually read
data from a file.

This gives the following, simpler interface:

var opener = new ZipOpener(file);
opener.onerror = function() { console.error(Loading failed); }
opener.onsuccess = function(zipFile)
{
// .files is a FileList, representing each file in the archive.
if(zipFile.files.length == 0) { console.error(ZIP file is empty);
return; }

var example_file = zipFile.files[0];
console.log(The first filename is, example_file.name, with an
expected CRC of, example_file.expectedCRC);

// Read from the file:
var reader = new FileReader(example_file);

// For convenience, add getter File? (DOMString name) to FileList, to
find a file by name.  This is equivalent
// to iterating through files[] and comparing .name.  If no match is
found, return null.  This could be a function
// instead of a getter.
var example_file2 = zipFile.files[file.txt];
if(example_file2 == null) { console.error(file.txt not found in ZIP;
return; }
}

(To fit expectedCRC in there, it would actually need to use a subclass of
File, not File itself.)

This also eliminates an error condition (no getFile error callback), and
since .files looks just like HTMLInputElement.files, it can be used
directly with code written for it.  For example, if you have a function
uploadAllFiles(files), you can pass in both an input type=file
multiple's .input or a zipFile.files, and they'll both work.

-- 
Glenn Maynard


Re: [whatwg] Archive API - proposal

2012-08-14 Thread Tobie Langel
On Aug 14, 2012, at 21:21, Glenn Maynard gl...@zewt.org wrote:

 (I've reordered my responses to give a more logical progression.)

 On Tue, Jul 17, 2012 at 9:23 PM, Andrea Marchesini b...@mozilla.com wrote:

 // The getFilenames handler receives a list of DOMString:
 var handle = this.reader.getFile(this.result[i]);


 This interface is problematic.  Since ZIP files don't have a standard
 encoding, filenames in ZIPs are often garbage.  This API requires that
 filenames round-trip uniquely, or else files aren't accessible t all.  For
 example, if you have two filenames in CP932, 日 and 本, but the encoding
 isn't determined correctly, you may end up with two files both with a
 filename of ??.  Either you can't open either file, or you can only open
 one of them.  This isn't theoretical; I hit ZIP files like this in the wild
 regularly.

 Instead, I'd recommend that the primary API simply returns File objects
 directly from the ZIP.  For example:

 var reader = archive.getFiles();
 reader.onsuccess = function(result) {
// result = [File, File, File, File...];

console.log(result[0].name);
// read the file
new FileReader(result[0]);
 }

 This allows opening files without any dependency on the filename.  Since
 File objects are by design lightweight--no decompression should happen
 until you actually read from the file--this isn't expensive and won't
 perform any extra I/O.  All the information you need to expose a File
 object is in the central directory (filename, mtime, decompressed size).

 I would like to receive feedback about this.. In particular:
 . Do you think it can be useful?
 . Do you see any limitation, any feature missing?


 It should be possible to get the CRC32 of files, which ZIP stores in the
 central directory.  This both allows the user to perform checksum
 verification himself if wanted, and all the other variously useful things
 about being able to get a file's checksum without having to read the whole
 file.

 (I don't think CRC32 checks should be performed automatically, since it's
 too hard for that to make sense when random access is involved.)

  // The ArchiveReader object works with Blob objects:
  var archiveReader = new ArchiveReader(file);

  // Any request is asynchronous:


 The only operation that needs to be asynchronous is creating the
 ArchiveReader itself.  It should parse the ZIP central record before before
 returning a result.  Once you've done that you can do the rest
 synchronously, because no further I/O is necessary until you actually read
 data from a file.

 This gives the following, simpler interface:

 var opener = new ZipOpener(file);
 opener.onerror = function() { console.error(Loading failed); }
 opener.onsuccess = function(zipFile)
 {
// .files is a FileList, representing each file in the archive.
if(zipFile.files.length == 0) { console.error(ZIP file is empty);
 return; }

var example_file = zipFile.files[0];
console.log(The first filename is, example_file.name, with an
 expected CRC of, example_file.expectedCRC);

// Read from the file:
var reader = new FileReader(example_file);

// For convenience, add getter File? (DOMString name) to FileList, to
 find a file by name.  This is equivalent
// to iterating through files[] and comparing .name.  If no match is
 found, return null.  This could be a function
// instead of a getter.
var example_file2 = zipFile.files[file.txt];
if(example_file2 == null) { console.error(file.txt not found in ZIP;
 return; }
 }

 (To fit expectedCRC in there, it would actually need to use a subclass of
 File, not File itself.)

 This also eliminates an error condition (no getFile error callback), and
 since .files looks just like HTMLInputElement.files, it can be used
 directly with code written for it.  For example, if you have a function
 uploadAllFiles(files), you can pass in both an input type=file
 multiple's .input or a zipFile.files, and they'll both work.

How are nested directories handled in your counter proposal?

--tobie