date:20120111

On Wed, Jan 11, 2012 at 1:00 AM, Simon Pieters sim...@opera.com wrote:

 On Tue, 10 Jan 2012 21:50:34 +0100, Ryosuke Niwa rn...@webkit.org wrote:

  On Tue, Jan 10, 2012 at 12:46 PM, Aryeh Gregor a...@aryeh.name wrote:


  On Tue, Jan 10, 2012 at 3:40 PM, Ryosuke Niwa rn...@webkit.org wrote:
  Single br tag is shorter than pairs of div tags when serialized.

 True, but only slightly, and the difference is even smaller if you use
 p instead of div.  This isn't enough of a reason by itself to
 justify the extra complexity of another mode.  Are there other
 reasons?


 p has default margins.


 This is why we implemented opera-defaultblock. Apps were manually
 converting our output to use divs because they didn't want margins, which
 is non-trivial to do and often leaves bugs in edge cases.


Right. I think that's a good idea.

 That alone is enough for us not to adopt p as
 the default paragraph separator. Also, unfortunately, there are many
 legacy
 contents that rely on the fact webkit uses div as the paragraph separator
 so we need a global or per editing-host switch regardless.


 Do you suggest that all browsers adopt div as default separator by
 default? Or that it will be impossible to reach interop? Or something else?
 :-)


It might be possible eventually once we introduce commands
like defaultblock and everyone starts using that since at that point, the
default doesn't really matter. On the other hand, at that point, the
default doesn't really matter so not sure if it's really worth converging.

 I almost want a global switch to toggle between legacy UA-specific behavior
 and new spec-compliant behavior.


 That would rather miss the point of having the spec IMHO. If we all
 implement a global switch to opt in to a different behavior, let's design a
 new, sane editing API instead. But I think the editing spec should try to
 reach interop for the legacy feature first.


That makes sense but I think we need to make sure we don't break the
existing contents during the transition. I really wish we can all converge
on one behavior but not at the cost of backward compatibility.

- Ryosuke

Re: Speech Recognition and Text-to-Speech Javascript API - seeking feedback for eventual standardization

2012-01-11 Thread Satish S


  It doesn't matter too much to me in which group the API will be developed
  (except that I'm against doing it in HTML WG).
  WebApps is reasonably good place (if there won't be any IP issues.)

 Starting the work in a Community Group is another option to consider. A
 really good option, actually. It's certainly the quickest way to get it
 started and to get a W3C draft actually published, and the route that would
 entail the least amount of unnecessary process overhead. The work could
 later be graduated to, e.g., the WebApps WG if/when needed.


The Community Groups [1] page says they are for anyone to socialize their
ideas for the Web at the W3C for possible future standardization.

The HTML Speech Incubator Group has done a considerable amount of work and
the final report [2] is quite detailed with requirements, use cases and API
proposals. Since we are interested in transitioning to the standards track
now, working with the relevant WGs seems more appropriate than forming a
new Community Group.

[1] http://www.w3.org/community/about/#cg
[2] http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech/

Re: Pressing Enter in contenteditable: p or br or div?

2012-01-11 Thread Simon Pieters


On Wed, 11 Jan 2012 10:43:24 +0100, Markus Ernst derer...@gmx.ch wrote:


p has default margins.


This is why we implemented opera-defaultblock. Apps were manually
converting our output to use divs because they didn't want margins,
which is non-trivial to do and often leaves bugs in edge cases.


Actually, applying p {margin:0} looks quite trivial.


Sure, but some apps like to send their stuff in HTML email to clients that  
don't support styling, or some such.



That would rather miss the point of having the spec IMHO. If we all
implement a global switch to opt in to a different behavior, let's
design a new, sane editing API instead. But I think the editing spec
should try to reach interop for the legacy feature first.


IMO the ability to create clean, state-of-the-art HTML code should be  
one of the main goals of a new spec. That means: Editor implementations  
should be able to get p on Enter, and br on Shift-Enter (as people  
are used to from commonly used word processors) without additional  
scripting.


That's nice but it's not clear how to go from here to there. There is web  
content that relies on quirks of each browser and might stop working  
completely if we change things. I value interop higher than clean code,  
so if, for instance, we can converge on div but not on p, then that's  
what we should spec, IMHO. (However I'm not convinced yet that it's easier  
to converge on div rather than p, which is why Opera still uses p by  
default. We have generally aimed for matching IE for contenteditable.)


I don't know the use cases Ryosuke mentions, where apps rely on  
webkit-specific behavior (or other ua-specific behaviors), and I don't  
know how harmful a change could be for them (paragraph margins appearing  
in a forum I would not consider very harmful).


It's more that a subtle change can break functionality completely in  
some apps. Even if it's just margins appearing where they didn't before,  
it might still not be accepted by many users and web developers.


On the long term, from a developer's and client supporter's POV I'd  
prefer to have a standard behavior that works the same in all UAs, and  
all common editor applications, by default.


I agree.

Offering a default paragraph separator setting means, that editor  
behaviors will remain different across applications,


It already is different across applications, because they implement the  
default paragraph separator setting themselves.



which is confusing for many users.

It might be less a hassle to have maintainers of existing applications  
insert a line of code that triggers legacy behavior, if this is crucial  
for their application.


--
Simon Pieters
Opera Software

Re: Speech Recognition and Text-to-Speech Javascript API - seeking feedback for eventual standardization

2012-01-11 Thread Michael[tm] Smith

Satish S sat...@google.com, 2012-01-11 10:04 +:

 The Community Groups [1] page says they are for anyone to socialize their
 ideas for the Web at the W3C for possible future standardization.

I don't think that page adequately describes the potential value of the
Community Group option. A CG can be used for much more than just
socializing ideas for some hope of standardization someday.

 The HTML Speech Incubator Group has done a considerable amount of work and
 the final report [2] is quite detailed with requirements, use cases and API
 proposals. Since we are interested in transitioning to the standards track
 now, working with the relevant WGs seems more appropriate than forming a
 new Community Group.

I can understand you seeing it that way, but I hope you can also understand
me saying that I'm not at all sure it's more appropriate for this work.

I think everybody could agree that the point is not just to produce a spec
that is nominally on the W3C standards track. Having something on the W3C
standards track doesn't necessarily do anything magical to ensure that
anybody actually implements it.

I think we all want is to for Web-platform technologies to actually get
implemented across multiple browsers, interoperably -- preferably sooner
rather than later. Starting from the WG option is not absolutely always the
best way to cause that to happen. It is almost certainly not the best way
to ensure it will get done more quickly.

You can start up a CG and have the work formally going on within that CG in
a matter of days, literally. In contrast, getting it going formally as a
deliverable within a WG requires a matter of months.

Among the things that are valuable about formal deliverables in WGs is that
they get you RF commitments from participants in the WG. But one thing that
I think not everybody understands about CGs is that they also get you RF
commitments from participants in the CG; everybody in the CG has to agree
to the terms of the W3C Community Contributor License Agreement -

  http://www.w3.org/community/about/agreements/cla/

Excerpt: I agree to license my Essential Claims under the W3C CLA RF
Licensing Requirements. This requirement includes Essential Claims that I own

Anyway, despite what it may seem like from what I've said above, I'm not
trying to do a hard sell here. It's up to you all what you choose to do.
But I would like to help make sure you're making a fully informed decision
based on what the actual benefits and costs of the different options are.

  --Mike

 [1] http://www.w3.org/community/about/#cg
 [2] http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech/

-- 
Michael[tm] Smith
http://people.w3.org/mike/+

Re: Speech Recognition and Text-to-Speech Javascript API - seeking feedback for eventual standardization

2012-01-11 Thread Andrei Popescu

Hi Michael,

Thanks for the info!

On Wed, Jan 11, 2012 at 11:36 AM, Michael[tm] Smith m...@w3.org wrote:
 Satish S sat...@google.com, 2012-01-11 10:04 +:

 The Community Groups [1] page says they are for anyone to socialize their
 ideas for the Web at the W3C for possible future standardization.

 I don't think that page adequately describes the potential value of the
 Community Group option. A CG can be used for much more than just
 socializing ideas for some hope of standardization someday.

 The HTML Speech Incubator Group has done a considerable amount of work and
 the final report [2] is quite detailed with requirements, use cases and API
 proposals. Since we are interested in transitioning to the standards track
 now, working with the relevant WGs seems more appropriate than forming a
 new Community Group.

 I can understand you seeing it that way, but I hope you can also understand
 me saying that I'm not at all sure it's more appropriate for this work.

 I think everybody could agree that the point is not just to produce a spec
 that is nominally on the W3C standards track. Having something on the W3C
 standards track doesn't necessarily do anything magical to ensure that
 anybody actually implements it.


We have strong interest from Mozilla and Google to implement. Would
this not be sufficient to have this API designed in this group?

Thanks,
Andrei

 I think we all want is to for Web-platform technologies to actually get
 implemented across multiple browsers, interoperably -- preferably sooner
 rather than later. Starting from the WG option is not absolutely always the
 best way to cause that to happen. It is almost certainly not the best way
 to ensure it will get done more quickly.

 You can start up a CG and have the work formally going on within that CG in
 a matter of days, literally. In contrast, getting it going formally as a
 deliverable within a WG requires a matter of months.

 Among the things that are valuable about formal deliverables in WGs is that
 they get you RF commitments from participants in the WG. But one thing that
 I think not everybody understands about CGs is that they also get you RF
 commitments from participants in the CG; everybody in the CG has to agree
 to the terms of the W3C Community Contributor License Agreement -

  http://www.w3.org/community/about/agreements/cla/

 Excerpt: I agree to license my Essential Claims under the W3C CLA RF
 Licensing Requirements. This requirement includes Essential Claims that I own

 Anyway, despite what it may seem like from what I've said above, I'm not
 trying to do a hard sell here. It's up to you all what you choose to do.
 But I would like to help make sure you're making a fully informed decision
 based on what the actual benefits and costs of the different options are.

  --Mike

 [1] http://www.w3.org/community/about/#cg
 [2] http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech/

 --
 Michael[tm] Smith
 http://people.w3.org/mike/+

Re: Speech Recognition and Text-to-Speech Javascript API - seeking feedback for eventual standardization

2012-01-11 Thread Michael[tm] Smith

Michael[tm] Smith m...@w3.org, 2012-01-11 20:36 +0900:

 Satish S sat...@google.com, 2012-01-11 10:04 +:
 
  The Community Groups [1] page says they are for anyone to socialize their
  ideas for the Web at the W3C for possible future standardization.
 
 I don't think that page adequately describes the potential value of the
 Community Group option. A CG can be used for much more than just
 socializing ideas for some hope of standardization someday.
 
  The HTML Speech Incubator Group has done a considerable amount of work and
  the final report [2] is quite detailed with requirements, use cases and API
  proposals. Since we are interested in transitioning to the standards track
  now, working with the relevant WGs seems more appropriate than forming a
  new Community Group.

Another data point to consider is, we have a precedent of a CG that's
already far along with work on a spec that already has multiple
implementations: The Web Media Text Tracks CG, which is working on the
WebVTT format for text tracks (captions, subtitles, etc.) for HTML video:

  http://www.w3.org/community/texttracks/

They're well beyond the stage of documenting use cases and requirements and
providing proposals; they already have a complete spec:

  http://dev.w3.org/html5/webvtt/

And the WebVTT spec is already implemented in IE10 and partially in WebKit,
with active implementation work continuing -

  http://msdn.microsoft.com/en-us/library/hh673566.aspx#WebVTT
  https://bugs.webkit.org/showdependencytree.cgi?id=43668hide_resolved=1

That CG was started only a little over 3 months ago. So it is in fact
possible for a CG to be producing work that's actually already getting
actively implemented in current browsers.

  --Mike

-- 
Michael[tm] Smith
http://people.w3.org/mike/+

Re: Speech Recognition and Text-to-Speech Javascript API - seeking feedback for eventual standardization

2012-01-11 Thread Charles McCathieNevile


On Wed, 11 Jan 2012 22:36:28 +1100, Michael[tm] Smith m...@w3.org wrote:


Satish S sat...@google.com, 2012-01-11 10:04 +:

The Community Groups [1] page says they are for anyone to socialize  
their

ideas for the Web at the W3C for possible future standardization.


I don't think that page adequately describes the potential value of the
Community Group option. A CG can be used for much more than just
socializing ideas for some hope of standardization someday.

The HTML Speech Incubator Group has done a considerable amount of work  
and the final report [2] is quite detailed with requirements, use cases

and AP proposals. Since we are interested in transitioning to the
standards track now, working with the relevant WGs seems more
appropriate than forming a new Community Group.


I can understand you seeing it that way, but I hope you can also  
understand me saying that I'm not at all sure it's more appropriate for

this work.


And I hope you all understand me saying that I think it is indeed more  
appropriate to move it to a formal working group, for reasons explained  
below...


I think everybody could agree that the point is not just to produce a  
spec that is nominally on the W3C standards track. Having something on

the W3C standards track doesn't necessarily do anything magical to ensure
that anybody actually implements it.


Indeed. But the same goes for a community group. Implementation commitment  
doesn't come from people writing a spec.



I think we all want is to for Web-platform technologies to actually get
implemented across multiple browsers, interoperably -- preferably sooner
rather than later. Starting from the WG option is not absolutely always  
the best way to cause that to happen. It is almost certainly not the best

way to ensure it will get done more quickly.


Actually, I don't think that what kind of group the work happens in is  
relevant one way or another to how fast it gets implemented - and not very  
relevant to the rate of developing the spec.


You can start up a CG and have the work formally going on within that CG  
in a matter of days, literally. In contrast, getting it going formally as

a deliverable within a WG requires a matter of months.


In the general case this is true. But *starting* work is easy - as Mike  
said above the goal is to get stuff interoperably implemented, in other  
words, *finished*. And the startup time only has an impact on the finish  
time in very trivial cases.


Among the things that are valuable about formal deliverables in WGs is  
that they get you RF commitments from participants in the WG. But one

thing that I think not everybody understands about CGs is that they also
get you RF commitments from participants in the CG; everybody in the CG
has to agree to the terms of the W3C Community Contributor License
Agreement -

  http://www.w3.org/community/about/agreements/cla/

Excerpt: I agree to license my Essential Claims under the W3C CLA RF
Licensing Requirements. This requirement includes Essential Claims that  
I own


There are important differences in what WGs and CGs offer, and each has  
both advantages and disadvantages in terms of the overall level of  
protection offered. A fair criticism of the process applied to HTML5 is  
that the editor claims to accept input from the working group, plus the  
WHAT-WG (whose participants have made no commitment on patents at all)  
plus anything he reads in email, blogs, the side of milk cartons, etc.  
There is a theoretical risk that he will read something placed in front of  
him by someone who has avoided joining the WG (and therefore makes no  
patent commitment) and introduce it into the spec not knowing it carries a  
patent liability. I think that in practice this is unlikely to be a real  
problem for HTML - but that doesn't mean it is unlikely to be a real  
problem for any Web technology. In particular, I think that the work being  
proposed here would benefit from being in a real working group - either  
the Voice WG or the Web Apps WG seem like sensible candidate groups, a  
priori. Web Apps has the benefit that we are in the middle of the  
rechartering process, so adding deliverables is as painless now as it can  
ever be (and the truth is that this doesn't mean trivial - broad patent  
licensing doesn't always come without some effort, which is why it is  
considered valuable).



Anyway, despite what it may seem like from what I've said above, I'm not
trying to do a hard sell here. It's up to you all what you choose to do.
But I would like to help make sure you're making a fully informed  
decision based on what the actual benefits and costs of the different

options are.


Indeed.

cheers

Chaals

--
Charles 'chaals' McCathieNevile  Opera Software, Standards Group
je parle français -- hablo español -- jeg kan litt norsk
http://my.opera.com/chaals   Try Opera: http://www.opera.com

Re: Speech Recognition and Text-to-Speech Javascript API - seeking feedback for eventual standardization

2012-01-11 Thread Arthur Barstow


On 1/10/12 11:25 AM, ext Glen Shires wrote:
Per #4 Testing commitment(s): can you elaborate on what you would like 
to see at this point?


At this point, I think a `warm fuzzy` like if/when the spec advances to 
Candidate Recommendation, we will contribute to a test suite that is 
sufficient to exit the CR would be useful.



Also, what is the next step?


WRT the API you proposed, I think we have enough preliminary feedback 
for me to start a CfC to add the API to WebApps charter. My only concern 
is the open question (at least to me) re the markup part. It seems like 
it would be useful to review the proposed API and markup together. 
However, a CfC for the markup can be done separately (provided 
sufficient interest/commitment is expressed).


If I don't see any objection from Chaals or Doug, today or tomorrow I'll 
start a CfC for the API proposal .


-AB

Re: Speech Recognition and Text-to-Speech Javascript API - seeking feedback for eventual standardization

2012-01-11 Thread Satish S


 Per #4 Testing commitment(s): can you elaborate on what you would like to
 see at this point?


 At this point, I think a `warm fuzzy` like if/when the spec advances to
 Candidate Recommendation, we will contribute to a test suite that is
 sufficient to exit the CR would be useful.


Yes we will contribute to a test suite that is sufficient for the Candidate
Recommendation.


 Also, what is the next step?


 WRT the API you proposed, I think we have enough preliminary feedback for
 me to start a CfC to add the API to WebApps charter. My only concern is the
 open question (at least to me) re the markup part. It seems like it would
 be useful to review the proposed API and markup together. However, a CfC
 for the markup can be done separately (provided sufficient
 interest/commitment is expressed).


In the spirit of starting with the basics and iterating we did not include
markup in the proposed API. Markup support also renders cleanly as a layer
on top of the JS API with few additions, so as you suggest if there is
sufficient interest/commitment a separate CfC could be done.

Re: Pressing Enter in contenteditable: p or br or div?

On Tue, Jan 10, 2012 at 3:50 PM, Ryosuke Niwa rn...@webkit.org wrote:
 p has default margins. That alone is enough for us not to adopt p as
 the default paragraph separator.

On Wed, Jan 11, 2012 at 5:15 AM, Simon Pieters sim...@opera.com wrote:
 Sure, but some apps like to send their stuff in HTML email to clients that
 don't support styling, or some such.

I used to think that this was a strong argument, but then I realized
blockquote and ol and ul have default margins too.  So if you
want it to look right, you'll have to use a stylesheet.  Also, it's
worth pointing out that recent versions of Word have margins by
default when you hit Enter.

But Simon makes a good point: for the e-mail use-case, styling might
not be available.  So this is a decent reason to support div.

 Also, unfortunately, there are many legacy
 contents that rely on the fact webkit uses div as the paragraph separator so
 we need a global or per editing-host switch regardless.

This is also a good reason -- it lets preexisting apps that expect
div opt into that behavior in new browsers, instead of being
rewritten to support p.

Okay, so what API should we use?  I'd really prefer this be
per-editing host.  In which case, how about we make it a content
attribute on the editing host?  It can be a DOMSettableTokenList.
Maybe something like

  div editoptions=tab-indent

where the attribute is a whitespace-separated list of tokens.  To
start with, we can maybe have tab-indent (hitting Tab indents) and
div-separator (hitting Enter produces div).  Does this sound like a
good approach?  If so, what should we call the attribute?  And should
it imply contenteditable=true, or should the author have to specify
that separately?

Also: are there any good use-cases for br?  Allowing div instead
of p adds basically no extra complexity, but allowing br would
make things significantly more complicated.

 I almost want a global switch to toggle between legacy UA-specific behavior
 and new spec-compliant behavior.

That's something we definitely shouldn't have.  If WebKit wants to go
down the IE route and keep its legacy behavior for WebKit-specific
content, it's welcome to, but web-facing behavior should be entirely
standard.  If we had a nonstandard mode for editing, it would be
quirks mode all over again -- eventually we'd have to standardize that
too so browsers are interoperable on pages that don't opt in to the
standard behavior, and we'd just make everything more painful in the
end.

There's really no way to make this painless.  We just have to be as
careful to make it as painless as possible.

On Wed, Jan 11, 2012 at 4:43 AM, Markus Ernst derer...@gmx.ch wrote:
 IMO the ability to create clean, state-of-the-art HTML code should be one of
 the main goals of a new spec.

The overriding goal of the spec is to get interop as quickly and
painlessly as possible.  Everything else is secondary.  Once we have
interop, we can talk about significantly improving the utility of the
features.

[Bug 15522] New: Add execCommand() to Element

https://www.w3.org/Bugs/Public/show_bug.cgi?id=15522

   Summary: Add execCommand() to Element
   Product: WebAppsWG
   Version: unspecified
  Platform: All
OS/Version: All
Status: NEW
  Severity: enhancement
  Priority: P2
 Component: HTML Editing APIs
AssignedTo: a...@aryeh.name
ReportedBy: a...@aryeh.name
 QAContact: sideshowbarker+html-editing-...@gmail.com
CC: m...@w3.org, public-webapps@w3.org


Suggested by Ojan Vafai:
http://lists.w3.org/Archives/Public/public-webapps/2012JanMar/0090.html

I think we should expose it something like this:

* If the element does not have contenteditable set to true, throw.  (This means
we should also throw for the body of a document with designMode, unless that
body also has contenteditable=true.)
* For things like styleWithCSS, set the flag for that editing host and its
descendants only.
* For regular commands like bold, run the command restricted to the descendants
of that editing host.

This solves the problem of clicking the B button next to one editing host and
making text in another editing host bold, or styleWithCss etc. changes leaking
between editing hosts.  If we do this, I'd also be okay with adding new flags.

There are two important changes we'd need here, I think:

1) Make a concept of editing flags or something (not a good name; they might
contain data, not just booleans).  The document always has a value for each
editing flag, and you can set them on an editing host too by calling
execCommand() on it, but by default they're all set to inherit for
non-document editing hosts.  To get the editing flag for an element, go up the
DOM until you hit something with the flag set.  Then change things like If the
CSS styling flag is false: to If the CSS styling flag for node is false:.

2) Change things like Let element list be all editable Elements effectively
contained in the active range to exclude anything outside the node you called
execCommand() on.

While I'm at it, it might make sense to fix bug 13911.

This should be flagged prominently as a new unimplemented feature, so that
implementers should be sure to critique the design if they don't completely
like it.

-- 
Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

Re: [editing] tab in an editable area WAS: [whatwg] behavior when typing in contentEditable elements

On Tue, Jan 10, 2012 at 4:48 PM, Charles Pritchard ch...@jumis.com wrote:
 Would users press Esc to get out of the tab lock?

Do they need to be able to get out of it?  They can't in a regular
word processor, so why should they be able to in Google Docs?  If some
users need to be able to override the feature, that's a good reason to
have it supported by browsers, so browsers can override it.  If the
page just intercepts tab, you can't get around it.

On Tue, Jan 10, 2012 at 7:28 PM, Ojan Vafai o...@chromium.org wrote:
 I agree the API is not the best. We should put execCommand, et. al. on
 Element. That would solve the global flag thing for useCss/styleWithCss as
 well. It's also more often what a website actually wants. They have a
 toolbar associated with each editing host. They don't want a click on the
 toolbar to modify content in a different editing host. This is a change we
 should make regardless of what we decide for tabbing behavior IMO.

What would be the behavior on Element?  Something like

* If the element is not an editing host, throw.
* For things like styleWithCSS, set the flag for that editing host and
its descendants only.
* For regular commands like bold, run the command restricted to the
descendants of that editing host.

Whereas calling it on document would affect all nodes in the document.
 This sounds like an interesting idea.  You're right that you don't
want the bold button for one editing host affecting other editing
hosts, which in my spec it currently does.

I've filed a bug: https://www.w3.org/Bugs/Public/show_bug.cgi?id=15522

 Calling indent doesn't actually match tabbing behavior (e.g. inserting a
 tab/spaces or, in a table cell, going to the next cell), right? I guess
 another way we could approach this is to add document.execCommand('Tab')
 that does the text-editing tabbing behavior. I'd be OK with that (the
 command name could probably be better).

Current indentation behavior is here:

http://dvcs.w3.org/hg/editing/raw-file/tip/editing.html#indenting-and-outdenting

You're right that it doesn't match up with how tab works at all.  The
way I make other keystrokes work (Enter, Delete, etc.) is by mapping
them to some command, following WebKit:

http://dvcs.w3.org/hg/editing/raw-file/tip/editing.html#additional-requirements

So I need to define a tab command.  I've filed a bug:

https://www.w3.org/Bugs/Public/show_bug.cgi?id=15523

 The bitmask is not a great idea, but there are certainly editors that would
 want tabbing in lists to work, but tab outside of lists to do the normal web
 tabbing behavior.

What are examples, and why?

 Historically, one of my biggest frustrations with contentEditable is that
 you have to take it all or none. The lack of configurability is frustrating
 as a developer. Maybe the solution is to come up with a lower level set of
 editing primitives in place of contentEditable instead of trying to extend
 it though.

Yes, that's definitely something we need to do.  There are algorithms
I've defined that would probably be really useful to web authors, like
wrap a list of nodes or some version of set the value of the
selection (= inline formatting algorithm).  I've been holding off on
exposing these to authors because I don't know if these algorithms are
correct yet, and I don't want implementers jumping the gun and
exposing them before using them internally so they're well-tested.  I
expect they'll need to be refactored a bunch once implementers try
actually reimplementing their editing commands in terms of them, and
don't want to break them for authors when that happens.

[Bug 15523] New: Define tab command

https://www.w3.org/Bugs/Public/show_bug.cgi?id=15523

   Summary: Define tab command
   Product: WebAppsWG
   Version: unspecified
  Platform: All
OS/Version: All
Status: NEW
  Severity: enhancement
  Priority: P2
 Component: HTML Editing APIs
AssignedTo: a...@aryeh.name
ReportedBy: a...@aryeh.name
 QAContact: sideshowbarker+html-editing-...@gmail.com
CC: m...@w3.org, public-webapps@w3.org, o...@chromium.org


Ojan points out there's a need for a tab command, since indent doesn't
match the behavior of tabbing:
http://lists.w3.org/Archives/Public/public-webapps/2012JanMar/0090.html

This needs to behave like indent for lists, but in regular text probably needs
to insert something like span style=white-space:pre-wrap#9;/span, and in
tables should tab between cells.

Do we also want an untab command that behaves like Shift-Tab?  Presumably
yes, for table cells at least.  How should it behave in regular text?  Or
lists?  LibreOffice Writer behaves oddly.

-- 
Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

[editing] Avoiding selections with no corresponding range, to simplify authoring

Anne asked me to investigate how exactly Ranges are added to
Selections (bug:
https://www.w3.org/Bugs/Public/show_bug.cgi?id=15470).  It turns out
browsers mostly don't interoperate.  One interesting thing I found out
is that in Gecko, if no one calls
addRange/removeRange/removeAllRanges, rangeCount is always exactly
one.  This means getRangeAt(0) will never throw.  This is actually
great, because it avoids a common authoring bug -- rangeCount is
rarely 0 in any browser, so authors often will call getRangeAt(0)
unconditionally, which risks throwing IndexSizeError.  I plan to
change the spec to match Gecko, in requiring that user-created
selections always have exactly one range (which is initially collapsed
at (document, 0)).

I'd like to go further, though.  addRange() already doesn't allow more
than one range per spec -- if there's an existing range, it replaces
it.  How about removeRange() and removeAllRanges() remove the range
and then add a new one collapsed at (document, 0)?  The common pattern
of remove(All)Range(s) followed by addRange will still work the same,
because addRange will replace the dummy range.  But now rangeCount
will *always* be 1, so getRangeAt(0) will *never* throw.  This seems
like it would prevent an entire class of authoring bugs (although I'm
admittedly not totally sure about compat impact).

Also, while I'm at it, how about collapsing at
(document.documentElement, 0) instead of (document, 0)?  This has the
minor added benefit of avoiding Selection boundary points that aren't
in an Element or Text node, which again makes things simpler for
authors.

If implementers are okay with this, I'll update the spec.

[Bug 15470] Changing the selection creates a Range object

https://www.w3.org/Bugs/Public/show_bug.cgi?id=15470

Aryeh Gregor a...@aryeh.name changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED

--- Comment #2 from Aryeh Gregor a...@aryeh.name 2012-01-11 17:12:49 UTC ---
I clarified behavior requirements for how to associate ranges with selections,
matching Firefox as noted in comment 1:
http://dvcs.w3.org/hg/editing/rev/c989dd9e441d

Started a discussion for possible further change:
http://lists.w3.org/Archives/Public/public-webapps/2012JanMar/0107.html

-- 
Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

[Bug 15524] New: Specify something about drag and drop behavior

https://www.w3.org/Bugs/Public/show_bug.cgi?id=15524

   Summary: Specify something about drag and drop behavior
   Product: WebAppsWG
   Version: unspecified
  Platform: All
OS/Version: All
Status: NEW
  Severity: enhancement
  Priority: P2
 Component: HTML Editing APIs
AssignedTo: a...@aryeh.name
ReportedBy: a...@aryeh.name
 QAContact: sideshowbarker+html-editing-...@gmail.com
CC: m...@w3.org, public-webapps@w3.org, rn...@webkit.org


Browsers allow the user to drag and drop HTML snippets within contenteditable
regions.  Ryosuke requested (via private e-mail a couple of months ago) that
this be required, along with saying exactly what gets deleted/inserted when the
user does this.

-- 
Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

Re: Pressing Enter in contenteditable: p or br or div?

On Wed, Jan 11, 2012 at 1:43 AM, Markus Ernst derer...@gmx.ch wrote:

 Am 11.01.2012 10:00 schrieb Simon Pieters:

  On Tue, 10 Jan 2012 21:50:34 +0100, Ryosuke Niwa rn...@webkit.org
 wrote:

  On Tue, Jan 10, 2012 at 12:46 PM, Aryeh Gregor a...@aryeh.name wrote:


 On Tue, Jan 10, 2012 at 3:40 PM, Ryosuke Niwa rn...@webkit.org wrote:
  Single br tag is shorter than pairs of div tags when serialized.

 True, but only slightly, and the difference is even smaller if you use
 p instead of div. This isn't enough of a reason by itself to
 justify the extra complexity of another mode. Are there other
 reasons?


 p has default margins.


 This is why we implemented opera-defaultblock. Apps were manually
 converting our output to use divs because they didn't want margins,
 which is non-trivial to do and often leaves bugs in edge cases.


 Actually, applying p {margin:0} looks quite trivial.


The problem is that many existing contents don't have that css rule and we
obviously don't want to create markup like p style=margin: 0px; for it
is too verbose.

On the long term, from a developer's and client supporter's POV I'd prefer
 to have a standard behavior that works the same in all UAs, and all common
 editor applications, by default. Offering a default paragraph separator
 setting means, that editor behaviors will remain different across
 applications, which is confusing for many users.


That's just not gonna happen. Each application uses a different paragraph
separator for a reason.

It might be less a hassle to have maintainers of existing applications
 insert a line of code that triggers legacy behavior, if this is crucial for
 their application.


That doesn't solve any backward compatibility problems.

- Ryosuke

Re: [editing] Avoiding selections with no corresponding range, to simplify authoring

On Wed, Jan 11, 2012 at 8:41 AM, Aryeh Gregor a...@aryeh.name wrote:

 Anne asked me to investigate how exactly Ranges are added to
 Selections (bug:
 https://www.w3.org/Bugs/Public/show_bug.cgi?id=15470).  It turns out
 browsers mostly don't interoperate.  One interesting thing I found out
 is that in Gecko, if no one calls
 addRange/removeRange/removeAllRanges, rangeCount is always exactly
 one.  This means getRangeAt(0) will never throw.  This is actually
 great, because it avoids a common authoring bug -- rangeCount is
 rarely 0 in any browser, so authors often will call getRangeAt(0)
 unconditionally, which risks throwing IndexSizeError.  I plan to
 change the spec to match Gecko, in requiring that user-created
 selections always have exactly one range (which is initially collapsed
 at (document, 0)).


Does gecko returns a Range at (document, 0) for getRange(0) in such cases?

I'd like to go further, though.  addRange() already doesn't allow more
 than one range per spec -- if there's an existing range, it replaces
 it.  How about removeRange() and removeAllRanges() remove the range
 and then add a new one collapsed at (document, 0)?  The common pattern
 of remove(All)Range(s) followed by addRange will still work the same,
 because addRange will replace the dummy range.  But now rangeCount
 will *always* be 1, so getRangeAt(0) will *never* throw.  This seems
 like it would prevent an entire class of authoring bugs (although I'm
 admittedly not totally sure about compat impact).

 Also, while I'm at it, how about collapsing at
 (document.documentElement, 0) instead of (document, 0)?  This has the
 minor added benefit of avoiding Selection boundary points that aren't
 in an Element or Text node, which again makes things simpler for
 authors.


This would change the behavior of removing ranges in design mode. Removing
the range will move the caret to the top of the document.

- Ryosuke

Re: [editing] Avoiding selections with no corresponding range, to simplify authoring

2012-01-11 Thread Boris Zbarsky


On 1/11/12 11:41 AM, Aryeh Gregor wrote:

Also, while I'm at it, how about collapsing at
(document.documentElement, 0) instead of (document, 0)?


Then you have to handle the case when document.documentElement is null.

And yes, this has come up before; there are scripts out there that 
remove documentElements, do some stuff, insert new documentElements, etc.



This has the minor added benefit of avoiding Selection boundary points that 
aren't
in an Element or Text node


This would happen anyway if you set up a selection inside 
document.documentElement and someone removes the documentElement; the 
normal range algorithm will give you endpoints inside the Document.  so 
you really can't enforce this condition.


-Boris

Re: [editing] Avoiding selections with no corresponding range, to simplify authoring

On Wed, Jan 11, 2012 at 12:27 PM, Ryosuke Niwa rn...@webkit.org wrote:
 Does gecko returns a Range at (document, 0) for getRange(0) in such cases?

Okay, it looks like my testing before was off.  Actually, all browsers
have no range in the selection initially.  But I was testing in Live
DOM Viewer, which didn't fully reset the document state when the
source code changed, because not all browsers clear the selection's
range on unload.  I fixed the spec to require the range to initially
be null (like all browsers), and specified that the range has to be
reset to null when the document is unloaded (like IE/Opera, not like
Gecko/WebKit):

http://dvcs.w3.org/hg/editing/rev/6aaa4b8455c9

I also added a test for the latter condition, and filed a Gecko bug
(WebKit is also now buggy per spec):

http://dvcs.w3.org/hg/editing/raw-file/6aaa4b8455c9/selecttest/unload.html
https://bugzilla.mozilla.org/show_bug.cgi?id=717339


Since we seem to have interop on the selection's rangeCount initially
being 0, I'm no longer enthusiastic about changing that.  I'm fine
with leaving the spec as-is now, unless implementers would prefer to
change.

On Wed, Jan 11, 2012 at 11:54 AM, Boris Zbarsky bzbar...@mit.edu wrote:
 Then you have to handle the case when document.documentElement is null.

 And yes, this has come up before; there are scripts out there that remove
 documentElements, do some stuff, insert new documentElements, etc.

 . . .

 This would happen anyway if you set up a selection inside
 document.documentElement and someone removes the documentElement; the normal
 range algorithm will give you endpoints inside the Document.  so you really
 can't enforce this condition.

Well, yes, and you can also do addRange() with whatever you like.  But
we can at least try to make the condition rarer, so bugs are less
likely to crop up in practice when authors inevitably write incorrect
code.

Anyway, as noted, I retract my suggestion for other reasons, unless
someone else is still interested.

Re: Pressing Enter in contenteditable: p or br or div?

On Wed, Jan 11, 2012 at 12:38 PM, Ryosuke Niwa rn...@webkit.org wrote:
 That sounds like a great idea.

 . . .

 I'm not sure if we should add just editoptions though given we might need
 to add more elaborative options in the future. It might make more sense to
 add a new attribute per option as in:

 div contentEditable paragraphSeparator=p tabIndentation

Ojan suggested in the other thread that we instead allow calling
execCommand() on Element, and have the result restricted to that
Element.  That solves the global-flags problem too, and doesn't
require new attributes.  So you'd do

  div.execCommand(tabindent, false, true);

or whatever.  Someone could still call
document.execCommand(tabindent, false, false), but that would be
overridden if it was called on the editing host.  I filed a bug on it:

https://www.w3.org/Bugs/Public/show_bug.cgi?id=15522

Does that sound good too?

 Should enter behave like shift+enter when br is the default
 paragraph separator?

Default paragraph separators are used in a couple of other places too,
so it would be a little more work than that.  But I just looked, and
it wouldn't be as bad as I thought.  So this is doable if people have
any good use-cases.

Re: [editing] tab in an editable area WAS: [whatwg] behavior when typing in contentEditable elements


On 1/11/2012 8:15 AM, Aryeh Gregor wrote:

On Tue, Jan 10, 2012 at 4:48 PM, Charles Pritchardch...@jumis.com  wrote:

Would users press Esc to get out of the tab lock?

Do they need to be able to get out of it?  They can't in a regular
word processor, so why should they be able to in Google Docs?  If some
users need to be able to override the feature, that's a good reason to
have it supported by browsers, so browsers can override it.  If the
page just intercepts tab, you can't get around it.


The reason is listed in WCAG2 section 2.1.2 and CR5.
http://www.w3.org/TR/WCAG/

The items suggest that a standard means of moving focus be maintained. 
Users should be given simple instructions on how to move focus if the 
keyboard is trapped.


When the tab key is trapped, I recommend having the escape key move 
focus and untrap tab. That said, that can interfere with full screen 
mode, which may also use escape with varying success.




On Tue, Jan 10, 2012 at 7:28 PM, Ojan Vafaio...@chromium.org  wrote:

Historically, one of my biggest frustrations with contentEditable is that
you have to take it all or none. The lack of configurability is frustrating
as a developer. Maybe the solution is to come up with a lower level set of
editing primitives in place of contentEditable instead of trying to extend
it though.

Yes, that's definitely something we need to do.  There are algorithms
I've defined that would probably be really useful to web authors, like
wrap a list of nodes or some version of set the value of the
selection (= inline formatting algorithm).  I've been holding off on
exposing these to authors because I don't know if these algorithms are
correct yet, and I don't want implementers jumping the gun and
exposing them before using them internally so they're well-tested.  I
expect they'll need to be refactored a bunch once implementers try
actually reimplementing their editing commands in terms of them, and
don't want to break them for authors when that happens.



We look to contentEditable as a means of programming and testing RTE.

In theory, we can implement RTE through the scripting environment, 
expose a feature-full execCommand set, and simply bolt-on existing 
editors such as the CKEditor.


Author implemented RTE is a bit taboo, but it's quite useful in 
prototyping and developing authoring tools.


Aryeh, your work on contentEditable is quite valuable, as it gives us a 
standard means to expose functions and a spec to follow. I understand 
and appreciate your careful deliberation.



-Charles

Re: Colliding FileWriters

2012-01-11 Thread Jonas Sicking

On Tue, Jan 10, 2012 at 1:32 PM, Eric U er...@google.com wrote:
 On Tue, Jan 10, 2012 at 1:08 PM, Jonas Sicking jo...@sicking.cc wrote:
 Hi All,

 We've been looking at implementing FileWriter and had a couple of questions.

 First of all, what happens if multiple pages create a FileWriter for
 the same FileEntry at the same time? Will both be able to write to the
 file at the same time and whoever writes lasts to a given byte wins?

 This isn't currently specified, and that's a hole we should fill.  By
 not having it in the spec, my assumption would be that last-wins would
 hold, but it would be good to clarify it if that's the behavior we
 want.  It's especially important given that there's nothing like
 fflush(), which would help users know what last meant.  Speaking of
 which, should we add a flushing mechanism?

 This is different from how file systems normally work since as long as
 file is open for writing that tends to prevent other processes from
 opening the same file.

 You're perhaps thinking of windows, where by default files are opened
 in exclusive mode?  On other operating systems, and on windows when
 you specify FILE_SHARE_WRITE in dwShareMode in CreateFile, multiple
 writers can exist simultaneously.

Ah. I didn't realize this was different on other OSs. It still seems
risky to not provide any means to get exclusive access. The only way I
can see websites dealing with this is to create their own locking
mechanism backed by using IndexedDB transactions as low-level atomic
primitive (local-storage doesn't work since you can implement
compare-and-swap in an atomic manner).

Having a 'exclusive' flag for createFileWriter seems much easier and
removes the IndexedDB dependency. I'd probably even say that it should
default to true since on the web defaulting to safe rather than fast
generally results in fewer bugs.

 A second question is why is FileEntry.createWriter asynchronous? It
 doesn't actually do any IO and so it seems like it could return an
 answer synchronously.

 FileWriter has a synchronous length property, just as Blob does, so it
 needs to do IO at creation time to look it up.

So how does this work if you have two tabs running in different
processes create FileWriters for the same FileEntry. Each tab could
end up changing the file's size in which case the the other tabs
FileWriter will either have to synchronously update its .length, or it
will have an outdated length.

So the IO you do when creating the FileWriter is basically unreliable
as soon as it's done.

So it seems like you could get the size when creating the FileEntry
and then use that cached size when creating FileWriter instance.

Though I wonder if it wouldn't be better to remove the .length
property. If anything we could add a asynchronous length getter or a
write method which appends to the end of the file (since writing is
already asynchronous).

Though if we add the 'exclusive' flag described above, then we'll need
to keep createFileWriter async anyway.

 Would this also explain why FileEntry.getFile is asynchronous? I.e. it
 won't call it's callback until all current FileWriters have been
 closed?

 Nope.  It's asynchronous because a File is a Blob, and has a
 synchronous length accessor, so we look up the length when we mint the
 File.  Note that the length can go stale if you have multiple writers,
 as we want to keep it fast.

This reminds me of something else that I intended to ask. I seem to
recall that you guys invalidate existing File instances pointing to a
FileEntry if the file is modified after the File object is
instantiated? How is this implemented? Especially given that the
FileWriter which modified the file might live in a different process
than the File reference. Do you guys grab a time-stamp when the File
instance is created and then check that against the last-modified time
of the os-file? What happens if the user modifies the OS time?

/ Jonas

Re: File modification

2012-01-11 Thread Eric U

On Wed, Jan 11, 2012 at 12:22 PM, Charles Pritchard ch...@jumis.com wrote:
 On 1/11/2012 9:00 AM, Glenn Maynard wrote:


 This isn't properly specced anywhere and may be impossible to implement
 perfectly, but previous discussions indicated that Chrome, at least, wanted
 File objects loaded from input elements to only represent access for the
 file as it is when the user opened it.  That is, the File is immutable (like
 a Blob), and if the underlying OS file changes (thus making the original
 data no longer available), attempting to read the File would fail.  (This
 was in the context of storing File in structured clone persistent storage,
 like IndexedDB.)


 Mozilla seems to only take a snapshot when the user opens the file. Chrome
 goes in the other direction, and does so intentionally with FileEntry.
 I'd prefer everyone follow Chrome.

We do so with FileEntry, in the sandbox, because it's intended to be a
much more powerful API than File, and the security aspects of it are
much simpler.  When the user drags a File into the browser, it's much
less clear that they intend to give the web app persistent access to
that File, including all future changes until the page is closed.  I
don't think we'd rush to make that change to the spec.  And if our
implementation isn't snapshotting currently, that's a bug.

 The spec on this could be nudged slightly to support Chrome's existing
 behavior.

 From dragdrop:
 http://www.whatwg.org/specs/web-apps/current-work/multipage/dnd.html
 The files attribute must return a live FileList sequence

 http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#live
 If a DOM object is said to be live, then the attributes and methods on that
 object must operate on the actual underlying data, not a snapshot of the
 data.

 Dragdrop continues:
 for a given FileList object and a given underlying file, the same File
 object must be used each time.

 Given that the underlying file can change, and the FileList sequence is
 live, it seems reasonable that subsequent reads of FileList would access a
 different File object when the underlying file has changed.

 FileList.onchanged would be appropriate. File.onupdated would not be
 appropriate. Entry.onupdated would be appropriate.


 I have one major technical concern: monitoring files for changes isn't
 free.  With only a DOM event, all instantiated Files (or Entries) would have
 to monitor changes; you don't want to depend on do something if an event
 handler is registered, since that violates the principle of event handler
 registration having no other side-effects.  Monitoring should be enabled
 explicitly.

 I also wonder whether this could be implemented everywhere, eg. on mobile
 systems.


 At this point, iOS still doesn't allow input type=file nor dataTransfer
 of file. So, we're looking far ahead.

 A system may send a FileList.onchanged() event when it notices that the
 FileList has been updated. It can be done on access of a live FileList when
 a mutation is detected. It could be done by occasional polling, or it could
 be done via notify-style OS hooks. In the first case, there is no
 significant overhead. webkitdirectory returns a FileList object that can be
 monitored via directory notification hooks; again, if the OS supports it.

 Event handlers have some side effects, but not in the scripting environment.
 onclick, for example, may mean that an element responds to touch events in
 the mobile environment.


 -Charles

Re: File modification


On 1/11/2012 12:27 PM, Eric U wrote:

On Wed, Jan 11, 2012 at 12:22 PM, Charles Pritchardch...@jumis.com  wrote:

On 1/11/2012 9:00 AM, Glenn Maynard wrote:


This isn't properly specced anywhere and may be impossible to implement
perfectly, but previous discussions indicated that Chrome, at least, wanted
File objects loaded from input elements to only represent access for the
file as it is when the user opened it.  That is, the File is immutable (like
a Blob), and if the underlying OS file changes (thus making the original
data no longer available), attempting to read the File would fail.  (This
was in the context of storing File in structured clone persistent storage,
like IndexedDB.)


Mozilla seems to only take a snapshot when the user opens the file. Chrome
goes in the other direction, and does so intentionally with FileEntry.
I'd prefer everyone follow Chrome.

We do so with FileEntry, in the sandbox, because it's intended to be a
much more powerful API than File, and the security aspects of it are
much simpler.  When the user drags a File into the browser, it's much
less clear that they intend to give the web app persistent access to
that File, including all future changes until the page is closed.  I
don't think we'd rush to make that change to the spec.  And if our
implementation isn't snapshotting currently, that's a bug.


In my reading of the spec, UAs explicitly instructed not to implement a 
snapshot. Everything in the specs talks about underlying data.
They are to keep the FileList live, and failing that, or should the 
underlying file be removed, they should throw an error when the File 
object is used with FileReader or the like.


I've written code for Chrome that detects file changes, so it'd be a bit 
of a bummer to see this feature removed without suitable replacement.


FileEntry does not currently work with DD as well as File. I don't 
think it's caught up yet. I can't drag a FileEntry into input 
type=file on another site.



 From dragdrop:
http://www.whatwg.org/specs/web-apps/current-work/multipage/dnd.html
The files attribute must return a live FileList sequence

http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#live
If a DOM object is said to be live, then the attributes and methods on that
object must operate on the actual underlying data, not a snapshot of the
data.

Dragdrop continues:
for a given FileList object and a given underlying file, the same File
object must be used each time.



Underlying is an important distinction here.

For persistent directory access, we talked about using 
requestFileSystem(MOUNT).


-Charles

Re: File modification

2012-01-11 Thread Glenn Maynard

(Pardon the top-quoting and poor editing; working off a phone today.)

This isn't properly specced anywhere and may be impossible to implement
perfectly, but previous discussions indicated that Chrome, at least, wanted
File objects loaded from input elements to only represent access for the
file as it is when the user opened it.  That is, the File is immutable
(like a Blob), and if the underlying OS file changes (thus making the
original data no longer available), attempting to read the File would
fail.  (This was in the context of storing File in structured clone
persistent storage, like IndexedDB.)

(I don't know if this was thought to apply to FSAPI-acquired Files as well,
eg. requiring the user to request a new File from the Entry after modifying
it.  That would be annoying, but it would preserve the invariant that Blobs
are immutable, which shouldn't be sacrificed lightly.)

That would make onchanged not meaningful for File.  However, it would be
useful on Entry instead.

I have one major technical concern: monitoring files for changes isn't
free.  With only a DOM event, all instantiated Files (or Entries) would
have to monitor changes; you don't want to depend on do something if an
event handler is registered, since that violates the principle of event
handler registration having no other side-effects.  Monitoring should be
enabled explicitly.

I also wonder whether this could be implemented everywhere, eg. on mobile
systems.
 On Jan 10, 2012 1:58 PM, Charles Pritchard ch...@jumis.com wrote:





 On Jan 10, 2012, at 1:53 PM, Eric U er...@google.com wrote:

 On Tue, Jan 10, 2012 at 1:29 PM, Charles Pritchard ch...@visc.us wrote:

 Modern operating systems have efficient mechanisms to send a signal when a
 watched file or directory is modified.


 File and FileEntry have a last modified date-- currently we must poll
 entries to see if the modification date changes. That works completely fine
 in practice, but it doesn't give us a chance to exploit the efficiency of
 some operating systems in notifying applications about file updates.


 So as a strawman: a File.onupdated event handler may be useful.


 It seems like it would be most useful if the File or FileEntry points
 to a file outside the sandbox defined by the FileSystem spec.  Does
 any browser currently supply such a thing?  Chrome currently
 implements this [with FileEntry] only for ChromeOS components that are
 implemented as extensions.  Does any browser let you have a File
 outside the sandbox *and* update its modification time?

 If you're dealing only with FileEntries inside the sandbox, there are
 already more efficient ways to tell yourself that you've changed
 something.


 Far as I can tell, File is live, and it's supposed to be live from input
 type=file.

 For FileEntry-- I'd imagine we'll see cross-origin communication with the
 objects at some point. In those cases, onupdated would be simpler than an
 additional postMessage layer for update notifications.

Re: [Bug 15434] New: [IndexedDB] Detail steps for assigning a key to a value

On Wed, Jan 11, 2012 at 12:40 PM, Joshua Bell jsb...@chromium.org wrote:

I thought this issue was theoretical when I filed it, but it appears to be
the reason behind the difference in results for IE10 vs. Chrome 17 when
running this test:

http://samples.msdn.microsoft.com/ietestcenter/indexeddb/indexeddb_harness.htm?url=idbobjectstore_add8.htm

If I'm reading the test script right, the IDB implementation is being
asked to assign a key (autogenerated, so a number, say 1) using the key
path test.obj.key to a value { property: data }

The Chromium/WebKit implementation follows the steps I outlined below.
Namely, at step 4 the algorithm would abort when the value is found to not
have a test attribute.

To be clear, in Chromium the *algorithm* aborts, leaving the value
unchanged. The request and transaction carry on just fine.

If IE10 is passing, then it must be synthesizing new JS objects as it
walks the key path, until it gets to the final step in the path, yielding
something like { property: data, test: { obj: { key: 1 } } }

Thoughts?

On Thu, Jan 5, 2012 at 1:44 PM, bugzi...@jessica.w3.org wrote:

https://www.w3.org/Bugs/Public/show_bug.cgi?id=15434

Summary: [IndexedDB] Detail steps for assigning a key to a
value
Product: WebAppsWG
Version: unspecified
Platform: All
OS/Version: All
Status: NEW
Severity: minor
Priority: P2
Component: Indexed Database API
AssignedTo: dave.n...@w3.org
ReportedBy: jsb...@chromium.org
QAContact: member-webapi-...@w3.org
CC: m...@w3.org, public-webapps@w3.org

In section 5.1 Object Store Storage Operation, step 2: when a key
generator
is used with store with in line keys, the spec says: set the property in
value
pointed to by store's key path to the new value for key

The steps for extracting a key from a value using a key path are called
out
explicitly under Algorithms in 4.7. Should the steps for assigning a key
to a
value using a key path be similarly documented?

Cribbing from the spec, this could read as:

4.X Steps for assigning a key to a value using a key path

When taking the steps for assigning a key to a value using a key path, the
implementation must run the following algorithm. The algorithm takes a
key path
named /keyPath/, a key named /key/, and a value named /value/ which may be
modified by the steps of the algorithm.

1. If /keyPath/ is the empty string, skip the remaining steps and /value/
is
not modified.
2. Let /remainingKeypath/ be /keyPath/ and /object/ be /value/.
3. If /remainingKeypath/ has a period in it, assign /remainingKeypath/ to
be
everything after the first period and assign /attribute/ to be everything
before that first period. Otherwise, go to step 7.
4. If /object/ does not have an attribute named /attribute/, then skip
the rest
of these steps and /value/ is not modified.
5. Assign /object/ to be the /value/ of the attribute named /attribute/ on
/object/.
6. Go to step 3.
7. NOTE: The steps leading here ensure that /remainingKeyPath/ is a single
attribute name (i.e. string without periods) by this step.
8. Let /attribute/ be /remainingKeyPath/
9. If /object/ has an attribute named /attribute/ which is not
modifiable, then
skip the remaining steps and /value/ is not modified.
10. Set an attribute named /attribute/ on /object/ with the value /key/.

Notes:

The above talks in terms of a mutable value. It could be amended to have
an
initial step which produces a clone of the value, which is later
returned, but
given how this algorithm is used the difference is not observable, since
the
value stored should already be a clone that doesn't have any other
references.

Step 9 is present in case the key path refers to a special property,
e.g. a
String/Array length, Blob/File properties, etc.

--
Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

Re: File modification


On 1/11/2012 9:00 AM, Glenn Maynard wrote:


This isn't properly specced anywhere and may be impossible to 
implement perfectly, but previous discussions indicated that Chrome, 
at least, wanted File objects loaded from input elements to only 
represent access for the file as it is when the user opened it.  That 
is, the File is immutable (like a Blob), and if the underlying OS file 
changes (thus making the original data no longer available), 
attempting to read the File would fail.  (This was in the context of 
storing File in structured clone persistent storage, like IndexedDB.)




Mozilla seems to only take a snapshot when the user opens the file. 
Chrome goes in the other direction, and does so intentionally with 
FileEntry.

I'd prefer everyone follow Chrome.

The spec on this could be nudged slightly to support Chrome's existing 
behavior.


From dragdrop:
http://www.whatwg.org/specs/web-apps/current-work/multipage/dnd.html
The files attribute must return a live FileList sequence

http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#live
If a DOM object is said to be live, then the attributes and methods on 
that object must operate on the actual underlying data, not a snapshot 
of the data.


Dragdrop continues:
for a given FileList object and a given underlying file, the same File 
object must be used each time.


Given that the underlying file can change, and the FileList sequence is 
live, it seems reasonable that subsequent reads of FileList would access 
a different File object when the underlying file has changed.


FileList.onchanged would be appropriate. File.onupdated would not be 
appropriate. Entry.onupdated would be appropriate.


I have one major technical concern: monitoring files for changes isn't 
free.  With only a DOM event, all instantiated Files (or Entries) 
would have to monitor changes; you don't want to depend on do 
something if an event handler is registered, since that violates the 
principle of event handler registration having no other side-effects.  
Monitoring should be enabled explicitly.


I also wonder whether this could be implemented everywhere, eg. on 
mobile systems.




At this point, iOS still doesn't allow input type=file nor 
dataTransfer of file. So, we're looking far ahead.


A system may send a FileList.onchanged() event when it notices that the 
FileList has been updated. It can be done on access of a live FileList 
when a mutation is detected. It could be done by occasional polling, or 
it could be done via notify-style OS hooks. In the first case, there is 
no significant overhead. webkitdirectory returns a FileList object that 
can be monitored via directory notification hooks; again, if the OS 
supports it.


Event handlers have some side effects, but not in the scripting 
environment. onclick, for example, may mean that an element responds to 
touch events in the mobile environment.



-Charles

Re: Pressing Enter in contenteditable: p or br or div?

On Wed, Jan 11, 2012 at 3:15 PM, Ryosuke Niwa rn...@webkit.org wrote:
 That sounds workable. Presumably it's only available on the editing host (as
 supposed to any element or any element with contenteditable content
 attribute).

Right.

Re: File modification

2012-01-11 Thread Kyle Huey

On Tue, Jan 10, 2012 at 10:57 PM, Charles Pritchard ch...@jumis.com wrote:

 Far as I can tell, File is live, and it's supposed to be live from input
 type=file.


 FWIW, I (and I believe others at Mozilla) consider the fact that File
objects are live in Gecko a bug.  Fixing this is kind of complicated in
our implementation though, and I've had other things to work on, so it's
still there.

- Kyle

Re: [File API]: Determining encoding

2012-01-11 Thread Arun Ranganathan

Glenn,

Sorry about letting this one get by unanswered -- I was OOTO at the time you 
sent it.


 Questions and thoughts while reading
 http://dev.w3.org/2006/webapi/FileAPI/#enctype:

 is this spec actually
 requiring that every registered encoding be supported?

What's required is that UAs support as much of the encodings in [IANACHARSET] 
as possible -- I think that's fair.  I've rewritten the algorithm to allow for 
what's not supported to be treated as UTF-8.  

Upon reflection, it might be prudent to decide a minimum subset of supported 
encodings, but I'm also comfortable leaving this to implementations and not 
saying anything about it.  What do you think?

 It would be clearer if steps 1 and 2 used the same terminology for an
 invalid character set.  

snip /

I really liked your version -- much clearer than the original text -- and so 
I've rewritten the editor's draft to reflect the change.  Many thanks :)

http://dev.w3.org/2006/webapi/FileAPI/#encoding-determination

-- A*

 When reading blob objects using the readAsText() read method, the
following encoding determination steps MUST be followed:

 1. Let charset be null.
 2. If the encoding parameter is specified, and is the name or alias of a
character set used on the Internet [IANACHARSET], let charset be encoding
parameter.
 3. If charset is null, and the blob's type attribute is present, and its
Charset Parameter [RFC2046] is the name or alias of a character set used on
the Internet, let charset be its Charset Parameter.
 4. If charset is null, then for each of the rows in the following table,
starting with the first one and going down, if the first bytes of blob
match the bytes given in the first column, then let charset be the encoding
given in the cell in the second column of that row.  [table]
 5. If charset is null, let charset be UTF-8.
 6. Return the result of decoding ...

[IANACHARSET] http://www.iana.org/assignments/character-sets

-- 
Glenn Maynard

Re: Colliding FileWriters

2012-01-11 Thread Eric U

On Wed, Jan 11, 2012 at 12:25 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Tue, Jan 10, 2012 at 1:32 PM, Eric U er...@google.com wrote:
 On Tue, Jan 10, 2012 at 1:08 PM, Jonas Sicking jo...@sicking.cc wrote:
 Hi All,

 We've been looking at implementing FileWriter and had a couple of questions.

 First of all, what happens if multiple pages create a FileWriter for
 the same FileEntry at the same time? Will both be able to write to the
 file at the same time and whoever writes lasts to a given byte wins?

 This isn't currently specified, and that's a hole we should fill.  By
 not having it in the spec, my assumption would be that last-wins would
 hold, but it would be good to clarify it if that's the behavior we
 want.  It's especially important given that there's nothing like
 fflush(), which would help users know what last meant.  Speaking of
 which, should we add a flushing mechanism?

 This is different from how file systems normally work since as long as
 file is open for writing that tends to prevent other processes from
 opening the same file.

 You're perhaps thinking of windows, where by default files are opened
 in exclusive mode?  On other operating systems, and on windows when
 you specify FILE_SHARE_WRITE in dwShareMode in CreateFile, multiple
 writers can exist simultaneously.

 Ah. I didn't realize this was different on other OSs. It still seems
 risky to not provide any means to get exclusive access. The only way I
 can see websites dealing with this is to create their own locking
 mechanism backed by using IndexedDB transactions as low-level atomic
 primitive (local-storage doesn't work since you can implement
 compare-and-swap in an atomic manner).

 Having a 'exclusive' flag for createFileWriter seems much easier and
 removes the IndexedDB dependency. I'd probably even say that it should
 default to true since on the web defaulting to safe rather than fast
 generally results in fewer bugs.

I don't think I'd generally be averse to this.  However, it would then
require some sort of a revocation mechanism as well.  If you're done
with your FileWriter, you want to be able to get rid of it without
depending on GC, so that another context can create one.  And if you
forget to revoke it, behavior in the second context presumably depends
on GC, which is a bit ugly.

I'm not quite sure how urgent this is yet, though.  I've been assuming
that if you have transactional/synchronization semantics you want to
maintain, you'll be using IDB anyway, or a server handshake, etc.  But
of course it's easy to write a naive app that the user loads in two
windows, with bad effect.

 A second question is why is FileEntry.createWriter asynchronous? It
 doesn't actually do any IO and so it seems like it could return an
 answer synchronously.

 FileWriter has a synchronous length property, just as Blob does, so it
 needs to do IO at creation time to look it up.

 So how does this work if you have two tabs running in different
 processes create FileWriters for the same FileEntry. Each tab could
 end up changing the file's size in which case the the other tabs
 FileWriter will either have to synchronously update its .length, or it
 will have an outdated length.

 So the IO you do when creating the FileWriter is basically unreliable
 as soon as it's done.

 So it seems like you could get the size when creating the FileEntry
 and then use that cached size when creating FileWriter instance.

The size in the FileEntry is no more reliable than that in the
FileWriter, of course.  But if you know you're the only writer,
either's good.

 Though I wonder if it wouldn't be better to remove the .length
 property. If anything we could add a asynchronous length getter or a
 write method which appends to the end of the file (since writing is
 already asynchronous).

A new async length getter's not needed; you can use file() for that already.
I didn't originally add append due to its apparent redundancy with
seek+write, but as you point out, seek+write doesn't guarantee to
append if there are multiple writers.

 Though if we add the 'exclusive' flag described above, then we'll need
 to keep createFileWriter async anyway.

Right--I think we should pick whatever subset of these suggestions
seems the most useful, since they overlap a bit.

One working subset would be:

* Keep createFileWriter async.
* Make it optionally exclusive [possibly by default].  If exclusive,
its length member is trustworthy.  If not, it can go stale.
* Add an append method [needed only for non-exclusive writes, but
useful for logs, and a safe default].

 Would this also explain why FileEntry.getFile is asynchronous? I.e. it
 won't call it's callback until all current FileWriters have been
 closed?

 Nope.  It's asynchronous because a File is a Blob, and has a
 synchronous length accessor, so we look up the length when we mint the
 File.  Note that the length can go stale if you have multiple writers,
 as we want to keep it fast.

 This reminds me of

Re: File modification


On 1/11/2012 12:37 PM, Kyle Huey wrote:
On Tue, Jan 10, 2012 at 10:57 PM, Charles Pritchard ch...@jumis.com 
mailto:ch...@jumis.com wrote:


Far as I can tell, File is live, and it's supposed to be live from
input type=file.


 FWIW, I (and I believe others at Mozilla) consider the fact that File 
objects are live in Gecko a bug.  Fixing this is kind of complicated 
in our implementation though, and I've had other things to work on, so 
it's still there.




Sorry I misspoke on this one. FileList is supposed to be live. File 
objects are immutable.
The bug (in my opinion) is that FileList returns the same File object 
when the underlying length/date has changed.


It ought to return a new File object (in my opinion), and references to 
the old File object should return an error when used with items like 
FileReader.

They're dirty blobs, out of control masses of green goo.

-Charles

String to ArrayBuffer

Currently, we can asynchronously use BlobBuilder with FileReader to get 
an array buffer from a string.
We can of course, use code to convert String.fromCharCode into a 
Uint8Array, but it's ugly.


The StringEncoding proposal seems a bit much for most web use:
http://wiki.whatwg.org/wiki/StringEncoding

All we really ever do is work on DOMString, and that's covered by UTF8.

As following file shows, DOMString to ArrayBuffer conversion is about 30 
lines of code (start at line 125):

http://code.google.com/p/stringencoding/source/browse/encoding.js

It seems like this kind of type conversion could be handled more 
efficiently and be less error prone on programmers like myself, who 
often forget to test with multibyte strings.


I'm sure this has popped up many times before on the list. Thought I'd 
put it out there again.
We could just tweak the ArrayBuffer constructor to support DOMString as 
an argument.

Currently, it supports length.

-Charles

Re: String to ArrayBuffer


On 1/11/2012 2:49 PM, James Robinson wrote:



On Wed, Jan 11, 2012 at 2:45 PM, Charles Pritchard ch...@jumis.com 
mailto:ch...@jumis.com wrote:


Currently, we can asynchronously use BlobBuilder with FileReader
to get an array buffer from a string.
We can of course, use code to convert String.fromCharCode into a
Uint8Array, but it's ugly.

The StringEncoding proposal seems a bit much for most web use:
http://wiki.whatwg.org/wiki/StringEncoding

All we really ever do is work on DOMString, and that's covered by
UTF8.


DOMString is not UTF8 or necessarily unicode.  It's a sequence of 16 
bit integers and a length.




To clarify, I'd want ArrayBuffer(DOMString) to work with unicode and 
throw an error if the DOMString is not valid unicode.

This is consistent with other Web Apps APIs.

For feature detection, the method should be wrapped in a try-catch block 
anyway.


-Charles

Re: String to ArrayBuffer


On 1/11/2012 2:49 PM, James Robinson wrote:



On Wed, Jan 11, 2012 at 2:45 PM, Charles Pritchard ch...@jumis.com 
mailto:ch...@jumis.com wrote:


Currently, we can asynchronously use BlobBuilder with FileReader
to get an array buffer from a string.
We can of course, use code to convert String.fromCharCode into a
Uint8Array, but it's ugly.

The StringEncoding proposal seems a bit much for most web use:
http://wiki.whatwg.org/wiki/StringEncoding

All we really ever do is work on DOMString, and that's covered by
UTF8.


DOMString is not UTF8 or necessarily unicode.  It's a sequence of 16 
bit integers and a length.


Is there any instance in practice where DOMString as exposed to the 
scripting environment is not implemented as a unicode string?
I realize that internally, DOMString may be implemented as a 16 bit 
integer + length;



As following file shows, DOMString to ArrayBuffer conversion is
about 30 lines of code (start at line 125):
http://code.google.com/p/stringencoding/source/browse/encoding.js


This only seems correct for valid unicode strings, which does not 
cover all DOMStrings.




Sure, they're checking for correctness. And it's really only about 15 lines.

Browsers do the same thing with WindowBase64, though it's specified as 
DOMString, in practice (as the notes say), it's unicode.

http://www.whatwg.org/specs/web-apps/current-work/multipage/webappapis.html#atob

Web Storage, also, only works with unicode.

-Charles

Re: String to ArrayBuffer

2012-01-11 Thread James Robinson

On Wed, Jan 11, 2012 at 2:45 PM, Charles Pritchard ch...@jumis.com wrote:

 Currently, we can asynchronously use BlobBuilder with FileReader to get an
 array buffer from a string.
 We can of course, use code to convert String.fromCharCode into a
 Uint8Array, but it's ugly.

 The StringEncoding proposal seems a bit much for most web use:
 http://wiki.whatwg.org/wiki/**StringEncodinghttp://wiki.whatwg.org/wiki/StringEncoding

 All we really ever do is work on DOMString, and that's covered by UTF8.


DOMString is not UTF8 or necessarily unicode.  It's a sequence of 16 bit
integers and a length.



 As following file shows, DOMString to ArrayBuffer conversion is about 30
 lines of code (start at line 125):
 http://code.google.com/p/**stringencoding/source/browse/**encoding.jshttp://code.google.com/p/stringencoding/source/browse/encoding.js


This only seems correct for valid unicode strings, which does not cover all
DOMStrings.

- James



 It seems like this kind of type conversion could be handled more
 efficiently and be less error prone on programmers like myself, who often
 forget to test with multibyte strings.

 I'm sure this has popped up many times before on the list. Thought I'd put
 it out there again.
 We could just tweak the ArrayBuffer constructor to support DOMString as an
 argument.
 Currently, it supports length.

 -Charles

RE: [Bug 15434] New: [IndexedDB] Detail steps for assigning a key to a value

2012-01-11 Thread Israel Hilerio

We updated Section 3.1.3 with examples to capture the behavior you are seeing 
in IE.  Based on this section, if the attribute doesn't exists and there is an 
autogen is set to true the attribute is added to the structure and can be used 
to access the generated value. The use case for this is to be able to 
auto-generate a key value by the system in a well-defined attribute. This 
allows devs to access their primary keys from a well-known attribute.  This is 
easier than having to add the attribute yourself with an empty value before 
adding the object. This was agreed on a previous email thread last year.

I agree with you that we should probably add a section with steps for 
assigning a key to a value using a key path.  However, I would change step #4 
and add #8.5 to reflect the approach described in section 3.1.3 and #9 to 
reflect that you can't add attributes to entities which are not objects.  In my 
mind this is how the new section should look like:

When taking the steps for assigning a key to a value using a key path, the
implementation must run the following algorithm. The algorithm takes a key path
named /keyPath/, a key named /key/, and a value named /value/ which may be
modified by the steps of the algorithm.

1. If /keyPath/ is the empty string, skip the remaining steps and /value/ is
not modified.
2. Let /remainingKeypath/ be /keyPath/ and /object/ be /value/.
3. If /remainingKeypath/ has a period in it, assign /remainingKeypath/ to be
everything after the first period and assign /attribute/ to be everything
before that first period. Otherwise, go to step 7.
4. If /object/ does not have an attribute named /attribute/, then create the 
attribute and assign it an empty object.  If error creating the attribute then 
skip the remaining steps, /value/ is not modified, and throw a DOMException of 
type InvalidStateError.
5. Assign /object/ to be the value of the attribute named /attribute/ on
/object/.
6. Go to step 3.
7. NOTE: The steps leading here ensure that /remainingKeyPath/ is a single
attribute name (i.e. string without periods) by this step.
8. Let /attribute/ be /remainingKeyPath/
8.5. If /object/ does not have an attribute named /attribute/, then create the 
attribute.  If error creating the attribute then skip the remaining steps, 
/value/ is not modified, and throw a DOMException of type InvalidStateError.
9. If /object/ has an attribute named /attribute/ which is not modifiable, then
skip the remaining steps, /value/ is not modified, and throw a DOMException of 
type InvalidStateError.
10. Set an attribute named /attribute/ on /object/ with the value /key/.

What do you think?

Israel

On Wednesday, January 11, 2012 12:42 PM, Joshua Bell wrote:
From: jsb...@google.commailto:jsb...@google.com 
[mailto:jsb...@google.com]mailto:[mailto:jsb...@google.com] On Behalf Of 
Joshua Bell
Sent: Wednesday, January 11, 2012 12:42 PM
To: public-webapps@w3.orgmailto:public-webapps@w3.org
Subject: Re: [Bug 15434] New: [IndexedDB] Detail steps for assigning a key to a 
value

On Wed, Jan 11, 2012 at 12:40 PM, Joshua Bell 
jsb...@chromium.orgmailto:jsb...@chromium.org wrote:
I thought this issue was theoretical when I filed it, but it appears to be the 
reason behind the difference in results for IE10 vs. Chrome 17 when running 
this test:

http://samples.msdn.microsoft.com/ietestcenter/indexeddb/indexeddb_harness.htm?url=idbobjectstore_add8.htm

If I'm reading the test script right, the IDB implementation is being asked to 
assign a key (autogenerated, so a number, say 1) using the key path 
test.obj.key to a value { property: data }

The Chromium/WebKit implementation follows the steps I outlined below. Namely, 
at step 4 the algorithm would abort when the value is found to not have a 
test attribute.

To be clear, in Chromium the *algorithm* aborts, leaving the value unchanged. 
The request and transaction carry on just fine.

If IE10 is passing, then it must be synthesizing new JS objects as it walks the 
key path, until it gets to the final step in the path, yielding something like 
{ property: data, test: { obj: { key: 1 } } }

Thoughts?

On Thu, Jan 5, 2012 at 1:44 PM, 
bugzi...@jessica.w3.orgmailto:bugzi...@jessica.w3.org wrote:
https://www.w3.org/Bugs/Public/show_bug.cgi?id=15434

  Summary: [IndexedDB] Detail steps for assigning a key to a
   value
  Product: WebAppsWG
  Version: unspecified
 Platform: All
   OS/Version: All
   Status: NEW
 Severity: minor
 Priority: P2
Component: Indexed Database API
   AssignedTo: dave.n...@w3.orgmailto:dave.n...@w3.org
   ReportedBy: jsb...@chromium.orgmailto:jsb...@chromium.org
QAContact: member-webapi-...@w3.orgmailto:member-webapi-...@w3.org
   CC: m...@w3.orgmailto:m...@w3.org, 
public-webapps@w3.orgmailto:public-webapps@w3.org


In section 5.1 Object Store Storage Operation, step 2: when a key generator
is used with store with in line

Re: String to ArrayBuffer

2012-01-11 Thread Kenneth Russell

The StringEncoding proposal is the best path forward because it
provides correct behavior in all cases. Adding String conversions
directly to the typed array spec will introduce dependencies that are
strongly undesirable, and make it much harder to implement the core
spec. Hopefully Josh can provide an update on how the StringEncoding
proposal is going.

-Ken

On Wed, Jan 11, 2012 at 3:05 PM, Charles Pritchard ch...@jumis.com wrote:
 On 1/11/2012 2:49 PM, James Robinson wrote:



 On Wed, Jan 11, 2012 at 2:45 PM, Charles Pritchard ch...@jumis.com wrote:

 Currently, we can asynchronously use BlobBuilder with FileReader to get an
 array buffer from a string.
 We can of course, use code to convert String.fromCharCode into a
 Uint8Array, but it's ugly.

 The StringEncoding proposal seems a bit much for most web use:
 http://wiki.whatwg.org/wiki/StringEncoding

 All we really ever do is work on DOMString, and that's covered by UTF8.


 DOMString is not UTF8 or necessarily unicode.  It's a sequence of 16 bit
 integers and a length.



 To clarify, I'd want ArrayBuffer(DOMString) to work with unicode and throw
 an error if the DOMString is not valid unicode.
 This is consistent with other Web Apps APIs.

 For feature detection, the method should be wrapped in a try-catch block
 anyway.

 -Charles

Re: String to ArrayBuffer


On 1/11/2012 3:12 PM, Kenneth Russell wrote:

The StringEncoding proposal is the best path forward because it
provides correct behavior in all cases. Adding String conversions
directly to the typed array spec will introduce dependencies that are
strongly undesirable, and make it much harder to implement the core
spec. Hopefully Josh can provide an update on how the StringEncoding
proposal is going.


Looking forward to it.
I'm not particularly worried about the dependencies, but, what I 
proposed is likely to do the wrong thing.
I'd want the DOMString processed as a UTF8 string, and at that point, 
we're stepping out of the way that other Web Apps APIs operate.


Is base64 encoding at all appropriate for a StringEncoding type?
Browser implementations of atob are not very good, and it's an extra 
step to run  StringEncoding(atob()).



-Charles

Re: [Bug 15434] New: [IndexedDB] Detail steps for assigning a key to a value

On Wed, Jan 11, 2012 at 3:17 PM, Israel Hilerio isra...@microsoft.comwrote:

We updated Section 3.1.3 with examples to capture the behavior you are
seeing in IE.

Ah, I missed this, looking for normative text. :)

Based on this section, if the attribute doesn’t exists and there is an
autogen is set to true the attribute is added to the structure and can be
used to access the generated value. The use case for this is to be able to
auto-generate a key value by the system in a well-defined attribute. This
allows devs to access their primary keys from a well-known attribute. This
is easier than having to add the attribute yourself with an empty value
before adding the object. This was agreed on a previous email thread last
year.

** **

I agree with you that we should probably add a section with “steps for
assigning a key to a value using a key path.” However, I would change step
#4 and add #8.5 to reflect the approach described in section 3.1.3 and #9
to reflect that you can’t add attributes to entities which are not
objects. In my mind this is how the new section should look like:

** **

When taking the steps for assigning a key to a value using a key path, the

implementation must run the following algorithm. The algorithm takes a key
path

named /keyPath/, a key named /key/, and a value named /value/ which may be

modified by the steps of the algorithm.

** **

1. If /keyPath/ is the empty string, skip the remaining steps and /value/
is

not modified.

2. Let /remainingKeypath/ be /keyPath/ and /object/ be /value/.

3. If /remainingKeypath/ has a period in it, assign /remainingKeypath/ to
be

everything after the first period and assign /attribute/ to be everything*
***

before that first period. Otherwise, go to step 7.

4. If /object/ does not have an attribute named /attribute/, then create
the attribute and assign it an empty object. If error creating the
attribute then skip the remaining steps, /value/ is not modified, and throw
a DOMException of type InvalidStateError.

5. Assign /object/ to be the value of the attribute named /attribute/ on**
**

/object/.

6. Go to step 3.

7. NOTE: The steps leading here ensure that /remainingKeyPath/ is a single

attribute name (i.e. string without periods) by this step.

8. Let /attribute/ be /remainingKeyPath/

8.5. If /object/ does not have an attribute named /attribute/, then create
the attribute. If error creating the attribute then skip the remaining
steps, /value/ is not modified, and throw a DOMException of type
InvalidStateError.

9. If /object/ has an attribute named /attribute/ which is not modifiable,
then

skip the remaining steps, /value/ is not modified, and throw a
DOMException of type InvalidStateError.

10. Set an attribute named /attribute/ on /object/ with the value /key/.**
**

** **

What do you think?

Overall looks good to me. Obviously needs to be renumbered. Steps 4 and 8.5
talk about first creating an attribute, then later then assigning it a
value. In contrast, step 10 phrases it as a single operation (set an
attribute named /attribute/ on /object/ with the value /key/). We should
unify the language; I'm not sure if there's precedent for one step vs. two
step attribute assignment.

Israel

** **

On Wednesday, January 11, 2012 12:42 PM, Joshua Bell wrote:

*From:* jsb...@google.com [mailto:jsb...@google.com] *On Behalf Of *Joshua
Bell
*Sent:* Wednesday, January 11, 2012 12:42 PM
*To:* public-webapps@w3.org
*Subject:* Re: [Bug 15434] New: [IndexedDB] Detail steps for assigning a
key to a value

** **

On Wed, Jan 11, 2012 at 12:40 PM, Joshua Bell jsb...@chromium.org wrote:

I thought this issue was theoretical when I filed it, but it appears to be
the reason behind the difference in results for IE10 vs. Chrome 17 when
running this test:

** **

http://samples.msdn.microsoft.com/ietestcenter/indexeddb/indexeddb_harness.htm?url=idbobjectstore_add8.htm

** **

If I'm reading the test script right, the IDB implementation is being
asked to assign a key (autogenerated, so a number, say 1) using the key
path test.obj.key to a value { property: data }

** **

The Chromium/WebKit implementation follows the steps I outlined below.
Namely, at step 4 the algorithm would abort when the value is found to not
have a test attribute.

** **

To be clear, in Chromium the *algorithm* aborts, leaving the value
unchanged. The request and transaction carry on just fine.

** **

Thoughts?

** **

On Thu, Jan 5, 2012 at 1:44 PM, bugzi...@jessica.w3.org wrote:

Re: String to ArrayBuffer

On Wed, Jan 11, 2012 at 3:12 PM, Kenneth Russell k...@google.com wrote:

 The StringEncoding proposal is the best path forward because it
 provides correct behavior in all cases. Adding String conversions
 directly to the typed array spec will introduce dependencies that are
 strongly undesirable, and make it much harder to implement the core
 spec. Hopefully Josh can provide an update on how the StringEncoding
 proposal is going.

 -Ken


Thanks for the cue, Ken. :)

As background for folks on public-webapps, the StringEncoding proposal
linked to by Charles grew out of similar discussions to this in on the
public_we...@khronos.org discussion. The most recent thread can be found at
http://www.khronos.org/webgl/public-mailing-list/archives//msg00017.html


If you read that thread it should be clear why the proposal is as heavy
as it is (although, being mired in IndexedDB lately, it looks so tiny).
Dealing with text encoding is also never as trivial or easy as it seems.

As far as current status: I haven't done much work on the proposal in the
last month or so, but plan to pick that up again soon, and it should be
shopped around for the appropriate WG (public-webapps or otherwise) for
feedback, gauging implementer interest, etc. Anne's work over on whatwg
around encoding detection and BOM handling in browsers is valuable so I've
been watching that closely, although this is a new API and callers will
have access to the raw bits so we don't have to spec the kitchen sink or
match legacy behavior. There are a few open issues called out in the
proposal, perhaps most notably the default handling of invalid data.



 On Wed, Jan 11, 2012 at 3:05 PM, Charles Pritchard ch...@jumis.com
 wrote:
  On 1/11/2012 2:49 PM, James Robinson wrote:
 
 
 
  On Wed, Jan 11, 2012 at 2:45 PM, Charles Pritchard ch...@jumis.com
 wrote:
 
  Currently, we can asynchronously use BlobBuilder with FileReader to get
 an
  array buffer from a string.
  We can of course, use code to convert String.fromCharCode into a
  Uint8Array, but it's ugly.
 
  The StringEncoding proposal seems a bit much for most web use:
  http://wiki.whatwg.org/wiki/StringEncoding
 
  All we really ever do is work on DOMString, and that's covered by UTF8.
 
 
  DOMString is not UTF8 or necessarily unicode.  It's a sequence of 16 bit
  integers and a length.
 
 
 
  To clarify, I'd want ArrayBuffer(DOMString) to work with unicode and
 throw
  an error if the DOMString is not valid unicode.
  This is consistent with other Web Apps APIs.
 
  For feature detection, the method should be wrapped in a try-catch block
  anyway.
 
  -Charles

Re: Pressing Enter in contenteditable: p or br or div?

On Wed, Jan 11, 2012 at 12:09 PM, Aryeh Gregor a...@aryeh.name wrote:

 On Wed, Jan 11, 2012 at 12:38 PM, Ryosuke Niwa rn...@webkit.org wrote:
  That sounds like a great idea.
 
  . . .
 
  I'm not sure if we should add just editoptions though given we might
 need
  to add more elaborative options in the future. It might make more sense
 to
  add a new attribute per option as in:
 
  div contentEditable paragraphSeparator=p tabIndentation

 Ojan suggested in the other thread that we instead allow calling
 execCommand() on Element, and have the result restricted to that
 Element.  That solves the global-flags problem too, and doesn't
 require new attributes.  So you'd do

  div.execCommand(tabindent, false, true);

 or whatever.  Someone could still call
 document.execCommand(tabindent, false, false), but that would be
 overridden if it was called on the editing host.  I filed a bug on it:

 https://www.w3.org/Bugs/Public/show_bug.cgi?id=15522

 Does that sound good too?


That sounds workable. Presumably it's only available on the editing host
(as supposed to any element or any element with contenteditable content
attribute).

 Should enter behave like shift+enter when br is the default
  paragraph separator?

 Default paragraph separators are used in a couple of other places too,
 so it would be a little more work than that.  But I just looked, and
 it wouldn't be as bad as I thought.  So this is doable if people have
 any good use-cases.


Great.

- Ryosuke

Re: String to ArrayBuffer

2012-01-11 Thread Boris Zbarsky


On 1/11/12 6:03 PM, Charles Pritchard wrote:

Is there any instance in practice where DOMString as exposed to the
scripting environment is not implemented as a unicode string?


I don't know what you mean by that.

The point is, it's trivial to construct JS strings that contain 
arbitrary sequences of 16-bit units (using fromCharCode or \u escapes). 
 Nothing anywhere in JS or the DOM per se enforces that strings are 
valid UTF-16 (which is the way that an actual Unicode string would be 
encoded as a JS string).



I realize that internally, DOMString may be implemented as a 16 bit
integer + length;


Not just internally.  The JS spec and the DOM spec both explicitly say 
that this is what strings are: an array of 16-bit integers.



Browsers do the same thing with WindowBase64, though it's specified as
DOMString, in practice (as the notes say), it's unicode.
http://www.whatwg.org/specs/web-apps/current-work/multipage/webappapis.html#atob


If you look at the actual processing model, you take the input array of 
16-bit integers, throw if any is not in the set { 0x2B, 0x2F, 0x30 } 
union [0x41,0x5A] union [0x61,0x6A] and then treat the rest as ASCII 
data (which at that point it is).


It defines this in terms of Unicode but that's just because any JS 
string that satisfies the above constraints can be considered a 
Unicode string if one wishes.



Web Storage, also, only works with unicode.


I'm not familiar with the relevant part of Web Storage.  Can you cite 
the relevant part please?


-Boris

Re: [editing] tab in an editable area WAS: [whatwg] behavior when typing in contentEditable elements

2012-01-11 Thread Ojan Vafai

On Wed, Jan 11, 2012 at 8:15 AM, Aryeh Gregor a...@aryeh.name wrote:

 On Tue, Jan 10, 2012 at 4:48 PM, Charles Pritchard ch...@jumis.com
 wrote:
  Historically, one of my biggest frustrations with contentEditable is that

 you have to take it all or none. The lack of configurability is
 frustrating
  as a developer. Maybe the solution is to come up with a lower level set
 of
  editing primitives in place of contentEditable instead of trying to
 extend
  it though.

 Yes, that's definitely something we need to do.  There are algorithms
 I've defined that would probably be really useful to web authors, like
 wrap a list of nodes or some version of set the value of the
 selection (= inline formatting algorithm).  I've been holding off on
 exposing these to authors because I don't know if these algorithms are
 correct yet, and I don't want implementers jumping the gun and
 exposing them before using them internally so they're well-tested.  I
 expect they'll need to be refactored a bunch once implementers try
 actually reimplementing their editing commands in terms of them, and
 don't want to break them for authors when that happens.


Yup. Make sense. I agree that with editing we're not at a point where it's
at all clear what a good lower-level API would be.

RE: [Bug 15434] New: [IndexedDB] Detail steps for assigning a key to a value

2012-01-11 Thread Israel Hilerio

Great! I will work with Eliot to unify the language and update the spec.

Israel

On Wednesday, January 11, 2012 3:45 PM, Joshua Bell wrote:
On Wed, Jan 11, 2012 at 3:17 PM, Israel Hilerio
isra...@microsoft.commailto:isra...@microsoft.com wrote:
We updated Section 3.1.3 with examples to capture the behavior you are seeing
in IE.

Ah, I missed this, looking for normative text. :)

Based on this section, if the attribute doesn't exists and there is an autogen
is set to true the attribute is added to the structure and can be used to
access the generated value. The use case for this is to be able to
auto-generate a key value by the system in a well-defined attribute. This
allows devs to access their primary keys from a well-known attribute. This is
easier than having to add the attribute yourself with an empty value before
adding the object. This was agreed on a previous email thread last year.

I agree with you that we should probably add a section with steps for
assigning a key to a value using a key path. However, I would change step #4
and add #8.5 to reflect the approach described in section 3.1.3 and #9 to
reflect that you can't add attributes to entities which are not objects. In my
mind this is how the new section should look like:

When taking the steps for assigning a key to a value using a key path, the
implementation must run the following algorithm. The algorithm takes a key path
named /keyPath/, a key named /key/, and a value named /value/ which may be
modified by the steps of the algorithm.

1. If /keyPath/ is the empty string, skip the remaining steps and /value/ is
not modified.
2. Let /remainingKeypath/ be /keyPath/ and /object/ be /value/.
3. If /remainingKeypath/ has a period in it, assign /remainingKeypath/ to be
everything after the first period and assign /attribute/ to be everything
before that first period. Otherwise, go to step 7.
4. If /object/ does not have an attribute named /attribute/, then create the
attribute and assign it an empty object. If error creating the attribute then
skip the remaining steps, /value/ is not modified, and throw a DOMException of
type InvalidStateError.
5. Assign /object/ to be the value of the attribute named /attribute/ on
/object/.
6. Go to step 3.
7. NOTE: The steps leading here ensure that /remainingKeyPath/ is a single
attribute name (i.e. string without periods) by this step.
8. Let /attribute/ be /remainingKeyPath/
8.5. If /object/ does not have an attribute named /attribute/, then create the
attribute. If error creating the attribute then skip the remaining steps,
/value/ is not modified, and throw a DOMException of type InvalidStateError.
9. If /object/ has an attribute named /attribute/ which is not modifiable, then
skip the remaining steps, /value/ is not modified, and throw a DOMException of
type InvalidStateError.
10. Set an attribute named /attribute/ on /object/ with the value /key/.

What do you think?

Overall looks good to me. Obviously needs to be renumbered. Steps 4 and 8.5
talk about first creating an attribute, then later then assigning it a value.
In contrast, step 10 phrases it as a single operation (set an attribute named
/attribute/ on /object/ with the value /key/). We should unify the language;
I'm not sure if there's precedent for one step vs. two step attribute
assignment.

Israel
On Wednesday, January 11, 2012 12:42 PM, Joshua Bell wrote:
On Wed, Jan 11, 2012 at 12:40 PM, Joshua Bell
jsb...@chromium.orgmailto:jsb...@chromium.org wrote:
I thought this issue was theoretical when I filed it, but it appears to be the
reason behind the difference in results for IE10 vs. Chrome 17 when running
this test:

http://samples.msdn.microsoft.com/ietestcenter/indexeddb/indexeddb_harness.htm?url=idbobjectstore_add8.htm

If I'm reading the test script right, the IDB implementation is being asked to
assign a key (autogenerated, so a number, say 1) using the key path
test.obj.key to a value { property: data }

The Chromium/WebKit implementation follows the steps I outlined below. Namely,
at step 4 the algorithm would abort when the value is found to not have a
test attribute.

To be clear, in Chromium the *algorithm* aborts, leaving the value unchanged.
The request and transaction carry on just fine.

If IE10 is passing, then it must be synthesizing new JS objects as it walks the
key path, until it gets to the final step in the path, yielding something like
{ property: data, test: { obj: { key: 1 } } }

Thoughts?

On Thu, Jan 5, 2012 at 1:44 PM,
bugzi...@jessica.w3.orgmailto:bugzi...@jessica.w3.org wrote:
https://www.w3.org/Bugs/Public/show_bug.cgi?id=15434

Re: Pressing Enter in contenteditable: p or br or div?

On Wed, Jan 11, 2012 at 7:39 AM, Aryeh Gregor a...@aryeh.name wrote:

 Okay, so what API should we use?  I'd really prefer this be
 per-editing host.  In which case, how about we make it a content
 attribute on the editing host?


That sounds like a great idea.


 It can be a DOMSettableTokenList. Maybe something like

  div editoptions=tab-indent

 where the attribute is a whitespace-separated list of tokens.  To
 start with, we can maybe have tab-indent (hitting Tab indents) and
 div-separator (hitting Enter produces div).  Does this sound like a
 good approach?  If so, what should we call the attribute?  And should
 it imply contenteditable=true, or should the author have to specify
 that separately?


I'm not sure if we should add just editoptions though given we might need
to add more elaborative options in the future. It might make more sense to
add a new attribute per option as in:

div contentEditable paragraphSeparator=p tabIndentation

Also: are there any good use-cases for br?  Allowing div instead
 of p adds basically no extra complexity, but allowing br would
 make things significantly more complicated.


Should enter behave like shift+enter when br is the default
paragraph separator?

- Ryosuke

Re: [Bug 15434] New: [IndexedDB] Detail steps for assigning a key to a value

I thought this issue was theoretical when I filed it, but it appears to be
the reason behind the difference in results for IE10 vs. Chrome 17 when
running this test:

http://samples.msdn.microsoft.com/ietestcenter/indexeddb/indexeddb_harness.htm?url=idbobjectstore_add8.htm

If I'm reading the test script right, the IDB implementation is being asked
to assign a key (autogenerated, so a number, say 1) using the key path
test.obj.key to a value { property: data }

The Chromium/WebKit implementation follows the steps I outlined below.
Namely, at step 4 the algorithm would abort when the value is found to not
have a test attribute. If IE10 is passing, then it must be synthesizing
new JS objects as it walks the key path, until it gets to the final step in
the path, yielding something like { property: data, test: { obj: { key: 1
} } }

Thoughts?

On Thu, Jan 5, 2012 at 1:44 PM, bugzi...@jessica.w3.org wrote:

https://www.w3.org/Bugs/Public/show_bug.cgi?id=15434

Cribbing from the spec, this could read as:

4.X Steps for assigning a key to a value using a key path

When taking the steps for assigning a key to a value using a key path, the
implementation must run the following algorithm. The algorithm takes a key
path
named /keyPath/, a key named /key/, and a value named /value/ which may be
modified by the steps of the algorithm.

1. If /keyPath/ is the empty string, skip the remaining steps and /value/
is
not modified.
2. Let /remainingKeypath/ be /keyPath/ and /object/ be /value/.
3. If /remainingKeypath/ has a period in it, assign /remainingKeypath/ to
be
everything after the first period and assign /attribute/ to be everything
before that first period. Otherwise, go to step 7.
4. If /object/ does not have an attribute named /attribute/, then skip the
rest
of these steps and /value/ is not modified.
5. Assign /object/ to be the /value/ of the attribute named /attribute/ on
/object/.
6. Go to step 3.
7. NOTE: The steps leading here ensure that /remainingKeyPath/ is a single
attribute name (i.e. string without periods) by this step.
8. Let /attribute/ be /remainingKeyPath/
9. If /object/ has an attribute named /attribute/ which is not modifiable,
then
skip the remaining steps and /value/ is not modified.
10. Set an attribute named /attribute/ on /object/ with the value /key/.

Notes:

The above talks in terms of a mutable value. It could be amended to have an
initial step which produces a clone of the value, which is later returned,
but
given how this algorithm is used the difference is not observable, since
the
value stored should already be a clone that doesn't have any other
references.

Step 9 is present in case the key path refers to a special property,
e.g. a
String/Array length, Blob/File properties, etc.

--
Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.

Re: [File API]: Determining encoding

2012-01-11 Thread Glenn Maynard

You may want to coordinate with Anne regarding charset support requirements
and his in-progress encodings spec.
On Jan 11, 2012 1:58 PM, Arun Ranganathan aranganat...@mozilla.com
wrote:

 Glenn,

 Sorry about letting this one get by unanswered -- I was OOTO at the time
 you sent it.


  Questions and thoughts while reading
  http://dev.w3.org/2006/webapi/FileAPI/#enctype:

  is this spec actually
  requiring that every registered encoding be supported?

 What's required is that UAs support as much of the encodings in
 [IANACHARSET] as possible -- I think that's fair.  I've rewritten the
 algorithm to allow for what's not supported to be treated as UTF-8.

 Upon reflection, it might be prudent to decide a minimum subset of
 supported encodings, but I'm also comfortable leaving this to
 implementations and not saying anything about it.  What do you think?

  It would be clearer if steps 1 and 2 used the same terminology for an
  invalid character set.

 snip /

 I really liked your version -- much clearer than the original text -- and
 so I've rewritten the editor's draft to reflect the change.  Many thanks :)

 http://dev.w3.org/2006/webapi/FileAPI/#encoding-determination

 -- A*

  When reading blob objects using the readAsText() read method, the
 following encoding determination steps MUST be followed:
 
  1. Let charset be null.
  2. If the encoding parameter is specified, and is the name or alias of a
 character set used on the Internet [IANACHARSET], let charset be encoding
 parameter.
  3. If charset is null, and the blob's type attribute is present, and its
 Charset Parameter [RFC2046] is the name or alias of a character set used on
 the Internet, let charset be its Charset Parameter.
  4. If charset is null, then for each of the rows in the following table,
 starting with the first one and going down, if the first bytes of blob
 match the bytes given in the first column, then let charset be the encoding
 given in the cell in the second column of that row.  [table]
  5. If charset is null, let charset be UTF-8.
  6. Return the result of decoding ...

 [IANACHARSET] http://www.iana.org/assignments/character-sets

 --
 Glenn Maynard

Re: String to ArrayBuffer


On 1/11/2012 4:22 PM, Boris Zbarsky wrote:

On 1/11/12 6:03 PM, Charles Pritchard wrote:

Is there any instance in practice where DOMString as exposed to the
scripting environment is not implemented as a unicode string?


I don't know what you mean by that.

The point is, it's trivial to construct JS strings that contain 
arbitrary sequences of 16-bit units (using fromCharCode or \u 
escapes).  Nothing anywhere in JS or the DOM per se enforces that 
strings are valid UTF-16 (which is the way that an actual Unicode 
string would be encoded as a JS string).



My [wrong] understanding was that DOMString referred to valid unicode.

WebIDL:
The DOMString type corresponds to the set of all possible sequences of 
16 bit unsigned integer code units. Such sequences are commonly 
interpreted as UTF-16 encoded strings [RFC2781] although this is not 
required... Nothing in this specification requires a DOMString value to 
be a valid UTF-16 string.

http://www.w3.org/TR/WebIDL/#idl-DOMString

DOM3:
The DOMString type is used to store [Unicode] characters as a sequence 
of 16-bit units using UTF-16 as defined in [Unicode] and Amendment 1 of 
[ISO/IEC 10646]. There are some normalization notes, but otherwise, 
it's close enough to saying it stores Unicode, but it can handle all 
16bit combinations.

http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#ID-C74D1578

For historic reasons WindowBase64 throws an error if input is not 
within Unicode range.

http://www.whatwg.org/specs/web-apps/current-work/multipage/webappapis.html#atob



I realize that internally, DOMString may be implemented as a 16 bit
integer + length;


Not just internally.  The JS spec and the DOM spec both explicitly say 
that this is what strings are: an array of 16-bit integers.


WebIDL and DOM define DOMString, of course. JS defines The String 
Type in 8.4. They are intended to be the same.

http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf

The  String type is the set of all finite ordered sequences of zero or 
more 16-bit unsigned integer values  When a String contains actual 
textual data, each element is considered to be a single UTF-16 code 
unit.  Whether or not this is the actual storage format of a String, the 
characters within a String are numbered by their initial code unit 
element position as though they were represented using UTF-16.



Browsers do the same thing with WindowBase64, though it's specified as
DOMString, in practice (as the notes say), it's unicode.
http://www.whatwg.org/specs/web-apps/current-work/multipage/webappapis.html#atob 



If you look at the actual processing model, you take the input array 
of 16-bit integers, throw if any is not in the set { 0x2B, 0x2F, 0x30 
} union [0x41,0x5A] union [0x61,0x6A] and then treat the rest as ASCII 
data (which at that point it is).


It defines this in terms of Unicode but that's just because any JS 
string that satisfies the above constraints can be considered a 
Unicode string if one wishes.



Web Storage, also, only works with unicode.


I'm not familiar with the relevant part of Web Storage.  Can you cite 
the relevant part please?


The character code conversion gets weird. If you'd explain this in the 
proper terms, I'd appreciate it.


Load a binary resource via the old charset hack.

Save the resulting string into localStorage. There are some conversion 
issues. I am not using the right vocabulary.
I know the list has seen the issue before, and I'll bet someone here can 
explain it succinctly.


Example:
// Image files are easiest to try this with.
https://developer.mozilla.org/En/XMLHttpRequest/Using_XMLHttpRequest#Receiving_binary_data_in_older_browsers
// From the article:
function load_binary_resource(url) {
  var req = new XMLHttpRequest();
  req.open('GET', url, false);
  //XHR binary charset opt by Marcus Granado 2006 
[http://mgran.blogspot.com]

  req.overrideMimeType('text\/plain; charset=x-user-defined');
  req.send(null);
  if (req.status != 200) return '';
  return req.responseText;
}
var x = load_binary_resource('imageurl.png');
localStorage.fail = x;
localStorage.fail == x.fail; // will return false.

Re: String to ArrayBuffer