date:20090727

There is now a public GIT repository containing the build process for
producing the HTML5 specification. The build process and source for
generating the HTML5+RDFa specification are also included. More details
here:

http://github.com/html5/spec/

For instructions on how one might use the repository to contribute
changes, see the README:

http://github.com/html5/spec/blob/e84bd4bd252ba7ec69cd9ef877eee78d3e90e2e4/README

A couple of quick bullet-points:

* Any member of WHAT WG or HTML WG that has agreed to the W3C Patent and
  Licensing Policy may have commit rights to the repository.
* If you would like to collaborate on tools, test cases, examples or
  specification text, get a github account (free), join the HTML WG
  (free) and contact me.
* There are 3 suggestions on etiquette for contributors, please read
  them. In short - don't stomp on anyone's work without their
  express permission.

The tools to split and re-assemble the specification (as outlined in the
Restructing HTML5 document[1]) are not yet available. I'll be writing
and placing those tools in the HTML5 git repository in the coming month.

Here is the current process that is used to build the specification:

1. Copy Ian's latest spec from WHAT WG's SVN repository.
2. Apply changes via a Python script to the copy of Ian's spec
   (such as inserting the RDFa spec text).
3. Running the Anolis post-processor on the newly modified spec.

If you would just like to take the repository for a test spin, do the
following:

git clone git://github.com/html5/spec.git

Let me know if there are any bugs, questions or concerns with the
current setup. It will hopefully become more usable as the weeks progress.

-- manu

[1] http://html5.digitalbazaar.com/a-new-way-forward/

-- 
Manu Sporny (skype: msporny, twitter: manusporny)
President/CEO - Digital Bazaar, Inc.
blog: Bitmunk 3.1 Released - Browser-based P2P Commerce
http://blog.digitalbazaar.com/2009/06/29/browser-based-p2p-commerce/

Re: [whatwg] the cite element

2009-07-27 Thread Erik Vorhes

On Sun, Jul 19, 2009 at 4:58 AM, Ian Hicksoni...@hixie.ch wrote:

 If cite is exclusively for titles, it shouldn't be called cite.

 Sure, but we're about 15 years too late for that.


Well, no: the as far as I have been able to determine, every HTML
specification (before HTML5) did not limit this element to titles.


 In practice, people haven't been confused between these two attributes as
 far as we can tell. People who use cite seem to use it for titles, and
 people who use cite= seem to use it for URLs. (The latter is rare.)


See http://www.four24.com/; note near the top of the source:
blockquote id=verse cite=John 4:24...


 A new element wouldn't work in legacy UAs, so it wouldn't be as compelling
 a solution. Also, cite is already being used for this purpose.


My preference would be for cite to retain the flexibility it has in
pre-HTML5 specifications, which would include referencing titles. If
backwards compatibility is that big a concern, why does HTML5 use
legend outside of fieldset elements? See:
http://twitter.com/rem/status/2869618614

And if the definition of new elements is such a concern, why introduce
*any* new elements? (Please forgive the snark.)


 What is the pressing need for an element for citations, which would
 require that we overload cite with two uses?


A title can be a citation, but not all citations are titles. What's
the pressing need for limiting cite only to titles?


 I understand HTML5's attempts to provide semantic value to such elements
 as i, b, and small. To at the same time remove semantic value at
 the same time is completely asinine.

 If cite's original meaning has value, that is true; what is its value?

I would assume that this would be obvious. cite both denotes and
connotes citation.


  Note that HTML5 now has a more detailed way of marking up citations,
  using the Bibtex vocabulary. I think this removes the need for using
  the cite element in the manner you describe.

 Since this is supposed to be the case, why shouldn't HTML5 just ditch
 cite altogether? (Aside from backward compatibility, which is beside
 the point of the question.)

 Backwards compatibility (with legacy documents, which uses it to mean
 title of work) is the main reason.


I'd beg to differ, regarding legacy documents. See, for example the
automated citation generation at Wikipedia:
http://en.wikipedia.org/wiki/Wikipedia:Citation_templates

In addition, the comments at zeldman.com use cite to reference
authors of comments. While that specific example is younger than
HTML5, this is merely an example of a relatively common use-case for
cite that does not use it to signify title of work.


 There is no reason at all why it can't be defined as citing whom.

 The main reason would be that there doesn't appear to be a useful purpose
 to doing that.


The above references suggest otherwise. There are plenty of instances
where one would want to cite people rather than just a title of
work; blog commenters are only the most obvious example.


 On Wed, 1 Jul 2009, Erik Vorhes wrote:
 On Wed, Jul 1, 2009 at 11:49 AM, Kristof
 Zelechovskigiecr...@stegny.2a.pl wrote:
  I can imagine two reasons the CITE element cannot be defined as citing
  whom:
   1. Existing tools may assume it contains a title.

 Existing tools (which I would assume follow the HTML 4.01 spec)

 It appears this assumption is mistaken.


Really? Please provide evidence. Existing tools that treat cite
exclusively as title of work do so against every HTML specification
out there (i.e., HTML 4.01 and earlier).


 While the HTML 4.01 specification is hardly perfect, I don't see the
 value in limiting the semantic potential of the cite element in HTML5.

 As far as I can tell, increasing it from citations to titles of works is
 actually increasing its semantic potential, not limiting it.


Well, no. It's making it more exclusive. Defining cit as title of
work increases its specificity, but limits its semantic potential. As
I noted before, all titles are citations, but not all citations are
titles. By defining cite as an element that identifies a citation
you allow for title of work while not excluding other justifiable
uses of this element, e.g., cited person.



 Indeed, there is a lot of misuse of the element -- as alternatives for
 q, i, em, and HTML5's meaning of cite, in particular.

 Expanding it to cover the meanings of q, i, and em doesn't seem as
 useful as expanding it just to cover works.


I believe you mean limiting it just to cover works here. By
requesting cite to retain a definition of this is a citation, I am
not advocating that it be allowed to overlap q, i, or em. (I
realize you were responding to someone else's message, here. What I've
suggested allows cite to retain its semantic value.)


 I think it's clear that people want to use cite for things other than
 citations, and in fact do use it that way widely. If we're increasing it
 past just citations, then there seems to be clear value to using it to
 mark up

Re: [whatwg] the cite element

2009-07-27 Thread Philip Taylor

On Mon, Jul 27, 2009 at 3:20 PM, Erik Vorhese...@textivism.com wrote:
 On Sun, Jul 19, 2009 at 4:58 AM, Ian Hicksoni...@hixie.ch wrote:

 In practice, people haven't been confused between these two attributes as
 far as we can tell. People who use cite seem to use it for titles, and
 people who use cite= seem to use it for URLs. (The latter is rare.)


 See http://www.four24.com/; note near the top of the source:
 blockquote id=verse cite=John 4:24...

See http://philip.html5.org/data/cite-attribute-values.txt for some
data. (Looks like non-URI values are quite rare.)
Also maybe relevant: see http://philip.html5.org/data/cite.txt for
some older data about cite. (Looks like non-title uses are very
common.)

-- 
Philip Taylor
exc...@gmail.com

Re: [whatwg] New HTML5 spec GIT collaboration repository

2009-07-27 Thread Geoffrey Sneddon


Manu Sporny wrote:

3. Running the Anolis post-processor on the newly modified spec.


Is there any reason you use --allow-duplicate-dfns? Likewise, you 
probably don't want --w3c-compat (the name is slightly misleading, it 
provides compatibility with the CSS WG's CSS3 Module Postprocessor, not 
with any W3C pubrules).


On the whole I'd recommend running it with:

--w3c-compat-xref-a-placement --parser=lxml.html --output-encoding=us-ascii

The latter two options require Anolis 1.1, which is just as stable as 
1.0. I believe those options are identical to how Hixie runs it through PMS.


--
Geoffrey Sneddon — Opera Software
http://gsnedders.com/
http://www.opera.com/

Re: [whatwg] the cite element

2009-07-27 Thread Erik Vorhes

On Mon, Jul 27, 2009 at 10:17 AM, Kristof
Zelechovskigiecr...@stegny.2a.pl wrote:
  1. If you cite a person, the person you cite does not become a citation
 because of that.  Putting the person inside the CITE element distorts the
 meaning.

If you are citing a person (either as someone worth quoting or as,
say, the photographer of an image), how does using cite to identify
the citation distort the meaning?


  2. The example CITE Chaucer and the CITE Canterbury Tales/CITE
/CITE  is invalid because Canterbury Tales are not being cited, at
 least not in the title page.

Why not? It seems clear to me that one title is citing the other.


  3. The semantic potential does not decrease uniformly with specificity.
 Rather, there is an optimal value somewhere in the middle of specificity.
 Arguably, that optimum is attained with CITE reserved for titles.

Arguably, the optimum is attained with cite reserved for citations.


  4. Of course titles are not always styled the same way.  However, there is
 a requirement that the presentation makes sense in most cases when CSS is
 not supported.  The cases where styling all titles in the same way makes the
 information hard to understand are scarce.

This doesn't explain why cite needs to be used exclusively used for
titles. (And I didn't realize that HTML was really just for use as
styling hooks. There's no audible difference between cite
style=font-style:normal;MLA Handbook for Writers of Research
Papers/cite and citeMLA Handbook for Writers of Research
Papers/cite.)


  5. Random markup errors a few pages do not constitute an obstacle here,
 nor do errors in template code (they are ubiquitous once deployed but they
 are easy to fix, at least at Wikipedia).

Except that Wikipedia is not erroneous in its usage of cite. It is
declaring conformance to XHTML 1.0 Transitional, which is based off of
the HTML 4.01 specification, which defines cite as a citation or a
reference to other sources.

To the issue of cite in HTML5, using cite as title of work
provides for no distinction between editions or translations of works.


  6. It does not mean anything to say this is a citation; this definition
 is too ambiguous to be useful.

I obviously disagree. cite identifies a title is too narrow a
definition to be useful.


Erik Vorhes

Re: [whatwg] A New Way Forward for HTML5 (revised)

2009-07-27 Thread Sam Ruby


John Foliot wrote:

Peter Kasting wrote:

It seems like the only thing you could ask for beyond
this is the ability to directly insert your own changes
into the spec without prior editorial oversight.  I think
that might be what you're asking for.  This seems very
unwise.


Really? This appears to be exactly the single, special status privilege
currently reserved for Ian Hickson.


False.


It is, in fact a serious complaint
that many are trying to correct, including Manu with his offer to assist
in setting up a more egalitarian solution.


In fact, Manu is an instance proof that the previous statement you made 
is false.


Ian is free to produce Working Drafts that are published by this working 
group.  The status of such drafts are, and I quote[1]:


Consensus is not a prerequisite for approval to publish; the
Working Group MAY request publication of a Working Draft even
if it is unstable and does not meet all Working Group requirements.

Both you and Manu have exactly the same ability as Ian does in this 
respect.  Ian has asked the group for permission to publish, and that 
was granted.  Manu has produced a document but has yet to request 
permission to publish as a Working Draft.  You are welcome to do 
likewise[2].



JF


- Sam Ruby

[1] http://www.w3.org/2005/10/Process-20051014/tr.html#first-wd
[2] http://lists.w3.org/Archives/Public/public-html/2009Jul/0627.html

[whatwg] Installed Apps

2009-07-27 Thread Michael Davidson

Hello folks -

I'm an engineer on the Gmail team. We've been working on a prototype
with the Chrome team to make the Gmail experience better. We thought
we'd throw out our ideas to the list to get some feedback.

THE PROBLEM

We would like to enable rich internet applications to achieve feature
parity with desktop applications. I will use Gmail and Outlook as
examples for stating the problems we hope to solve.

-- Slow startup: When a user navigates to mail.google.com, multiple
server requests are required to render the page. The Javascript is
cacheable, but personal data (e.g. the list of emails to show) is not.
New releases of Gmail that require JS downloads are even slower to
load.
-- Native apps like Outlook can (and do) run background processes on
the user's machine to make sure that data is always up-to-date.
-- Notifications: Likewise, Outlook can notify users (via a background
process) when new mail comes in even if it's not running.

A SOLUTION

Our proposed solution has two parts. The first, which should be
generally useful, is the ability to have a hidden HTML/JS page running
in the background that can access the DOM of visible windows. This
page should be accessible from windows that the user navigates to. We
call this background Javascript window a shared context or a
background page. This will enable multiple instances of a web app
(e.g. tearoff windows in Gmail) to cleanly access the same user state
no matter which windows are open.

Additionally, we'd like this background page to continue to run after
the user has navigated away from the site, and preferably after the
user has closed the browser. This will enable us to keep client-side
data up-to-date on the user's machine. It will also enable us to
download JS in advance. When the user navigates to a web app, all the
background page has to do is draw the DOM in the visible window. This
should significantly speed up app startup. Additionally, when
something happens that requires notification, the background page can
launch a visible page with a notification (or use other rich APIs for
showing notifications).

WHY NOT SHARED WORKERS

Shared workers and persistent workers are designed to solve similar
problems, but don't meet our needs. The key difference between what
we're proposing and earlier proposals for persistent workers is that
background pages would be able to launch visible windows and have full
DOM access.  This is different from the model of workers where all
interaction with the DOM has to be done through asynchronous message
passing. We would like background pages to be able to drive UI in a
visible window using the techniques (DOM manipulation, innerHTML) that
are common today. We believe that more apps would be able to take
advantage of a background page if they didn't require rewriting the
app in the asynchronous, message-passing style required by workers.
Allowing the background page to drive the UI by doing direct DOM
manipulation is a more common programming style. For apps that don't
need the benefits of multiple threads provided by shared workers, this
will give the benefits of fast startup and the benefits of running in
the background (like showing notifications) without the downside of
the worker programming model.

The concepts here are similar to permanent workers, but with a
different programming model.

IMPLEMENTATION AVENUES

For now, we have a simple API in Chrome. This is meant as a prototype
of the concepts, not as a final API.

-- installApp(uri, name) Fetches the HTML page at uri, and runs it as
a hidden window. Currently this window is loaded when the machine
starts. This should eventually involve permissioning UI, but this is
not implemented. name is a name that can be used to get access to the
hidden window.
-- getInstalledApp(name) Returns a reference to the background page,
or null if the app is not installed.
-- removeInstalledApp(name) The moral equivalent of window.close() for
a background page.

We might migrate to a model where webapps can be installed as Chrome
extensions instead of using a Javascript call to install the app.

Another alternative we've discussed is allowing authors to specify in
their AppCache manifest that a given page should be an always-loaded
background page. This seems like a natural fit since the AppCache
manifest is where authors describe the attributes of various parts of
their app.

KNOWN ISSUES

As mentioned in earlier discussions about persistent workers,
permissioning UI is a major issue.

FEEDBACK

We would like to know if others would find this functionality useful.
Does anyone have an idea for a better API?


Michael

Re: [whatwg] Issues with Web Sockets API

2009-07-27 Thread Alexey Proskuryakov

06.07.2009, в 21:30, Ian Hickson написал(а):

postMessage() may want another exception condition... 'too much data
pending exception'... consider calling postMessage in a while(true)
loop... at some point the system is going to have to give up queing
the

data if its not actually making its way out on the wire.

The spec doesn't specify how UAs are to handle hitting hardware
limitations or system limitations, because it's often difficult to
truly

control how those cases are handled.

I agree with Michael that send() should not silently drop data that
could not be sent. It is very easy to fill send buffers, and if bytes
get silently dropped, implementing app-level acks becomes quite
difficult. With TCP, the basic guarantee is that bytes are not lost
until the connection is lost, so app-level acks only require
confirming the last processed command, and losing this guarantee would
be quite unfortunate. Most (all?) system TCP implementations certainly
have ways to deal with flow control.

However, I do not think that raising an exception is an appropriate
answer. Often, the TCP implementation takes a part of data given to
it, and asks to resubmit the rest later. So, just returning an integer
result from send() would be best in my opinion.

The thread has such a nice title that I'm going to throw some
additional issues in :)

1) Web Sockets is specified to send whatever authentication
credentials the client has for the resource. However, there is no
challenge-response sequence specified, which seems to prevent using
common auth schemes. HTTP Basic needs to know an authentication realm
for the credentials, and other schemes need a cryptographic challenge
(e.g. nonce for Digest auth).

2) It is not specified what the server does when credentials are
incorrect, so I assume that the intended behavior is to close the
connection. Unlike HTTP 401 response, this doesn't give the client a
chance to ask the user again. Also, if the server is on a different
host, especially one that's not shared with an HTTP server, there
isn't a way to obtain credentials, in the first place.

I'm not sure how to best handle this, other than to copy more HTTP
behaviors.

3) A Web Sockets server cannot respond with a redirect to another URL.
I'm not sure if the intention is to leave this to implementations, or
to add in Web Sockets v2, but it definitely looks like an important
feature to me, maybe something that needs to be in v1.

4) If the user agent already has a Web Socket connection to the
remote host identified by /host/ (even if known by another name), wait
until that connection has been established or for that connection to
have failed.

It doesn't look like host identified by /host/ is defined anywhere.
Does this requirement say that IP addresses should be compared,
instead of host names? I'm not sure if this is significant for
preventing DoS attacks, and anyway, the IP address may not be known
before a request is sent. This puts an unusual burden on the
implementation.

5) We probably need to specify a keep-alive feature to avoid proxy
connection timeout. I do not have factual data on whether common
proxies implement connection timeout, but I'd expect them to often do.

6) The spec should probably explicitly permit blocking some ports from
use with Web Sockets at UA's discretion. In practice, the list would
likely be the same as for HTTP, see e.g. http://www.mozilla.org/projects/netlib/PortBanning.html
.

7) use a SOCKS proxy for WebSocket connections, if available, or
failing that, to prefer an HTTPS proxy over an HTTP proxy

It is not clear what definition of proxy types is used here. To me, an
HTTPS proxy is one that supports CONNECT to port 443, and an HTTP
proxy (if we're making a distinction from HTTPS) is one that
intercepts and forwards GET requests. However, this understanding
contradicts an example in paragraph 3.1.3, and also, it's not clear
how a GET proxy could be used for Web Sockets.

8) Many HTTPS proxies only allow connecting to port 443. Do you have
the data on whether relying on existing proxies to establish
connections to arbitrary ports is practical?

9) There is no limit to the number of established Web Socket
connections a user agent can have with a single remote host.

Does this mean that Web Socket connections are exempt from the normal
4-connection (or so) limit? Why is it OK?

10) Web Socket handshake uses CRLF line endings strictly. Does this
add much to security? It prevents using telnet/netcat for debugging,
which is something I personally use often when working on networking
issues.

If there is no practical reason for this, I'd suggest relaxing this
aspect of parsing.

11) There is no way for the client to know that the connection has
been closed. For example:

- socket.close() is called from JavaScript;
- onclose handler is invoked;
- more data arrives from the

Re: [whatwg] A New Way Forward for HTML5 (revised)

2009-07-27 Thread Peter Kasting

On Mon, Jul 27, 2009 at 12:06 PM, John Foliot jfol...@stanford.edu wrote:

 That said, the barrier to equal entry remains high:
 http://burningbird.net/node/28


I don't understand. That page says We're told that to propose changes to
the document for consideration, we need to ... and then a long list of
things.  But that seems untrue.  To propose changes, all you need to do is
write them, anywhere you want (in an email, on a webpage, whatever) and
notify people.  If we had to do everything that page lists in order to
propose changes, I'd be upset too.  But we don't.  A simple email works.

I'm beginning to suspect that this whole line of conversation is specific to
RDFa, which is a discussion I never took part in.

PK

Re: [whatwg] Issues with Web Sockets API

2009-07-27 Thread Alexey Proskuryakov



27.07.2009, в 12:35, Maciej Stachowiak написал(а):

However, I do not think that raising an exception is an appropriate  
answer. Often, the TCP implementation takes a part of data given to  
it, and asks to resubmit the rest later. So, just returning an  
integer result from send() would be best in my opinion.


With WebSocket, another possibility is for the implementation to  
buffer pending data that could not yet be sent to the TCP layer, so  
that the client of WebSocket doesn't have to be exposed to system  
limitations. At that point, an exception is only needed if the  
implementation runs out of memory for buffering. With a system TCP  
implementation, the buffering would be in kernel space, which is a  
scarce resource, but user space memory inside the implementation is  
no more scarce than user space memory held by the Web application  
waiting to send to the WebSocket.



I agree that this will help if the application sends data in burst  
mode, but what if it just constantly sends more than the network can  
transmit? It will never learn that it's misbehaving, and will just  
take more and more memory.


An example where adapting to network bandwidth is needed is of course  
file uploading, but even if we dismiss it as a special case that can  
be served with custom code, there's also e.g. captured video or audio  
that can be downgraded in quality for slow connections.


- WBR, Alexey Proskuryakov

Re: [whatwg] A New Way Forward for HTML5 (revised)

2009-07-27 Thread Sam Ruby


John Foliot wrote:

Sam Ruby wrote:

Really? This appears to be exactly the single, special status

privilege

currently reserved for Ian Hickson.

False.


...and yes, I stand corrected.  Although the *impression* that this is the
current status remains fairly pervasive; however I will endeavor to dispel
that myth as well.  That said, the barrier to equal entry remains high:
http://burningbird.net/node/28

(however, I will also state that Sam has offered on numerous occasions to
extend help to any that requires = balanced commentary)


My goal is to ensure that there are no excuses not to participate.

I've said that a person can simply go into notepad[3], make the changes, 
and I will take care of the rest.  Manu has documented the process for 
those who prefer to do it themselves[4].  Ian has offered to make the 
changes if somebody can explain the use cases[5].  If people have 
suggestions on how to be even *more* inclusive, I welcome any and all 
suggestions.


Meanwhile, your offer to help dispel that myth is very much appreciated.


Both you and Manu have exactly the same ability as Ian does in this
respect.  Ian has asked the group for permission to publish, and that
was granted.  Manu has produced a document but has yet to request
permission to publish as a Working Draft.  You are welcome to do
likewise[2].


While I have personal reservations that this may introduce an even wider
fork of opinion, making consensus down the road even harder to achieve,
this is the die that has been cast.  I will offer what contributions I can
to both Manu and Shelly in their respective initiatives, to the best of my
ability, and will leave the WHAT WG to continue propagating what I see as
their mistakes and false assumptions as they see fit - they have clearly
signaled that not all contributions are welcome.


It may very well end up that the sole difference between the WHATWG 
document and the W3C document is that the the WHATWG document states 
that summary attribute is conformant but obsolete, and the W3C document 
states that the summary attribute is conformant but not (yet) obsolete.


But the only way that will happen is if somebody goes into notepad, or 
follows Manu's process, or explains the use case, or finds some other 
means to cause a working draft to appear with these changes.



JF


[1] http://www.w3.org/2005/10/Process-20051014/tr.html#first-wd
[2] http://lists.w3.org/Archives/Public/public-html/2009Jul/0627.html


- Sam Ruby

[3] http://lists.w3.org/Archives/Public/public-html/2009Jul/0633.html
[4] http://lists.w3.org/Archives/Public/public-html/2009Jul/0785.html
[5] http://lists.w3.org/Archives/Public/public-html/2009Jul/0745.html

Re: [whatwg] Issues with Web Sockets API

2009-07-27 Thread Jeremy Orlow

On Mon, Jul 27, 2009 at 1:14 PM, Alexey Proskuryakov a...@webkit.org wrote:


 27.07.2009, в 12:35, Maciej Stachowiak написал(а):

  However, I do not think that raising an exception is an appropriate
 answer. Often, the TCP implementation takes a part of data given to it, and
 asks to resubmit the rest later. So, just returning an integer result from
 send() would be best in my opinion.


 With WebSocket, another possibility is for the implementation to buffer
 pending data that could not yet be sent to the TCP layer, so that the client
 of WebSocket doesn't have to be exposed to system limitations. At that
 point, an exception is only needed if the implementation runs out of memory
 for buffering. With a system TCP implementation, the buffering would be in
 kernel space, which is a scarce resource, but user space memory inside the
 implementation is no more scarce than user space memory held by the Web
 application waiting to send to the WebSocket.



 I agree that this will help if the application sends data in burst mode,
 but what if it just constantly sends more than the network can transmit? It
 will never learn that it's misbehaving, and will just take more and more
 memory.

 An example where adapting to network bandwidth is needed is of course file
 uploading, but even if we dismiss it as a special case that can be served
 with custom code, there's also e.g. captured video or audio that can be
 downgraded in quality for slow connections.


Maybe the right behavior is to buffer in user-space (like Maciej explained)
up until a limit (left up to the UA) and then anything beyond that results
in an exception.  This seems like it'd handle bursty communication and would
keep the failure model simple.

Re: [whatwg] Issues with Web Sockets API

On Mon, Jul 27, 2009 at 1:14 PM, Alexey Proskuryakov a...@webkit.org wrote:


 27.07.2009, в 12:35, Maciej Stachowiak написал(а):

  However, I do not think that raising an exception is an appropriate
 answer. Often, the TCP implementation takes a part of data given to it, and
 asks to resubmit the rest later. So, just returning an integer result from
 send() would be best in my opinion.


 With WebSocket, another possibility is for the implementation to buffer
 pending data that could not yet be sent to the TCP layer, so that the client
 of WebSocket doesn't have to be exposed to system limitations. At that
 point, an exception is only needed if the implementation runs out of memory
 for buffering. With a system TCP implementation, the buffering would be in
 kernel space, which is a scarce resource, but user space memory inside the
 implementation is no more scarce than user space memory held by the Web
 application waiting to send to the WebSocket.



 I agree that this will help if the application sends data in burst mode,
 but what if it just constantly sends more than the network can transmit? It
 will never learn that it's misbehaving, and will just take more and more
 memory.


I would suggest that the solution to this situation is an appropriate
application-level protocol (i.e. acks) to allow the application to have no
more than (say) 1MB of data outstanding.

I'm just afraid that we're burdening the API to handle degenerative cases
that the vast majority of users won't encounter. Specifying in the API that
any arbitrary send() invocation could throw some kind of retry exception
or return some kind of error code is really really cumbersome.




 An example where adapting to network bandwidth is needed is of course file
 uploading, but even if we dismiss it as a special case that can be served
 with custom code, there's also e.g. captured video or audio that can be
 downgraded in quality for slow connections.

 - WBR, Alexey Proskuryakov

Re: [whatwg] Issues with Web Sockets API

2009-07-27 Thread Alexey Proskuryakov



27.07.2009, в 13:20, Jeremy Orlow написал(а):

I agree that this will help if the application sends data in burst  
mode, but what if it just constantly sends more than the network can  
transmit? It will never learn that it's misbehaving, and will just  
take more and more memory.


An example where adapting to network bandwidth is needed is of  
course file uploading, but even if we dismiss it as a special case  
that can be served with custom code, there's also e.g. captured  
video or audio that can be downgraded in quality for slow connections.


Maybe the right behavior is to buffer in user-space (like Maciej  
explained) up until a limit (left up to the UA) and then anything  
beyond that results in an exception.  This seems like it'd handle  
bursty communication and would keep the failure model simple.



This sounds like the best approach to me.


27.07.2009, в 13:27, Drew Wilson написал(а):

I would suggest that the solution to this situation is an  
appropriate application-level protocol (i.e. acks) to allow the  
application to have no more than (say) 1MB of data outstanding.


I'm just afraid that we're burdening the API to handle degenerative  
cases that the vast majority of users won't encounter. Specifying in  
the API that any arbitrary send() invocation could throw some kind  
of retry exception or return some kind of error code is really  
really cumbersome.


Having a send() that doesn't return anything and doesn't raise  
exceptions would be a clear signal that send() just blocks until it's  
possible to send data to me, and I'm sure to many others, as well.  
There is no reason to silently drop data sent over a TCP connection -  
after all, we could as well base the protocol on UDP if we did, and  
lose nothing.


- WBR, Alexey Proskuryakov

Re: [whatwg] Issues with Web Sockets API

On Mon, Jul 27, 2009 at 1:36 PM, Alexey Proskuryakov a...@webkit.org wrote:


 27.07.2009, в 13:20, Jeremy Orlow написал(а):

  I agree that this will help if the application sends data in burst mode,
 but what if it just constantly sends more than the network can transmit? It
 will never learn that it's misbehaving, and will just take more and more
 memory.

 An example where adapting to network bandwidth is needed is of course file
 uploading, but even if we dismiss it as a special case that can be served
 with custom code, there's also e.g. captured video or audio that can be
 downgraded in quality for slow connections.

 Maybe the right behavior is to buffer in user-space (like Maciej
 explained) up until a limit (left up to the UA) and then anything beyond
 that results in an exception.  This seems like it'd handle bursty
 communication and would keep the failure model simple.



 This sounds like the best approach to me.


 27.07.2009, в 13:27, Drew Wilson написал(а):

  I would suggest that the solution to this situation is an appropriate
 application-level protocol (i.e. acks) to allow the application to have no
 more than (say) 1MB of data outstanding.

 I'm just afraid that we're burdening the API to handle degenerative cases
 that the vast majority of users won't encounter. Specifying in the API that
 any arbitrary send() invocation could throw some kind of retry exception
 or return some kind of error code is really really cumbersome.


 Having a send() that doesn't return anything and doesn't raise exceptions
 would be a clear signal that send() just blocks until it's possible to send
 data to me, and I'm sure to many others, as well. There is no reason to
 silently drop data sent over a TCP connection - after all, we could as well
 base the protocol on UDP if we did, and lose nothing.


There's another option besides blocking, raising an exception, and dropping
data: unlimited buffering in user space. So I'm saying we should not put any
limits on the amount of user-space buffering we're willing to do, any more
than we put any limits on the amount of other types of user-space memory
allocation a page can perform.


 - WBR, Alexey Proskuryakov

Re: [whatwg] Issues with Web Sockets API

2009-07-27 Thread Jeremy Orlow

On Mon, Jul 27, 2009 at 1:44 PM, Drew Wilson atwil...@google.com wrote:



 On Mon, Jul 27, 2009 at 1:36 PM, Alexey Proskuryakov a...@webkit.orgwrote:


 27.07.2009, в 13:20, Jeremy Orlow написал(а):

  I agree that this will help if the application sends data in burst mode,
 but what if it just constantly sends more than the network can transmit? It
 will never learn that it's misbehaving, and will just take more and more
 memory.

 An example where adapting to network bandwidth is needed is of course
 file uploading, but even if we dismiss it as a special case that can be
 served with custom code, there's also e.g. captured video or audio that can
 be downgraded in quality for slow connections.

 Maybe the right behavior is to buffer in user-space (like Maciej
 explained) up until a limit (left up to the UA) and then anything beyond
 that results in an exception.  This seems like it'd handle bursty
 communication and would keep the failure model simple.



 This sounds like the best approach to me.


 27.07.2009, в 13:27, Drew Wilson написал(а):

  I would suggest that the solution to this situation is an appropriate
 application-level protocol (i.e. acks) to allow the application to have no
 more than (say) 1MB of data outstanding.

 I'm just afraid that we're burdening the API to handle degenerative cases
 that the vast majority of users won't encounter. Specifying in the API that
 any arbitrary send() invocation could throw some kind of retry exception
 or return some kind of error code is really really cumbersome.


 Having a send() that doesn't return anything and doesn't raise exceptions
 would be a clear signal that send() just blocks until it's possible to send
 data to me, and I'm sure to many others, as well. There is no reason to
 silently drop data sent over a TCP connection - after all, we could as well
 base the protocol on UDP if we did, and lose nothing.


 There's another option besides blocking, raising an exception, and dropping
 data: unlimited buffering in user space. So I'm saying we should not put any
 limits on the amount of user-space buffering we're willing to do, any more
 than we put any limits on the amount of other types of user-space memory
 allocation a page can perform.


I agree with Alexey that applications need feedback when they're
consistentiently exceeding what your net connection can handle.  I think an
application getting an exception rather than filling up its buffer until it
OOMs is a much better experience for the user and the web developer.

If you have application level ACKs (which you probably should--especially in
high-throughput uses), you really shouldn't even hit the buffer limits that
a UA might have in place.  I don't really think that having a limit on the
buffer size is a problem and that, if anything, it'll promote better
application level flow control.

J

Re: [whatwg] Issues with Web Sockets API

On Mon, Jul 27, 2009 at 2:02 PM, Jeremy Orlow jor...@chromium.org wrote:

 On Mon, Jul 27, 2009 at 1:44 PM, Drew Wilson atwil...@google.com wrote:



 On Mon, Jul 27, 2009 at 1:36 PM, Alexey Proskuryakov a...@webkit.orgwrote:


 27.07.2009, в 13:20, Jeremy Orlow написал(а):

  I agree that this will help if the application sends data in burst mode,
 but what if it just constantly sends more than the network can transmit? It
 will never learn that it's misbehaving, and will just take more and more
 memory.

 An example where adapting to network bandwidth is needed is of course
 file uploading, but even if we dismiss it as a special case that can be
 served with custom code, there's also e.g. captured video or audio that can
 be downgraded in quality for slow connections.

 Maybe the right behavior is to buffer in user-space (like Maciej
 explained) up until a limit (left up to the UA) and then anything beyond
 that results in an exception.  This seems like it'd handle bursty
 communication and would keep the failure model simple.



 This sounds like the best approach to me.


 27.07.2009, в 13:27, Drew Wilson написал(а):

  I would suggest that the solution to this situation is an appropriate
 application-level protocol (i.e. acks) to allow the application to have no
 more than (say) 1MB of data outstanding.

 I'm just afraid that we're burdening the API to handle degenerative
 cases that the vast majority of users won't encounter. Specifying in the 
 API
 that any arbitrary send() invocation could throw some kind of retry
 exception or return some kind of error code is really really cumbersome.


 Having a send() that doesn't return anything and doesn't raise exceptions
 would be a clear signal that send() just blocks until it's possible to send
 data to me, and I'm sure to many others, as well. There is no reason to
 silently drop data sent over a TCP connection - after all, we could as well
 base the protocol on UDP if we did, and lose nothing.


 There's another option besides blocking, raising an exception, and
 dropping data: unlimited buffering in user space. So I'm saying we should
 not put any limits on the amount of user-space buffering we're willing to
 do, any more than we put any limits on the amount of other types of
 user-space memory allocation a page can perform.


 I agree with Alexey that applications need feedback when they're
 consistentiently exceeding what your net connection can handle.  I think an
 application getting an exception rather than filling up its buffer until it
 OOMs is a much better experience for the user and the web developer.


I'm assuming that no actual limits would be specified in the specification,
so it would be entirely up to a given UserAgent to decide how much buffering
it is willing to provide. Doesn't that imply that a well-behaved web
application would be forced to check for exceptions from all send()
invocations, since there's no way to know a priori whether limits imposed by
an application via its app-level protocol would be sufficient to stay under
a given user-agent's internal limits?

Even worse, to be broadly deployable the app-level protocol would have to
enforce the lowest-common-denominator buffering limit, which would inhibit
throughput on platforms that support higher buffers. In practice, I suspect
most implementations would adopt a just blast out as much data as possible
until the system throws an exception, then set a timer to retry the send in
100ms approach. But perhaps that's your intention? If so, then I'd suggest
changing the API to just have a canWrite notification like other async
socket APIs provide (or something similar) to avoid the clunky
catch-and-retry idiom.

Personally, I think that's overkill for the vast majority of use cases which
would be more than happy with a simple send(), and I'm not sure why we're
obsessing over limiting memory usage in this case when we allow pages to use
arbitrary amounts of memory elsewhere.



 If you have application level ACKs (which you probably should--especially
 in high-throughput uses), you really shouldn't even hit the buffer limits
 that a UA might have in place.  I don't really think that having a limit on
 the buffer size is a problem and that, if anything, it'll promote better
 application level flow control.

 J

Re: [whatwg] Serializing HTML fragments (9.4)

On Thu, 9 Jul 2009, Kartikaya Gupta wrote:

 According to this section 9.4, any descendant text node of a style 
 element should be outputted literally, rather than being escaped. 
 However, this doesn't seem to match what Opera/Chrome/FF do. Test case:
 
 html
  body
   style id=test  
   /style
   script type=text/javascript 
 var test = document.getElementById(test);
 var c1 = document.createElement( 'c1' );
 c1.appendChild( document.createTextNode( 'somestuff' ) );
 test.appendChild( c1 );
 test.appendChild( document.createTextNode( 'morestuff' ) );
 var html = test.innerHTML;
 alert(html);
   /script 
  /body
 /html 
 
 Opera and Chrome will alert c1somegt;stuff/c1morestuff (escaping 
 the angle bracket inside the child element) and Firefox just outputs 
 morestuff (presumably a bug). I tried a couple of the other special 
 elements (script and xmp) and they worked the same way. I think for 
 compatibility the spec should say If the parent of the current node is 
 a instead of If one of the ancestors of current node is a for the 
 Text/CDATASection handling.

On Thu, 9 Jul 2009, Boris Zbarsky wrote:
 
 It's actually rather purposeful, at least in terms of the code.  It'd be 
 pretty easy to change to returning the textContent instead (so walking 
 into kids).
 
 See https://bugzilla.mozilla.org/show_bug.cgi?id=125746 for the history 
 here (the code has just been carried along since).

On Mon, 13 Jul 2009, Simon Pieters wrote:
 
 I think the spec currently matches what IE does.

On Mon, 13 Jul 2009, Boris Zbarsky wrote:
 
 Does IE even support adding a child element to a script?

It appears not.

I've changed the spec to say parent rather than ancestor (matching 
the descriptions of Opera and Chrome above).


 One problem with what the spec currently says, if I read it correctly, 
 is that it doesn't round-trip scripts correctly, at least as far as I 
 can see.  What Gecko serializes as the innerHTML of the script is 
 something that, if you set the script's innerHTML to that value, will 
 give a script that is equivalent to the original one if it's executed. 
 That doesn't seem to be the case for the spec's current behavior...

If the script node contains elements, then indeed, you'll get weird 
behaviour when you use innerHTML. I'm not sure that's a big problem, 
though... if you want innerHTML to work, then don't nest elements in 
script blocks. You have to go to some lengths to do that anyway, and 
it's non-conforming.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] Canvas context.drawImage clarification

On Thu, 9 Jul 2009, Gregg Tavares wrote:

 The specific ambiguity I'd like to bring up has to do with the several 
 versions of a function, context.drawImage. They take width and height 
 values.  The spec does not make it clear what is supposed to happen with 
 negative values.
 
 My personal interpretation and preference is that negative values should
 
 (a) be legal and
 (b) draw backward, flipping the image.
 
 The specification currently says:
 
 The source rectangle is the rectangle whose corners are the four points
 (sx, sy), (sx+sw, sy), (sx+sw, sy+sh), (sx, sy+sh).
 
 ...
 
 The destination rectangle is the rectangle whose corners are the four
 points (dx, dy), (dx+dw, dy), (dx+dw, dy+dh), (dx, dy+dh).
 
 Well, simple math would suggest that if sx = 10, and sw = -5 then it still
 defines a valid rectangle.

Correct. Why is this ambiguous? The rectangle is well-defined, it just 
happens that its points are given in a different order than normally.


 I'd like to make a passionate plea that the spec say implementations 
 must support negative widths and negative heights and draw the image 
 backward effectively flipping the result.

If you want to flip the image, use a transform.


 Also, I'd like to suggest that a widths and heights of 0 for source 
 should be valid as well as rectangles outside of the source also be 
 valid and that this part of the spec.
 
 If the source rectangle is not entirely within the source image, or if 
 one of the sw or sh arguments is zero, the implementation must raise an 
 INDEX_SIZE_ERR exception.
 
 be changed to reflect that.

If height or width is zero, how do you scale the bitmap up to a non-zero 
size?

We could use transparent black for the pixels outside the image, but this 
is already interoperably implemented, so I don't want to change it.


 Coming from a graphics background I see no reason why if I let my user 
 size an image in a canvas I should have to special case a width or 
 height of zero. Just draw nothing if the width or height is zero.
 Similarly, if I was to provide a UI to let a user choose part of the 
 source to copy to the dest and I let them define a rectangle on the 
 source and drag it such that all or part of it is off the source I see 
 no reason why I should have to do extra math in my application to make 
 that work when simple clipping of values in drawImage would make all 
 that extra work required by each app disappear.

I agree that this may have made sense when the API was being designed a 
few years ago.


 The next issue related to drawImage is that the spec does not specify 
 how to filter an image when scaling it. Should it use bi-linear 
 interpolation? Nearest Neighbor? Maybe that should stay implementation 
 dependent? On top of that the spec does not say what happens at the 
 edges and the different browsers are doing different things. To give you 
 an example, if you take a 2x2 pixel image and scale it to 256x256 using 
 drawImage. All the major browsers that currently support the canvas tag 
 will give you an image where the center of each pixel is around center 
 of each 128x128 corner of the 256x256 result. The area inside the area 
 defined by those 4 points is rendered very similar on all 4 browsers. 
 The area outside though, the edge, is rendered very differently. On 
 Safari, Chrome and Opera the colors of the original pixels continue to 
 be blended all the way to the edge of the 256x256 area. On Firefox 
 though, the blending happens as though the source image was actually 4x4 
 pixels instead of 2x2 where the edge pixels are all set to an RGBA value 
 of 0, 0, 0, 0. It then draws that scaled image as as though the source 
 rectangle was sx = 1, sy = 1, sw = 2, sh = 2 so that you get a 
 progressively more and more translucent color towards the edge of the 
 rectangle.
 
 I don't know which is right but with low resolution source images the 2 
 give vastly different results.

 Here's a webpage showing the issue.
 
 http://greggman.com/downloads/examples/canvas-test/test-01/canvas-test-01-results.html

It's not clear to me why what Firefox does is actually wrong. They use 
different assumptions, but why is it wrong? There's no trnasparency in the 
original, sure, but there's also no pixelation in the original, and no 
purple between the two pixels on the left, yet you aren't complaining 
about the introduction of pixelation or purple, both of which are done by 
one or another of the browsers.


On Thu, 9 Jul 2009, Gregg Tavares wrote:
 
 [...] Or making it consistent when the DOCTYPE is set to something.

We're not adding any more quirks modes, four is already far too many. We 
want consistency across all modes.


 When I scale a rectangular opaque image I expect rectangular opaque 
 results. The Firefox implementation does not do this.

Let them know. This seems like a quality of implementation issue. I don't 
expect a 2x2 bitmap with four distinct colours to turn into the washes the 
other UAs do either.


 If

Re: [whatwg] New HTML5 spec GIT collaboration repository

2009-07-27 Thread Cameron McCormack

Manu Sporny:
  3. Running the Anolis post-processor on the newly modified spec.

Geoffrey Sneddon:
 Is there any reason you use --allow-duplicate-dfns?

I think it’s because the source file includes the source for multiple
specs (HTML 5, Web Sockets, etc.) which, when taken all together, have
duplicate definition.  Manu’s Makefile will need to split out the
HTML 5 specific parts (between the !--START html5-- and !--END
html5-- markers).  The ‘source-html5 : source’ rule in
http://dev.w3.org/html5/spec-template/Makefile will handle that.

-- 
Cameron McCormack ≝ http://mcc.id.au/

Re: [whatwg] [EventSource] Garbage collection rules

On Fri, 10 Jul 2009, Stewart Brodie wrote:
 
 The GC rules in section 9 seem overly permissive - if there is a 
 listener for message events but the script forgets to call close() 
 when the user navigates away, then the resources it is consuming cannot 
 be reclaimed.

Fixed.


 There is a small chance that it may be reclaimed if the server terminates
 the connection and a GC occurs before the UA is able to re-establish the
 connection (i.e. during the reconnection delay or the reconnection) [...]

Fixed.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] Serving up Theora video in the real world

On Fri, 10 Jul 2009, Aryeh Gregor wrote:
 On Fri, Jul 10, 2009 at 4:57 AM, Robert O'Callahanrob...@ocallahan.org 
 wrote:
  The way we've implemented in Firefox, we'll return yes if you 
  specify a codecs parameter and we support every codec in your list. So 
  v.canPlayType(video/ogg; codecs=vorbis,theora) returns probably in 
  Firefox 3.5. I think this is reasonable because I believe that, modulo 
  bugs in our implementation, we support the full Theora and Vorbis 
  specs. On the other hand, we will return maybe for 
  v.canPlayType(video/ogg). I think this distinction will be useful.
 
 In what use-case would an author want to make use of the distinction? In 
 either case, your only course of action is to try playing the video.  
 Maybe you'd try testing all the video types you support, and if one is 
 maybe while another is probably you'd go with probably?  That 
 seems like a pretty marginal use case to help for the sake of such a 
 confusing API.  Programmers expect binary logic, not ternary (look at 
 the complaints about SQL's NULL).

The main use case is ordering. If you have ten variants, you might want to 
try the probablys before the maybes, especially if there are a lot of 
weird codecs involved, such that the maybes might be able to play 
somewhat, just not as well as the probablys.


On Fri, 10 Jul 2009, Philip Jagenstedt wrote:
 
 I agree that the current interface is ugly and quite fail to see what the
 use for it is. With a boolean return value, canPlayType(application/ogg)
 would return true if one can demux Ogg streams.
 canPlayType(application/ogg; codecs=vorbis,dirac) would return true if
 one can demux Ogg and decode vorbis + dirac.

What would canPlayType(video/ogg; codecs=vorbis) return? There's not 
enough information there to say whether or not you can play a stream 
labeled that way.


 Unless there's some compelling use case that can't be handled with the 
 above I'd support canPlayType returning a boolean. The only issue I can 
 see is that canPlayType(foo)==true might be interpreted as a strong 
 promise of playability which can't be given. In that case just rename 
 the function to wouldTryTypeInResourceSelection (no, not really).

You can use the method as it is now as a boolean method, in practice. 
However, I think there is value in being explicit that a true return 
value isn't really necessarily confident.


On Fri, 10 Jul 2009, Philip Jagenstedt wrote:
 
 Before someone conjures up an example where this doesn't exactly match 
 the current behavior, the point is simply that calling canPlayType 
 without out a codecs list or with specific codecs, you can learn exactly 
 what is supported and not out of the container formats and codecs you 
 are interested in, without the need for the strange 
 probably/maybe/ API.

On Sat, 11 Jul 2009, Robert O'Callahan wrote:
 
 I think it would be somewhat counterintuitive for 
 canPlayType(video/ogg) to return true, but canPlayType(video/ogg; 
 codecs=dirac) to return false.

On Sat, 11 Jul 2009, Philip J�genstedt wrote:
 
 Well I disagree of course, because having canPlayType(video/ogg) mean 
 anything else than can I demux Ogg streams is pointless.

On Sat, 11 Jul 2009, Robert O'Callahan wrote:
 
 So you want canPlayType to mean one thing when provided a type without 
 codecs, and another thing when provided a type with codecs. I don't 
 think that's a good idea.
 
 Anyway, it's too late. If you care passionately about this you should 
 have reopened this discussion months ago, not now that two browsers have 
 just shipped support for the API in the spec.

On Sun, 12 Jul 2009, Robert O'Callahan wrote:
 
 IIRC some browsers using system media frameworks don't know what codecs they
 support, so they still need to be able to answer maybe when codecs are
 provided; you still need a three-valued result.
 
 I still think it would confuse authors if you return true for canPlayType(T)
 and false for canPlayType(U) where U is a subset of T.

I'm with Robert on this. The idea is that you can take the actual MIME 
type of a file, and find out what the odds are that the file will play ok. 
In practice, the odds are lower with video/ogg than a type that 
explicitly lists a supported codec.


On Sun, 12 Jul 2009, Philip Jägenstedt wrote:
 
 Not that I except this discussion to go anywhere, but out of curiosity I 
 checked how Firefox/Safari/Chrome actually implement canPlayType:
 
 http://wiki.whatwg.org/wiki/Video_type_parameters#Browser_Support
 
 Firefox is conservative and honest (except maybe for audio/wav; 
 codecs=0, what could you do with the RIFF DATA chunk?) Safari gets 
 maybe/probably backwards compared to what the spec suggests. Chrome 
 seems to ignore the codecs parameter, claiming probably even for bogus 
 codecs. Authors obviously can't trust the distinction between maybe 
 and probably to any extent.

That certainly is unfortunate.


On Sat, 11 Jul 2009, Maciej Stachowiak wrote:
 
 If I were designing the API from scratch, I

Re: [whatwg] Canvas context.drawImage clarification

2009-07-27 Thread Gregg Tavares

On Mon, Jul 27, 2009 at 3:12 PM, Ian Hickson i...@hixie.ch wrote:

On Thu, 9 Jul 2009, Gregg Tavares wrote:

The specific ambiguity I'd like to bring up has to do with the several
versions of a function, context.drawImage. They take width and height
values. The spec does not make it clear what is supposed to happen with
negative values.

My personal interpretation and preference is that negative values should

(a) be legal and
(b) draw backward, flipping the image.

The specification currently says:

The source rectangle is the rectangle whose corners are the four points
(sx, sy), (sx+sw, sy), (sx+sw, sy+sh), (sx, sy+sh).

...

The destination rectangle is the rectangle whose corners are the four
points (dx, dy), (dx+dw, dy), (dx+dw, dy+dh), (dx, dy+dh).

Well, simple math would suggest that if sx = 10, and sw = -5 then it
still
defines a valid rectangle.

Correct. Why is this ambiguous? The rectangle is well-defined, it just
happens that its points are given in a different order than normally.

The diagram in the docs
http://www.whatwg.org/specs/web-apps/current-work/multipage/the-canvas-element.html#images

Clearly show SX maps to DX, SY maps top DY

But that is not the interpretation that is implemented. The interpretation
that is implemented is Source Top/Left maps to Dest Top/Left regardless of
whether SX/SY define top left or SX + WIDTH, SY + HEIGHT define top left.

That seems pretty ambiguous to me.

I'd argue that based on the spec as currently written, all current canvas
implementations are wrong. Hence the suggestion to make it unambiguous or
get the implementation to match the spec.

I'd like to make a passionate plea that the spec say implementations
must support negative widths and negative heights and draw the image
backward effectively flipping the result.

If you want to flip the image, use a transform.

Also, I'd like to suggest that a widths and heights of 0 for source
should be valid as well as rectangles outside of the source also be
valid and that this part of the spec.

If the source rectangle is not entirely within the source image, or if
one of the sw or sh arguments is zero, the implementation must raise an
INDEX_SIZE_ERR exception.

be changed to reflect that.

If height or width is zero, how do you scale the bitmap up to a non-zero
size?

We could use transparent black for the pixels outside the image, but this
is already interoperably implemented, so I don't want to change it.

Coming from a graphics background I see no reason why if I let my user
size an image in a canvas I should have to special case a width or
height of zero. Just draw nothing if the width or height is zero.
Similarly, if I was to provide a UI to let a user choose part of the
source to copy to the dest and I let them define a rectangle on the
source and drag it such that all or part of it is off the source I see
no reason why I should have to do extra math in my application to make
that work when simple clipping of values in drawImage would make all
that extra work required by each app disappear.

I agree that this may have made sense when the API was being designed a
few years ago.

The next issue related to drawImage is that the spec does not specify
how to filter an image when scaling it. Should it use bi-linear
interpolation? Nearest Neighbor? Maybe that should stay implementation
dependent? On top of that the spec does not say what happens at the
edges and the different browsers are doing different things. To give you
an example, if you take a 2x2 pixel image and scale it to 256x256 using
drawImage. All the major browsers that currently support the canvas tag
will give you an image where the center of each pixel is around center
of each 128x128 corner of the 256x256 result. The area inside the area
defined by those 4 points is rendered very similar on all 4 browsers.
The area outside though, the edge, is rendered very differently. On
Safari, Chrome and Opera the colors of the original pixels continue to
be blended all the way to the edge of the 256x256 area. On Firefox
though, the blending happens as though the source image was actually 4x4
pixels instead of 2x2 where the edge pixels are all set to an RGBA value
of 0, 0, 0, 0. It then draws that scaled image as as though the source
rectangle was sx = 1, sy = 1, sw = 2, sh = 2 so that you get a
progressively more and more translucent color towards the edge of the
rectangle.

I don't know which is right but with low resolution source images the 2
give vastly different results.

Here's a webpage showing the issue.

http://greggman.com/downloads/examples/canvas-test/test-01/canvas-test-01-results.html

It's not clear to me why what Firefox does is actually wrong. They use
different assumptions, but why is it wrong? There's no trnasparency in the
original, sure, but there's also no pixelation in the

Re: [whatwg] Canvas context.drawImage clarification

On Mon, 27 Jul 2009, Gregg Tavares wrote:
 
 The diagram in the docs 
 http://www.whatwg.org/specs/web-apps/current-work/multipage/the-canvas-element.html#images
 
 Clearly show SX maps to DX, SY maps top DY
 
 But that is not the interpretation that is implemented. The 
 interpretation that is implemented is Source Top/Left maps to Dest 
 Top/Left regardless of whether SX/SY define top left or SX + WIDTH, SY + 
 HEIGHT define top left.
 
 That seems pretty ambiguous to me.

Ignore the diagram. It's not normative. The text is the only thing that 
matters. I've moved the diagram up to the intro section to make this 
clearer.


 I'd argue that based on the spec as currently written, all current 
 canvas implementations are wrong. Hence the suggestion to make it 
 unambiguous or get the implementation to match the spec.

Could you explain what other interpretations of the following you think 
are reasonable?:

# The source rectangle is the rectangle whose corners are the four points 
# (sx, sy), (sx+sw, sy), (sx+sw, sy+sh), (sx, sy+sh).
# [...]
# The destination rectangle is the rectangle whose corners are the four 
# points (dx, dy), (dx+dw, dy), (dx+dw, dy+dh), (dx, dy+dh).
#
# When drawImage() is invoked, the region of the image specified by the 
# source rectangle must be painted on the region of the canvas specified 
# by the destination rectangle [...]

It seems pretty unambigious to me.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] getImageData/putImageData comments

On Fri, 10 Jul 2009, Boris Zbarsky wrote:
 Ian Hickson wrote:
  I don't see why the imagedata API isn't suitable for that. It's not 
  like if you're painting that on the canvas you'll want to leave the 
  last row or column unaffected. You'll want to clear it or some such, 
  in practice.
 
 I believe in this case the page actually wants to create a canvas for 
 each intermediate state, so needs to create canvases sized 1 imagedata 
 pixel smaller than a given canvas  Just painting to a canvas of the 
 old size isn't right, since it'll then take too much space in the 
 layout.

That seems like an unnecessarily complicated way of doing things.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] Installed Apps

This sounds really powerful, and seems like a natural evolution of some of
the stuff we've discussed previously for persistent workers. A few
comments/notes:
1) It sounds like this background page would act like any other web page
with respect to its processing model (i.e. like other pages, script running
in this page would be limited as to how long it can run, as opposed to
workers which can run for any arbitrary length of time). This seems
reasonable, especially since this page could assumedly still create workers
if it need to do true background processing. It's really more of a hidden
page than a background page?

2) For multi-process browsers like Chrome, there seem to be limitations as
to what can actually be accessed between processes (direct DOM access across
process boundaries seems problematic for example). Do you have ideas about
how to address this, since assumedly the page calling getInstalledApp()
could be running under some arbitrary process?

3) This approach has another advantage over something like workers in that a
hidden page can do cross-domain access/sharing via iframes, whereas workers
don't really have any facility for cross-domain access.

4) I had a quick question/clarification about the motivation behind this -
aside from the advantages described above, it sounds like the specific
problem you are solving by a hidden page is a) you don't have to load
javascript in a new page (which I'm assuming must be slow), and b) you don't
have to load client state in the new page.

For a) - Having some way to load large amounts of cached javascript quickly
in a new page seems like an issue that would be nice to address in general,
not just for pages that install hidden pages. Are there other approaches
worth trying here?

For b) - How much client state are we talking about? If you were to pursue
this approach using workers to maintain client state, how much data would
you expect to be transferred to the client app on startup? We're seeing
fairly low latency for client-worker communication, so in theory it
shouldn't be a huge source of slowdown.

I agree that the programming model of the hidden page is much cleaner/more
familiar than rewriting applications to use asynchronous messaging, so that
may be sufficient motivation for this.

-atw

On Mon, Jul 27, 2009 at 11:50 AM, Michael Davidson m...@google.com wrote:

 Hello folks -

 I'm an engineer on the Gmail team. We've been working on a prototype
 with the Chrome team to make the Gmail experience better. We thought
 we'd throw out our ideas to the list to get some feedback.

 THE PROBLEM

 We would like to enable rich internet applications to achieve feature
 parity with desktop applications. I will use Gmail and Outlook as
 examples for stating the problems we hope to solve.

 -- Slow startup: When a user navigates to mail.google.com, multiple
 server requests are required to render the page. The Javascript is
 cacheable, but personal data (e.g. the list of emails to show) is not.
 New releases of Gmail that require JS downloads are even slower to
 load.
 -- Native apps like Outlook can (and do) run background processes on
 the user's machine to make sure that data is always up-to-date.
 -- Notifications: Likewise, Outlook can notify users (via a background
 process) when new mail comes in even if it's not running.

 A SOLUTION

 Our proposed solution has two parts. The first, which should be
 generally useful, is the ability to have a hidden HTML/JS page running
 in the background that can access the DOM of visible windows. This
 page should be accessible from windows that the user navigates to. We
 call this background Javascript window a shared context or a
 background page. This will enable multiple instances of a web app
 (e.g. tearoff windows in Gmail) to cleanly access the same user state
 no matter which windows are open.

 Additionally, we'd like this background page to continue to run after
 the user has navigated away from the site, and preferably after the
 user has closed the browser. This will enable us to keep client-side
 data up-to-date on the user's machine. It will also enable us to
 download JS in advance. When the user navigates to a web app, all the
 background page has to do is draw the DOM in the visible window. This
 should significantly speed up app startup. Additionally, when
 something happens that requires notification, the background page can
 launch a visible page with a notification (or use other rich APIs for
 showing notifications).

 WHY NOT SHARED WORKERS

 Shared workers and persistent workers are designed to solve similar
 problems, but don't meet our needs. The key difference between what
 we're proposing and earlier proposals for persistent workers is that
 background pages would be able to launch visible windows and have full
 DOM access.  This is different from the model of workers where all
 interaction with the DOM has to be done through asynchronous message
 passing. We would like background pages

Re: [whatwg] CanvasRenderingContext2D.lineTo compatibility problem

On Sat, 11 Jul 2009, Oliver Hunt wrote:

 While investigating a compatibility issue with 
 http://www.blahbleh.com/clock.php I found that the spec behaviour on 
 CanvasRenderingContext2D.lineTo conflicts with what Gecko implements.
 
 The current spec language is
 The lineTo(x, y) method must do nothing if the context has no subpaths.
 Otherwise, it must connect the last point in the subpath to the given point
 (x, y) using a straight line, and must then add the given point (x, y) to the
 subpath.
 
 Gecko appears to treat the empty path case as moveTo(x,y).  I'm going to do a
 bit more investigation into the behaviour of this and the other path
 manipulation functions to see whether lineTo is special or this logic
 effects every function (of course any Gecko devs may be able to answer more
 quickly than i can manually verify).  On the *assumption* that my initial
 analysis is correct i propose that the language be updated to something akin
 to:
 The lineTo(x, y) method is equivalent to moveTo(x, y) if the context has no
 subpaths. Otherwise, it must connect the last point in the subpath to the
 given point (x, y) using a straight line, and must then add the given point
 (x, y) to the subpath.

On Sat, 11 Jul 2009, Oliver Hunt wrote:

 Okay the behaviour for lineTo, quadraticCurveTo and bezierCurveTo without an
 existing subpath (unsure about arcTo any sane response to arcTo with an
 empty path results in one of those edge cases where webkit, gecko, and presto
 all disagree) should probably be changed to (worded better of course :D ):
 
 * lineTo(x, y) is equivalent to moveTo(x, y) if the context has no subpaths.
 * The quadraticCurveTo(cpx, cpy, x, y) method is equivalent to moveTo(cpx,
 cpy); quadraticCurveTo(cpx, cpy, x, y);  if the context has no subpaths
 * The bezierCurveTo(cp1x, cp1y, cp2x, cp2y, x, y) method is equivalent to
 moveTo(cp1x, cp1y); bezierCurveTo(cp1x, cp1y, cp2x, cp2y, x, y);  if the
 context has no subpaths
 
 My rationale for this change is that it is a relaxation of existing API 
 -- in the specified API these cases would become no-ops that feed into 
 subsequent calls, eg. lineTo(..);lineTo(..);lineTo(..) will draw nothing 
 as the path never becomes non-empty so none of the calls can ever have 
 an effect, whereas this re-specification would result in subsequent 
 operations drawing something.

Fixed the spec as proposed (and also for arcTo(), though as you say, the 
behaviour there isn't very interoperable).

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] bezierCurveTo summary is incorrect

On Sat, 11 Jul 2009, Oliver Hunt wrote:

 Just noticed while looking at the path modification functions that the 
 summary of bezierCurveTo (at the beginning of Section 4.8.11.1.8) gives 
 bezierCurveTo the signature bezierCurveTo(cpx, cpy, x, y) which is the 
 signature for quadraticCurveTo.  The signature should be 
 bezierCurveTo(cp1x, cp1y, cp2x, cp2y, x, y)

Fixed.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] input type=url allow URLs without http:// prefix

On Sun, 12 Jul 2009, Bruce Lawson wrote:
 
 The eleventy squillion WordPress sites out there that allow comments ask 
 for your web page address as well as name and email. The method of 
 entering a URL does not require the http:// prefix; just beginning the 
 URL with www is accepted.
 
 As it's very common for people to drop the http:// prefix on 
 advertising, business cards etc (and who amongst us reads out the prefix 
 when reading a URL on the phone?) I'd like to suggest that input 
 type=url allows the http:// prefix to be optional on input and, if 
 ommitted, be assumed when parsing.

Assuming you mean user input, it already is allowed to be optional; the 
spec doesn't prevent the user agent from doing whatever they want in terms 
of fixups.

On Mon, 13 Jul 2009, Ian Pouncey wrote:
 On Sun, Jul 12, 2009 at 3:48 PM, Kornel Lesinskikor...@geekhood.net 
 wrote:
  On Sun, 12 Jul 2009 09:46:19 +0100, Bruce Lawson bru...@opera.com 
  wrote:
 
  As it's very common for people to drop the http:// prefix on 
  advertising, business cards etc (and who amongst us reads out the 
  prefix when reading a URL on the phone?) I'd like to suggest that 
  input type=url allows the http:// prefix to be optional on input 
  and, if ommitted, be assumed when parsing.
 
  The spec explicitly allows that actual value seen and edited by the 
  user in the interface is different from DOM value of the input, so 
  browsers are free to prepend http:// automatically (and IMHO should � 
  DSK-253195).
 
 To make this less ambiguous I would prefer that we talk about making it 
 optional to specify a protocol or scheme name (personal preference for 
 protocol) rather than http:// specifically. While http will be the most 
 common protocol by far it is not the only possibility.

The scheme is not optional in the submission format.


 I have no problems with the idea though, I just think there needs to be 
 a mechanism for highlighting the change to the user rather than this 
 being hidden in the DOM.

That's a UI issue, which is more or less out of scope of the spec.


On Mon, 13 Jul 2009, Bruce Lawson wrote:
 
 Excellent. And, while I don't doubt you at all, I'm abashed that I 
 missed that nuance, especially as it'#s explicitly allowed?  Where would 
 I find that in the spec?

On Mon, 13 Jul 2009, Kornel wrote:
 
 The URL state section says that value in DOM may be different from 
 value in the user interface:
 
 http://www.whatwg.org/specs/web-apps/current-work/multipage/forms.html#url-state
 
 The example difference given in the spec is URL-escaping, but in my 
 understanding, it should allow to prepending of protocol as well (I 
 admit that last bit is not stated explicitly).

Right.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] Accessibility for Drag and Drop

On Sun, 12 Jul 2009, Aron Spohr wrote:

 1. Every Element with the draggable attribute set to true should be 
 focusable by default. I know that tabindex=n would make almost any 
 element focusable but I think making draggable elements focusable is 
 something that shouldn't be left to best practises for webmasters and 
 should be defaulted to by spec. An element that can have a focus usually 
 indicates to the user that it supports some kind of interaction. That is 
 definately the case for elements with the draggable attribute set to 
 true. If they can be focused a user could use the Tab-Key to focus a 
 draggable element and then press the context-menu key in order to get 
 a copy function offered by the user agent. Maybe elements with draggable 
 attribute set to true can be added to the list in section 7.5.1 
 (Sequential focus navigation) so that they get focused by default.

Whether draggable elements can be focused or not is up to the user agent. 
It's much like links -- in some browsers, whether they are focusable or 
not depends on the operating system accessibility preferences.


 I think this should only apply to elements which have the draggable 
 attribute set to true explicitly (and not to auto). That ensures that 
 the pre-existing default behaviour of user agents doesn't change when 
 working with ‘a’ and ‘img’ elements. i.e. ‘a’ elements are 
 already focusable and allow this type of behaviour already, img 
 wouldn’t unless the draggable attribute has been set to true 
 explicitly.

Again, that would depend on the user agent's behaviour.


 2. On the opposite side it’s actually pretty undefined how the user 
 can “find” an area where something can be dropped. It’s obviously 
 very much up to the webmaster if the content on the page indicates that 
 something can be dropped somewhere. And that’s probably ok to some 
 degree. However for accessibility it’s suboptimal.

 Similar to the draggable attribute, which indicates that something can 
 be dragged away I believe it would be very useful to have the opposite, 
 which does indicate that an element can accept drops.

The user agent can actually find where drops can occur by just acting as 
if the user had tried to drop everywhere. I'm not really sure how we could 
make an attribute work for this, since the model allows any element to be 
a drop target already.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] Typos in spec

On Sun, 12 Jul 2009, Aryeh Gregor wrote:

 In 2.1.2:
 
 A content attribute is said to change value only if its value new value 
 is different than its previous value; setting an attribute to a value it 
 already has does not change it.
 
 should be
 
 A content attribute is said to change value only if its new value is 
 different than its previous value; setting an attribute to a value it 
 already has does not change it.

Fixed.


 In 2.1.5:
 
 This includes such encodings as Shift_JIS and variants of ISO-2022,
 even though it is possible in this encodings for bytes like 0x70 to be
 part of longer sequences that are unrelated to their interpretation as
 ASCII.
 
 should be
 
 This includes such encodings as Shift_JIS and variants of ISO-2022,
 even though it is possible in these encodings for bytes like 0x70 to
 be part of longer sequences that are unrelated to their interpretation
 as ASCII.

Fixed.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] nostyle consideration

On Sun, 12 Jul 2009, Joshua Brickner wrote:
 
 A tag like nostyle would useful to 'reset' the styles for the specifed 
 part of the document.
 
 Essentially it would clear any inherited styles from being applied to 
 it's contents. This idea is not quite the same as the one proposed but 
 I've sometimes had a need for something like this.
 
 A better name might be styleless or nostyles

This seems like CSS feature. I would recommend proposing this use case to 
the CSS working group. I don't think HTML is the appropriate place to 
address this problem.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] Installed Apps

2009-07-27 Thread Sebastian Markbåge


 2) For multi-process browsers like Chrome, there seem to be limitations as
 to what can actually be accessed between processes (direct DOM access across
 process boundaries seems problematic for example). Do you have ideas about
 how to address this, since assumedly the page calling getInstalledApp()
 could be running under some arbitrary process?


Even with single-process browsers you would have to handle threading issues
in a way that is just not done in all browsers yet.

For a) - Having some way to load large amounts of cached javascript quickly
 in a new page seems like an issue that would be nice to address in general,
 not just for pages that install hidden pages. Are there other approaches
 worth trying here?


Loading of cached JavaScript isn't really that slow. I think the real issue
here is client state. It's often a good idea to have a copy of running
scripts in each process for stability anyway. However, cached
parsing/pre-compilation (if available) of scripts might be a generally good
idea. Perhaps some kind of process cloning like *nix forks?

IMHO, this could be solved cleaner and more memory efficient way with some
form of persistent workers rather than a hidden page. But I might be
missing something.

On Tue, Jul 28, 2009 at 1:30 AM, Drew Wilson atwil...@google.com wrote:

 This sounds really powerful, and seems like a natural evolution of some of
 the stuff we've discussed previously for persistent workers. A few
 comments/notes:
 1) It sounds like this background page would act like any other web page
 with respect to its processing model (i.e. like other pages, script running
 in this page would be limited as to how long it can run, as opposed to
 workers which can run for any arbitrary length of time). This seems
 reasonable, especially since this page could assumedly still create workers
 if it need to do true background processing. It's really more of a hidden
 page than a background page?

 2) For multi-process browsers like Chrome, there seem to be limitations as
 to what can actually be accessed between processes (direct DOM access across
 process boundaries seems problematic for example). Do you have ideas about
 how to address this, since assumedly the page calling getInstalledApp()
 could be running under some arbitrary process?

 3) This approach has another advantage over something like workers in that
 a hidden page can do cross-domain access/sharing via iframes, whereas
 workers don't really have any facility for cross-domain access.

 4) I had a quick question/clarification about the motivation behind this -
 aside from the advantages described above, it sounds like the specific
 problem you are solving by a hidden page is a) you don't have to load
 javascript in a new page (which I'm assuming must be slow), and b) you don't
 have to load client state in the new page.

 For a) - Having some way to load large amounts of cached javascript quickly
 in a new page seems like an issue that would be nice to address in general,
 not just for pages that install hidden pages. Are there other approaches
 worth trying here?

 For b) - How much client state are we talking about? If you were to pursue
 this approach using workers to maintain client state, how much data would
 you expect to be transferred to the client app on startup? We're seeing
 fairly low latency for client-worker communication, so in theory it
 shouldn't be a huge source of slowdown.

 I agree that the programming model of the hidden page is much cleaner/more
 familiar than rewriting applications to use asynchronous messaging, so that
 may be sufficient motivation for this.

 -atw


 On Mon, Jul 27, 2009 at 11:50 AM, Michael Davidson m...@google.com wrote:

 Hello folks -

 I'm an engineer on the Gmail team. We've been working on a prototype
 with the Chrome team to make the Gmail experience better. We thought
 we'd throw out our ideas to the list to get some feedback.

 THE PROBLEM

 We would like to enable rich internet applications to achieve feature
 parity with desktop applications. I will use Gmail and Outlook as
 examples for stating the problems we hope to solve.

 -- Slow startup: When a user navigates to mail.google.com, multiple
 server requests are required to render the page. The Javascript is
 cacheable, but personal data (e.g. the list of emails to show) is not.
 New releases of Gmail that require JS downloads are even slower to
 load.
 -- Native apps like Outlook can (and do) run background processes on
 the user's machine to make sure that data is always up-to-date.
 -- Notifications: Likewise, Outlook can notify users (via a background
 process) when new mail comes in even if it's not running.

 A SOLUTION

 Our proposed solution has two parts. The first, which should be
 generally useful, is the ability to have a hidden HTML/JS page running
 in the background that can access the DOM of visible windows. This
 page should be accessible from windows that the user navigates to. We
 call this

Re: [whatwg] DOMTokenList is unordered but yet requires sorting

On Sun, 12 Jul 2009, Jonas Sicking wrote:
 
  Oh, I have forseen that. Is it really necessary to remove duplicates 
  ? I imagine DOMTokenList to be similar to what can be achieved with a 
  String.split(), but then it would be just more duplicate 
  functionality.
 
  If we don't remove duplicates, then things like the .toggle() method 
  could have some quite weird effects.
 
 Such as?

Such as .length changing by more than 1 after a call to .toggle().


 I definitely think it'd be worth avoiding the code complexity and perf 
 hit of having the implementation remove duplicates if they appear in the 
 class attribute given how extremely rare duplicates are.

Fair enough. I've made DOMTokenList not remove duplicates.


On Mon, 13 Jul 2009, Sylvain wrote:
 
 This is a bit unrelated, but when looking at the DOMTokenList 
 implementation, I had an idea about an alternative algorithm that could 
 be easier to implement and could also be described more simply in the 
 spec. The disadvantage is that the DOMTokenList methods mutating the 
 underlying string wouldn't preserve existing whitespace (which the 
 current algorithms try hard to do).
 
 The idea is that any DOMTokenList method that mutates the underlying string
 would do:
  - split the attribute in unique tokens (preserving order).
  - add or remove the token according to the method called.
  - rebuild the attribute string by concatenating tokens together (with a
 single space).
 
 At first, this may look like inefficient (if implemented naively).
 But I guess that implementations will usually keep both the attribute string
 and a list of tokens in memory, so they wouldn't have to tokenize the string
 on every mutation. There is a small performance hit during attribute
 tokenization: the list of tokens would need to keep only unique tokens. But
 after that, the DOMTokenList methods are very simple: length/item() don't need
 to take care of duplicates, add/remove/toggle are simple list manipulation
 (the attribute string could be lazily generated from the token list when
 needed).
 
 To summarize:
 pros: simpler spec algorithms, simpler implementation
 cons: less whitespace preservation, small perf hit during tokenization
 
 I don't know if I'm missing something. Does this sound reasonable?

It ends up being not much simpler since you still have to deal with direct 
changes to the underlying string, as far as I can tell.


On Mon, 13 Jul 2009, Jonas Sicking wrote:
 
 I do agree that the spec seems to go extraordinary far to not touch 
 whitespace. Normalizing whitespace when parsing is a bad idea, but once 
 the user modifies the DOMTokenList, I don't see a lot of value in 
 maintaining whitespace exactly as it was.
 
 Ian: What is the reason for the fairly complicated code to deal with 
 removals? At least in Gecko it would be much simpler to just regenerate 
 the string completely. That way generating the string-value could just 
 be dropped on modifications, and regenerated lazily when requested.

In general, I try to be as conservative as possible in making changes to 
the DOM. Are the algorithms really as complicated as you're making out? 
They seem pretty trivial to me.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] Issues with Web Sockets API



On Jul 27, 2009, at 2:14 PM, Alexey Proskuryakov wrote:



27.07.2009, в 12:35, Maciej Stachowiak написал(а):

However, I do not think that raising an exception is an  
appropriate answer. Often, the TCP implementation takes a part of  
data given to it, and asks to resubmit the rest later. So, just  
returning an integer result from send() would be best in my opinion.


With WebSocket, another possibility is for the implementation to  
buffer pending data that could not yet be sent to the TCP layer, so  
that the client of WebSocket doesn't have to be exposed to system  
limitations. At that point, an exception is only needed if the  
implementation runs out of memory for buffering. With a system TCP  
implementation, the buffering would be in kernel space, which is a  
scarce resource, but user space memory inside the implementation is  
no more scarce than user space memory held by the Web application  
waiting to send to the WebSocket.



I agree that this will help if the application sends data in burst  
mode, but what if it just constantly sends more than the network can  
transmit? It will never learn that it's misbehaving, and will just  
take more and more memory.


An example where adapting to network bandwidth is needed is of  
course file uploading, but even if we dismiss it as a special case  
that can be served with custom code, there's also e.g. captured  
video or audio that can be downgraded in quality for slow connections.


If an application could usefully choose to do something other than  
buffer in memory (as applies to both of your examples), then yes, it  
would be useful to tell it when to back off on the send rate. But this  
could also be combined with buffering inside the implementation but  
outside the kernel, so the client of WebSocket never has to resend  
whole or partial packets, it can just note that it should back off on  
the send rate, and delay future packets.


Regards,
Maciej

Re: [whatwg] Issues with Web Sockets API



On Jul 27, 2009, at 2:44 PM, Drew Wilson wrote:




There's another option besides blocking, raising an exception, and  
dropping data: unlimited buffering in user space. So I'm saying we  
should not put any limits on the amount of user-space buffering  
we're willing to do, any more than we put any limits on the amount  
of other types of user-space memory allocation a page can perform.


I think even unlimited buffering needs to be combined with at least a  
hint to the WebSocket client to back off the send rate, because it's  
possible to send so much data that it exceeds the available address  
space, for example when uploading a very large file piece by piece, or  
when sending a live media stream that requires more bandwidth than the  
connection can deliver. In the first case, it is possible, though  
highly undesirable, to spool the data to be sent to disk; in the  
latter case, doing that would just inevitably fill the disk. Obviously  
we need more web platform capabilities to make such use cases a  
reality, but they are foreseeable and we should deal with them in some  
reasonable way.


Regards,
Maciej

Re: [whatwg] Serving up Theora video in the real world

2009-07-27 Thread Andrew Scherkus

On Mon, Jul 27, 2009 at 3:48 PM, Ian Hickson i...@hixie.ch wrote:

 On Fri, 10 Jul 2009, Aryeh Gregor wrote:
  On Fri, Jul 10, 2009 at 4:57 AM, Robert O'Callahanrob...@ocallahan.org
 wrote:
   The way we've implemented in Firefox, we'll return yes if you
   specify a codecs parameter and we support every codec in your list. So
   v.canPlayType(video/ogg; codecs=vorbis,theora) returns probably in
   Firefox 3.5. I think this is reasonable because I believe that, modulo
   bugs in our implementation, we support the full Theora and Vorbis
   specs. On the other hand, we will return maybe for
   v.canPlayType(video/ogg). I think this distinction will be useful.
 
  In what use-case would an author want to make use of the distinction? In
  either case, your only course of action is to try playing the video.
  Maybe you'd try testing all the video types you support, and if one is
  maybe while another is probably you'd go with probably?  That
  seems like a pretty marginal use case to help for the sake of such a
  confusing API.  Programmers expect binary logic, not ternary (look at
  the complaints about SQL's NULL).

 The main use case is ordering. If you have ten variants, you might want to
 try the probablys before the maybes, especially if there are a lot of
 weird codecs involved, such that the maybes might be able to play
 somewhat, just not as well as the probablys.


 On Fri, 10 Jul 2009, Philip Jagenstedt wrote:
 
  I agree that the current interface is ugly and quite fail to see what the
  use for it is. With a boolean return value,
 canPlayType(application/ogg)
  would return true if one can demux Ogg streams.
  canPlayType(application/ogg; codecs=vorbis,dirac) would return true if
  one can demux Ogg and decode vorbis + dirac.

 What would canPlayType(video/ogg; codecs=vorbis) return? There's not
 enough information there to say whether or not you can play a stream
 labeled that way.


  Unless there's some compelling use case that can't be handled with the
  above I'd support canPlayType returning a boolean. The only issue I can
  see is that canPlayType(foo)==true might be interpreted as a strong
  promise of playability which can't be given. In that case just rename
  the function to wouldTryTypeInResourceSelection (no, not really).

 You can use the method as it is now as a boolean method, in practice.
 However, I think there is value in being explicit that a true return
 value isn't really necessarily confident.


 On Fri, 10 Jul 2009, Philip Jagenstedt wrote:
 
  Before someone conjures up an example where this doesn't exactly match
  the current behavior, the point is simply that calling canPlayType
  without out a codecs list or with specific codecs, you can learn exactly
  what is supported and not out of the container formats and codecs you
  are interested in, without the need for the strange
  probably/maybe/ API.

 On Sat, 11 Jul 2009, Robert O'Callahan wrote:
 
  I think it would be somewhat counterintuitive for
  canPlayType(video/ogg) to return true, but canPlayType(video/ogg;
  codecs=dirac) to return false.

 On Sat, 11 Jul 2009, Philip Jägenstedt wrote:
 
  Well I disagree of course, because having canPlayType(video/ogg) mean
  anything else than can I demux Ogg streams is pointless.

 On Sat, 11 Jul 2009, Robert O'Callahan wrote:
 
  So you want canPlayType to mean one thing when provided a type without
  codecs, and another thing when provided a type with codecs. I don't
  think that's a good idea.
 
  Anyway, it's too late. If you care passionately about this you should
  have reopened this discussion months ago, not now that two browsers have
  just shipped support for the API in the spec.

 On Sun, 12 Jul 2009, Robert O'Callahan wrote:
 
  IIRC some browsers using system media frameworks don't know what codecs
 they
  support, so they still need to be able to answer maybe when codecs are
  provided; you still need a three-valued result.
 
  I still think it would confuse authors if you return true for
 canPlayType(T)
  and false for canPlayType(U) where U is a subset of T.

 I'm with Robert on this. The idea is that you can take the actual MIME
 type of a file, and find out what the odds are that the file will play ok.
 In practice, the odds are lower with video/ogg than a type that
 explicitly lists a supported codec.


 On Sun, 12 Jul 2009, Philip Jägenstedt wrote:
 
  Not that I except this discussion to go anywhere, but out of curiosity I
  checked how Firefox/Safari/Chrome actually implement canPlayType:
 
  http://wiki.whatwg.org/wiki/Video_type_parameters#Browser_Support
 
  Firefox is conservative and honest (except maybe for audio/wav;
  codecs=0, what could you do with the RIFF DATA chunk?) Safari gets
  maybe/probably backwards compared to what the spec suggests. Chrome
  seems to ignore the codecs parameter, claiming probably even for bogus
  codecs. Authors obviously can't trust the distinction between maybe
  and probably to any extent.

 That certainly is

Re: [whatwg] Installed Apps

2009-07-27 Thread Michael Enright

On Mon, Jul 27, 2009 at 11:50 AM, Michael Davidsonm...@google.com wrote:

 THE PROBLEM

 snip
 feature parity with desktop applications.
 snip

 A SOLUTION

 snip
 hidden HTML/JS page running
 in the background that can access the DOM of visible windows.
 snip


 KNOWN ISSUES

 As mentioned in earlier discussions about persistent workers,
 permissioning UI is a major issue.


The thing that rubs me the wrong way about this is not that there
would be a permissioning UI, it would be the permission decision I
would have to convey through said UI. How do I know that I want to let
some site have the same permissions I give Gmail?

Do you really, really need this? Gmail is already always running on my desktop.

Re: [whatwg] Installed Apps

2009-07-27 Thread Robert O'Callahan

On Tue, Jul 28, 2009 at 6:50 AM, Michael Davidson m...@google.com wrote:

 As mentioned in earlier discussions about persistent workers,
 permissioning UI is a major issue.


Indeed, the most difficult issue here is security and the permissions UI,
which you haven't addressed at all.

Currently, when you close a browser tab, the application in that tab is
gone. This is a very important user expectation that we can't just break.

Maybe you could have a browser window containing regular tabs, but presented
differently, with just icons and titles in some sort of tray, so users can
see which applications are running in the background?

Rob
-- 
He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all. [Isaiah
53:5-6]

Re: [whatwg] Issues with Web Sockets API

2009-07-27 Thread Robert O'Callahan

Why not just allow unlimited buffering, but also provide an API to query how
much data is currently buffered (approximate only, so it would be OK to just
return the size of data buffered in user space)?

Then applications that care and can adapt can do so. But most applications
will not need to. The problem of partial writes being incorrectly handled is
pernicious and I definitely think partial writes should not be exposed to
applications.

Rob
-- 
He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all. [Isaiah
53:5-6]

Re: [whatwg] Issues with Web Sockets API

2009-07-27 Thread Michael Nordman

 Obviously we need more web platform capabilities to make such use cases a
reality, but they are foreseeable and we should deal with them in some
reasonable way.
Couldn't agree more.

The proposed websocket interface is too dumbed down. The caller doesn't know
what the impl is doing, and the impl doesn't know what the caller is trying
to do. As a consequence, there is no reasonable action that either can
take when buffers start overflowing. Typically, the network layer provides
sufficient status info to its caller that, allowing the higher level code to
do something reasonable in light of how the network layer is performing.
That kind of status info is simply missing from the websocket interface. I
think its possible to add to the interface features that would facilitate
more demanding uses cases without complicating the simple use cases. I think
that would be an excellent goal for this API.

On Mon, Jul 27, 2009 at 5:30 PM, Maciej Stachowiak m...@apple.com wrote:


 On Jul 27, 2009, at 2:44 PM, Drew Wilson wrote:



 There's another option besides blocking, raising an exception, and
 dropping data: unlimited buffering in user space. So I'm saying we should
 not put any limits on the amount of user-space buffering we're willing to
 do, any more than we put any limits on the amount of other types of
 user-space memory allocation a page can perform.


 I think even unlimited buffering needs to be combined with at least a hint
 to the WebSocket client to back off the send rate, because it's possible to
 send so much data that it exceeds the available address space, for example
 when uploading a very large file piece by piece, or when sending a live
 media stream that requires more bandwidth than the connection can deliver.
 In the first case, it is possible, though highly undesirable, to spool the
 data to be sent to disk; in the latter case, doing that would just
 inevitably fill the disk. Obviously we need more web platform capabilities
 to make such use cases a reality, but they are foreseeable and we should
 deal with them in some reasonable way.

 Regards,
 Maciej

Re: [whatwg] HTML 5 video tag questions


There were some proposals to change video's fallback model to allow 
markup to be used as an alternative to a video stream, so that once the 
various alternative videos had been tried, the user agent would make 
video act as a regular display:block element and no longer act as a 
video viewport.

I haven't changed video to do this, for the following reasons:

 * We tried this with object and we had years of pain. I'm not at all 
   convinced that we wouldn't have the same problems here again.

 * We won't need fallback as soon as we resolve the common codec issue, 
   which is still being worked on. I don't think we should redesign the 
   model to address a transient problem in the platform.

 * This model would either have to make video not dynamic -- once you 
   fail to load a video, you can no longer use it, because it's showing 
   fallback -- or would require us to introduce yet more state to keep 
   track of whether we should be displaying fallback or not.

 * If it's not dynamic, we'd have to define what happens when you use the 
   API when the fallback is being shown. Especially for cases like off- 
   page audio, this would lead to a quite confusing authoring experience.

 * We'd need to have an explicit way of triggering the behaviour for 
   source, given the way the media loading algorithm works.

 * It's a change, and I really would like to put the brakes on the number 
   of non-critical changes we make.

When the element was designed, we explicitly didn't do this -- that's why 
we have source, and is why the fallback is explicitly intended for 
legacy user agents, and not user agents that don't support the given 
videos. Which is to say, we didn't accidentally stumble into this design.

All in all, I'm very skeptical that this is a good idea.

What's more, any solution would almost certainly involve scripting anyway, 
and once you're doing things with script, it's really trivial to do the 
fallback manually, especially with all the hooks video provides, so the 
use cases don't seem that compelling.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] Installed Apps



On Jul 27, 2009, at 6:42 PM, Robert O'Callahan wrote:

On Tue, Jul 28, 2009 at 6:50 AM, Michael Davidson m...@google.com  
wrote:

As mentioned in earlier discussions about persistent workers,
permissioning UI is a major issue.

Indeed, the most difficult issue here is security and the  
permissions UI, which you haven't addressed at all.


Currently, when you close a browser tab, the application in that tab  
is gone. This is a very important user expectation that we can't  
just break.


I share this concern. Violating this expectation seems like it could  
be a vector for malware, in a way that a permissions dialog would not  
meaningfully address.


Regards,
Maciej

Re: [whatwg] Installed Apps

2009-07-27 Thread Aryeh Gregor

On Mon, Jul 27, 2009 at 8:42 PM, Robert O'Callahanrob...@ocallahan.org wrote:
 Indeed, the most difficult issue here is security and the permissions UI,
 which you haven't addressed at all.

One obvious solution would be to have installation UI like extensions,
but somewhat less scary (no signing requirements, countdowns, this
will explode your computer warnings, etc.).  These would effectively
be sandboxed browser extensions.  So you could view and disable any
background windows from the Add-Ons menu or browser equivalent.  Even
browsers that don't support extensions have some UI already for
plugins that could be partially reused.

I'm not clear how the UI requirements here are different from
persistent workers, though.  Those also persist after the user
navigates away, right?

 Maybe you could have a browser window containing regular tabs, but presented
 differently, with just icons and titles in some sort of tray, so users can
 see which applications are running in the background?

So the browser becomes still more of a mini-OS.  Tabs already usurped
some of the window manager's functionality, and now we're talking
about having system trays too. What next, a clock in the corner?  :)

I'm not saying this is bad, necessarily, but it's something to keep in
mind.  We have an operating system independent from the browser that
provides general-purpose process management for a reason.  (Unless
maybe you use Chrome OS!)

Re: [whatwg] Installed Apps



On Jul 27, 2009, at 7:13 PM, Aryeh Gregor wrote:


I'm not clear how the UI requirements here are different from
persistent workers, though.  Those also persist after the user
navigates away, right?


Persistent workers are even more of a security risk, since they are  
supposed to persist even after the browser has been restarted, or  
after the system has been rebooted. Persistent workers should be  
renamed to BotNet Construction Kit.


Regards,
Maciej

Re: [whatwg] Installed Apps

2009-07-27 Thread Aryeh Gregor

On Mon, Jul 27, 2009 at 9:39 PM, Maciej Stachowiakm...@apple.com wrote:
 Persistent workers are even more of a security risk, since they are supposed
 to persist even after the browser has been restarted, or after the system
 has been rebooted. Persistent workers should be renamed to BotNet
 Construction Kit.

Surely this proposal also would have the pages run even after the
browser has been restarted, or the system rebooted?  It was suggested
that ideally they'd continue running even when the browser has been
shut down!

Re: [whatwg] DOMTokenList is unordered but yet requires sorting

2009-07-27 Thread Sylvain Pasche

On Tue, Jul 28, 2009 at 2:17 AM, Ian Hicksoni...@hixie.ch wrote:
 On Sun, 12 Jul 2009, Jonas Sicking wrote:
 
  Oh, I have forseen that. Is it really necessary to remove duplicates
  ? I imagine DOMTokenList to be similar to what can be achieved with a
  String.split(), but then it would be just more duplicate
  functionality.
 
  If we don't remove duplicates, then things like the .toggle() method
  could have some quite weird effects.

 Such as?

 Such as .length changing by more than 1 after a call to .toggle().

I guess that couldn't have happened, because .length counted only the
unique tokens.

 I definitely think it'd be worth avoiding the code complexity and perf
 hit of having the implementation remove duplicates if they appear in the
 class attribute given how extremely rare duplicates are.

 Fair enough. I've made DOMTokenList not remove duplicates.

ok, I realize now that this is about the duplicates in .length and item().

By the way, preserving duplicates shouldn't be much code complexity if
I'm not mistaken.

The only required code change would be to use a hashset when parsing
the attribute in order to only insert unique tokens in the token
vector. Then DOMTokenList.length would return the token vector length
and .item() get the token by index. I don't think anything actually
depends on keeping duplicate tokens in the token vector. Then there
would be a small perf hit when parsing attributes with more than one
token.

 To summarize:
 pros: simpler spec algorithms, simpler implementation
 cons: less whitespace preservation, small perf hit during tokenization

 I don't know if I'm missing something. Does this sound reasonable?

 It ends up being not much simpler since you still have to deal with direct
 changes to the underlying string, as far as I can tell.

I don't think changing the underlying string is related to that
algorithm (from an implementation point of view). On setting, the
tokens would be deleted and the attribute parsed again.

 On Mon, 13 Jul 2009, Jonas Sicking wrote:

 I do agree that the spec seems to go extraordinary far to not touch
 whitespace. Normalizing whitespace when parsing is a bad idea, but once
 the user modifies the DOMTokenList, I don't see a lot of value in
 maintaining whitespace exactly as it was.

 Ian: What is the reason for the fairly complicated code to deal with
 removals? At least in Gecko it would be much simpler to just regenerate
 the string completely. That way generating the string-value could just
 be dropped on modifications, and regenerated lazily when requested.

 In general, I try to be as conservative as possible in making changes to
 the DOM. Are the algorithms really as complicated as you're making out?
 They seem pretty trivial to me.

The remove() algorithm is about 50 lines with whitespace and comments.
After all, that's not a big cost and I guess that preserving
whitespace may be closer to what DOMTokenList API consumers would
expect.

Sylvain

Re: [whatwg] New HTML5 spec GIT collaboration repository

Geoffrey Sneddon wrote:
 Manu Sporny wrote:
 3. Running the Anolis post-processor on the newly modified spec.
 
 Is there any reason you use --allow-duplicate-dfns? 

Legacy cruft. There was a time that I had duplicate dfns while
attempting to figure something else out. The latest commit to the master
branch has it removed - thanks :)

 Likewise, you
 probably don't want --w3c-compat (the name is slightly misleading, it
 provides compatibility with the CSS WG's CSS3 Module Postprocessor, not
 with any W3C pubrules).

Ah, I thought it was required to generate some W3C-specific HTML.
Removed as well, thanks for the pointer.

 On the whole I'd recommend running it with:
 
 --w3c-compat-xref-a-placement --parser=lxml.html --output-encoding=us-ascii

Done, those are the default flags that the HTML5 git repo uses now to
build all of the specifications.

 The latter two options require Anolis 1.1, which is just as stable as
 1.0. I believe those options are identical to how Hixie runs it through
 PMS.

Seeing as how building Python eggs and using Mercurial is scary for some
people, would it be okay if I included the Anolis app into the HTML5 git
repository? Your license allows this, but I thought I'd ask first in
case you wanted to collaborate on it in a particular way.

I can either track updates from the mercurial anolis source repo, or
give you commit access to the HTML5 git repo so that you can continue to
modify Anolis there. Let me know which you would prefer...

-- manu

-- 
Manu Sporny (skype: msporny, twitter: manusporny)
President/CEO - Digital Bazaar, Inc.
blog: Bitmunk 3.1 Released - Browser-based P2P Commerce
http://blog.digitalbazaar.com/2009/06/29/browser-based-p2p-commerce/

Re: [whatwg] DOMTokenList is unordered but yet requires sorting

2009-07-27 Thread Jonas Sicking

 On Mon, 13 Jul 2009, Sylvain wrote:

 This is a bit unrelated, but when looking at the DOMTokenList
 implementation, I had an idea about an alternative algorithm that could
 be easier to implement and could also be described more simply in the
 spec. The disadvantage is that the DOMTokenList methods mutating the
 underlying string wouldn't preserve existing whitespace (which the
 current algorithms try hard to do).

 The idea is that any DOMTokenList method that mutates the underlying string
 would do:
  - split the attribute in unique tokens (preserving order).
  - add or remove the token according to the method called.
  - rebuild the attribute string by concatenating tokens together (with a
 single space).

 At first, this may look like inefficient (if implemented naively).
 But I guess that implementations will usually keep both the attribute string
 and a list of tokens in memory, so they wouldn't have to tokenize the string
 on every mutation. There is a small performance hit during attribute
 tokenization: the list of tokens would need to keep only unique tokens. But
 after that, the DOMTokenList methods are very simple: length/item() don't 
 need
 to take care of duplicates, add/remove/toggle are simple list manipulation
 (the attribute string could be lazily generated from the token list when
 needed).

 To summarize:
 pros: simpler spec algorithms, simpler implementation
 cons: less whitespace preservation, small perf hit during tokenization

 I don't know if I'm missing something. Does this sound reasonable?

 It ends up being not much simpler since you still have to deal with direct
 changes to the underlying string, as far as I can tell.

On changes to the underlying string (using .setAttribute) you always
have to reparse from scratch anyway, so doesn't seem like that matters
here?

 On Mon, 13 Jul 2009, Jonas Sicking wrote:

 I do agree that the spec seems to go extraordinary far to not touch
 whitespace. Normalizing whitespace when parsing is a bad idea, but once
 the user modifies the DOMTokenList, I don't see a lot of value in
 maintaining whitespace exactly as it was.

 Ian: What is the reason for the fairly complicated code to deal with
 removals? At least in Gecko it would be much simpler to just regenerate
 the string completely. That way generating the string-value could just
 be dropped on modifications, and regenerated lazily when requested.

 In general, I try to be as conservative as possible in making changes to
 the DOM. Are the algorithms really as complicated as you're making out?
 They seem pretty trivial to me.

At least in the gecko implementation it's significantly more complex
than not normalizing whitespace. The way that the implementation works
in gecko is:

When a class attribute is set, (during parsing or using setAttribute)
we parse the classlist into a list of tokens. We additionally keep
around the original string in order to preserve a correct DOM
(actually, as an optimization, we only do this if needed).

This token-list is then used during Selector matching and during
getElementsByClassName.

So far I would expect most implementations to match this.

It would be very nice if DOMTokenList could be implemented as simply
exposing this internal token list. With the recent change to not
remove duplicates reading operations like .length and .item(n) maps
directly to reading from this token list. All very nice.

However writing operations such as .add and .remove requires operating
on the string rather than the internal token-list. The current spec
requires .remove to duplicate the tokenization process (granted, a
pretty simple task) and modify the string while tokenizing. It would
be significantly simpler if you could just modify the token-list as
needed and then regenerate the string from the token-list.

/ Jonas

Re: [whatwg] Installed Apps

2009-07-27 Thread Michael Kozakewich


On Mon, Jul 27, 2009 at 11:50 AM, Michael Davidsonm...@google.com wrote:


THE PROBLEM

snip
feature parity with desktop applications.
snip

A SOLUTION

snip
hidden HTML/JS page running
in the background that can access the DOM of visible windows.
snip

KNOWN ISSUES

As mentioned in earlier discussions about persistent workers,
permissioning UI is a major issue.

Isn't this what Google Gears was created to handle? I run Google Reader from 
my Quick Launch, like an application, and sometimes have it open all day. It 
notifies me (thought silently) of new items. The only improvement I could 
see would be to make it minimize to the tray and make popup notifications, 
as well as improve the offline features.
I'm sure web workers are as limited as they are for security concerns.

Re: [whatwg] DOMTokenList is unordered but yet requires sorting

2009-07-27 Thread Jonas Sicking

On Mon, Jul 27, 2009 at 8:24 PM, Sylvain Paschesylvain.pas...@gmail.com wrote:
 On Tue, Jul 28, 2009 at 2:17 AM, Ian Hicksoni...@hixie.ch wrote:
 On Sun, 12 Jul 2009, Jonas Sicking wrote:
 
  Oh, I have forseen that. Is it really necessary to remove duplicates
  ? I imagine DOMTokenList to be similar to what can be achieved with a
  String.split(), but then it would be just more duplicate
  functionality.
 
  If we don't remove duplicates, then things like the .toggle() method
  could have some quite weird effects.

 Such as?

 Such as .length changing by more than 1 after a call to .toggle().

 I guess that couldn't have happened, because .length counted only the
 unique tokens.

 I definitely think it'd be worth avoiding the code complexity and perf
 hit of having the implementation remove duplicates if they appear in the
 class attribute given how extremely rare duplicates are.

 Fair enough. I've made DOMTokenList not remove duplicates.

 ok, I realize now that this is about the duplicates in .length and item().

 By the way, preserving duplicates shouldn't be much code complexity if
 I'm not mistaken.

I take it you mean *removing* duplicates here, right?

 The only required code change would be to use a hashset when parsing
 the attribute in order to only insert unique tokens in the token
 vector. Then DOMTokenList.length would return the token vector length
 and .item() get the token by index. I don't think anything actually
 depends on keeping duplicate tokens in the token vector. Then there
 would be a small perf hit when parsing attributes with more than one
 token.

It's certainly doable to do this at the time when the token-list is
parsed. However given how extremely rare duplicated classnames are (I
can't recall ever seeing it in a real page), I think any code spent on
dealing with it is a waste.

 On Mon, 13 Jul 2009, Jonas Sicking wrote:

 I do agree that the spec seems to go extraordinary far to not touch
 whitespace. Normalizing whitespace when parsing is a bad idea, but once
 the user modifies the DOMTokenList, I don't see a lot of value in
 maintaining whitespace exactly as it was.

 Ian: What is the reason for the fairly complicated code to deal with
 removals? At least in Gecko it would be much simpler to just regenerate
 the string completely. That way generating the string-value could just
 be dropped on modifications, and regenerated lazily when requested.

 In general, I try to be as conservative as possible in making changes to
 the DOM. Are the algorithms really as complicated as you're making out?
 They seem pretty trivial to me.

 The remove() algorithm is about 50 lines with whitespace and comments.
 After all, that's not a big cost and I guess that preserving
 whitespace may be closer to what DOMTokenList API consumers would
 expect.

The code would be 7 lines if we didn't need to preserve whitespace:

nsAttrValue newAttr(aAttr);
newAttr-ResetMiscAtomOrString();
nsCOMPtrnsIAtom atom = do_GetAtom(aToken);
while (newAttr-GetAtomArrayValue().RemoveElement(atom));
nsAutoString newValue;
newAttr.ToString(newValue);
mElement-SetAttr(...);

If you spent a few more lines of code you could even avoid serializing
the token-list and call SetAttrAndNotify instead of SetAttr.

/ Jonas

Re: [whatwg] A New Way Forward for HTML5 (revised)

Peter Kasting wrote:
 On Mon, Jul 27, 2009 at 12:06 PM, John Foliot jfol...@stanford.edu
 mailto:jfol...@stanford.edu wrote:
 
 That said, the barrier to equal entry remains high:
 http://burningbird.net/node/28

I don't necessarily agree with most of Shelley's take on the situation.
I do agree with the point that we need to make contributing to HTML5
easier for those without the technical skills required for source
control. So, this response has nothing to do with the post that John
linked to or Shelley's take on the situation (just making those points
clear).

 I'm beginning to suspect that this whole line of conversation is
 specific to RDFa, which is a discussion I never took part in.

No, it is not specific to RDFa. If it were specific to RDFa, I would
have said that it was specific to RDFa and wouldn't have gone to the
trouble of writing the Restructuring HTML5 document.

The RDFa discussion triggered my current thinking on how this spec is
being put together, the XHTML2 work being halted added to the concern,
others (both inside and outside WHAT WG) helped to focus the issues.
They are all aspects of the document, but are not end-goals.

Here's why I'm not that concerned about RDFa at this point in time:

Even if it isn't in the HTML5 specification: RDFa can be embedded,
as-is, in XHTML5.

There exists an HTML5+RDFa spec, and it will probably be published as a
WD. If this conversation was specific to RDFa, why would we go to the
trouble of creating tools to edit the specification when the end-product
(HTML5+RDFa) already exists?

As for the discussion on HTML5+RDFa - it's still going on, if you'd like
to provide constructive criticism or feedback of any kind.

-- manu

-- 
Manu Sporny (skype: msporny, twitter: manusporny)
President/CEO - Digital Bazaar, Inc.
blog: Bitmunk 3.1 Released - Browser-based P2P Commerce
http://blog.digitalbazaar.com/2009/06/29/browser-based-p2p-commerce/

Re: [whatwg] New HTML5 spec GIT collaboration repository