Re: [whatwg] Comparison of XForms-Tiny and WF2

2007-01-27 Thread Geoffrey Sneddon


On 27 Jan 2007, at 02:17, Elliotte Harold wrote:


Matthew Raymond wrote:


   This specification is in no way aimed at replacing XForms 1.0
[XForms], nor is it a subset of XForms 1.0.


I agree that it's not a subset of XForms 1.0, but the first claim  
is pure FUD. Web Forms 2.0 happened precisely because some people  
didn't like XForms 1.0 and wanted to replace it with something they  
liked better. I'm not saying they're wrong, or that their spec is  
worse, but don't kid yourself about what's going on here.


It's not replacing it, as XForms 1.0 MUST be in an XML document,  
whereas WF2 can be put in an HTML document. Both, IMO, have very  
different use-cases.


- Geoffrey Sneddon




Re: [whatwg] The m element

2007-02-08 Thread Geoffrey Sneddon


On 8 Feb 2007, at 15:23, Leons Petrazickis wrote:


In the Western world, the standard for highlighting is a neon yellow
background. I submit that a much better name for m is hi
(hilite, highlite, highlight). People don't necessarily mark
text much -- if anything, mark implies underlining, circling, and
drawing arrows -- but they do highlight. In university, I often saw
students perched with their notes and a highlighter, marking important
sections. The semantic meaning is to draw attention for later review.


In my eyes such an element is presentational – a more generic  
element, but one with semantic meaning, like m is far more relevant  
(although it may well be a good idea to suggest it be rendered as  
highlighted).



- Geoffrey Sneddon




[whatwg] Expected behaviour when a base is within an innerHTML fragment

2007-02-11 Thread Geoffrey Sneddon

To take this from a discussion last month on atom-syntax:

What is meant to happen if you set innerHTML of a div where the set  
value has both a base and an a?


- Geoffrey Sneddon




Re: [whatwg] Expected behaviour when a base is within an innerHTML fragment

2007-02-11 Thread Geoffrey Sneddon

On 11 Feb 2007, at 11:37, Jorgen Horstink wrote:

On Feb 11, 2007, at 12:01 PM, Geoffrey Sneddon wrote:


To take this from a discussion last month on atom-syntax:

What is meant to happen if you set innerHTML of a div where the  
set value has both a base and an a?




first of all the base element can only be inserted in HTML  
documents.


That's perfectly fine… If you have control over the content being  
inserted.


The spec states that there can only be one base element. The  
base element must  be used before any elements that use relative  
URI's.


Sure, there MUST only be one, and in head, but as the parsing  
section dictates, if there is one in body it gets moved into  
head. It also, as stands, leaves it possible for the parser to  
place multiple base elements in head.




If the insertion mode is in body handle the token as follows:
  A start tag token whose tag name is one of: base, link,  
meta, title
  Parse error. Process the token as if the insertion mode had  
been in head. [1]


So inserting a base element in the body results in a parse error.

[1] http://www.whatwg.org/specs/web-apps/current-work/#how-to0


As Anne has already said, the spec says how to deal with parse errors  
(they aren't fatal errors, as parsing continues as normal). Also, as  
what you quote says, the element gets inserted in head.


The point is whether it:
	a) Gets inserted into the head, and changes all the links in the  
document.
	b) Appears in some magic place, and changes the links in the HTML  
fragment.

c) Gets ignored.

I'm personally in favour of b), as using the normal parsing rules  
(placing it in head) may well end up changing more than what is  
wanted. I'll do some testing of current implementations later.


- Geoffrey Sneddon

(I accidentally sent this to just Jorgen! Sorry!)



Re: [whatwg] Expected behaviour when a base is within an innerHTML fragment

2007-02-11 Thread Geoffrey Sneddon


On 11 Feb 2007, at 15:11, Geoffrey Sneddon wrote:


The point is whether it:
	a) Gets inserted into the head, and changes all the links in the  
document.
	b) Appears in some magic place, and changes the links in the HTML  
fragment.

c) Gets ignored.

I'm personally in favour of b), as using the normal parsing rules  
(placing it in head) may well end up changing more than what is  
wanted. I'll do some testing of current implementations later.


So… the testing:

For reference, I'll note the behaviour as such:
1: Changes all links in the document.
2: Changes links in HTML fragment.
3: Changes nothing.

Test 1: http://software.hixie.ch/utilities/js/live-dom-viewer/?%3C% 
21DOCTYPE%20html%3E%0A%3Cscript%20type%3D%22text/javascript%22%3E% 
0Afunction%20insert_base%28%29%0A%7B%0A%09document.getElementById%28% 
22insert%22%29.innerHTML%3D%22%3Cbase%20href%3D%27http%3A// 
example.org/%27%3E%3Ca%20href%3D%27test%27%3Etest2%3C/a%3E%22%3B%0A%7D 
%0A%3C/script%3E%0A%3Cbase%20href%3D%22http%3A//example.com/%22%3E%0A% 
3Cp%3E%3Ca%20href%3D%22test%22%3ETest%3C/a%3E%3C/p%3E%0A%3Cp%20id%3D% 
22insert%22%3E%3Ca%20href%3D%22javascript%3Ainsert_base%28%29%22% 
3Einsert%3C/a%3E%3C/p%3E%0A%3Cp%3E%3Ca%20href%3D%22test%22%3ETest%3C/a 
%3E%3C/p%3E


Safari 2.0.4/419.3: (1) Inserted in DOM (in the innerHTML location).
Firefox 2.0.0.1: (3) Inserted in DOM (in the innerHTML location).
IE/Mac 5.2.3: (2) (anyway to view the DOM tree?)
Opera 9.10: (1) DOM Snapshot for some reason isn't working.
IE6/Win: (2) The new base never appears in DOM, but the full  
absolute URLs are in the DOM.
IE7/Win: (3) The new base never appears in DOM, but the full  
absolute URLs are in the DOM.


Test 2 (this uses onclick to avoid escaping it as a URI): http:// 
software.hixie.ch/utilities/js/live-dom-viewer/?%3C%21DOCTYPE%20html% 
3E%0A%3Cbase%20href%3D%22http%3A//example.com/%22%3E%0A%3Cp%3E%3Ca% 
20href%3D%22test%22%3ETest%3C/a%3E%3C/p%3E%0A%3Cp%20id%3D%22insert%22% 
3E%3Ca%20onclick%3D%22document.getElementById%28%26quot%3Binsert% 
26quot%3B%29.innerHTML%3D%26quot%3B%3Cbase%20href%3D%27http%3A// 
example.org/%27%3E%3Ca%20href%3D%27test%27%3Etest2%3C/a%3E%26quot%3B% 
22%3Einsert%3C/a%3E%3C/p%3E%0A%3Cp%3E%3Ca%20href%3D%22test%22%3ETest% 
3C/a%3E%3C/p%3E


Results the same as above.

In conclusion, Safari and Opera change all the links, IE5/Mac and IE6/ 
Win both change links within the fragment, and Firefox and IE7/Win  
don't change any links.



- Geoffrey Sneddon




Re: [whatwg] base versus xml:base

2007-03-02 Thread Geoffrey Sneddon


On 2 Mar 2007, at 19:25, Keryx Web wrote:


Anne van Kesteren skrev:
I think base should also be allowed in XML documents. It  
simplifies the language, it already needs to be supported and  
base is able to set Document.baseURI where xml:base can at most  
set Document.documentElement.baseURI. (Document.baseURI influences  
how XMLHttpRequest works for instance.)
The base element section should probably also talk about what  
happens when you modify the .href attribute.


And today the base element already works in at least FFox and Opera  
also when content is sent as true XHTML 1.0, so this would not  
really change anything but the spec.


XHTML 1.0/1.1 doesn't allow xml:base, though, so base is the only  
way to set a base URL within the document.



- Geoffrey Sneddon




Re: [whatwg] video element proposal

2007-03-04 Thread Geoffrey Sneddon


On 4 Mar 2007, at 14:08, Maik Merten wrote:


- MPEG4: This is most common in forms of DivX and XviD. Predecessor of
H.264. As usual there's patent pool licensing involved. This means  
that
albeit XviD is open sourced it's not really free due to patent  
licensing

issues.


That's wrong – H.264 is MPEG4 Part 11 – it's part of the MPEG4 spec.

I think we need to look at why the MPEG standards see near universal  
support and use: as you say, parts of MPEG4 are highly efficient  
(such as H.264 and AAC), whereas alternatives of things like Theora  
aren't anywhere near efficient. Also note that patents haven't  
stopped the web in the past (see: GIF).


I really believe that this is too political, as history has shown  
people will use whatever formats can be created easily, and are well  
supported. It could be perfectly possible that anything wanting to  
implement the spec is put off by needing to support a single format  
that (almost) nobody uses.



- Geoffrey Sneddon




Re: [whatwg] video element proposal

2007-03-04 Thread Geoffrey Sneddon


On 4 Mar 2007, at 14:31, Geoffrey Sneddon wrote:


On 4 Mar 2007, at 14:08, Maik Merten wrote:

- MPEG4: This is most common in forms of DivX and XviD.  
Predecessor of
H.264. As usual there's patent pool licensing involved. This means  
that
albeit XviD is open sourced it's not really free due to patent  
licensing

issues.


That's wrong – H.264 is MPEG4 Part 11 – it's part of the MPEG4 spec.


Slight correction of myself, that should read: H.264 is MPEG4 Part 10.


- Geoffrey Sneddon




Re: [whatwg] base versus xml:base

2007-03-05 Thread Geoffrey Sneddon


On 5 Mar 2007, at 21:07, Keryx Web wrote:


Geoffrey Sneddon wrote:
 XHTML 1.0/1.1 doesn't allow xml:base, though, so base is the  
only   way to set a base URL within the document.


In what way would the XHTML 1.0/1.1 spec **disallow** the use of  
this element from the xml namespace? It's not *part of* the spec,  
but that's a different matter, right?



xml:lang and xml:base are the actual attribute names – the XML  
namespace exists so they work within namespace aware parsers (as XML- 
Names is a separate spec that extends XML) – therefore, it must be  
explicitly allowed within the DTD (like xml:lang is).



- Geoffrey Sneddon




Re: [whatwg] Configure Apache to send the right MIME type for XHTML

2007-03-07 Thread Geoffrey Sneddon


On 7 Mar 2007, at 17:07, Anne van Kesteren wrote:

If you're after the fact that browsers don't sniff for XML in text/ 
html that's because the old HTML WG said so (there's a pointer  
somewhere out there) and changing that now is impossible given how  
many authors got XML as text/html completely wrong.


http://lists.w3.org/Archives/Public/www-html/2000Sep/0024.html –  
that's the post Anne is referring to (I know of no other time that  
the HTML WG have said anything on this issue).



- Geoffrey Sneddon




Re: [whatwg] Using the HTML5 DOCTYPE as a new quirksmode switch

2007-03-10 Thread Geoffrey Sneddon


On 10 Mar 2007, at 13:43, Elliotte Harold wrote:


Alexey Feldgendler wrote:


The tutorials will just say Use !DOCTYPE html.


What are those of us who wish to use XML tools on our documents  
supposed to use? We will need a real DTD at some point, to declare  
the entities if nothing else. We will not be able to use !DOCTYPE  
html.


Then you're still relying on the UA reading the DTD, which it doesn't  
have to. What use is a DTD if it doesn't need to be read and has no  
nominative value?



- Geoffrey Sneddon




[whatwg] The input stream issues

2007-03-11 Thread Geoffrey Sneddon
From implementing parts of the input stream (section 8.2.2 as of  
writing) yesterday, I found several issues (some of which will show  
the asshole[1] within me):


	- Within the step one of the get an attribute sub-algorithm it says  
start over – is this start over the sub-algorithm or the whole  
algorithm?
	- Again in step one, why do we need to skip whitespace in both the  
sub-algorithm and at section one of the inner step for meta tags?
	- In step 11, when we have anything apart from a double/single quote  
or less/greater than sign, we add it to the value, but don't move the  
position forward, so when we move onto step 12 we add it again.
	- In step 3 of the very inner set of steps for a content attribute  
in a meta tag, is charset case-sensitive?
	- Again there, shouldn't we be given unicode codepoints for that (as  
it'll be a unicode string)?


- Geoffrey Sneddon

[1]: http://diveintomark.org/archives/2004/08/16/specs


Re: [whatwg] Versioning (was: Re: Using the HTML5 DOCTYPE as a new quirksmode switch)

2007-03-14 Thread Geoffrey Sneddon


On 14 Mar 2007, at 15:16, liorean wrote:


This is a switch out of backwards-compatibility-hell for a single
specific browser they are asking for, not something any other browser
vendor should have to worry about.


Other browsers introduced quirks mode to match buggy behaviour of  
others – what's to say that won't happen here, so that other browsers  
have an IE/Win DOM mode, which would therefore require a switch?



- Geoffrey Sneddon




Re: [whatwg] Video proposals

2007-03-17 Thread Geoffrey Sneddon


On 16 Mar 2007, at 23:58, Håkon Wium Lie wrote:


Also sprach Robert Brodrecht:

I'd rather make video and audio optional so that those who  
cannot

support these Ogg on these elements (for whatever reason) can still
comply with the spec. They can also support proprietary codecs  
through

object.


Do you mean make the elements themselves optional to support?


Yes. If a vendor, for some reason, is unable to support the Ogg
codecs, I think it's better that they (a) do not support video, than
(b) they support video with proprietary codecs only.

Interoperability has more value than conformace.


I think forcing browsers to support a codec when it is outdated is  
wrong. I don't want WA 1.0 to end up like RSS 2.0, having multiple  
versions incompatible with one another (in WA1.0's case different  
versions requiring different codecs).



- Geoffrey Sneddon




[whatwg] IE/Win treats backslashes in path as forward slashes

2007-04-11 Thread Geoffrey Sneddon
Looking through the spec again, there is nothing about backslashes in  
URI's path being treated as a forward slash, behaviour needed for  
compatibility for quite a few websites.


- Geoffrey Sneddon




Re: [whatwg] additional empty elements

2007-05-02 Thread Geoffrey Sneddon


On 1 May 2007, at 20:21, Brenton Strine wrote:


However, if I then wanted to add additional special
styling to the first and third div, (e.g.. a border and
background color) it is less graceful. I could add style
attributes, but that would be wasteful if I want to do
this on a large scale. Multiple classes would be
confusing.

A nice solution would be the addition of a few div tags.
(e.g. div2, div3, div4 and div5.) Then you could
do something like this:

style
div1 {text-indent:0px;}
div2 {text-indent:10px;}
div3 {text-indent:20px;}
/style


Why not:

!DOCTYPE html
style
.first {
color: red;
}
.first + div {
text-indent: 10px;
}
.first + div + div {
text-indent: 20px;
color: blue;
}
/style
div class=firstIndent 0/div
divIndent 1/div
divIndent 2/div
div class=firstIndent 0/div
divIndent 1/div
divIndent 2/div




Re: [whatwg] The issue of interoperability of the video element

2007-06-25 Thread Geoffrey Sneddon


On 25 Jun 2007, at 13:21, Ivo Emanuel Gonçalves wrote:


According to Wikipedia,

ATT is trying to sue companies such as Apple Inc. over alleged
MPEG-4 patent infringement.[1][2][3]

I would be fascinated to see a statement from Apple, Inc. regarding  
this.


Seeming they are already under risk from what they already support,  
what advantage do Apple get by supporting more codecs, therefore  
opening up themselves to further risks?



It's also quite interesting that different portions of MPEG-4,
including different sections of video and audio are licensed
separately, so what this means is that any vendor willing to support
MPEG-4 for video and audio has to locate every patent holder and
pay them.


No, they don't, it all goes through MPEG-LA.


Oh, and will you look at this, Apple, Inc. holds one the patents!  US
6,134,243 [4].  So Apple gets money for every single license sold.
How nice.  They are attempting to lock vendors into MPEG-4 and get
money from licenses in the process.  Apple, Inc. is no better than
Microsoft.


So a company which owns a patent on a standard that can bought and  
read at freedom is just as bad as a company which owns a patent on a  
standard that has absolutely no public documentation? Also, a large  
part of this topic has been around H.264, Apple holds no known  
patents affecting H.264.



- Geoffrey Sneddon




Re: [whatwg] The issue of interoperability of the video element

2007-06-26 Thread Geoffrey Sneddon


On 26 Jun 2007, at 00:57, Silvia Pfeiffer wrote:


So a company which owns a patent on a standard that can bought and
read at freedom is just as bad as a company which owns a patent on a
standard that has absolutely no public documentation?


If you're talking about Ogg Theora, then you've got your facts wrong.
First of all, Ogg Theora is not owned by a company.


So a company [Apple] which owns a patent on a standard that can  
bought and read at freedom [MPEG4] is just as bad as a company  
[Microsoft] which owns a patent on a standard that has absolutely no  
public documentation [WMA/WMV]?


- Geoffrey Sneddon




Re: [whatwg] The issue of interoperability of the video element

2007-06-26 Thread Geoffrey Sneddon


On 26 Jun 2007, at 17:46, Maik Merten wrote:

* The spec can be practical about implementing the video tag  
and
  specify H.263 or MPEG4 as a baseline. Existing multimedia  
toolkits
  can be reused in implementation and thus all browsers can  
support

  the standard. Users will use the format thanks to ubiquitous
  support. The tax will be a non-issue in most cases despite
  leaving a bad taste in the standard committee's mouth. Up and
  coming browsers can choose not to implement that part of the
  standard if they so choose or piggyback on an existing media
  player's licensing.


Free Software like Mozilla cannot implement MPEG4 or H.263 and still
stay free. The tax *is* an issue because you can't buy a community
license that is valid for all uses.

Plus even if you implement H.263 or MPEG4 video - what audio codec
should be used with that? Creating valid MPEG streams would mean  
using a
MPEG audio codec - that'd be e.g. MP3 or AAC. Additional licensing  
costs

and additional un-freeness.

Don't get me wrong: MPEG technology is nice and well performing - but
the licensing makes implementations in free software impossible (or at
least prevents distribution in e.g. Europe or North America).


Under the current spec it is merely a SHOULD — you can have an  
implementation of the spec that omits it. MPEG4 and WMV are the  
current de-facto standards. We should really just pave the cowpaths  
here, meaning those are the real two options. WMV has absolutely no  
publicly available documentation, so it makes no sense to reference  
that. MPEG4 has publicly available documentation, but is patent- 
encumbered. MPEG4 looks better on grounds that it is at least  
implementable by people outside of MS without reverse engineering it  
themselves.



- Geoffrey Sneddon




[whatwg] Implementation + Test Cases Available For Numbers Subsection of Common Microsyntaxes

2007-07-12 Thread Geoffrey Sneddon
Now my review of this subsection is complete, it is now worthwhile  
publicising my 1:1 PHP implementation of the HTML 5 algorithms,  
including the numbers subsection at http://geoffers.no-ip.com/svn/ 
php-html-5-direct/src/trunk/numbers.php. There are also test cases  
(that follow the spec even when there are issues with it) at http:// 
geoffers.no-ip.com/svn/php-html-5-direct/tests/numbersTest. Results  
for currently shipping UAs (esp. browsers) would be greatly welcomed.


- Geoffrey Sneddon




[whatwg] RFC 2732 reference unneeded

2007-08-11 Thread Geoffrey Sneddon

#terminology:
For readability, the term URI is used to refer to both ASCII URIs  
and Unicode IRIs, as those terms are defined by RFC 3986 and RFC  
3987 respectively, and as modified by RFC 2732.




RFC 2732 is irrelevant, as URIs as of RFC 3986 and IRIs as of RFC  
3987 define how to deal with IPv6 addresses. RFC 2732 is noted as  
obsoleted by RFC 3986.



- Geoffrey Sneddon




Re: [whatwg] several messages about a way to disable referer headers for links

2007-11-06 Thread Geoffrey Sneddon


On 4 Nov 2007, at 12:40, Anne van Kesteren wrote:

On Sat, 03 Nov 2007 18:27:50 +0100, Krzysztof ??elechowski [EMAIL PROTECTED] 
 wrote:

Dnia 03-11-2007, sob o godzinie 08:42 +, Ian Hickson napisa??(a):
Ok, I've added a rel value similar to nofollow called  
noreferer that

does this.


While we are unable correct the spelling of referer, we certainly  
need

not duplicate it for noreferrer.  There must be some end to this
self-humiliation.


I think it's way better to stay consistent. Especially as the  
feature affects the Referer (sic) header.


I too think Anne is right here — there are enough things that are  
inconsistent in the web already. Don't add another thing that requires  
me to think. I'll just make mistakes. A markup language should not  
require me to think — it should reflect logical structure.  
Importantly, outwith the structure, logic dictates contextual  
consistency (even if that goes against being consistent with other  
contexts).



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] Removal off Ogg technology

2007-12-11 Thread Geoffrey Sneddon


On 11 Dec 2007, at 15:33, Wilson Michaels wrote:


In reference to:
http://html5.org/tools/web-apps-tracker?from=1142to=1143

I am a retired software developer who is outraged that Ogg
technology has been removed from HTML5. It must be
reinstated as a should option so that the world is not
held hostage to proprietary implementations of media
technologies. Proprietary technologies eventually are used
to limit inovation and prevent entry of other thechnologies
that threaten the proprietary company in some way. We don't
need another MP3 fiasco.


What difference is there between a SHOULD that few, if any, major  
companies implement, and one that doesn't exist? The spec will never  
recommend any format that cannot be freely (as in beer) be implemented  
safely by developers (i.e., without risking being sued). Also, MP3 is  
not a proprietary standard: you can go out and buy a copy of the spec  
if you wish, and pay any patent charges due. You still, as with  
anything invented within the last 20 years (including Ogg/Vorbis/ 
Theora), run the risk of a submarine patents.



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] Removal of Ogg is *preposterous*

2007-12-11 Thread Geoffrey Sneddon


On 11 Dec 2007, at 18:09, Manuel Amador (Rudd-O) wrote:


Fact: Vorbis is the *only* codec whose patent status has been widely
researched, nearly to exhaustion.  Repeating the same FUD over and  
over again
(which you just did) may lead the world to believe this to be false,  
but it's
TRUE.  You should at least have talked to Monty @ Xiph before  
jumping to rash

conclusions.


So undisclosed patents have been looked at? How?


--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] OGG in HTML5

2007-12-11 Thread Geoffrey Sneddon


On 11 Dec 2007, at 16:20, alex wrote:

I am a webdeveloper and a fierce supporter of opensource. I was  
under the impression the standards were being designed in the same  
opensource spirit, but I may have been wrong.


Standards are developed inline with the policies of the organisations  
they are developed by. http://www.w3.org/Consortium/Process/  
describes the W3C process document. The issue here is that the chairs  
think the reasons given for not publishing a working draft are strong  
enough (i.e., it is the strength of the arguments, not the number in  
favour of the arguments that is important).


Setting OGG as the de facto standard is the best idea i've heard in  
a long time,


How can you set a de-facto standard? By the very meaning of de-facto,  
you cannot. We can set a de-jure standard, but not a de-facto one.


and now it's all coming down because a few companies (some of which  
are known for their vendor lock-in tactics) want to keep their empire.


No, it is coming down because a few companies don't want to take the  
risk of being sued for submarine patents which might exist for Ogg/ 
Vorbis/Theora. Do you want to pick up the bill for patent  
infringement? MS has to pay 1.52 billion USD for (submarine) patent  
infringement covering MP3. Unsurprisingly, major companies don't want  
to take such a risk on a codec that has few advantages over current  
standards such as MPEG-4.


But why, then, are they happy to support MPEG standards? They already  
do: it had/has clear technical advantages to prior de-facto formats  
(the same cannot be said for Theora, which is less efficient than  
MPEG-4). They have already taken the risk to support it, and people  
have already had the chance to sue them, and that has not yet  
happened. In the case of MS and Apple, they already support video  
formats at the OS level, and don't re-implement them within the  
browser (and have already therefore paid patent charges). Finally, the  
risk of supporting both is greater than supporting just one. There are  
already widespread de-facto standards, so that is what they will  
choose to support, not a container/codec combination that has  
(comparatively) very little content.


I am not saying that ogg should be enforced onto anyone, if nokia  
wishes to keep using a different format, no problem, but by making  
it a standard, we at least know that ogg will be supported by all  
(standards-compatible) browsers, and as such it can be deployed by  
those who are opposed to vendor lock-in or monopoly positions.


It won't be supported by all (currently) standards-compatible  
browsers. Apple, a major browser vendor, has said they don't intend to  
implement Ogg/Vorbis/Theora just because the spec requires it (i.e.,  
if you can get a critical mass of web content using it, you may well  
be able to get them to support it).


OGG is the choice of freedom, enabling that freedom for all  
webdevelopers is a must in my opinion, although in the same spirit,  
it can not be enforced upon anyone, therefor the original text  
stating it should instead of it must is probably the best way to  
go.


If it is a MUST, then the spec is irrelevant: it will be ignored by  
major companies. We must settle at a compromise between the two POVs  
to get the spec implemented at all; we otherwise run the risk of major  
companies not implementing any part of the spec whatsoever, leaving us  
far worse off that we would be otherwise.


Also, if it a MUST everyone in the WG would be issuing a RF license  
covering any patents they hold covering Ogg/Vorbis/Theora to everyone  
else in the WG (as per http://www.w3.org/Consortium/Patent-Policy-20040205/#def-essential 
), which companies such as MS and Nokia have said they are unwilling  
to do.


As far as compromises go, there are several viable solutions,  
including MJPEG and H.261 (the latter is only slightly worse than  
Theora, and is so old (as of next year, even the revision to it will  
be 20 years old) that any and all patents have either expired or are  
invalid). This still leaves questions open regarding container format  
and audio (which I know less about, and won't comment so much on).


If you truly do want make no compromises yourself, you may be able to  
get the major browser manufacturers that are currently unwilling to  
implement Ogg/Vorbis/Theora to implement them by getting a critical  
mass of content out there. Bear in mind, though, that MS still does  
not support MPEG-4 out of the box (except for Zune), despite the huge  
amount of MPEG-4 content already out there.



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] Removal of Ogg is *preposterous*

2007-12-11 Thread Geoffrey Sneddon


On 11 Dec 2007, at 20:12, Manuel Amador (Rudd-O) wrote:


It was intended as meaning recognized in the sense of browsers
recognising them. No currently shipping browser recognises either Ogg
Vorbis or FLAC.


If I use EMBED on Konqueror pointing to an Ogg Vorbis file, I get a  
nice

player with streaming and everything.  Konqueror's shipping, isn't it?
There is at least *one* browser that already supports, through  
GStreamer, Ogg
in video tags.  I'd give you the link but it apparently fell off  
the end of
Planet GNOME so I can't find it...  Now hold on, it's not shipping,  
but that

doesn't mean it won't be shipping tomorrow.

What you actually wanted to say (but couldn't/didn't/were unwilling  
to) is:


No currently shipping browser by any of the major proprietary  
software

vendors support Ogg Vorbis or FLAC.


Nor any of the minor ones, nor most open source ones.

Also, I assume through Konqueror relying on GStreamer that Konqueror  
doesn't support it itself (or through a required dependancy, which is  
needed to actually conform to such a clause that existed). WebKit  
trunk also supports Ogg in video if you have the needed QT component  
(which is supporting it as much as Konqueror supports it). Opera 9.5  
beta has built in support for Ogg/etc. and supports nothing else.


There are still large questions about when Fx will support (which I  
assume from your later post is what you were referring to) video  
natively, though it may well be in Fx 3.0 in early '08.



It's just dollars.


Apple does not license Apple Lossless to anyone else AFAIK,


OK.  So they sell fewer iPods because iPods don't play Ogg Vorbis  
without

Rockbox.  Same outcome.


Oh, look, they are already losing custom through not supporting WMA.  
It doesn't look like they particularly care about that, does it?



and the
only standards that MPEG-LA collects money for that Apple receives  
any

share of whatsoever is MPEG-4 Systems and IEEE 1394 (Firewire).
Neither of these have anything to do with audio/video codecs. Saying
that Apple has a financial interest in wanting MPEG codecs mandated  
in

HTML 5 is totally untrue.


I didn't say Apple wanted MPEG codecs mandated in HTML 5, so don't  
put words
in my mouth or attempt to smoke-and-mirrors us with straw men.  This  
is
either a fumble on your part or an attempt to derail the discussion  
into

wreckland.


No, it is me trying to understand what you're meaning.

I said Apple doesn't want Ogg Vorbis because they don't control the  
tech, and
because they would very much rather have consumers prefer (in the  
sense of
being screwed with no choice) DRM-encumbered AAC (note it's not the  
codec,

but the controlling of the consumer that matters here).


AAC doesn't support DRM natively. It's a proprietary extension. iTunes  
has always ripped CDs by default into non-DRM-encumbered AAC (i.e., an  
open standard, and compatible with numerous players). Apple has never,  
anywhere where it has a choice, favoured DRM-encumbered standards.



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] Video codec requirements changed

2007-12-12 Thread Geoffrey Sneddon


On 12 Dec 2007, at 01:41, Maciej Stachowiak wrote:


1) maybe (I've heard game vendors cited, not sure which ones)


I know someone already posted a list, but it is used within all Unreal  
Engine 2.5 (i.e., UT 2004) and Unreal Engine 3 (i.e., UT 3) games  
(which I'm sure you can find a long list of games that use them on  
Wikipedia or elsewhere).



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] Ogg content on the Web

2007-12-12 Thread Geoffrey Sneddon


On 12 Dec 2007, at 14:23, David Gerard wrote:


FWIW, Wikipedia and Wikimedia Commons only allow unencumbered formats
on the site. Video MUST be Ogg Theora. Compressed audio better be Ogg.


Why must video just one of many unencumbered formats?

So far we have had zero patent trolls come calling. I wonder why  
that is.


Do you have enough money to pay a fine a similar size to what MS got  
last year? If you don't have enough money, they won't sue you. It  
isn't worth their time.



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] Ogg content on the Web

2007-12-12 Thread Geoffrey Sneddon


On 12 Dec 2007, at 17:44, David Gerard wrote:


On 12/12/2007, Geoffrey Sneddon [EMAIL PROTECTED] wrote:

On 12 Dec 2007, at 14:23, David Gerard wrote:


FWIW, Wikipedia and Wikimedia Commons only allow unencumbered  
formats
on the site. Video MUST be Ogg Theora. Compressed audio better be  
Ogg.



Why must video just one of many unencumbered formats?



Er, what are the others?


Technically speaking, Theora is actually unencumbered (it just has a  
RF license covering the patents from On2). Dirac is in a similar  
situation.


Apart from those two, the others I can think of are those that are in  
excess of twenty years old (and therefore their patents have expired),  
such as H.260.



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] Ogg content on the Web

2007-12-12 Thread Geoffrey Sneddon


On 12 Dec 2007, at 19:30, Maik Merten wrote:


Geoffrey Sneddon schrieb:

Apart from those two, the others I can think of are those that are in
excess of twenty years old (and therefore their patents have  
expired),

such as H.260.


I couldn't find anything insightful about H.260. Sure you don't mean
H.120, which is a 1982 video codec I couldn't find a current
implementation of?


Yeah. I always miscall it H.260 (as it is the precursor to H.261).


H.261, OTOH, is a 1990 standard and thus still a bit away from getting
absolutely free.


Though, by the time we reach LC, it may not be.


--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] The truth about Nokias claims

2007-12-14 Thread Geoffrey Sneddon


On 14 Dec 2007, at 07:15, Shannon wrote:



Ian, as editor, was asked to do this.  It was a reasonable request  
to reflect work in progress.  He did not take unilateral action.
Ok, not unilateral. How about 'behind closed doors?'. Why no open  
discussion BEFORE the change?


Please look back on the mailing list archives. There's been plenty of  
discussion about this before, and it's always ended up in the same  
loop: A group of people wanting nothing but Ogg/Theora/Vorbis, and  
another wanting one standard that all major implementers will support.



--
Geoffrey Sneddon
http://gsnedders.com/



[whatwg] +/- in SGML DOCTYPE (was: Re: The truth about Nokias claims)

2007-12-15 Thread Geoffrey Sneddon


On 15 Dec 2007, at 12:52, Benjamin Hawkes-Lewis wrote:


Krzysztof Żelechowski wrote:

Dnia 14-12-2007, Pt o godzinie 19:47 +0100, Maik Merten pisze:

Krzysztof Żelechowski schrieb:

Remember the - in DOCTYPE HTML?

Feel free to be more specific.

That prefix means that HTML DOCTYPE is not issued by an officially
recognised standards body.  If W3C were such an organisation, we  
would

have a + there instead.


I haven't bought the SGML specification to double-check, so feel  
free to quote from it if it says otherwise.


But from everything else I've read it simply means W3C has not  
registered a Public Text Owner Identifier with ISO. See also:


http://msdn2.microsoft.com/en-us/library/ms535242.aspx

http://www.is-thought.co.uk/book/sgml-6.htm#FPI

http://www.freebsd.org/doc/en_US.ISO8859-1/books/fdp-primer/sgml-primer-doctype-declaration.html

http://xml.coverpages.org/gca-pubidrls.html

http://xml.coverpages.org/fpiResolverFlynn.html

Any old organization can register as Public Text Owners, not just  
officially recognized standards body.


The - has nothing to do to do with W3C being (or not being)  
recognized as a standards body.



ISO 8879:1989 states that SGML public text owner identifier  
registration (i.e., those that start with a + instead of the  
unregistered -) is defined in ISO 9070, which I don't have a copy of.  
I can, however, quote the summary from ISO 8879:1989: These  
[registered owner identifiers] include standards body identifiers for  
national or industry standards organisations (similar to the ISO owner  
identifier), and unique codes that may have been assigned to  
organisations by other standards.


--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] HTML5 and URI Templates

2007-12-16 Thread Geoffrey Sneddon


On 16 Dec 2007, at 14:12, Julian Reschke wrote:


Henri Sivonen wrote:

On Dec 16, 2007, at 05:28, James M Snell wrote:
The gist of the idea (which I believe may have been brought up  
before
but I'm not certain) is to allow the use of a URI Template in  
place of
the form element action attribute, and to use form elements to  
provide

the replacement values, e.g.

form template=http://example.org{-prefix|/|foo}?bar={bar}
method=POST
Foo: input name=foo type=input 
Bar: input name=bar type=input
/form
What's the backward-compatibility story of this feature? (Both  
behavior of URI templates in legacy browsers and ensuring that  
existing content doesn't use braces.)


Braces are not allowed in URIs (in case somebody forgot :-). That's  
exactly why URI Templates can use them.


There are sites that rely on braces in URIs. You can't just go and  
change their meaning, breaking the sites, specs be damned. If RFC 3986  
defined what to do with non-conformant URIs, we wouldn't have this  
issue.



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] XHTML subtitle (was: [html5] r1156 - /)

2008-01-14 Thread Geoffrey Sneddon


On 14 Jan 2008, at 05:45, ianh wrote:

Add a subtitle to clarify the scope of the document for people who  
don't read the spec. (W3C version only.)


Is there any reason for this not to be in the WHATWG version as well?


--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] Some video questions

2008-01-29 Thread Geoffrey Sneddon


On 28 Jan 2008, at 23:32, Charles wrote:


The video element offers an interface to the native media
playback capabilities of the platform.


The browser platform (e.g. WebKit), the multimedia platform (e.g.  
QuickTime)

or the OS platform (e.g. Mac OS X)?


Whatever the browser chooses to use. In WebKit's case, this is the OS  
(so QT on OS X, DirectShow on Windows, and GStreamer on GTK). Presto  
(in Opera) provides its own decoder (for Ogg/Vorbis/Theora, likewise  
does Gecko.



It is not a plug-in mechanism and it is not suitable for embedding
things like Flash or Silverlight.


So for Safari on both Macintosh and Windows, is Apple's intent that  
video

will only work for formats supported by QuickTime?


Apple's intent, as far as I'm aware, is to use the natively supported  
multimedia support of a given environment (as WebKit isn't for  
multimedia). Also, as Henri has already said, QuickTime supports  
plugins itself.


And given that little internet content targets QuickTime, who  
exactly will

be using the video tag?


There is a _huge_ amount of content on the web that uses MPEG-4, which  
QuickTime supports (note that on Windows DirectShow doesn't support  
MPEG-4 out of the box, and AFAIK only supports MPEG-1 and WMV (for  
video)). There's also still a large amount of content that relies on  
the QuickTime container format (.mov), even if the content is MPEG-4  
(whose own container is based on the QT one).



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] Some video questions

2008-01-31 Thread Geoffrey Sneddon


On 31 Jan 2008, at 17:50, Charles wrote:

If it's that the SWF references a FLV, QuickTime Movies have been  
able to

reference media pretty much forever, and when you embed an ASX with
references with Windows Media content, you're still embedding video  
even

though the metafile happens to be a text file.


Whereas it is possible to get the video from a QuickTime container, it  
is not possible to get a FLV from a SWF, making it impossible to  
directly control the video. The video element exists to contain  
container formats (of which Flash is not one, though FLV is), and  
nothing else. Inserting a Flash file into a video element is similar  
to inserting an HTML file that happens to have a link to video: sure,  
it links to a video, but it does a billion other things too — it isn't  
in itself the video.



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] Clarification on hashed id reference

2008-02-20 Thread Geoffrey Sneddon


On 20 Feb 2008, at 19:47, Adele Peterson wrote:

I was looking at the definition of a valid hashed id reference, and  
I noticed some inconsistency.  The first sentence says the string  
must match the id attribute, but then the last parsing rule says  
that the string can match the id or name attributes of the element.   
If the parsing rule is correct, then should there be some rule for  
determining which attribute should get checked first?


It already says [r]eturn the first element — which attribute gets  
checked first is irrelevant. If you search by attribute, I guess you  
need to carry out both searches, then combine the results, order by  
tree order, and return the first.


And if the parsing rule is correct, maybe the initial description  
should mention the name attribute too.


It means exactly what it says: conformant documents cannot use @name,  
but parsers must look in @name. They serve identical purposes, so  
there's no reason to allow both in a document, but parsers must  
support both for compatibility.



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] several messages about content sniffing in HTML

2008-02-29 Thread Geoffrey Sneddon


On 29 Feb 2008, at 16:33, Julian Reschke wrote:


Geoffrey Sneddon wrote:
It seems like the HTTP spec should define how to handle that, but  
the HTTP working group has indicated a desire to not specify  
error handling behaviour, so I guess it's up to us.
IE and Safari use the first one, Firefox and Opera use the last  
one. I guess we'll use the first one.


Isn't the fact that FF and IE disagree here an indication that  
this doesn't need to be specified?
Things aren't specified well enough until I can write an HTTP UA  
that can work in the real world (which, as someone dealing with  
feeds, I can tell you need without question support for content- 
type sniffing) from reading specifications without having to  
reverse-engineer anything.

...


Doesn't seem to apply to this case.

A duplicate Content-Type header response indicates that the response  
is invalid.


And guess what? Users don't like error messages. I want to know how to  
deal with it without having to look elsewhere (from the spec).


Apparently, most browsers accept the response anyway, some of which  
picking the first value, others the second. Both behaviors seem to  
be acceptable to users.


So there's nothing you *need* to reverse engineer in this case.


A page (http://www.toledoblade.com/apps/pbcs.dll/section?Category=RSS01mime=XML 
) that I came across recently had:


Content-Type: XML
Content-Type: text/XML

Using the first would break badly. I guess it seems to work because of  
content-type sniffing on an unknown (and invalid) header (or, as many  
feed readers do, totally ignoring it, with the exception of any  
charset parameter). Without content-type sniffing, that HTML 5 now  
allows, you need the last.


But as James says: how do I know that which behaviour I choose doesn't  
matter until I reverse engineer browsers to discover that?



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] Proposal for a link attribute to replace a href

2008-02-29 Thread Geoffrey Sneddon


On 29 Feb 2008, at 01:29, Shannon wrote:


Geoffrey Sneddon wrote:
 While yes, you could rely on something like that, it totally  
breaks in any user agent without scripting support. Nothing else, to  
my knowledge, in HTML 5 leads to total loss of functionality without  
JavaScript whatsoever.


By total loss of functionality I meant something that is functionality  
provided by HTML itself (and not through CSS or some DOM API) which  
leads to the page being totally unusable.



Well nothing except global/session/database storage,


You already have the fallback for people without ECMAScript, so that  
works fine.



the irrelevant attribute,


So you can edit something which you otherwise couldn't. Oh well.  
Nothing breaks.



contenteditable,


Oh come on. Even IE supports this. This most certainly is backwards  
compatible.



contextmenu,


Again, this is a DOM API and can be recreated in ECMAScript (which, if  
you're try to use it at all, you know is enabled).



draggable,


Both IE and Safari have partial support for this already.


the video and audio elements, canvas


All three of these have fallback content, which is needed sometimes  
when a browser does support HTML 5 anyway.



and the connection interface.


Again, you know you have ECMAScript enabled already to be able to use  
this at all. Something similar could be done using XMLHttpRequest, if  
I am not mistaken.



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] several messages about handling encodings in HTML

2008-02-29 Thread Geoffrey Sneddon


On 29 Feb 2008, at 01:21, Ian Hickson wrote:


- Again there, shouldn't we be given unicode codepoints for that (as
it'll be a unicode string)?


Not sure what you mean.


This is just me being incredibly dumb. Ignore it.


On Sat, 26 May 2007, Henri Sivonen wrote:


The draft says:
A leading U+FEFF BYTE ORDER MARK (BOM) must be dropped if present.

That's reasonable for UTF-8 when the encoding has been established by
other means.

However, when the encoding is UTF-16LE or UTF-16BE (i.e. supposed  
to be
signatureless), do we really want to drop the BOM silently?  
Shouldn't it

count as a character that is in error?


Do the UTF-16LE and UTF-16BE specs make a leading BOM an error?

If yes, then we don't have to say anything, it's already an error.

If not, what's the advantage of complaining about the BOM in this  
case?


I don't see anything making a BOM illegal in UTF-16LE/UTF-16BE, in  
fact, the only mention I find of it with regards to either in Unicode  
5.0 is In UTF-16(BE|LE), an initial byte sequence (FE FF|FF FE) is  
interpreted as U+FEFF zero width no-break space.


I suppose the rational given for removing it is the section that  
follows D101 (e.g., When converting between different encoding  
schemes…UTF-8 byte sequences is not recommended by the Unicode  
Standard.).



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] Video

2008-04-02 Thread Geoffrey Sneddon


On 2 Apr 2008, at 16:55, Robert J Crisler wrote:

It will be very, very difficult to develop critical mass for content  
encoded in Theora (or Dirac), much less ubiquity. I'm not saying  
there's no point in trying. I applaud the effort, though I have  
misgivings about the W3C setting itself up as a video/audio  
standards organization when we already have the Motion Picture  
Experts Group.


I don't think anyone whatsoever is suggestion to create a new codec —  
we'd gain nothing by doing so.


But ... why not recommend that web developers encode in MPEG-4 AVC  
or Theora?


MPEG-4 has patent fees to be paid, making it impossible for Firefox or  
Konqueror (for example) to comply to that.


Theora has unknown patent status, and big companies are unwilling to  
implement it (as it has little pre-existing content, and it is no  
better than what they already have) lest they get sued due to some  
submarine patent.


At least that would give some direction out of the current morass.  
ISO/IEC standards, like AVC/h.264, are vastly preferable to single- 
vendor (non)standards from Adobe, MS and Real.


All the codecs that have publicly been looked at already have glaring  
issues with actually getting them interoperably used. We need  
something everyone is willing to implement. If people don't implement  
what we say, what we say is irrelevant.


Why should the W3C choose not create a better situation than the  
current one (which is a mess for developers and a mess for users),  
while continuing to work on the ideal?


There's a reason why the status quo is the status quo: different  
people willing to implement different things. One standard cannot  
force people to implement something they don't want to. We cannot just  
create a better situation: people have to actually do what we say to  
be in any better situation than we already are. One group can't  
implement specifications with known patents, and the other is  
unwilling to implement specifications with no known patents, due to  
submarine patent risks.



--
Geoffrey Sneddon
http://gsnedders.com/



[whatwg] Creating An Outline oddity

2008-06-14 Thread Geoffrey Sneddon
Having implemented the creating an outline algorithm (see http://pastebin.ca/1048202 
), I'm getting some odd results (the only TODO won't affect HTML  
4.01 documents such as the following issues).


Using `h1Fooh2Barh2Lol`, and looking at the final current  
section (this is the root sectioning element, body), it seems I  
correctly get the heading of it (Foo), but I only get one  
subsection: Bar. As far as I can see, my implementation follows what  
the spec says, so it looks as if this is an issue with the spec.


With HTML 5, the current_outlinee at the end is a td element, when it  
should be the body element. That really is rather odd.



--
Geoffrey Sneddon
http://gsnedders.com/


Re: [whatwg] Creating An Outline oddity

2008-06-18 Thread Geoffrey Sneddon


On 15 Jun 2008, at 04:06, Ian Hickson wrote:


On Sun, 15 Jun 2008, Geoffrey Sneddon wrote:


Having implemented the creating an outline algorithm (see
http://pastebin.ca/1048202), I'm getting some odd results (the only
TODO won't affect HTML 4.01 documents such as the following issues).

Using `h1Fooh2Barh2Lol`, and looking at the final current
section (this is the root sectioning element, body), it seems I
correctly get the heading of it (Foo), but I only get one  
subsection:

Bar. As far as I can see, my implementation follows what the spec
says, so it looks as if this is an issue with the spec.

With HTML 5, the current_outlinee at the end is a td element, when it
should be the body element. That really is rather odd.


I don't understand the markup you mean. Could you draw the DOM or  
provide

unambiguous markup for what you're describing? (I don't understand how
Foo is a heading but Bar is a section in your markup.)


The first issue is identical to http://lists.w3.org/Archives/Public/public-html/2008Mar/0032.html 
, which I bullied (sorry, asked) you in to fixing yesterday and is  
now fixed. The second issue was an implementation bug.



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] Pre, code and semantics in HTML5: Wishful thinking?

2008-07-24 Thread Geoffrey Sneddon


On 22 Jun 2008, at 21:22, Edward Z. Yang wrote:


To represent a block of computer code, the pre element can be used
with a code element; to represent a block of computer output the pre
element can be used with a samp element. Similarly, the kbd element
can be used within a pre element to indicate text that the user is to
enter.


The implication is that document authors are recommended to use
precode to wrap all of their programming code instead of a lone
pre, if they wish to be fully semantic. This feels needlessly  
verbose

and abusive of code, which traditionally has been used to mark
single-liners.


Well, that tradition is wrong under HTML 4.01 (pre tells visual user  
agents that the enclosed text is 'preformatted', whereas code  
'designates a fragment of computer code').



It also makes it extremely difficult to style pre as a block for code,
as the only semantic indication that the contents of the pre block are
computer code is its child. You'd end up having to say pre
class=codecode if you wanted to style pre as well.


There are lots of thing that are semantically desirable in HTML that  
can't be fulfilled using pre-existing CSS selectors. Continuing to  
style pre is no less ambiguous and risky as it was under what the  
traditional behaviour is.


At the same time, I still think the semantics of whether or not a  
pre

tag indicates a plaintext file, or a piece of ASCII art, or computer
code, is somewhat important. However, I think this information would  
be

more appropriately given as an attribute.


Why go against what HTML 4.01 does? It seems needless to change.


--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] image element

2008-07-30 Thread Geoffrey Sneddon


On 30 Jul 2008, at 08:17, Nicholas Shanks wrote:

So again, I ask for an image element to replace img. Benefits  
include:
- As video would cater for video/* MIME types, image would  
cater for

image/*


I don't see how this is a benefit over img.


In order of importance to me:

1. It's spelt correctly.
2. It's not an empty element.
3. It's spelt correctly.



Re: 2) — it is, and it has to be for backwards compatibility (it is  
changed to an img element in the tokenizer, though). In terms of 1 and  
3, how about starting with something that is completely wrong, not  
just an abbreviation, such as the Referer header in HTTP? Not that  
that can actually be changed, because things rely upon it…



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] [rest-discuss] HTML5 and RESTful HTTP in browsers

2008-11-18 Thread Geoffrey Sneddon


On 18 Nov 2008, at 16:41, Joshua Cranmer wrote:

(and if you retort XMLHTTPRequest, let me point out that I  
personally would have objected to injecting HTTP specifics into that  
interface, had I been around during the design phases)


XMLHttpRequest doesn't need to be XML, it doesn't need to be HTTP (FTP  
should work fine too in browsers IIRC), so all it really is is a  
generic request object.



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] Multi-block dicta within DIALOG

2008-11-23 Thread Geoffrey Sneddon


On 23 Nov 2008, at 20:11, Benjamin Hawkes-Lewis wrote:


I'm wondering whether:

dtJack White/dt
ddfoobarpbaz/ppquux/p/dd

is equivalent to or different to:

dtJack White/dt
ddpfoobar/ppbaz/ppquux/p/dd


Semantically equivalent, though different in the trees they produce  
(in the former foobar is a text node child of the dd element, in the  
latter it is a text node child of the first p element child of the dd  
element).



and whether

dtJack White/dt
ddfoobar/dd

is equivalent to or different to:

dtJack White/dt
ddpfoobar/p/dd


Same — semantically equivalent, though different in the trees they  
produce (in the former foobar is a text node child of the dd  
element, in the latter it is a text node child of the p element child  
of the dd element).



Does DD have an implicit P, much as it has an implicit Q/BLOCKQUOTE?


No: a run of text nodes (and phrasing content elements) is a  
paragraph, much like an explicit one created by the p element. http://www.whatwg.org/specs/web-apps/current-work/#paragraphs 
 details this in-depth.



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] Question regarding accessibility for img

2008-11-30 Thread Geoffrey Sneddon


On 30 Nov 2008, at 16:40, Pentasis wrote:


I notice that it says in the spec under the img-section:

There has been some suggestion that the longdesc attribute from  
HTML4, or some other mechanism that is more powerful than alt=,  
should be included. This has not yet been considered.


May I ask why it has not been considered (yet)?


Because there's an issues list of several thousand issues, and as such  
not all issues have been considered. If we could do everything at once  
we'd have a spec instantly. :)



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] Stability of tokenizing/dom algorithms

2008-12-14 Thread Geoffrey Sneddon


On 14 Dec 2008, at 21:55, Edward Z. Yang wrote:


Are there any specific differences that pose problems?


Not that I know of yet, since I haven't started on an implementation
yet. Which brings me back to my original question: how stable is  
section

8? I would rather not be chasing a moving target.


It's not really a moving target — what it is is largely constrained by  
the requirement to parse pre-existing documents (which rely on almost  
every possible bit of behaviour).


If you do start work on a PHP implementation, please do seriously  
consider adding it to the html5lib project (which currently contains  
Python and Ruby implementations) as MIT licensed — there are also a  
fair number of test cases there.



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] Byte-wise tokenization algorithm

2008-12-21 Thread Geoffrey Sneddon


On 21 Dec 2008, at 05:41, Ian Hickson wrote:

1. Given an input stream that is known to be valid UTF-8, is it  
possible
to implement the tokenization algorithm with byte-wise operations  
only?
I think it's possible, since all of the character matching parts of  
the

algorithm map to characters in ASCII space.


Yes. (At least, that's the intent; if you find anything that  
contradicts

that, please let me know.)


Indeed it is possible (or at least it certainly was a year and a half  
ago, but I have seen nothing change that would stop it).



2. Would such an implementation be conforming?


Looking just at parsing, yes, probably... But an HTML5 implementation,
according to the spec, must at a minimum support the UTF-8 and
Windows-1252 encodings, so the overall implementation might not  
depending

on exactly how this is done.


That should be no problem: just convert Windows-1252 to UTF-8 using  
strtr() (as it is a SBCS this is simple enough — doing the inverse is  
not) — see the attached file. Then all you need to do is normalize the  
character set name to match all aliases of Windows-1252 and UTF-8, as  
well as mapping ISO-8859-1 and US-ASCII (and all their aliases) to  
Windows-1252. http://bugs.simplepie.org/repositories/entry/sp1/trunk/create.php 
 does that (the only dependancy is for getting the file via HTTP,  
that can just be replaced with cURL if you wish to just require that).



--
Geoffrey Sneddon
http://gsnedders.com/
?php

/**
 * Converts a Windows-1252 encoded string to a UTF-8 encoded string	
 *
 * @copyright 2008 Geoffrey Sneddon
 * @license http://www.opensource.org/licenses/bsd-license.php BSD License
 * @param string $string Windows-1252 encoded string
 * @return string UTF-8 encoded string
 */
	
function windows_1252_to_utf8($string)	
{
static $convert_table = array(
\x80 = \xE2\x82\xAC,
\x81 = \xEF\xBF\xBD,
\x82 = \xE2\x80\x9A,
\x83 = \xC6\x92,
\x84 = \xE2\x80\x9E,
\x85 = \xE2\x80\xA6,
\x86 = \xE2\x80\xA0,
\x87 = \xE2\x80\xA1,
\x88 = \xCB\x86,
\x89 = \xE2\x80\xB0,
\x8A = \xC5\xA0,
\x8B = \xE2\x80\xB9,
\x8C = \xC5\x92,
\x8D = \xEF\xBF\xBD,
\x8E = \xC5\xBD,
\x8F = \xEF\xBF\xBD,
\x90 = \xEF\xBF\xBD,
\x91 = \xE2\x80\x98,
\x92 = \xE2\x80\x99,
\x93 = \xE2\x80\x9C,
\x94 = \xE2\x80\x9D,
\x95 = \xE2\x80\xA2,
\x96 = \xE2\x80\x93,
\x97 = \xE2\x80\x94,
\x98 = \xCB\x9C,
\x99 = \xE2\x84\xA2,
\x9A = \xC5\xA1,
\x9B = \xE2\x80\xBA,
\x9C = \xC5\x93,
\x9D = \xEF\xBF\xBD,
\x9E = \xC5\xBE,
\x9F = \xC5\xB8,
\xA0 = \xC2\xA0,
\xA1 = \xC2\xA1,
\xA2 = \xC2\xA2,
\xA3 = \xC2\xA3,
\xA4 = \xC2\xA4,
\xA5 = \xC2\xA5,
\xA6 = \xC2\xA6,
\xA7 = \xC2\xA7,
\xA8 = \xC2\xA8,
\xA9 = \xC2\xA9,
\xAA = \xC2\xAA,
\xAB = \xC2\xAB,
\xAC = \xC2\xAC,
\xAD = \xC2\xAD,
\xAE = \xC2\xAE,
\xAF = \xC2\xAF,
\xB0 = \xC2\xB0,
\xB1 = \xC2\xB1,
\xB2 = \xC2\xB2,
\xB3 = \xC2\xB3,
\xB4 = \xC2\xB4,
\xB5 = \xC2\xB5,
\xB6 = \xC2\xB6,
\xB7 = \xC2\xB7,
\xB8 = \xC2\xB8,
\xB9 = \xC2\xB9,
\xBA = \xC2\xBA,
\xBB = \xC2\xBB,
\xBC = \xC2\xBC,
\xBD = \xC2\xBD,
\xBE = \xC2\xBE,
\xBF = \xC2\xBF,
\xC0 = \xC3\x80,
\xC1 = \xC3\x81,
\xC2 = \xC3\x82,
\xC3 = \xC3\x83,
\xC4 = \xC3\x84,
\xC5 = \xC3\x85,
\xC6 = \xC3\x86,
\xC7 = \xC3\x87,
\xC8 = \xC3\x88,
\xC9 = \xC3\x89,
\xCA = \xC3\x8A,
\xCB = \xC3\x8B,
\xCC = \xC3\x8C,
\xCD = \xC3\x8D,
\xCE = \xC3\x8E,
\xCF = \xC3\x8F,
\xD0 = \xC3\x90,
\xD1 = \xC3\x91,
\xD2 = \xC3\x92,
\xD3 = \xC3\x93,
\xD4 = \xC3\x94,
\xD5 = \xC3\x95,
\xD6 = \xC3\x96,
\xD7 = \xC3\x97,
\xD8 = \xC3\x98,
\xD9 = \xC3\x99,
\xDA = \xC3\x9A,
\xDB = \xC3\x9B,
\xDC = \xC3\x9C,
\xDD = \xC3\x9D,
\xDE = \xC3\x9E,
\xDF = \xC3\x9F,
\xE0 = \xC3\xA0,
\xE1 = \xC3\xA1,
\xE2 = \xC3\xA2,
\xE3 = \xC3\xA3,
\xE4 = \xC3\xA4,
\xE5 = \xC3\xA5,
\xE6 = \xC3\xA6,
\xE7 = \xC3\xA7,
\xE8 = \xC3\xA8,
\xE9 = \xC3\xA9,
\xEA = \xC3\xAA,
\xEB = \xC3\xAB,
\xEC = \xC3\xAC,
\xED = \xC3\xAD,
\xEE = \xC3\xAE,
\xEF = \xC3\xAF,
\xF0 = \xC3\xB0,
\xF1 = \xC3\xB1,
\xF2 = \xC3\xB2,
\xF3 = \xC3\xB3,
\xF4 = \xC3\xB4,
\xF5 = \xC3\xB5,
\xF6 = \xC3\xB6,
\xF7 = \xC3\xB7,
\xF8 = \xC3\xB8,
\xF9 = \xC3\xB9,
\xFA = \xC3\xBA

Re: [whatwg] Byte-wise tokenization algorithm

2008-12-21 Thread Geoffrey Sneddon


On 21 Dec 2008, at 16:35, Edward Z. Yang wrote:


I suppose the big pivot point is as if. A byte-wise implementation
would replace character globally with byte, and any U+ designation
with the UTF-8 encoded byte version. HTML 5 dictates end behavior, not
the actual algorithm implementation, no?


It states that what is done must be wholly equivalent to the given  
algorithm.



But an HTML5 implementation,
according to the spec, must at a minimum support the UTF-8 and
Windows-1252 encodings, so the overall implementation might not  
depending

on exactly how this is done.


The plan is to convert Windows-1252 into UTF-8 before processing;  
with a

reasonably good iconv implementation, support for lots of encodings is
possible. The implementation might not be fully conforming if iconv
doesn't perform the proper (possibly context-sensitive; I haven't
checked) substitution when it doesn't recognize a character, but it
should be close.


I've never seen any way of getting iconv (at least via PHP) to do what  
HTML 5 requires (i.e., replacing invalid bytes with U+FFFD). It is,  
however, possible using mbstring (which also has the advantage of not  
being system dependant), as well as with PHP6's Unicode support.



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] /html with omitted tags

2008-12-26 Thread Geoffrey Sneddon


On 26 Dec 2008, at 17:02, Calogero Alex Baldacchino wrote:


Philip Taylor ha scritto:
I can start with a simple document that's probably conforming and  
that

the validator doesn't complain about:

 !DOCTYPE htmlhtmlheadtitle/title/headbody/body/ 
html


Then I can read the Writing HTML document: Optional tags section,  
which says:


 A head element's end tag may be omitted if the head element is not
immediately followed by a space character or a comment.

 A body element's start tag may be omitted if the first thing inside
the body element is not a space character or a comment, except if the
first thing inside the body element is a script or style element.

 A body element's end tag may be omitted if the body element is not
immediately followed by a comment.

So I choose to omit the /headbody/body because I think those
rules say I can do so. I get:

 !DOCTYPE htmlhtmlheadtitle/title/html

But now I get a parse error, which I think is because the /html
comes in the in head insertion mode and is Any other end tag:  
Parse

error. Ignore the token., so something seems wrong.




AIUI, omitting those closing tags is a parse error anyway, but in  
certain situations the parser can fix the code automatically because  
the state to enter/remain in is unambigous. Thus a validator  
notifies a parse error, while a browser keeps the error internally  
and handles it when possible.


The writing HTML documents section is meant to give what is a  
conforming HTML document, and those documents are conforming according  
to that. However, conformance checkers which are meant to follow the  
parser section (and throw the parse errors that produces) which in  
these cases differs. Therefore, either the writing section is wrong or  
the parser is wrong to throw the parse errors.



--
Geoffrey Sneddon
http://gsnedders.com/



[whatwg] Resolving a URL

2008-12-28 Thread Geoffrey Sneddon

Hey,

Time to send some feedback on the resolve a URL dfn.

Step 3 is (currently) If encoding is UTF-16, then change it to  
UTF-8.. Does this mean we literally change just encoding to UTF-16,  
and leave url verbatim, or are we meant to change url to UTF-8  
too? This is currently ambiguous. Not changing url will cause issues  
later in a UTF-16 document.


Step 12 replaces \ with /. IIRC WebKit does this for all URLs, not  
just those with a server-based naming authority (what's that anyway?).


Also, earlier in the Resolving URLs section, there should probably  
be a ref. to XMLBASE.



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] Spellchecking mark III

2008-12-30 Thread Geoffrey Sneddon


On 30 Dec 2008, at 11:38, Ian Hickson wrote:


In 2006 I proposed the following spec for a spellcheck= attribute,
based on requests from the Google engineers then working on Firefox:

  http://www.damowmow.com/playground/spellcheck.txt

The same engineers have since implemented this feature in Chrome  
also, and
Google does use this attribute on its sites. However, the attribute  
has
seen very little interest outside of Google, with just a handful of  
sites

using it, primarily in dyanamic editor libraries.

I have therefore not added this feature to HTML5 for the time being.  
If

there is more interest in this feature, please speak up.


This seems stupid. If I want to have spell-checking, let me. Don't  
force it off. I don't see any reason to have it forced off, ever.



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] Minor error in content‐type sniffing t able

2009-01-02 Thread Geoffrey Sneddon


On 2 Jan 2009, at 22:39, 111...@gmail.com wrote:


In section 2.7.4 of the specification, part of the table reads
FF FF 00 00
FE FF 00 00
text/plain
n/a
UTF-16BE BOM
FF FF 00 00
FF FF 00 00
text/plain
n/a
UTF-16LE BOM
in the 1 January draft.

Should this be
FF FF 00 00
FE FF 00 00
text/plain
n/a
UTF-16BE BOM
FF FF 00 00
FF FE 00 00
text/plain
n/a
UTF-16LE BOM
?


Yes.


--
Geoffrey Sneddon



Re: [whatwg] Issues relating to the syntax of dates and times

2009-01-02 Thread Geoffrey Sneddon


On 2 Jan 2009, at 21:53, Asbjørn Ulsberg wrote:


On Wed, 26 Nov 2008 11:09:24 +0100, Ian Hickson i...@hixie.ch wrote:

The spec draws the line already -- it says that the date has to be  
in the
proleptic Gregorian calendar, and that the year has to be greater  
than

zero.


Reading the spec, I have to wonder: Does HTML5 need to specify as  
much as it does inline? Can't more of it be referenced to ISO 8601  
or even better; RFC 3339? I really fancy how Atom (RFC 4287) has  
defined date constructs:

http://www.atompub.org/rfc4287.html#date.constructs

Does not RFC 3339 defined date and time in a satisfactory manner to  
use directly in HTML5? If there's prior discussion regarding this,  
I'd really appreciate a pointer. Thanks!


Without looking up prior discussion, the short answer is that content  
relies upon the parsing currently specified. Also, neither RFC3339 nor  
ISO8601 define parsing.



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] HTML5 DOCTYPE suggestion

2009-02-21 Thread Geoffrey Sneddon


On 21 Feb 2009, at 12:37, mikemi...@verizon.net wrote:


If the doctype is !DOCTYPE HTML5 instead


Then Gecko-based UAs would be in quirks mode.


--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] content models clarity

2009-03-06 Thread Geoffrey Sneddon


On 6 Mar 2009, at 11:53, Rikkert Koppes wrote:

The content models [1] section is pretty clear, however, whenever  
one wants to know which elements fall into a specific content model,  
one has to look at the categories definition of each element.


A (possibly non normative) overview would be helpful. I understand  
it is repeating information, hence possibly leading to conflicts  
when not paying attention while editing, but I think it would  
clarify a lot. Consider a similar overview to the one found in [2]  
listing the applicable attributes to various input types.


[1] 
http://www.whatwg.org/specs/web-apps/current-work/multipage/dom.html#content-models
[2] 
http://www.whatwg.org/specs/web-apps/current-work/multipage/forms.html#the-input-element


It is intended that such a thing be part of the as-of-yet unwritten  
index.



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] time

2009-03-12 Thread Geoffrey Sneddon


On 10 Mar 2009, at 17:03, David Singer wrote:


At 3:22  +0100 10/03/09, Charles McCathieNevile wrote:
That format has some serious limitations for heavy metadata users.  
In particular for those who are producing information about  
historical objects, from British Parliamentary records to histories  
of pre-communist Russia or China to museum collections, the fact  
that it doesn't handle Julian dates is a big problem - albeit one  
that could be solved relatively simply in a couple of different ways.


The trouble is, that opens a large can of worms.  Once we step out  
of the Gregorian calendar, we'll get questions about various other  
calendar systems (e.g. Roman ab urbe condita http://en.wikipedia.org/wiki/Ab_urbe_condita 
, Byzantine Indiction cycles http://en.wikipedia.org/wiki/ 
Indiction, and any number of other calendar systems from history  
and in current use).  Then, of course, are the systems with a  
different 'year' (e.g. lunar rather than solar).  And if we were to  
introduce a 'calendar system designator', we'd have to talk about  
how one converted/normalized.


Ultimately, why is the Gregorian calendar good enough for the ISO but  
not us? I'm sure plenty of arguments were made to the ISO before  
ISO8601 was published, yet that still supports only the Gregorian  
calendar, having been revised twice since it's original publication in  
1988. Is there really any need to go beyond what ISO 8601 supports?



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] Incorrect declaration of the default namespace in user agent CSS

2009-04-19 Thread Geoffrey Sneddon


On 19 Apr 2009, at 21:01, Sergey Ilinsky wrote:


In the 10.2 The CSS user agent style sheet and presentational hints

The declaration of the default namespace (to be applied to names  
that have no explicit namespace component) is incorrect:

@namespace url(http://www.w3.org/1999/xhtml);

Correct one should look like [1]:
@namespace http://www.w3.org/1999/xhtml;;

[1] http://www.w3.org/TR/css3-namespace/#declaration


According to that document both are correct.


--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] Link rot is not dangerous

2009-05-16 Thread Geoffrey Sneddon


On 16 May 2009, at 07:08, Leif Halvard Silli wrote:


Geoffrey Sneddon Fri May 15 14:27:03 PDT 2009


On 15 May 2009, at 18:25, Shelley Powers wrote:

 One of the very first uses of RDF, in RSS 1.0, for feeds, is  
still   in existence, still viable. You don't have to take my  
word, check it  out yourselves:


 http://purl.org/rss/1.0/

Who actually treats RSS 1.0 as RDF? Every major feed reader just  
uses  a generic XML parser for it (quite frequently a non-namespace  
aware one) and just totally ignores any RDF-ness of it.


What does it mean to treat as RDF? An RSS 1.0 feed is  
essentially a stream of items that has been lifted from the  
page(s) and placed in an RDF/XML feed. When I read e.g. http://www.w3.org/2000/08/w3c-synd/home.rss 
 in Safari, I can sort the news items according to date, source,  
title. Which means - I think - that Safari sees the feed as machine  
readable.  It is certainly possible to do more - I guess, and  
Safari does the same to non-RDF feeds, but still. And search engines  
should have the same opportunities w.r.t. creating indexes based on  
RSS 1.0 as on RDFa. (Though here perhaps comes in between the fact  
that search engines prefers to help us locate HTML pages rather than  
feeds.)


I mean using an RDF processor, and treating it as an RDF graph.  
Everything just creates from an XML stream (or object model) a bunch  
of items with a certain title, date, and description, and acts on that  
(and parses it out in a format specific manner, so it creates the same  
sort of item for, e.g., Atom) — it doesn't actually use an RDF graph  
for it. If you can find any widely used software that actually treats  
it as an RDF graph I'd be interested to know.



--
Geoffrey Sneddon
http://gsnedders.com/
http://simplepie.org/



[whatwg] Naming of Self-closing start tag state

2009-05-21 Thread Geoffrey Sneddon
I think this is a bit of a misnomer, as the current token can be an  
end tag token (although it will throw a parse error whatever happens  
once it reaches this state). I suggest renaming it to self-closing  
tag state.


--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] First or last Content-Type header?

2009-05-31 Thread Geoffrey Sneddon


On 30 May 2009, at 23:20, Adam Barth wrote:


In editing the content sniffing Internet Draft today, I noticed the
draft uses the *first* Content-Type header.  Internet Explorer uses
the first Content-Type header, but Firefox and Google Chrome use the
last Content-Type header.  (I don't recall off-hand which Safari or
Opera use.)  Because the sniffing algorithm is more similar to the
algorithms used by Firefox and Google Chrome, I've changed this aspect
to match them as well.


Firefox, Safari and Opera use the last header in all cases where there  
is a header that is only expected to appear once (i.e., doesn't take a  
#rule as a value), and have a list of all headers that they expect to  
appear only once. IE use the first header in all cases where it  
doesn't expect the header to appear more than once (i.e., a header  
like X-Foobar appearing twice returns the value of the first one). I  
don't know about Chrome, because that only appeared after I last did  
any work on HTTP parsing (but it normally follows Firefox from the  
small amount of experimentation I've done with it since). I, on the  
whole, would be tempted to take the first header, and use a list of  
headers that you expect to only appear once (i.e., a mix of behaviours).



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] First or last Content-Type header?

2009-05-31 Thread Geoffrey Sneddon


On 31 May 2009, at 12:55, Geoffrey Sneddon wrote:

IE use the first header in all cases where it doesn't expect the  
header to appear more than once (i.e., a header like X-Foobar  
appearing twice returns the value of the first one).


I don't think this is quite true, actually. It doesn't always use the  
first header, I don't think (from memory). Try:


Content-Type: jkfjkdsfjdsf
Content-Type: text/xml
Content-Type: text/plain

I think it'll use text/xml as the first valid value (and in the case  
of other browsers using the last header gives compat. with the  
majority of the content that relies upon this behaviour).


It's probably simplest just using the last header, actually, then.

I should probably try playing around with HTTP parsing again some more…


--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] Google's use of FFmpeg in Chromium and Chrome Was: Re: MPEG-1 subset proposal for HTML5 video codec

2009-06-02 Thread Geoffrey Sneddon


On 2 Jun 2009, at 02:58, Chris DiBona wrote:


One participant quoted one of the examples from the LGPL 2.1, which
says For example, if a patent license would not permit royalty-free
redistribution of the Library by all those who receive copies directly
or indirectly through you, then the only way you could satisfy both it
and this License would be to refrain entirely from distribution of the
Library.


I'm still unclear as to how this does not apply to Chrome's case. If I  
get a copy of Chrome, you are bound (by the LGPL) to provide me with a  
copy of the source ffmpeg, and I must be able to redistribute that in  
either binary or source form. I would, however, get in trouble for not  
having paid patent fees for doing so. Hence, as that example  
concludes, you cannot distribute ffmpeg whatsoever.



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] Codec mess with video and audio tags

2009-06-07 Thread Geoffrey Sneddon


On 7 Jun 2009, at 16:30, David Gerard wrote:


2009/6/7  jjcogliati-wha...@yahoo.com:


There are concerns or issues with all of these:
a) a number of large companies are concerned about the possible
unintended entanglements of the open-source codecs; a 'deep pockets'
company deploying them may be subject to risk here.  Google and  
other companies have announced plans to ship Ogg Vorbis and Theora  
or are shipping Ogg Vorbis and Theora, so this may not be  
considered a problem in the future.



Indeed. There are no *credible* claims of submarine patent problems
with the Ogg codecs that would not apply precisely as much to *any
other codec whatsoever*.

In fact, there are less, because the Ogg codecs have in fact been
thoroughly researched.

This claimed objection to Ogg is purest odious FUD, and should be
described as such at every mention of it. It is not credible, it is a
blatant and knowing lie.


How is it incredible? Who has looked at the submarine patents? They by  
definition are unpublished! Yes, certainly, published patents are well  
researched, but this is not the objection that anyone has made to it.



--
Geoffrey Sneddon



[whatwg] Charset override table should match case of IANA registry

2009-06-18 Thread Geoffrey Sneddon
Although charsets are case insensitive, it'd probably be best to be  
consistent with the IANA registry. The only change this means makes is  
changing Windows-* to windows-*.


Re: [whatwg] input type=url allow URLs without http:// prefix

2009-07-12 Thread Geoffrey Sneddon


On 12 Jul 2009, at 10:46, Bruce Lawson wrote:

The eleventy squillion WordPress sites out there that allow comments  
ask for your web page address as well as name and email. The method  
of entering a URL does not require the http:// prefix; just  
beginning the URL with www is accepted.


As it's very common for people to drop the http:// prefix on  
advertising, business cards etc (and who amongst us reads out the  
prefix when reading a URL on the phone?) I'd like to suggest that  
input type=url allows the http:// prefix to be optional on input  
and, if ommitted, be assumed when parsing.


How do we tell apart foo.html (a relative URL) and example.com (a  
host name)?



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] A New Way Forward for HTML5

2009-07-24 Thread Geoffrey Sneddon

Ian Hickson wrote:

On Thu, 23 Jul 2009, Tab Atkins Jr. wrote:
That being said, inline spec comments sound interesting.  Can you expand 
on this?  Are these meant to be private and only shown to Ian? Shown to 
everything who views the spec (optionally, of course)?  Sent to the 
mailing list?


If anybody would like to follow-up on this particular idea, I'm very 
interested in setting something up that makes it even easier to submit 
comments without having to worry about subscribing to the lists or 
registering with the W3C's Bugzilla instance. I'm not quite sure what the 
UI would look like, but if anyone has any ideas, feel free to e-mail me 
directly and we can figure something out. (This would be exceedingly 
useful once we're in last call in a few months.)


I remember having some discussion about such a thing in IRC a few months 
ago. Indeed, the biggest problem seems to be what sort of UI we could 
use for it.


My proposal, on the whole, would be to have some box appearing upon 
selecting text. Then, in that box, give space for both an email address 
and a comment, and send that along with the selected text to the list.


--
Geoffrey Sneddon — Opera Software ASA
http://gsnedders.com/
http://www.opera.com/


Re: [whatwg] New HTML5 spec GIT collaboration repository

2009-07-27 Thread Geoffrey Sneddon

Manu Sporny wrote:

3. Running the Anolis post-processor on the newly modified spec.


Is there any reason you use --allow-duplicate-dfns? Likewise, you 
probably don't want --w3c-compat (the name is slightly misleading, it 
provides compatibility with the CSS WG's CSS3 Module Postprocessor, not 
with any W3C pubrules).


On the whole I'd recommend running it with:

--w3c-compat-xref-a-placement --parser=lxml.html --output-encoding=us-ascii

The latter two options require Anolis 1.1, which is just as stable as 
1.0. I believe those options are identical to how Hixie runs it through PMS.


--
Geoffrey Sneddon — Opera Software
http://gsnedders.com/
http://www.opera.com/


Re: [whatwg] New HTML5 spec GIT collaboration repository

2009-07-28 Thread Geoffrey Sneddon

Manu Sporny wrote:

Cameron McCormack wrote:

Manu Sporny:

3. Running the Anolis post-processor on the newly modified spec.

Geoffrey Sneddon:

Is there any reason you use --allow-duplicate-dfns?

I think it’s because the source file includes the source for multiple
specs (HTML 5, Web Sockets, etc.) which, when taken all together, have
duplicate definition.  Manu’s Makefile will need to split out the
HTML 5 specific parts (between the !--START html5-- and !--END
html5-- markers).  The ‘source-html5 : source’ rule in
http://dev.w3.org/html5/spec-template/Makefile will handle that.


What a great answer, Cameron! I wish I had thought of that :)


Ah, that's true. I was assuming he was working on the split-up spec.


Yes, that will become an issue in time and was going to have a chat with
Geoffrey about how to modify Anolis to handle that as well as handling
what happens when there is no definitions when building the
cross-references (perhaps having a formatter warnings section in the file?).


Handle that in what way? The correct way, as far as I can see, is to do 
what Ian does, which is to call Anolis on the already-split up spec. 
What do you mean about warnings? Just if there's an instance of a term 
which isn't defined? That can't be done, because it would mean that 
every abbr, code, i, span and var element would have to be an instance 
(whereas they can perfectly fine exist without being one).


It's probably worth throwing an error/warning when data-anolis-xref is 
set and it is unknown, though. (But that will probably change to data-xref.)



I also spoke too soon, Geoffrey, --allow-duplicate-dfns is needed
because of this error when compiling Ian's spec:

The term dom-sharedworkerglobalscope-applicationcache is defined more
than once

I'm probably doing something wrong... haven't had a chance to look at
Cameron's Makefile pointer yet, so --allow-duplicate-dfns is in there
for now.


I expect you are doing something wrong, because that doesn't exist in 
Ian's copy. :)


With regards to tracking Anolis, your free to pull it in if you want, 
but you probably don't want to track it too closely (currently there 
haven't been any major changes from 1.0, though they are coming soon, so 
it may get a miss less stable). I tend to ping James (Graham) provided 
it's stable, so he can update pimpmyspec.net, and I can try and remember 
to ping you too.


--
Geoffrey Sneddon — Opera Software
http://gsnedders.com/
http://www.opera.com/


Re: [whatwg] A tag for measurements / quantity?

2009-08-19 Thread Geoffrey Sneddon

Jeremy Keith wrote:
 Unit-measures differ from locale to locale (e.g. Fahrenheit vs. Celsius,
 pound versus Kilogram), making comparison and matching of offerings
 difficult.

There's more variation than that: (imperial) gallon v. (US) gallon. 
Cases like that really make it hard to deal with. Then you have varying 
names in different languages, disagreement about what kilobyte means, 
and so much more… Sounds like a whirlwind of fun.


--
Geoffrey Sneddon — Opera Software
http://gsnedders.com/
http://www.opera.com/


Re: [whatwg] Text areas with pattern attributes?

2009-08-19 Thread Geoffrey Sneddon

Alex Vincent wrote:

I'm drifting into writing code for the pattern attribute on text
fields again, and I wondered:  if text inputs can have pattern
attribute for regular expression matching, why not text area elements?


What's the use-case for it? Textareas are almost always for such large 
amounts of input that they are almost always free-form text. Why allow 
the pattern attribute?


--
Geoffrey Sneddon — Opera Software
http://gsnedders.com/
http://www.opera.com/


Re: [whatwg] More prohibited characters for unquoted attributes are needed

2009-09-07 Thread Geoffrey Sneddon


On 6 Sep 2009, at 12:35, Aryeh Gregor wrote:


See some research here:

http://code.google.com/p/html5lib/issues/detail?id=93

It seems like in addition to whitespace and '= , the characters
U+ through U+0020 should be banned from unquoted attribute values,
as well as U+0060 (backtick `), for the sake of compatibility.


Apparently Hixie had previously said he didn't want to change this as  
it will become a non-issue over time. I think it does matter due to  
the security issues it presents in existing UAs. Conforming markup  
(using elements/attributes allowed in HTML 4.01) should not cause JS  
to execute in one browser but not in another.



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] More prohibited characters for unquoted attributes are needed

2009-10-13 Thread Geoffrey Sneddon

Ian Hickson wrote:

On Mon, 7 Sep 2009, Aryeh Gregor wrote:

On Mon, Sep 7, 2009 at 1:34 PM, Geoffrey Sneddon
foolistbar at googlemail.com wrote:

Apparently Hixie had previously said he didn't want to change this as it
will become a non-issue over time. I think it does matter due to the
security issues it presents in existing UAs. Conforming markup (using
elements/attributes allowed in HTML 4.01) should not cause JS to execute in
one browser but not in another.
I agree with you as an author.  I wrote an HTML output function in 
MediaWiki assuming that what the standard says is known to be 
interoperable, which is apparently wrong.  If I hadn't been keeping up 
with HTML 5, I would have introduced an XSS vulnerability because of 
some browsers' handling of `.


If the problem will go away with time, then perhaps a later version of 
the standard could make such unquoted attributes conforming, once 
there's no more problem with them.


As far as I can tell, this is an IE bug; treating ` as an attribute 
quoting character is non-conforming in any version of HTML so far, it 
seems. I'm certainly not going to make it non-conforming to stumble into 
any IE bug or difference in parsing between IE and previous specs or other 
browsers; we'd just end up with an asanine set of conformance 
requirements.


I agree that it's pointless to make it non-conforming to hit any parsing 
bug, but I would argue that we should make as many cases as it is 
sensible to do so non-conforming if they open up security holes in 
websites on legacy UAs, given that website uses a HTML 5 
parser/sanitizer/serializer.



For example, should this be non-conforming?

   !DOCTYPE html
   titleTest/title
   form
labelSearch: input type=text/label
input type=submit
   /form

This perfectly innocent piece of HTML content (HTML2-compliant except for 
the DOCTYPE) results in a non-tree DOM in IE8. Should we make it 
non-conforming?


No, it opens up no security hole if that is done.

Similarly, IE conditional comments make it trivial to trigger scripts in 
IE but not another UA; indeed people do this on purpose. Should we make 
those non-conforming also?


They are a harder issue, but I think it is probably fair enough to 
assume that most sanitizers drop comments for such reasons, hence making 
them fine to leave as conforming also.


As I understand it, the attack here is a site that allows the user to 
input text that is used verbatim in two attributes, such that the user can 
set the first attribute's value to:


   `

...and the second to:

   ` onload='...payload...' end=x

...with the assumption that the site is going to not quote the first one, 
and quote the second one with double quotes:


(This is the default behaviour of Python html5lib, FWIW: the first is 
not quoted as it does not contain any whitespace characters or U+003E 
(), the latter is quoted for that reason.)



   body title=` class=` onload='...payload...' end=x

...which in IE, for some reason, gets treated as:

   body title=' class='
 onload='...payload...'
 end='x'


Indeed, this is the attack I (and others) am concerned about.

I've disallowed ` in unquoted attribute values for now, but I think we 
should revert this once IE has fixed this bug for a few years.


Right, once versions of IE with this bug have faded out of existence I 
think this will become a non-issue. I also expect that'll be a while 
yet, though, and I highly doubt that time will have come even by the 
time when HTML 5 goes to REC. Furthermore, if there are similar attacks 
to this, I think they should similarly be made non-conforming.


--
Geoffrey Sneddon — Opera Software
http://gsnedders.com/
http://www.opera.com/


Re: [whatwg] Character casing for Appropriate End Tags and the temporary buffer

2009-10-29 Thread Geoffrey Sneddon

Matt Hall wrote:

Apologies for the repost -- here is the original e-mail in plain text:


Prior to r4177, the matching of tag names for exiting the RCDATA/RAWTEXT states 
was done as follows:

...and the next few characters do no match the tag name of the last start tag token 
emitted (compared in an ASCII case-insensitive manner)

However, the current revision doesn't include any comment on character casing in its discussion of 
Appropriate End Tags.  Similarly, certain tokenizer states require that you check the contents of 
the temporary buffer against the string script but there is no indication of whether 
or not to do this in a case-insensitive manner.

In both cases, should this comparison be done in an ASCII case-insensitive 
manner or not? It might be helpful to clarify the spec in both places in either 
case.


It is already case-insensitive as you lowercase the characters when 
creating the token name and when adding them to the buffer.



--
Geoffrey Sneddon — Opera Software
http://gsnedders.com/
http://www.opera.com/


Re: [whatwg] Script Data tokenizer mode

2009-11-02 Thread Geoffrey Sneddon

Matt Hall wrote:

When the script data state was added to the tokenizer, the tree construction
algorithm was updated to switch the tokenizer into this state upon finding a
start tag named script while in the in head insertion mode (9.2.5.7). I see
that a corresponding change was not made to 9.5 about Parsing HTML Fragments
as it still says to switch into the RAWTEXT state upon finding a script tag.
Does anyone know if this difference is intentional, or did someone just forget
to update the fragment parsing case?


I think, due to the fact that no start tag has ever been emitted by the 
tokenizer, that RAWTEXT and the script data states should behave 
identically for the script element fragment case. (Once you take into 
account that there are no appropriate end tag token, all the careful 
casing for the comments effectively becomes nothing, and regardless of 
input everything will become character tokens. This is true of both the 
script data state and the RAWTEXT state: the latter is probably 
preferably due to its far lower complexity.)


--
Geoffrey Sneddon — Opera Software
http://gsnedders.com/
http://www.opera.com/


Re: [whatwg] HTML5 doctypes incompatible with XHR if named entities present

2009-11-12 Thread Geoffrey Sneddon

Aryeh Gregor wrote:

On Thu, Nov 12, 2009 at 12:33 AM, Boris Zbarsky bzbar...@mit.edu wrote:

I assume you meant mostly as in most of the pages are well-formed, not
pages are mostly well-formed, since the latter is useless, right?

I did a brief survey of obvious sites fitting those descriptions that I had
in my browser history at the moment. . . .

So either you're looking at a totally different dataset or mostly is a bit
of a stretch


I admit I didn't look closely.  At a guess, maybe the default
WordPress skin(s) are valid XHTML, but custom skins are very popular
for WordPress and those mostly aren't valid XHTML?  MediaWiki is
unreasonably difficult to reskin, so that's not much of a problem for
us . . .


Even with the default skin it's easy to break (e.g., search for U+). 
That'll be output to the page and make it not well-formed.


--
Geoffrey Sneddon — Opera Software
http://gsnedders.com/
http://www.opera.com/


Re: [whatwg] Adding ECMAScript 5 array extras to HTMLCollection

2010-04-27 Thread Geoffrey Sneddon

On 26/04/10 19:50, And Clover wrote:

David Flanagan wrote:


Rather that trying to make DOM collections feel like arrays, how about
just giving them a toArray() method?


I like that, as a practical and explicit (JavaScript-specific) binding.

In the longer term, what's the thinking on a more basic change:

- Require specific DOM interfaces like NodeList, HTMLCollection, Element
etc. to be available for prototype monkey-patching under their interface
names as properties of `window`?

Then we wouldn't have to worry about what Array-like methods need to be
provided on HTMLCollection, because application and framework authors
could choose whichever they liked to prototype in.

IE8/Moz/Op/Saf/Chr already do this to a significant extent, but there's
no standard that says they have to. It would allow DOM extension to be
put on a much less shaky footing than the messy hack Prototype 1.x uses.

Is this something that's a reasonable requirement for browsers in future?


HTML5 through WebIDL and its ECMAScript binding already does require this.

--
Geoffrey Sneddon — Opera Software
http://gsnedders.com/
http://www.opera.com/


Re: [whatwg] Adding ECMAScript 5 array extras to HTMLCollection

2010-04-28 Thread Geoffrey Sneddon

On 27/04/10 20:23, David Bruant wrote:

Le 27/04/2010 03:54, Geoffrey Sneddon a écrit :

On 26/04/10 19:50, And Clover wrote:

David Flanagan wrote:


Rather that trying to make DOM collections feel like arrays,
how about just giving them a toArray() method?


I like that, as a practical and explicit (JavaScript-specific)
binding.

In the longer term, what's the thinking on a more basic change:

- Require specific DOM interfaces like NodeList, HTMLCollection,
Element etc. to be available for prototype monkey-patching under
their interface names as properties of `window`?

Then we wouldn't have to worry about what Array-like methods need
to be provided on HTMLCollection, because application and
framework authors could choose whichever they liked to prototype
in.

IE8/Moz/Op/Saf/Chr already do this to a significant extent, but
there's no standard that says they have to. It would allow DOM
extension to be put on a much less shaky footing than the messy
hack Prototype 1.x uses.

Is this something that's a reasonable requirement for browsers in
 future?


HTML5 through WebIDL and its ECMAScript binding already does
require this.


I can see where interfaces are expected to be exposed
([NamedConstructor]) in the global object, but I don't see where it
is said that the prototype of the constructor must be extensible. I
don't even see this in the section which is the relevent one in my
opinion (Interface prototype object) I have read this version of
WebIDL : http://dev.w3.org/2006/webapi/WebIDL/


Section 4.1.1 Interface object:


The interface object MUST also have a property named prototype with
attributes { DontDelete, ReadOnly } whose value is an object called
the interface prototype object. This object provides access to the
functions that correspond to the operations defined on the interface,
and is described in more detail in section 4.4.3 below.


--
Geoffrey Sneddon — Opera Software
http://gsnedders.com/
http://www.opera.com/


Re: [whatwg] Adding ECMAScript 5 array extras to HTMLCollection (ATTN IE TEAM - TRAVIS LEITHEAD)

2010-04-29 Thread Geoffrey Sneddon

On 28/04/10 23:28, Garrett Smith wrote:

On Wed, Apr 28, 2010 at 2:12 AM, James Grahamjgra...@opera.com  wrote:

On 04/28/2010 10:27 AM, David Bruant wrote:


When I started this thread, my point was to define a normalized way
(through ECMAScript binding) to add array extras to array-like objects
in the scope of HTML5 (HTMLCollection and inheriting interfaces).
I don't see any reason yet to try to find a solution to problems that
are in current web browsers.
Of course, if/when a proposal emerges from this thread and some user
agent accept to implement it, a workaround (probably, feature detection)
will have to be found to use the feature in user agents that implement
it and doing something equivalent in web browsers that don't.


To be clear the proposals in this thread are pure syntactic sugar; they
don't allow you do do anything that you can't already do like:

Array.prototype.whatever.call(html_collection, arg1, arg2, ...)

where whatever is the array method you are interested in.



- and from that you can expect errors in Internet Explorer up to and
including version 8.


Adding a toArray operation (for example) won't work in IE up to and 
including version 8 though either. There's no point in adding a toArray 
operation for the pure reason that they currently don't implement 
another part of the spec (through the WebIDL references) currently. 
toArray adds no extra usefulness once they implement other parts of the 
spec.





Of course there is nothing wrong with making the syntax more natural if it
can be done in a suitably web-compatible way. However it seems more sensible
to do this at a lower level e.g. as part of Web DOM Core. Sadly that spec is
in need of an editor.



The problem that has been well established is that Internet Explorer's
implementation of host object collections or dhtml collection[1]
objects is incompatible with JScript implementation of Array generics.

The result of attempting to supply an Internet Explorer dhtml
collection to an Array generic method, e.g. slice, as the `this`
value, results in a jscript runtimer error: JScript object expected.

IE8:
[].slice.call(document.styleSheets);

Result:
Error: JScript object expected.


In IE8 document.styleSheets.toArray().slice(0, 1); also throws an error. 
How does adding toArray help for IE8, which you're giving as the reason 
for adding it?



Travis Leithead and IE Team: Can you release Internet Explorer 9 with
all dhtml collections implemented as native EcmaScript objects?


As far as I am aware, none of them are on this list.

--
Geoffrey Sneddon — Opera Software
http://gsnedders.com/
http://www.opera.com/


Re: [whatwg] Parse errors for invalid characters

2013-09-07 Thread Geoffrey Sneddon

On 06/09/2013 04:05, Kang-Hao (Kenny) Lu wrote:

(2013/09/06 6:08), Geoffrey Sneddon wrote:

The phrasing content section states:


Text nodes and attribute values must consist of Unicode characters,
must not contain U+ characters, must not contain permanently
undefined Unicode characters (noncharacters), and must not contain
control characters other than space characters. This specification
includes extra constraints on the exact value of Text nodes and
attribute values depending on their precise context.


And the pre-processing the input-stream section states:


Any occurrences of any characters in the ranges U+0001 to U+0008,
U+000E to U+001F, U+007F to U+009F, U+FDD0 to U+FDEF, and characters
U+000B, U+FFFE, U+, U+1FFFE, U+1, U+2FFFE, U+2, U+3FFFE,
U+3, U+4FFFE, U+4, U+5FFFE, U+5, U+6FFFE, U+6,
U+7FFFE, U+7, U+8FFFE, U+8, U+9FFFE, U+9, U+AFFFE,
U+A, U+BFFFE, U+B, U+CFFFE, U+C, U+DFFFE, U+D,
U+EFFFE, U+E, U+E, U+F, U+10FFFE, and U+10 are parse
errors. These are all control characters or permanently undefined
Unicode characters (noncharacters).


Note the first uses Unicode characters, the second characters — the
former excludes surrogates as a conformance requirement.

Note that every disallowed non-surrogate character is a parse error.


Except U+ or am I missing something?


This is handled inline in the parser, as noted in the preprocessing 
section. It sometimes gets passed through as U+, sometimes gets 
changed to U+FFFD, sometimes gets ignored, but always creates a parser 
error.



Therefore, it would make sense to make surrogates parse errors.

It should be noted that they can only occur in the input stream if they
come from script (as they cannot be decoded from the input byte stream
as the decoders will never emit a surrogate).


which means that this seems ... cubersome ... to implement in a
conformance checker. Which reminds me, does

# Conformance checkers must report at least one parse error
# condition to the user if one or more parse error conditions exist
# in the document and must not report parse error conditions if none
# exist in the document. Conformance checkers may report more than
# one parse error condition if more than one parse error condition
# exists in the document.

mean validator.nu and Firefox view source are non-conforming because
they do nothing about document.write() ?

I think we should exempt conformance checkers from scripts instead.


They already are. From the Conformance classes section:


Conformance checkers must check that the input document conforms when parsed without a browsing 
context (meaning that no scripts are run, and that the parser's scripting flag is disabled), and 
should also check that the input document conforms when parsed with a browsing context in which 
scripts execute, and that the scripts never cause non-conforming states to occur other than 
transiently during script execution itself. (This is only a SHOULD and not a 
MUST requirement because it has been proven to be impossible. [COMPUTABLE])


(I feel like pedanting and pointing out this is untrue — it has not been 
proven impossible to do, it has been proven impossible to do in general. 
It wouldn't be that hard to design a conformance checker to check 
htmlscriptdocument.write(p)/script.)


On the other hand, a JS console can reasonably report parse errors from 
script, so the parse errors are still worthwhile to have.


/Geoffrey.


[whatwg] Bogus comment state and CDATA section state do not stylistically fit in the tokenizer

2014-06-08 Thread Geoffrey Sneddon
It would aid programmatic conversion of the spec, and confuse me when
reading the spec less thereby avoiding bugs like 25871, if these states
matched the model of the rest of the tokenizer.

Thus I propose the bogus comment state becomes:

 Consume the next input character:
 
 U+003E GREATER-THAN SIGN ():
 
 Switch to the data state. Emit the comment token.
 
 U+ NULL:
 
 Append a U+FFFD REPLACEMENT CHARACTER character to the comment token's data.
 
 EOF:
 
 Switch to the data state. Emit the comment token. Reconsume the EOF character.
 
 Anything else:
 
 Append the current input character to the comment token's data.

This also necessitates creating a new comment token prior to entering
the bogus comment state.

The CDATA section state should become:

 Consume the next input character:
 
 U+005D RIGHT SQUARE BRACKET (]):
 
 If the three characters starting from the current input character are U+005D 
 RIGHT SQUARE BRACKET U+005D RIGHT SQUARE BRACKET U+003E GREATER-THAN SIGN 
 (]]), then consume those characters and switch to the data state. Otherwise, 
 emit the current input character as a character token.
 
 EOF:
 
 Switch to the data state. Reconsume the EOF character.
 
 Anything else:
 
 Append the current input character to the comment token's data.

No changes are needed elsewhere for this. (There is no consistent style
for lookahead — and most cases are ASCII case-insensitive words — so I
went with what seems sane here!)

/Geoffrey