[whatwg] Parsing: /br and /p
These closing tags also need to be guided through the head element phase and such to ensure documents such as !doctype html/br !doctype htmlhead/p behave similar to the browsers we try to imitate in English. -- Anne van Kesteren http://annevankesteren.nl/ http://www.opera.com/
Re: [whatwg] The issue of interoperability of the video element
Nicholas Shanks schrieb: Browsers don't (and shouldn't) include their own av decoders anyway. Codec support is an operating system issue, and any browser installed on my computer supports exactly the same set of codecs, which are the ones made available via the OS (QuickTime APIs in my case, Windows Media APIs on Bill's platform, and from the sounds of it, libavcodec on the Penguin) Browsers should ship with their own decoders (at least one set) because depending on what platform you are the choice of codecs that are installed varies greatly and as a content producer you have no idea what the clients can decode in that scenario. If IE supports WMV, Safari supports MPEG4 and Opera and Mozilla support Ogg out of the box you can at least be somewhat sure that if you provide content in those 3 formats your visitors will almost certainly be able to access the content (and that's a worst case scenario where interoperability is pretty poor). Browsers don't rely on the OS to decode JPEG or PNG or GIF either - I assume that's driven by similiar reasons. Hooking into the media frameworks of the various platforms may be a good idea despite of this, albeit that may mean that on one platform e.g. Firefox can decode WMV while it can't on some other (and in this case content providers may choose to not provide content in alternative formats because Internet Explorer and Firefox on Windows cover 95% of potential customers and they all can do WMV - that could grow to an unfortunate situation where actually improving interoperability with one media system slams the door for Linux and MacOS users). Maik Merten
Re: [whatwg] The issue of interoperability of the video element
On 27 Jun 2007, at 09:28, Maik Merten wrote: Browsers don't rely on the OS to decode JPEG or PNG or GIF either In my experience that seems to be exactly what they do do—rely on the OS to provide image decoding (as with other AV media). I say this because changes that had occurred in the OS (such as adding JPEG-2000 support) are immediately picked up by my browsers. Firefox can decode WMV while it can't on some other (and in this case content providers may choose to not provide content in alternative formats because Internet Explorer and Firefox on Windows cover 95% of potential customers and they all can do WMV - that could grow to an unfortunate situation where actually improving interoperability with one media system slams the door for Linux and MacOS users). WMV 9 is supported on the Mac OS via a (legal) download, so only Linux would get screwed. Once the download is installed, every app that uses QuickTime (including apps that have their own codecs too, such as RealPlayer, VLC) immediately gain the ability to play WMV files. Same is true for the Theora codecs from xiph.org. I assert that any codec written by a browser vendor and available only within that browser is user-hostile (due to lack of system ubiquity), likely to be slower and buggier than the free decoding component written by the codec vendor themselves, and detracts from the time available for implementing other browser changes. - Nicholas. smime.p7s Description: S/MIME cryptographic signature
Re: [whatwg] The issue of interoperability of the video element
On 6/27/07, Nicholas Shanks [EMAIL PROTECTED] wrote: On 27 Jun 2007, at 09:28, Maik Merten wrote: Browsers don't rely on the OS to decode JPEG or PNG or GIF either In my experience that seems to be exactly what they do do—rely on the OS to provide image decoding (as with other AV media). I say this because changes that had occurred in the OS (such as adding JPEG-2000 support) are immediately picked up by my browsers. You do not know what you are talking about. Firefox does not use OS image decoders. likely to be slower and buggier than the free decoding component written by the codec vendor themselves We use official Ogg Theora libraries. and detracts from the time available for implementing other browser changes. No-one's suggesting reimplementing codecs. We're talking about integrating existing codecs into the browser, and shipping them with the browser. Rob -- Two men owed money to a certain moneylender. One owed him five hundred denarii, and the other fifty. Neither of them had the money to pay him back, so he canceled the debts of both. Now which of them will love him more? Simon replied, I suppose the one who had the bigger debt canceled. You have judged correctly, Jesus said. [Luke 7:41-43]
Re: [whatwg] The issue of interoperability of the video element
On 27 Jun 2007, at 11:55, Robert O'Callahan wrote: In my experience... You do not know what you are talking about. Firefox does not use OS image decoders. And I don't use Firefox, so my point is still valid. Please don't inform me of what you think I know or do not know, it is impolite. For your future reference, Robert, the browsers I am familiar with and was referring to in my statement about image decoders are WebKit- based browsers, OmniWeb 4.5 (historically), Camino and iCab 3. I avoid FireFox and Opera due to their non-native interfaces and form controls. Given your statement I may be incorrect about Camino though. We use official Ogg Theora libraries. No-one's suggesting reimplementing codecs. We're talking about integrating existing codecs into the browser, and shipping them with the browser. This is only possible if the codec is free. I thought we were talking about the problem of adding non-free codecs (namely WMV and MPEG4) to free software, (possibly also involving reverse-engineering the codec). - Nicholas. smime.p7s Description: S/MIME cryptographic signature
Re: [whatwg] The issue of interoperability of the video element
Nicholas Shanks schrieb: This is only possible if the codec is free. I thought we were talking about the problem of adding non-free codecs (namely WMV and MPEG4) to free software, (possibly also involving reverse-engineering the codec). Reverse-engineering doesn't lead to usable implementations of non-free formats. You end up having *sourcecode* with a free license attached to it, but you're not allowed to *distribute* actual binaries of that code because the codec is still covered by patents. Take for example libavcodec: That actually has WMV support and its sourcecode is open. However, thanks to the MPEG and Microsoft codecs being patented (and because those patents are enforced) you cannot put it into Mozilla. Open source usually only covers copyright. Truly free codecs are open sourced AND don't require patent licensing. Maik Merten
[whatwg] Editorial: typo (spelling)
The verb `precede' does not follow the same pattern as `succeed' and `proceed'. s/precee/prece/g would correct the current misspellings. -- �istein E. Andersen
Re: [whatwg] The issue of interoperability of the video element
On 6/28/07, Nicholas Shanks [EMAIL PROTECTED] wrote: For your future reference, Robert, the browsers I am familiar with and was referring to in my statement about image decoders are WebKit-based browsers, OmniWeb 4.5 (historically), Camino and iCab 3. I avoid FireFox and Opera due to their non-native interfaces and form controls.Given your statement I may be incorrect about Camino though. You are. If we're going to make sweeping statements about how browsers work, let's make sure we include IE, Firefox and Opera in our data. We use official Ogg Theora libraries. No-one's suggesting reimplementing codecs. We're talking about integrating existing codecs into the browser, and shipping them with the browser. This is only possible if the codec is free. I thought we were talking about the problem of adding non-free codecs (namely WMV and MPEG4) to free software, (possibly also involving reverse-engineering the codec). No-one's suggesting that. As Maik points out, reverse engineering is a dead end. Shipping a binary codec with, say, Firefox is a theoretical possibility, but for many reasons it's very unlikely to happen. Rob -- Two men owed money to a certain moneylender. One owed him five hundred denarii, and the other fifty. Neither of them had the money to pay him back, so he canceled the debts of both. Now which of them will love him more? Simon replied, I suppose the one who had the bigger debt canceled. You have judged correctly, Jesus said. [Luke 7:41-43]
Re: [whatwg] Entity parsing [trema/diaeresis vs umlaut]
How does it influence the case flanceacutee vs oeliguvre? The only difference is that the first one is used in English. Chris -Original Message- From: Oistein E. Andersen [mailto:[EMAIL PROTECTED] Sent: Tuesday, June 26, 2007 10:55 PM To: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: [whatwg] Entity parsing [trema/diaeresis vs umlaut] On 26 Jun 2007, at 7:49AM, Křištof Želechovski wrote: Internet Explorer apparently chose to support English natively while SGML preferred remaining language-agnostic. To be fair, this is not how things developed. Microsoft first chose to make the semicolon optional not only when allowed by SGML rules (notably before whitespace and tags), but in any position, for all named entities /that existed at the time/, i.e., latin-1. Unfortunately, this meant that new entities could not be added without changing the interpretation of already existing pages (e.g., if a page contained lessless, adding the entity le to the list would result in its being interpreted as less?ss), although most of the entities have names that are rather unlikely to appear by chance, and the ampersand should be spelt amp;. Microsoft did not dare to risk this, so entities beyond latin-1 require a semicolon in IE, even in cases where it is optional according to SGML (and therefore will pass HTML 4.01 validation, I might add). -- Oistein E. Andersen
Re: [whatwg] Parsing: /br and /p
On Wed, 27 Jun 2007, Anne van Kesteren wrote: These closing tags also need to be guided through the head element phase and such to ensure documents such as !doctype html/br !doctype htmlhead/p behave similar to the browsers we try to imitate in English. Done. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] void elements vs. content model = empty
Ian Hickson wrote: On Wed, 20 Jun 2007, Jonas Sicking wrote: Simon Pieters wrote: On Wed, 20 Jun 2007 00:28:37 +0200, Ian Hickson [EMAIL PROTECTED] wrote: Also, if there's a difference between content=empty and 'void elements' it deserves an explanation. One is just about the content model, the other is just about the syntax. They're not really related, though it happens to be the case that all elements that have an empty content model are void elements in HTML. FWIW, script src has empty content model but still requires the end tag. That is not true. The contents of a script src is interpreted as script and executed if loading the resource pointed to by the src-attribute fails. In other words script src=http://nonexistant.example.com/; alert('hi'); /script should bring up an alert. This doesn't seem to be the case as far as I can tell. It indeed appears I am wrong. I consulted Brendan and it appears that this might have been the case back in NS3 and apparently I still remember it from back then. My brain works in mysterious ways... / Jonas
Re: [whatwg] Entity parsing [trema/diaeresis vs umlaut]
On 27 Jun 2007, at 8:45PM, Křištof Želechovski wrote: How does it influence the case flanceacutee vs oeliguvre? You might want to have a look at http://pl.wikipedia.org/wiki/ISO_8859-1 . Afterwards, consider the following: 1) Latin-1 does not contain all the characters that are required for typesetting of English. 2) It does include characters that are never used in English at all. 3) In IE, the entities that can be used without a terminating semicolon are the ones that can be found in this character set. How does this make Microsoft Anglocentric? The only difference is that the first one is used in English. They are both used in English, actually (and the spelling with a ligature should not be considered obsolete in words borrowed from French, unlike those of Latin origin). -- Øistein E. Andersen
Re: [whatwg] Entity parsing
On 26 Jun 2007, at 4:35AM, Ian Hickson wrote: The informal research I did when updating the spec suggests that the current state of the spec is what is better. (It is difficult to say anything sensible without knowing either the nature of the research undertaken or the options under consideration.) I don't really know how to do more research -- it's quite hard to programatically tell when an entity should be expanded and when it shouldn't. True, but this is not completely insurmountable — or, rather: useful information can be extracted without necessarily making these decisions explicitly. I do not know what you have done already, but something like the following for each entity ref; would be useful for the discussion: — total number of ref; — number of ref;; — number of ref followed by /[a-zA-Z0-9]/; — the N most frequent matches of /[a-zA-Z0-9]*ref[a-zA-Z0-9]+/. Without any real data, arguing, e.g., that conforming HTML 4.01 documents that are currently handled correctly by Firefox and Safari must be handled differently in the future for the sake of backwards compatibility is not really persuasive. The only argument for following IE that I have been able to find in the archives is the following in a post from Simon Pieters on 14th Aug 2006 in the thread “Parsing Entities”: I guess that for compat with IE and the Web[1] we have to treat Reacutesumeacute as if it were Reacute;sumeacute;. [...] [1] http://www.google.com/search?q=R%26eacutesum%C3%A9 The implication seems to be that Reacutesumeacute can be found on the Web and therefore should be supported. But Google also tells us something else: (1) reacutesumé: 572 (2) +résumé: 114,000,000 (3) reacute;sumeacute -reacute;sumeacute;s: 16,300 (4) +résumé: 1,000 Actually, (1) does not only cover reacutesumeacute, but also code like ramp;eacutesumé, so the number of occurrences that can be saved by parser quirks is lower than 572. As could be expected, (1) is quite rare compared to (2), all the correctly encoded variants. Whether 0.0005% should be regarded as significant (supposing that résumé is representative) may be a contentious issue, but it is interesting to note that other errors — unwanted conversion of to amp; in (3) and a typical encoding problem in (4) — are actually significantly more common, and these cannot be corrected at all. -- Øistein E. Andersen
Re: [whatwg] Entity parsing
On Thu, 28 Jun 2007, �istein E. Andersen wrote: I don't really know how to do more research -- it's quite hard to programatically tell when an entity should be expanded and when it shouldn't. True, but this is not completely insurmountable — or, rather: useful information can be extracted without necessarily making these decisions explicitly. I do not know what you have done already, but something like the following for each entity ref; would be useful for the discussion: — total number of ref; — number of ref;; — number of ref followed by /[a-zA-Z0-9]/; — the N most frequent matches of /[a-zA-Z0-9]*ref[a-zA-Z0-9]+/. Without any real data, arguing, e.g., that conforming HTML 4.01 documents that are currently handled correctly by Firefox and Safari must be handled differently in the future for the sake of backwards compatibility is not really persuasive. Sadly none of the arguments in any direction right now are particularly persuasive. I'm not really convinced that the data that the above proposed survey might collect would actually help, since it doesn't tell us the what was intended by the author. You'd be surprised at how often people use ampersands in text in ways that have nothing to do with entities but in ways which could get interpreted as entities. The implication seems to be that Reacutesumeacute can be found on the Web and therefore should be supported. But Google also tells us something else: (1) reacutesumé: 572 (2) +résumé: 114,000,000 (3) reacute;sumeacute -reacute;sumeacute;s: 16,300 (4) +résumé: 1,000 Actually, (1) does not only cover reacutesumeacute, but also code like ramp;eacutesumé, so the number of occurrences that can be saved by parser quirks is lower than 572. The number of occurences of reacutesumé is at least two (the two hits I looked at both worked in IE and did not in Firefox). Am I correct in assuming that you would like the spec changed? What would you like the spec changed to, exactly? -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
[whatwg] Canvas - non-standard globalCompositeOperation
In addition to the standard values for globalCompositeOperation (and ignoring 'darker'), Gecko supports: clear: The Porter-Duff 'clear' operator, which always sets the output to rgba(0, 0, 0, 0). over: Synonym for 'source-over'. The code says not part of spec, kept here for compat. (It looks like FF1.5 had a broken 'source-over', and implemented 'over' like a correct 'source-over'. 'source-over' was fixed in FF2.0, and 'over' left unchanged.) (See http://lxr.mozilla.org/mozilla/source/content/canvas/src/nsCanvasRenderingContext2D.cpp#1703.) WebKit supports: clear: Same as above. highlight: Synonym for source-over. (See http://developer.apple.com/documentation/Cocoa/Reference/ApplicationKit/Classes/NSImage_Class/Reference/Reference.html#//apple_ref/doc/c_ref/NSCompositeHighlight - NSCompositeHighlight: Deprecated. Mapped to NSCompositeSourceOver.) (See http://trac.webkit.org/projects/webkit/browser/trunk/WebCore/platform/graphics/GraphicsTypes.cpp#L34.) Opera is very nice and doesn't do anything wrong. The spec clearly defines the behaviour here: any attempts to set such values must be ignored. 'clear' is pretty useless, since it's exactly equivalent to doing globalAlpha = 0; globalCompositeOperation = 'copy' or (depending on the transform matrix) clearRect(0, 0, w, h). The spec already omits the Porter-Duff 'B' operator (which sets the output to be equal to the destination bitmap, i.e. is equivalent to not drawing anything at all), so it does not seem reasonable to argue for adding 'clear' just for completeness. I can't think of any other reasons for it to be added to the spec, other than for interoperability. As far as I can imagine, for each non-standard value, the possible situations are: * No content relies on that value. = Web browsers should remove support for it: it has no purpose, and it may result in authors accidentally using that value and becoming confused when their code doesn't work in other browsers which will be irritating for everyone and it will evolve into the next situation: * Web content relies on that value. = It should be added to the spec, because it's necessary for handling web content. * Non-web, browser-specific content (extensions, widgets, etc) relies on that value, and web content doesn't. = It should be disabled except when run in the extension/widget/etc context, to avoid the problems as in the first case. That may cause minor confusion to the extension/widget/etc authors about why their code [which is relying on undocumented features] works differently if they run it on the web instead, but that seems insignificant compared to having interoperability problems on the web. * Nobody cares. = Nothing happens. Am I missing any issues here? Would any browser developer think one of the first three situations applies, and be willing to make the necessary changes in that case? -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] Entity parsing
On 28 Jun 2007, at 12:43AM, Ian Hickson wrote: Sadly none of the arguments in any direction right now are particularly persuasive. Indeed. I'm not really convinced that the data that the above proposed survey might collect would actually help, since it doesn't tell us the what was intended by the author. To a certain extent, this depends on the results. Some conclusions can be drawn without actually knowing the author's intent at all: if, for instance, foo[^;] is exceedingly rare, then what the author meant does not really matter, since the construct does not need to be supported anyway. I also tend to think that entities that are part of existing words are highly likely to be supposed to be expanded. Of course, 100% accuracy cannot be achieved, but this is not really needed for the results to be useful. Am I correct in assuming that you would like the spec changed? What would you like the spec changed to, exactly? I would really like an informed decision, and I currently get the impression that rules are changed to follow IE by default rather than to handle existing content, which may lead to unnecessary complicated rules that do not actually handle existing documents optimally. More specifically, some of the points that probably should be addressed are the following: 1) Is it useful to handle unterminated entities followed by an alphanumerical character like IE does? The number of documents for which this actually helps might be small compared to the number of documents that contain other, incorrigible errors. The process also introduces errors, albeit not in conforming documents. Is the gain worth the added complexity? If so, then should this apply to all entities? (Probably not.) Would it be useful to add to/remove from the set supported by IE7? (This may seem insane, but we should try to avoid premature decisions.) 2) HTML 4.01 allows the semicolon to be omitted in certain cases. Does this cause problems? Firefox and Safari both support this, and it would seem meaningless to change the way conforming documents are parsed unless it can be shown that, e.g., ndash actually is supposed to mean amp;ndash more often than ndash; . (Conformance is a separate issue.) 3) Will new entities ever be needed? If yes, can new entities adopt existing conformance criteria and parsing rules? 4) Similar considerations for entities in attribute values. -- Øistein E. Andersen