Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements
On Wed, 11 Aug 2010 01:43:01 +0200, Silvia Pfeiffer silviapfeiff...@gmail.com wrote: That's a good approach and will reduce the need for breaking backwards-compatibility. In an xml-based format that need is 0, while with a text format where the structure is ad-hoc, that need can never be reduced to 0. That's what I am concerned about and that's why I think we need a version identifier. If we end up never using/changing the version identifier, the better so. But I'd much rather we have it now and can identify what specification a file adheres to than not being able to do so later. XML is also text-based. ;-) But more seriously, if we ever need to make changes that would completely break backwards compatibility we should just use a new format rather than fit it into an existing one. That is the approach we have for most formats (and APIs) on the web (CSS, HTML, XMLHttpRequest) and so far a version identifier need (or need for a replacement) has not yet arisen. Might be worth reading through some of: http://www.w3.org/2002/09/wbs/40318/issues-4-84-objection-poll/results On Tue, Aug 10, 2010 at 7:49 PM, Philip Jägenstedt phil...@opera.comwrote: That would make text/srt and text/websrt synonymous, which is kind of pointless. No, it's only pointless if you are a browser vendor. For everyone else it is a huge advantage to be able to choose between a guaranteed simple format and a complex format with all the bells and whistles. But it is not complex at all and everyone else supports most of the extensions the WebSRT format has. -- Anne van Kesteren http://annevankesteren.nl/
Re: [whatwg] HTML5 (including next generation additions still in development) - Mozilla Firefox (Not Responding)
On 8/10/10, Ian Hickson i...@hixie.ch wrote: On Wed, 7 Jul 2010, Garrett Smith wrote: This is about the fourth time I've said it here. Can the person in charge of writing the slow and buggy ajvascript on the HTML 5 spec please remove that? The problem is that that whatwg page causes freezes and crashes [...] That sounds like a bug in the browser. No page should cause such problems. The halting problem is caused by the program running in the environment. While the environment is not something you get to control (that's my browser), you see what the code is doing. I don't see such problems with the browsers I use to view the spec. I'm running Firefox 3.6.4 on windows 7, on a 2ghz intel dual core with 2g ram. Nothing to brag about, but I've seen faster applications running on IE5 on Windows 98. On Wed, 7 Jul 2010, Boris Zbarsky wrote: I'll just note that part of the reason it's a stress test, apart from the old Firefox issue, is that it tries to be clever and not hang the browser which actually causes the browser to do a lot more work. On my machine here, if the spec's script were not trying for the clever thing, it would take about 1-2 seconds (with a completely hung browser during that time) to do what it currently takes anywhere from 8 to 25 seconds to do, during which time the browser is largely unresponsive anyway. Even 1 second would be still too long. I've tried to tweak the scripts to not be quite as silly in the way they split up the work (in particular, now they won't split up the work if it's being done fast -- in browsers I tested, this reduced the problem just to the restyling being slow, in some cases taking a few seconds). Well its still freezing my Firefox. Looping through the dom on page load is something that is best avoided. Most (if not all) of this stuff seems best suite for server-side logic. I see navigational and state management features that could be done on the server. For example: // dfn.js // makes dfn elements link back to all uses of the term However the freezing seems to be coming from toc.js. Navigation should be done in HTML, not in javascript. The code itself has problems and shouldn't be expected to do anything more than throw errors. toc.js: while (li) { if (li.nodeType == Node.ELEMENT_NODE) { var id = li.firstChild.hash.substr(1); Don't expect nonstandard global `Node` property; there isn't any standard that says it should be there and it won't work cross browser. You could use your own constants, but what you really want here is list's items, since no such property exists, you can use list.getElementsByTagName(li). Next, the code expects that li.firstChild is an object with a hash property (string). That could be an a. What happens if whitespace or a comment appears before that a? Unless the script is generating that HTML, it would be safer to use li.getElementsByTagName(a), or at least to perform an existence inference check; var hash = li.firstChild typeof li.firstChild.hash == string; if(hash) { hash = hash.substring(1); } String.prototype.substr is nonstandard. String.prototype.substring is specified standard by ECMA. That use of substr there won't trigger the IE bugs, but why use a nonstandard method when a normatively spec'd method is available and known to work more consistently? The javascript navigation won't work if the script fails or throw errors. There is no reason to expect that Node is present and if it is not present, the script will throw errors and that is the author's fault. Can I ask why you chose to use javascript to create navigation? Is it because you ahve to deal with disparity between the server environments on whatwg.org and w3.org? Cna you do it another way? It is getting late now. I may try and take another look at toc.js tomorrow. I'd much rather see a server side strategy used to manage the navigation, though. I would also like to quickly mention that the Submit Review Comment as well as the status feature that jumps around on scrolling both get in the way and are distracting. I'd like to see the status as a static, non moving element on the page. I'd like to see the Submit Review Comment feature not get in the way of my browser's find feature. When the browser's find is used, the text that is highlighted is hidden under that element. This forces the user to return focus to the document, scroll down a bit past that Submit Review element, read the highlighted text to see if the context the text appears in is in the context of what he was looking for, and if it isn't, then re-focus the find feature and repeat that process. Instead, I'd prefer not to have that feature get in the way. Garrett
Re: [whatwg] HTML5 (including next generation additions still in development) - Mozilla Firefox (Not Responding)
On 8/11/10 3:49 AM, Garrett Smith wrote: I'm running Firefox 3.6.4 on windows 7 Which has a known performance bug with a particular reasonably rare class of DOM mutations. The only way for the spec to avoid performing such mutations is to not add the annotation boxes (which is what it will do if you ask it not to load them) or to embed them in the HTML itself instead of adding them dynamically. Or to make them overflow:visible, of course... I've tried to tweak the scripts to not be quite as silly in the way they split up the work (in particular, now they won't split up the work if it's being done fast -- in browsers I tested, this reduced the problem just to the restyling being slow, in some cases taking a few seconds). Well its still freezing my Firefox. Ian's change doesn't affect the issue described above, which your Firefox still suffers from. Is this the part where I urge you to try Firefox 4 beta 3 when it comes out in a day or two? ;) Looping through the dom on page load is something that is best avoided. That's not an issue here. Most (if not all) of this stuff seems best suite for server-side logic. That's possible; I'm assuming Ian had a reason for setting it up the way he did. However the freezing seems to be coming from toc.js. Yes. -Boris
Re: [whatwg] Please consider adding a couple more datetime input types - type=year and type=month-day
Ryosuke Niwa: All popular calendar systems should be supported. Browser widgets for the datetime types may support more than proleptic Gregorian, but the spec shouldn’t. ISO 8601 or a subset thereof should be the interchange format; clients and servers, before and after, may handle it however they deem useful. It makes sense to be able to specify other calendaric elements than just a specific day (-MM-DD or -DDD – these differ only in leap years). ISO 8601 offers several other specific items (month: -MM, week: -wWW, day: -wWW-D, year: ) and several unspecific items (-DDD, -MM-DD, --DD, -wWW, -wWW-D, --D). If I remember correctly splitting the century component from the year was only in previous versions of ISO 8601. (Almost the same goes for time of the day, of course.) The thing is those items hardly ever exactly match their equivalent types in other calendars: the months are very proprietary (even hardly compatible with Julian), the ISO week (Mon–Sun, no splits) is not the same as the Christian/Hebrew (Sun–Sat, no splits), US (Sun–Sat, splits), TV/Muslim (Sat–Fri, no splits) week or the work-week (Mon–Fri), and sometimes even the day is not the same because it doesn’t run from 00:00Z through 24:00Z. The standard doesn’t define semesters, trimesters or quarters (e.g. 3 months or 13 weeks) of any kind and neither does it say anything about seasons, zodiacs, phases of the moon or other days not bound to a single month-day or week-day. It features two types of year with different lengths (365+1 days and 52+1 weeks) sharing a common numeric designation. The 12-hour clock does not pose any problems (if supported in browser widgets), because it maps exactly to the 24-hour clock – unless you want to be able to select ‘AM’ or ‘PM’ without giving an exact time, but why stop there with dichotomy when ‘night’, ‘daylight', ‘afternoon’, ‘evening’, ‘dawn’ etc. may be equally viable selections? Tantek and others provided use cases already for several of the types available in ISO 8601: : year of birth without revealing birthday -MM-DD: birthday without revealing age -MM:should this be ‘month’? -MM:credit card expiration, calendar page selection --D:day to show in a weekly recurring pattern, e.g. TV or school schedule --DD: day to show in a monthly recurring pattern, seldom -DDD: day to show in a yeary recurring pattern, rare -wWW: should this be ‘week’? -wWW-D: repeating week dates, perhaps including some holidays -wWW: accounting, work schedule, may be mappable to TV weeks etc. -wWW-D: can usually be substituted by -MM-DD or -DDD -DDD: can usually be substituted by -MM-DD ±…: history ±Y*:astronomy, geology, geography, palæontology etc. To support items other than ISO year, month, week and day, it might be useful to somehow support any arbitrary period by way of time intervals (including the P syntax) in ISO 8601. The current TV week was for example 2010-W31-6T03:00/P1W or 2010-W31-6/W32-5. User Interface Someone mentioned Nielsen’s research results that simple textual input is preferable for dates. That is true (and mostly irrelvant here), but only when the date is known exactly, as with your own birthday, but may already fail when you have to enter tomorrow or next Thursday (and assume that “tomorrow” and “next Thu” are not valid inputs, although they should be), since you may know today’s weekday and month but not the numeric position inside the month nor the week number.
Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements
On Wed, Aug 11, 2010 at 5:04 PM, Anne van Kesteren ann...@opera.com wrote: On Wed, 11 Aug 2010 01:43:01 +0200, Silvia Pfeiffer silviapfeiff...@gmail.com wrote: That's a good approach and will reduce the need for breaking backwards-compatibility. In an xml-based format that need is 0, while with a text format where the structure is ad-hoc, that need can never be reduced to 0. That's what I am concerned about and that's why I think we need a version identifier. If we end up never using/changing the version identifier, the better so. But I'd much rather we have it now and can identify what specification a file adheres to than not being able to do so later. XML is also text-based. ;-) I mean unstructured text. ;-) But more seriously, if we ever need to make changes that would completely break backwards compatibility we should just use a new format rather than fit it into an existing one. That's exactly the argument I am using for why WebSRT should be a new format and not take over the SRT space. They are different enough to not just be versions of each other. That's actually what I care about a lot more than a version field. That is the approach we have for most formats (and APIs) on the web (CSS, HTML, XMLHttpRequest) and so far a version identifier need (or need for a replacement) has not yet arisen. There are Web formats with a version attribute, such as Atom, RSS and even HTTP has a version number. Also, I can see that structured formats with a clear path for how extensions would be included may not need such a version attribute. WebSRT is not such a structured format, which is what makes all the difference. For example, you simply cannot put a new element outside the root element in XML, but you can easily put a new element anywhere in WebSRT - which might actually make a lot of sense if you think e.g. about adding SVG and CSS inline in future. Might be worth reading through some of: http://www.w3.org/2002/09/wbs/40318/issues-4-84-objection-poll/results I guess you mostly wanted me to read http://berjon.com/blog/2009/12/xmlbp-naive-versioning.html . :-) It's a nice discussion with some good experiences. Interesting that we need quirks mode to deal with versioning issues. It doesn't take into account good practice in software development, though, where there is a minor version number and a major version number. A change of the minor version number is ignored by apps that need to display something - it just gives a hint that new features were introduced that shouldn't break anything. It's basically metadata to give a hint to applications where it really matters, e.g. if an application relies on new features to be available. A change of major version number, however, essentially means it's a new format and thus breaks existing stuff to allow the world to move forwards within the same namespace and experience framework. But let's get this resolved. I don't care enough about this to make a fuss. So ... if we do everything possible to make WebSRT flexible for future changes (which is what Philip proposed) and agree that if we cannot extend WebSRT in a backwards compatible manner, we will create a new format, I can live without a version attribute. I am only a little weary of this, because already we are trying to make SRT and WebSRT the same format when there is no compatibility (see below). On Tue, Aug 10, 2010 at 7:49 PM, Philip Jägenstedt phil...@opera.com wrote: That would make text/srt and text/websrt synonymous, which is kind of pointless. No, it's only pointless if you are a browser vendor. For everyone else it is a huge advantage to be able to choose between a guaranteed simple format and a complex format with all the bells and whistles. But it is not complex at all and everyone else supports most of the extensions the WebSRT format has. All of the WebSRT extensions that do not exist in {basic SRT , b , i} are not supported by anyone yet. Existing SRT authoring tools, media players, transcoding tools, etc. do not support the cue settings (see http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#websrt-cue-settings), or parsing of random text in the cues, or the voice markers. So, I disagree with everyone else supports most of the extensions of the WebSRT format. Also, what I man with the word complex is actually a good thing: a format that supports lots of requirements that go beyond the basic ones. Thus, it's actually a good thing to have a simple format (i.e. SRT) and a complex (maybe rather: rich? capable?) format (i.e. WebSRT). Cheers, Silvia.
Re: [whatwg] select element should have a required attribute
2010-08-10 21:25 EEST: Tab Atkins Jr.: On Tue, Aug 10, 2010 at 11:12 AM, Mike Wilcox m...@mikewilcox.net wrote: This seems like the ideal situation to use a placeholder attribute: select required=true placeholder=Select an item... option value=Foo Foo /option option value=Bar Bar /option option value= None /option /select Almost, but not quite. Yes, the value used in this situation is essentially a placeholder value - it performs the same function as input placeholder. However, an input type=text placeholder=foo required will fail validation if the user doesn't interact with it, while a similar select will not at the moment (it will just submit the first value). It could be possible to define the interaction of select, @placeholder, and @required in such a way that it works intelligently, so that the select with a placeholder fails validation if the user doesn't interact with it, but that may be too much magic. I would prefer something like this: select pattern=(?!myvalue).* option value=myvalueSelect an item.../option option ../option ... /select That is, the author should explicitly specify that the item with the special value will not be accepted. Stuff I don't want to see (combined with @required): - first option is always special - empty string as the value is special - option without a value is special If there needs to be some easier way to specify this but the pattern, how about @disallow=xyz? -- Mikko signature.asc Description: OpenPGP digital signature
Re: [whatwg] video await a stable state in resource selection (Was: Race condition in media load algorithm)
CC Hixie, question below. On Tue, 10 Aug 2010 18:39:04 +0200, Boris Zbarsky bzbar...@mit.edu wrote: On 8/10/10 4:40 AM, Philip Jägenstedt wrote: Because the parser can't create a state which the algorithm doesn't handle. It always first inserts the video element, then the source elements in the order they should be evaluated. The algorithm is written in such a way that the overall result is the same regardless of whether it is invoked/continued on each inserted source element or after the video element is closed. Ah, the waiting state, etc? Yes, in the case of the parser inserting source elements that fail one of the tests (no src, wrong type, wrong media) the algorithm will end up at step 6.21 waiting. It doesn't matter if all sources are available when the algorithm is first invoked or if they trickle in, be that from the parser or from scripts. Why does the algorithm not just reevaluate any sources after the newly-inserted source instead? Because if a source failed after network access (404, wrong MIME, etc) then we'd have to perform that network access again and again for each modification. More on that below. However, scripts can see the state at any point, which is why it needs to be the same in all browsers. I'm not sure which the state you mean here. For example networkState can be NETWORK_NO_SOURCE, NETWORK_EMPTY or NETWORK_LOADING depending on which steps you've run. Silvia Pfeiffer found inconsistencies between browsers because of this in, see http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2010-July/027284.html It's quite serious because NETWORK_EMPTY is used as a condition in many places of the spec, so this absolutely must be consistent between browsers. Because changes to the set of source elements do not restart the resource selection algorithm, right? Why don't they, exactly? That seems broken to me, from the POV of how the rest of the DOM generally works (except as required by backward compatibility considerations)... The resource selection is only started once, typically when the src attribute is set (by parser or script) or when the first source element is inserted. If it ends up in step 21 waiting, inserting another source element may cause it to continue at step 22. Right, ok. Restarting the algorithm on any modification of source elements would mean retrying sources that have previously failed due to network errors or incorrect MIME type again and again, wasting network resources. Instead, the algorithm just keeps it state and waits for more source elements to try. Well, the problem is that it introduces hysteresis into the DOM. Why is this a smaller consideration than the other, in the edge case when someone inserts sources in reverse order and slowly (off the event loop)? The algorithm has been very stateful since I first implemented it and I always considered the sync/async split to be precisely for that reason, to be more tolerant of the order of DOM modification. I'll have to let Hixie answer why this specific trade-off was made. That is, why do we only consider sources inserted after the |pointer| instead of all newly inserted sources? Otherwise the pointer could potentially reach the same source element twice, with the aforementioned problems with failing after network access. I'm not sure what you mean by hysteresis http://en.wikipedia.org/wiki/Hysteresis Specifically, that the state of the page depends not only on the current state of the DOM but also on the path in state space that the page took to get there. Or in other words, that inserting two source elements does different things depending on whether you do appendChild(a); appendChild(b) or appendChild(b); insertBefore(a, b), even though the resulting DOM is exactly the same. Or in your case, the fact that the ordering of the setAttribute and insertChild calls matters, say. Such situations, which introduce order-dependency on DOM operations, are wonderful sources of frustration for web developers, especially if libraries that abstract away the DOM manipulation are involved (so the web developer can't even change the operation order). OK, perhaps I should take this more seriously. Making the whole algorithm synchronous probably isn't a brilliant idea unless we can also do away with all of the state it keeps (i.e. hysteresis). One way would be to introduce a magic flag on all source elements to indicate that they have already failed. This would be cleared whenever src, type or media is modified. Another is to cache 404 responses and the MIME types of rejected resources, but I think that's a bit overkill. Do you have any specific ideas? I have a really hard time believing that you trigger resource selection when the video is inserted into the document and don't retrigger it afterward, given that... do you? To the best of my knowledge we do exactly what the spec says, apart from the
Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements
On Wed, 11 Aug 2010 10:30:23 +0200, Silvia Pfeiffer silviapfeiff...@gmail.com wrote: On Wed, Aug 11, 2010 at 5:04 PM, Anne van Kesteren ann...@opera.com wrote: That is the approach we have for most formats (and APIs) on the web (CSS, HTML, XMLHttpRequest) and so far a version identifier need (or need for a replacement) has not yet arisen. There are Web formats with a version attribute, such as Atom, RSS and even HTTP has a version number. None of these have really executed a successful version strategy though. Syndication in particular is quite bad, we should learn from that. See e.g. http://diveintomark.org/archives/2004/02/04/incompatible-rss Also, I can see that structured formats with a clear path for how extensions would be included may not need such a version attribute. WebSRT is not such a structured format, which is what makes all the difference. For example, you simply cannot put a new element outside the root element in XML, but you can easily put a new element anywhere in WebSRT - which might actually make a lot of sense if you think e.g. about adding SVG and CSS inline in future. There is all kinds of ways we could address this. For instance, we could add a feature that makes a line ignored and use that in the future for new features. While players are transitioning to WebSRT they will ensure that they do not break with future versions of the format. There might be enough extensibility in the current WebSRT parsing rules for this, I have not checked. It doesn't take into account good practice in software development, though, where there is a minor version number and a major version number. A change of the minor version number is ignored by apps that need to display something - it just gives a hint that new features were introduced that shouldn't break anything. It's basically metadata to give a hint to applications where it really matters, e.g. if an application relies on new features to be available. A change of major version number, however, essentially means it's a new format and thus breaks existing stuff to allow the world to move forwards within the same namespace and experience framework. What works for software products does not work for formats with universal deployment on which we want to get interoperability between various vendors. They are very distinct. But it is not complex at all and everyone else supports most of the extensions the WebSRT format has. All of the WebSRT extensions that do not exist in {basic SRT , b , i} are not supported by anyone yet. Existing SRT authoring tools, media players, transcoding tools, etc. do not support the cue settings (see http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#websrt-cue-settings), or parsing of random text in the cues, or the voice markers. So, I disagree with everyone else supports most of the extensions of the WebSRT format. Do they throw an error or do they just ignore the settings? If the latter it does not seem like a problem. If the former authors will probably not use these features for a while until they are better supported. Also, what I man with the word complex is actually a good thing: a format that supports lots of requirements that go beyond the basic ones. Thus, it's actually a good thing to have a simple format (i.e. SRT) and a complex (maybe rather: rich? capable?) format (i.e. WebSRT). I don't think so. It just makes things more complex for authors (learn two formats, have to convert formats (i.e. change mime) in order to use new features (which could be as simple as a ruby fragment for some Japanese track), more complex for implementors (need two separate implementations as to not encourage authors to use features of the more complex one in the less complex one), more complex for conformance checkers (need more code), etc. Seems highly suboptimal to me. -- Anne van Kesteren http://annevankesteren.nl/
Re: [whatwg] select element should have a required attribute
On Wed, Aug 11, 2010 at 2:03 AM, Mikko Rantalainen mikko.rantalai...@peda.net wrote: Stuff I don't want to see (combined with @required): - first option is always special - empty string as the value is special - option without a value is special Do you consider it a problem for input type=text required to treat empty string special? If yes, do you think @required should be removed completely? If no, why do you consider it a problem for select? / Jonas
Re: [whatwg] select element should have a required attribute
On Wed, Aug 11, 2010 at 9:31 PM, Jonas Sicking jo...@sicking.cc wrote: On Wed, Aug 11, 2010 at 2:03 AM, Mikko Rantalainen mikko.rantalai...@peda.net wrote: Stuff I don't want to see (combined with @required): - first option is always special - empty string as the value is special - option without a value is special Do you consider it a problem for input type=text required to treat empty string special? If yes, do you think @required should be removed completely? If no, why do you consider it a problem for select? I've been following this thread for only a short time but I too cannot see why one would not have @required on a select element. It appears to me that having a blank-value option within a select is a mechanism used by various websites where you are trying to prompt the user to select a value from the list. Personally I see input and select elements being very much aligned as they are both form elements - therefore I would have thought that if behaviour was being added to input then a similar behaviour (where applicable) should be added to select elements. Just my 2 cents... Richard.
Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements
On Wed, 11 Aug 2010 13:35:30 +0200, Silvia Pfeiffer silviapfeiff...@gmail.com wrote: On Wed, Aug 11, 2010 at 7:31 PM, Anne van Kesteren ann...@opera.com wrote: While players are transitioning to WebSRT they will ensure that they do not break with future versions of the format. That's impossible, since we do not know what future versions will look like and what features we may need. If that is impossible it would be impossible for HTML and CSS too. And clearly it is not. I'm pretty sure that several will break. We cannot just test a handful of available applications and if they don't break assume none will. In fact, all existing applications that get loaded with a WebSRT file with extended features will display text with stuff that is not expected - in particular if the metadata case is used. And wrong rendering is bad, e.g. if it's part of a production process, burnt onto the video, and shipped to hearing-impaired customers. Or stored in an archive. Sure, that's why the tools should be updated to support the standard format instead rather than each having their own variant of SRT. (And if they really just take in text like that they should at least run some kind of validation so not all kinds of garbage can get in.) I don't think so. It just makes things more complex for authors (learn two formats, I see that as an advantage: I can learn the simple format and be off to a running start immediately. Then, when I find out that I need more features, I can build on top of already existing knowledge for the richer format and can convert my old files through a simple renaming of the resources. Or could you learn the simple format from a tutorial that only teaches that and when you see someone else using more complex features you can just copy and paste them and use them directly. This is pretty much how the web works. have to convert formats (i.e. change mime) in order to use new features (which could be as simple as a ruby fragment for some Japanese track) If I know from the start that I need these features, I will immediately learn WebSRT. But you don't. , more complex for implementors (need two separate implementations as to not encourage authors to use features of the more complex one in the less complex one), more complex for conformance checkers (need more code), etc. Seems highly suboptimal to me. That's already part of Ian's proposal: it already supports multiple different approaches of parsing cues. No extra complexity here. Actually that is not true. There is only one approach to parsing in Ian's proposal. My theory is: we only implement support for WebSRT in the browser - that it happens to also support SRT is a positive side effect. It works for the Web - and it works for the existing SRT communities and platforms. They know they have to move to WebSRT in the long run, but right now they can get away with simple SRT support and still deliver for the Web. And they have a growth path into a new file format that provides richer features. This is the proposal. That they are the same format should not matter. -- Anne van Kesteren http://annevankesteren.nl/
Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements
On Wed, 11 Aug 2010 01:43:01 +0200, Silvia Pfeiffer silviapfeiff...@gmail.com wrote: On Tue, Aug 10, 2010 at 7:49 PM, Philip Jägenstedt phil...@opera.comwrote: On Tue, 10 Aug 2010 01:34:02 +0200, Silvia Pfeiffer silviapfeiff...@gmail.com wrote: On Tue, Aug 10, 2010 at 12:04 AM, Philip Jägenstedt phil...@opera.com wrote: On Sat, 07 Aug 2010 09:57:39 +0200, Silvia Pfeiffer silviapfeiff...@gmail.com wrote: I guess this is in support of Henri's proposal of parsing the cue using the HTML fragment parser (same as innerHTML)? That would be easy to implement, but how do we then mark up speakers? Using span class=narrator/span around each cue is very verbose. HTML isn't very good for marking up dialog, which is quite a limitation when dealing with subtitles... I actually think that the span @class mechanism is much more flexible than what we have in WebSRT right now. If we want multiple speakers to be able to speak in the same subtitle, then that's not possible in WebSRT. It's a little more verbose in HTML, but not massively. We might be able to add a special markup similar to the [timestamp] markup that Hixie introduced for Karaoke. This is beyond the innerHTML parser and I am not sure if it breaks it. But if it doesn't, then maybe we can also introduce a [voice] marker to be used similarly? An HTML parser parsing 1 or 00:01:30 will produce text nodes 1 and 00:01:30. Without having read the HTML parsing algorithm I guess that elements need to begin with a letter or similar. So, it's not possible to (ab)use the HTML parser to handle inner timestamps of numerical voices, we'd have to replace those with something else, probably more verbose. I have checked the parse spec and http://www.whatwg.org/specs/web-apps/current-work/#tag-open-state indeed implies that a tag starting with a number is a parse error. Both, the timestamps and the voice markers thus seem problems when going with an innerHTML parser. Is there a way to resolve this? I mean: I'd quite happily drop the voice markers for a span @class but I am not sure what to do about the timestamps. We could do what I did in WMML and introduce a t element with the timestamp as a @at attribute, but that is again more verbose. We could also introduce an @at attribute in span which would then at least end up in the DOM and can be dealt with specially. What should numerical voices be replaced with? Personally I'd much rather write philip and silvia to mark up a conversation between us two, as I think it'd be quite hard to keep track of the numbers if editing subtitles with many different speakers. However, going with that and using an HTML parser is quite a hack. Names like mark and li may already have special parsing rules or default CSS. Going with HTML in the cues, we either have to drop voices and inner timestamps or invent new markup, as HTML can't express either. I don't think either of those are really good solutions, so right now I'm not convinced that reusing the innerHTML parser is a good way forward. Think for example about the case where we had a requirement that a double newline starts a new cue, but now we want to introduce a means where the double newline is escaped and can be made part of a cue. Other formats keep track of their version, such as MS Word files. It is to be hoped that most new features can be introduced without breaking backwards compatibility and we can write the parsing requirements such that certain things will be ignored, but in and of itself, WebSRT doesn't provide for this extensibility. Right now, there is for example extensibility with the WebSRT settings parsing (that's the stuff behind the timestamps) where further setting:value settings can be introduced. But for example the introduction of new cue identifiers (that's the marker at the start of a cue) would be difficult without a version string, since anything that doesn't match the given list will just be parsed as cue-internal tag and thus end up as part of the cue text where plain text parsing is used. The bug I filed suggested allowing arbitrary voices, to simplify the parser and to make future extensions possible. For a web format I think this is a better approach format than versioning. I haven't done a full review of the parser, but there are probably more places where it could be more forgiving so as to allow future tweaking. That's a good approach and will reduce the need for breaking backwards-compatibility. In an xml-based format that need is 0, while with a text format where the structure is ad-hoc, that need can never be reduced to 0. That's what I am concerned about and that's why I think we need a version identifier. If we end up never using/changing the version identifier, the better so. But I'd much rather we have it now and can identify what specification a file adheres to than not being able to do so later. Perhaps I'm too
Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements
On Wed, Aug 11, 2010 at 9:49 PM, Anne van Kesteren ann...@opera.com wrote: On Wed, 11 Aug 2010 13:35:30 +0200, Silvia Pfeiffer silviapfeiff...@gmail.com wrote: On Wed, Aug 11, 2010 at 7:31 PM, Anne van Kesteren ann...@opera.com wrote: While players are transitioning to WebSRT they will ensure that they do not break with future versions of the format. That's impossible, since we do not know what future versions will look like and what features we may need. If that is impossible it would be impossible for HTML and CSS too. And clearly it is not. HTML and CSS have predefined structures within which their languages grow and are able to grow. WebSRT has newlines to structure the format, which is clearly not very useful for extensibility. No matter how we turn this, the xml background or HTML and the name-value background of CSS provide them with in-built extensibility, which WebSRT does not have. I'm pretty sure that several will break. We cannot just test a handful of available applications and if they don't break assume none will. In fact, all existing applications that get loaded with a WebSRT file with extended features will display text with stuff that is not expected - in particular if the metadata case is used. And wrong rendering is bad, e.g. if it's part of a production process, burnt onto the video, and shipped to hearing-impaired customers. Or stored in an archive. Sure, that's why the tools should be updated to support the standard format instead rather than each having their own variant of SRT. They don't have their own variant of SRT - they only have their own parsers. Some will tolerate crap at the end of the -- line. Others won't. That's no break of conformance to the basic spec as given in http://en.wikipedia.org/wiki/SubRip#SubRip_text_file_format . They all interoperate on the basic SRT format. But they don't interoperate on the WebSRT format. That's why WebSRT has to be a new format. (And if they really just take in text like that they should at least run some kind of validation so not all kinds of garbage can get in.) That's not a requirement of the spec. It's requirement is to render whatever characters are given in cues. That's why it is so simple. I don't think so. It just makes things more complex for authors (learn two formats, I see that as an advantage: I can learn the simple format and be off to a running start immediately. Then, when I find out that I need more features, I can build on top of already existing knowledge for the richer format and can convert my old files through a simple renaming of the resources. Or could you learn the simple format from a tutorial that only teaches that and when you see someone else using more complex features you can just copy and paste them and use them directly. This is pretty much how the web works. Sure. All I need to do is rename the file. Not much trouble at all. Better than believing I can just copy stuff from others since it's apparently the same format and then it breaks the SRT environment that I already have and that works. have to convert formats (i.e. change mime) in order to use new features (which could be as simple as a ruby fragment for some Japanese track) If I know from the start that I need these features, I will immediately learn WebSRT. But you don't. Why? If I write Japanese subtitles and my tutorial tells me they are not supported in SRT, but only in WebSRT, then I go for WebSRT. Done. , more complex for implementors (need two separate implementations as to not encourage authors to use features of the more complex one in the less complex one), more complex for conformance checkers (need more code), etc. Seems highly suboptimal to me. That's already part of Ian's proposal: it already supports multiple different approaches of parsing cues. No extra complexity here. Actually that is not true. There is only one approach to parsing in Ian's proposal. A the moment, cues can have one of two different types of content: (see http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#syntax-0 6. The cue payload: either WebSRT cue texthttp://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#websrt-cue-text or WebSRT metadata texthttp://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#websrt-metadata-text . So that means in essence two different parsers. My theory is: we only implement support for WebSRT in the browser - that it happens to also support SRT is a positive side effect. It works for the Web - and it works for the existing SRT communities and platforms. They know they have to move to WebSRT in the long run, but right now they can get away with simple SRT support and still deliver for the Web. And they have a growth path into a new file format that provides richer features. This is the proposal. That they are the same format should not matter. It matters to other
Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements
On Wed, Aug 11, 2010 at 10:30 PM, Philip Jägenstedt phil...@opera.comwrote: On Wed, 11 Aug 2010 01:43:01 +0200, Silvia Pfeiffer silviapfeiff...@gmail.com wrote: On Tue, Aug 10, 2010 at 7:49 PM, Philip Jägenstedt phil...@opera.com wrote: I have checked the parse spec and http://www.whatwg.org/specs/web-apps/current-work/#tag-open-state indeed implies that a tag starting with a number is a parse error. Both, the timestamps and the voice markers thus seem problems when going with an innerHTML parser. Is there a way to resolve this? I mean: I'd quite happily drop the voice markers for a span @class but I am not sure what to do about the timestamps. We could do what I did in WMML and introduce a t element with the timestamp as a @at attribute, but that is again more verbose. We could also introduce an @at attribute in span which would then at least end up in the DOM and can be dealt with specially. What should numerical voices be replaced with? Personally I'd much rather write philip and silvia to mark up a conversation between us two, as I think it'd be quite hard to keep track of the numbers if editing subtitles with many different speakers. However, going with that and using an HTML parser is quite a hack. Names like mark and li may already have special parsing rules or default CSS. In HTML it is span class=philip../span and span class=silvia.../span. I don't see anything wrong with that. And it's only marginally longer than philip ... /philip and silvia.../silvia. Going with HTML in the cues, we either have to drop voices and inner timestamps or invent new markup, as HTML can't express either. I don't think either of those are really good solutions, so right now I'm not convinced that reusing the innerHTML parser is a good way forward. I don't see a need for the voices - they already have markup in HTML, see above. But I do wonder about the timestamps. I'd much rather keep the innerHTML parser if we can, but I don't know enough about how the timestamps could be introduced in a non-breakable manner. Maybe with a data- attribute? Maybe span data-t=00:00:02.100.../span? Think for example about the case where we had a requirement that a double newline starts a new cue, but now we want to introduce a means where the double newline is escaped and can be made part of a cue. Other formats keep track of their version, such as MS Word files. It is to be hoped that most new features can be introduced without breaking backwards compatibility and we can write the parsing requirements such that certain things will be ignored, but in and of itself, WebSRT doesn't provide for this extensibility. Right now, there is for example extensibility with the WebSRT settings parsing (that's the stuff behind the timestamps) where further setting:value settings can be introduced. But for example the introduction of new cue identifiers (that's the marker at the start of a cue) would be difficult without a version string, since anything that doesn't match the given list will just be parsed as cue-internal tag and thus end up as part of the cue text where plain text parsing is used. The bug I filed suggested allowing arbitrary voices, to simplify the parser and to make future extensions possible. For a web format I think this is a better approach format than versioning. I haven't done a full review of the parser, but there are probably more places where it could be more forgiving so as to allow future tweaking. That's a good approach and will reduce the need for breaking backwards-compatibility. In an xml-based format that need is 0, while with a text format where the structure is ad-hoc, that need can never be reduced to 0. That's what I am concerned about and that's why I think we need a version identifier. If we end up never using/changing the version identifier, the better so. But I'd much rather we have it now and can identify what specification a file adheres to than not being able to do so later. Perhaps I'm too influenced by HTML and its failed attempts at versioning, but I think that if you want to know which version of a spec a document is written against, you can run it through a parser for each version. This doesn't tell you the author intent, but I'm not sure that's very interesting to know. If the author thinks it's important, perhaps it can be put in a comment in the header. I was most concerned about non-backwards-compatible changes here, but let's not repeat the discussion I had with Anne. Let's rather focus on making sure we have some means of extending WebSRT in future, should the need arise. On the other hand, keeping the same extension and (unregistered) MIME type as SRT has plenty of benefits, such as immediately being able to use existing SRT files in browsers without changing their file extension or MIME type. There is no harm for browsers to accept both MIME types if they are sure they can parse
Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements
On Wed, 11 Aug 2010 15:09:34 +0200, Silvia Pfeiffer silviapfeiff...@gmail.com wrote: HTML and CSS have predefined structures within which their languages grow and are able to grow. WebSRT has newlines to structure the format, which is clearly not very useful for extensibility. No matter how we turn this, the xml background or HTML and the name-value background of CSS provide them with in-built extensibility, which WebSRT does not have. The parser has the bad cue loop concept for ignoring supposedly bogus lines. Seems extensible to me. Sure, that's why the tools should be updated to support the standard format instead rather than each having their own variant of SRT. They don't have their own variant of SRT - they only have their own parsers. That comes down to the same thing in my opinion. This is like saying browsers did not all have their own variant of HTML4. Some will tolerate crap at the end of the -- line. Others won't. That's no break of conformance to the basic spec as given in http://en.wikipedia.org/wiki/SubRip#SubRip_text_file_format . They all interoperate on the basic SRT format. But they don't interoperate on the WebSRT format. That's why WebSRT has to be a new format. By that reasoning HTML5 would have had to be a new format too. And CSS 2.1 as opposed to CSS 2, etc. (And if they really just take in text like that they should at least run some kind of validation so not all kinds of garbage can get in.) That's not a requirement of the spec. It's requirement is to render whatever characters are given in cues. That's why it is so simple. But it is not so simple because various extensions are out there in the wild and are used so the concerns you have with respect to WebSRT already apply. Sure. All I need to do is rename the file. Not much trouble at all. Better than believing I can just copy stuff from others since it's apparently the same format and then it breaks the SRT environment that I already have and that works. At least with the copy approach you would still see something in your SRT environment. The ruby bits would just be ignored or some such. That's already part of Ian's proposal: it already supports multiple different approaches of parsing cues. No extra complexity here. Actually that is not true. There is only one approach to parsing in Ian's proposal. A the moment, cues can have one of two different types of content: (see http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#syntax-0 [...] So that means in essence two different parsers. Per the parser section there is only one. See the end of http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#parsing-0 -- Anne van Kesteren http://annevankesteren.nl/
Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements
On Wed, Aug 11, 2010 at 11:45 PM, Anne van Kesteren ann...@opera.comwrote: On Wed, 11 Aug 2010 15:09:34 +0200, Silvia Pfeiffer silviapfeiff...@gmail.com wrote: HTML and CSS have predefined structures within which their languages grow and are able to grow. WebSRT has newlines to structure the format, which is clearly not very useful for extensibility. No matter how we turn this, the xml background or HTML and the name-value background of CSS provide them with in-built extensibility, which WebSRT does not have. The parser has the bad cue loop concept for ignoring supposedly bogus lines. Seems extensible to me. Hmm, that's for ignoring lines that don't match the -- pattern. It could work: ignore anything that's inside a WebSRT file and not a cue. I tend to think of caption files as composed of the following broad components: * header-data that is information that applies to the complete file, which tends to be setup data (such as language, charset, stylesheet link etc) and metadata (name-value pairs) * a list of cues, which have their own structure: ** start and end time ** per-cue header-type data such as more setup data, positioning, text size etc ** the cue text itself (in various structured formats, potentially with time markers for roll-on presentation) * comments that can be made at any location As long as we can make sure we're extensible within these broader areas, I *think* we should be ok. Sure, that's why the tools should be updated to support the standard format instead rather than each having their own variant of SRT. They don't have their own variant of SRT - they only have their own parsers. That comes down to the same thing in my opinion. This is like saying browsers did not all have their own variant of HTML4. From an author's point of view, they were not writing multiple different Web pages, but only trying to accommodate the quirks of each browser in one page. So, no, I wouldn't regard them as having different versions of HTML4. Some will tolerate crap at the end of the -- line. Others won't. That's no break of conformance to the basic spec as given in http://en.wikipedia.org/wiki/SubRip#SubRip_text_file_format . They all interoperate on the basic SRT format. But they don't interoperate on the WebSRT format. That's why WebSRT has to be a new format. By that reasoning HTML5 would have had to be a new format too. And CSS 2.1 as opposed to CSS 2, etc. They interoperate by their sheer structure. It has been made sure that old browsers will ignore the new additions because there is a structured means to grow theres. So, no, I believe they are different cases. (And if they really just take in text like that they should at least run some kind of validation so not all kinds of garbage can get in.) That's not a requirement of the spec. It's requirement is to render whatever characters are given in cues. That's why it is so simple. But it is not so simple because various extensions are out there in the wild and are used so the concerns you have with respect to WebSRT already apply. There are two version out there: the plain ones without markup and the ones with i,b,u and font. Nothing else exists. Those could be called quirks of the same format. I would prefer if SRT meant only the stuff without any markup at all, which is supported by everyone who supports SRT. The thing is, WebSRT isn't even backwards compatible with the quirky SRT extension: it doesn't support u and font. So, it's neither backwards nor forwards compatible. Sure. All I need to do is rename the file. Not much trouble at all. Better than believing I can just copy stuff from others since it's apparently the same format and then it breaks the SRT environment that I already have and that works. At least with the copy approach you would still see something in your SRT environment. The ruby bits would just be ignored or some such. Preferably, I would be using a captioning application which will make me aware that I am just now adding features that the format the I used for saving doesn't support. So it gives me the choice of either losing those features or upgrading to the better format. It's what all text processors do, too, so people are used to it. And they know to stick to the more capable formats. That's already part of Ian's proposal: it already supports multiple different approaches of parsing cues. No extra complexity here. Actually that is not true. There is only one approach to parsing in Ian's proposal. A the moment, cues can have one of two different types of content: (see http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#syntax-0 [...] So that means in essence two different parsers. Per the parser section there is only one. See the end of http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#parsing-0 Yeah, I think there's something missing in the spec. Cheers, Silvia.
Re: [whatwg] HTML5 (including next generation additions still in development) - Mozilla Firefox (Not Responding)
On 8/11/10, Boris Zbarsky bzbar...@mit.edu wrote: On 8/11/10 3:49 AM, Garrett Smith wrote: I'm running Firefox 3.6.4 on windows 7 Which has a known performance bug with a particular reasonably rare class of DOM mutations. The only way for the spec to avoid performing such mutations is to not add the annotation boxes (which is what it will do if you ask it not to load them) or to embed them in the HTML itself instead of adding them dynamically. Or to make them overflow:visible, of course... I've tried to tweak the scripts to not be quite as silly in the way they split up the work (in particular, now they won't split up the work if it's being done fast -- in browsers I tested, this reduced the problem just to the restyling being slow, in some cases taking a few seconds). Well its still freezing my Firefox. Ian's change doesn't affect the issue described above, which your Firefox still suffers from. Is this the part where I urge you to try Firefox 4 beta 3 when it comes out in a day or two? ;) It would have been more helpful to explain, if you can, the cause of the slowness in Firefox.. I can run the debugger and step through it to see which part is taking time and then explain that. There is a slight chance that I'll actually do that, depending on time. Looping through the dom on page load is something that is best avoided. That's not an issue here. No. Actually it *is* an issue here; that issue is just massively dwarfed by the other issue behind door #2. Most (if not all) of this stuff seems best suite for server-side logic. That's possible; I'm assuming Ian had a reason for setting it up the way he did. Possible where? On w3c server or whatwg server? [...] Garrett
Re: [whatwg] Iframe dimensions
Am 11.08.2010 00:24 schrieb Ian Hickson: On Mon, 5 Jul 2010, Markus Ernst wrote: [...] Example: http://test.rapid.ch/de/haendler-schweiz/iseki.html (This is under construction.) As a workaround to the height problem, I applied a script that adjusts the iframe height to the available height in the browser window. But of course the user experience would be more consistent if the page could behave like a single page, with only one scrollbar at the right of the browser window. If you control both pages and can't use seamless, you can use postMessage() to negotiate a size. On the long term, I expect we'll make seamless work with CORS somehow. I'm waiting until we properly understand how CORS is used in the wild before adding it all over the place in HTML. A solution at authoring level for cases where the author controls both pages would be quite helpful. I think of a meta element in the embedded document that specifies one or more domains that are allowed to embed it seamlessly in an iframe, such as e.g.: meta name=allow-seamless-embedding name=domain.tld, otherdomain.tld I think that this would be ok from a security POV, and much easier than using CORS. On Tue, 6 Jul 2010, Markus Ernst wrote: My problem is this sentence in the spec for seamless: This will cause links to open in the parent browsing context. In an application like http://test.rapid.ch/de/haendler-schweiz/iseki.html, the external page should be able to re-call itself inside the iframe, for example if a sort link is clicked or a search form submitted. On Tue, 6 Jul 2010, Ashley Sheridan wrote: Could you explicitly call the _self target in links in the frame? I wasn't sure if the target attribute was going or not, but I'd expect target=_self to override the default seamless action. Good point. Fixed. You can now work around this by targetting the frame explicitly using base target=_self. (Or by using target=foo if the iframe has name=foo.) Great!
Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements
On Wed, 11 Aug 2010 15:38:32 +0200, Silvia Pfeiffer silviapfeiff...@gmail.com wrote: On Wed, Aug 11, 2010 at 10:30 PM, Philip Jägenstedt phil...@opera.comwrote: On Wed, 11 Aug 2010 01:43:01 +0200, Silvia Pfeiffer silviapfeiff...@gmail.com wrote: On Tue, Aug 10, 2010 at 7:49 PM, Philip Jägenstedt phil...@opera.com wrote: I have checked the parse spec and http://www.whatwg.org/specs/web-apps/current-work/#tag-open-state indeed implies that a tag starting with a number is a parse error. Both, the timestamps and the voice markers thus seem problems when going with an innerHTML parser. Is there a way to resolve this? I mean: I'd quite happily drop the voice markers for a span @class but I am not sure what to do about the timestamps. We could do what I did in WMML and introduce a t element with the timestamp as a @at attribute, but that is again more verbose. We could also introduce an @at attribute in span which would then at least end up in the DOM and can be dealt with specially. What should numerical voices be replaced with? Personally I'd much rather write philip and silvia to mark up a conversation between us two, as I think it'd be quite hard to keep track of the numbers if editing subtitles with many different speakers. However, going with that and using an HTML parser is quite a hack. Names like mark and li may already have special parsing rules or default CSS. In HTML it is span class=philip../span and span class=silvia.../span. I don't see anything wrong with that. And it's only marginally longer than philip ... /philip and silvia.../silvia. Going with HTML in the cues, we either have to drop voices and inner timestamps or invent new markup, as HTML can't express either. I don't think either of those are really good solutions, so right now I'm not convinced that reusing the innerHTML parser is a good way forward. I don't see a need for the voices - they already have markup in HTML, see above. But I do wonder about the timestamps. I'd much rather keep the innerHTML parser if we can, but I don't know enough about how the timestamps could be introduced in a non-breakable manner. Maybe with a data- attribute? Maybe span data-t=00:00:02.100.../span? data- attributes are reserved for use by scripts on the same page, but we *could* of course introduce new elements or attributes for this purpose. However, adding features to HTML only for use in WebSRT seems a bit odd. That would make text/srt and text/websrt synonymous, which is kind of pointless. No, it's only pointless if you are a browser vendor. For everyone else it is a huge advantage to be able to choose between a guaranteed simple format and a complex format with all the bells and whistles. The advantages of taking text/srt is that all existing software to create SRT can be used to create WebSRT That's not strictly true. If they load a WebSRT file that was created by some other software for further editing and that WebSRT file uses advanced WebSRT functionality, the authoring software will break. Right, especially settings appended after the timestamps are quite likely to be stripped when saving the file. Or may even break the software if it's badly implemented, or may end up inside the cue text - just like the other control instructions which will end up as plain text inside the cue. You won't believe how many people have pointed out to me that my SRT test parser exposed an i tag markup in the cue text rather than interpreting it, when I was experimenting with applying SRT cues in a HTML div without touching the cue text content. Extraneous markup is really annoying. Indeed, but given the option of seeing no subtitles at all and seeing some markup from time to time, which do you prefer? For a long time I was using a media player that didn't handle HTML in SRT and wasn't very amused at seeing i and similar, but it was sure better than no subtitles at all. I doubt it will take long for popular software to start ignoring things trailing the timestamp and things in square brackets, which is all you need for basic compatibility. Some of the tested software already does so. and servers that already send text/srt don't need to be updated. In either case I think we should support only one mime type. What's the harm in supporting two mime types but using the same parser to parse them? Most content will most likely be plain old SRT without voices, ruby or similar. People will create them using existing software with the .srt extension and serve them using the text/srt MIME type. When they later decide to add some ruby or similar, it will just work without changing the extension or MIME type. The net result is that text/srt and text/websrt mean exactly the same thing, making it a wasted effort. From a Web browser perspective, yes. But not from a caption authoring perspective. At first, I would author a SRT file.
Re: [whatwg] HTML5 (including next generation additions still in development) - Mozilla Firefox (Not Responding)
On 8/11/10 10:31 AM, Garrett Smith wrote: It would have been more helpful to explain, if you can, the cause of the slowness in Firefox.. Sure thing. https://bugzilla.mozilla.org/show_bug.cgi?id=481131#c12 (the paragraph starting The time) and https://bugzilla.mozilla.org/show_bug.cgi?id=481131#c17 Note that I was wrong in my previous mail; it's status.js that causes the mutations that trigger the Firefox bug, not to.js Looping through the dom on page load is something that is best avoided. That's not an issue here. No. Actually it *is* an issue here; that issue is just massively dwarfed by the other issue behind door #2. What makes you say that looping through the dom is a performance issue? In case you care, an actual loop through the DOM on the HTML5 single-page spec, like so (making sure to touch all the nodes, etc): javascript:var start = new Date(); function f(n) { for (var k = n.firstChild; k; k = n.nextSibling) f(k); } f(document); alert(new Date() - start) alerts numbers under 50ms for me in all the browsers I've tried that can load the spec to start with. Most (if not all) of this stuff seems best suite for server-side logic. That's possible; I'm assuming Ian had a reason for setting it up the way he did. Possible where? On w3c server or whatwg server? It's possible that the output the scripts generate is more suited to being generated server-side. I made no comment on the feasibility of doing so, since I have no idea what the server setup looks like. -Boris
Re: [whatwg] HTML5 (including next generation additions still in development) - Mozilla Firefox (Not Responding)
On 8/11/10 11:48 AM, Boris Zbarsky wrote: javascript:var start = new Date(); function f(n) { for (var k = n.firstChild; k; k = n.nextSibling) f(k); } f(document); alert(new Date() - start) Er, that had a typo. The correct script is: javascript:var start = new Date(); function f(n) { for (var k = n.firstChild; k; k = k.nextSibling) f(k); } f(document); alert(new Date() - start); Now the numbers are slightly larger; on the order of 230ms to 350ms. Barely above human lag-perception. This is on a several-years-old laptop as hardware. -Boris
Re: [whatwg] Iframe dimensions
On Wed, Aug 11, 2010 at 8:05 AM, Markus Ernst derer...@gmx.ch wrote: Am 11.08.2010 00:24 schrieb Ian Hickson: On Mon, 5 Jul 2010, Markus Ernst wrote: Example: http://test.rapid.ch/de/haendler-schweiz/iseki.html (This is under construction.) As a workaround to the height problem, I applied a script that adjusts the iframe height to the available height in the browser window. But of course the user experience would be more consistent if the page could behave like a single page, with only one scrollbar at the right of the browser window. If you control both pages and can't use seamless, you can use postMessage() to negotiate a size. On the long term, I expect we'll make seamless work with CORS somehow. I'm waiting until we properly understand how CORS is used in the wild before adding it all over the place in HTML. A solution at authoring level for cases where the author controls both pages would be quite helpful. I think of a meta element in the embedded document that specifies one or more domains that are allowed to embed it seamlessly in an iframe, such as e.g.: meta name=allow-seamless-embedding name=domain.tld, otherdomain.tld I think that this would be ok from a security POV, and much easier than using CORS. That feels like re-inventing CORS. Maybe we should make CORS easier to use instead? Adam
Re: [whatwg] Iframe dimensions
On Wed, 11 Aug 2010 19:03:28 +0200, Adam Barth w...@adambarth.com wrote: On Wed, Aug 11, 2010 at 8:05 AM, Markus Ernst derer...@gmx.ch wrote: A solution at authoring level for cases where the author controls both pages would be quite helpful. I think of a meta element in the embedded document that specifies one or more domains that are allowed to embed it seamlessly in an iframe, such as e.g.: meta name=allow-seamless-embedding name=domain.tld, otherdomain.tld I think that this would be ok from a security POV, and much easier than using CORS. That feels like re-inventing CORS. Maybe we should make CORS easier to use instead? What exactly is hard about it? (Though I should note we should carefully study whether using CORS here is safe and sound. For instance, you may want to allow seamless embedding, but not share content.) -- Anne van Kesteren http://annevankesteren.nl/
Re: [whatwg] postMessage's target origin argument can be a full URL in some implementations
On 8/11/2010 06:58, Adam Barth wrote: On Tue, Aug 10, 2010 at 9:28 PM, Boris Zbarsky bzbar...@mit.edu wrote: On 8/10/10 9:11 PM, Ian Hickson wrote: Specifically, this means the magic / string is no longer supported Why? That seemed like a useful feature, and not something likely to break anyone out there In particular, it allows use cases that are not possible right now (e.g. reasonable postMessage from an about:blank page to a page that has the same origin as the about:blank page). Yeah, it seems like there's a big difference between breaking changes (e.g., rejecting previously valid inputs) and non-breaking changes (e.g., allowing some new, previously invalid inputs). Opera has reverted targetOrigin validation back to previous in current builds, but we also wouldn't mind keeping /. --sigbjorn; s...@opera.com
Re: [whatwg] Javascript: URLs as element attributes
On 8/11/10 2:57 PM, Cris Neckar wrote: 6.1.5 So for example a javascript: URL for a src attribute of an img element would be evaluated in the context of an empty object as soon as the attribute is set; it would then be sniffed to determine the image type and decoded as an image. Right. Browsers currently deal with these in a fairly ad-hoc way. I used the following to test a few examples in various browsers. Your test is assuming an alert property on the scope chain, and that the value of the property is a function. The first assumption would be false in the situation described in 6.1.5, since an empty object would have no such property. Firefox 3.6.3: Allows object.data, applet.code, and embed.src. Blocks all others. Firefox 3.7.863: Allows object.data and embed.src. Blocks all others. Gecko's currently-intended behavior is to do what section 6.1.5 describes in all cases except: iframe src=javascript: object data=javascript: embed src=javascript: applet code=javascript: Has there been discussion on this in the past? If not we should work towards defining which of these we want to allow and which we should block. Agreed. For what it's worth, as I see it there are three possible behaviors for a javascript: URI (whether in an attribute value or elsewhere): 1) Don't run the script. 2) Run the script, but in a sandbox. 3) Run the script against some Window object (which one?) Defining which of these happens in which case would be good. Again, Gecko's behavior is #2 by default (in all sorts of situations; basically anywhere you can dereference a URI), with exceptions made to do #3 in some cases. -Boris
Re: [whatwg] video await a stable state in resource selection (Was: Race condition in media load algorithm)
On 8/11/10 5:13 AM, Philip Jägenstedt wrote: Yes, in the case of the parser inserting source elements that fail one of the tests (no src, wrong type, wrong media) the algorithm will end up at step 6.21 waiting. It doesn't matter if all sources are available when the algorithm is first invoked or if they trickle in, be that from the parser or from scripts. Right, ok. Thanks for bearing with me on this! It's quite serious because NETWORK_EMPTY is used as a condition in many places of the spec, so this absolutely must be consistent between browsers. OK, gotcha. Otherwise the pointer could potentially reach the same source element twice, with the aforementioned problems with failing after network access. But this would only happen in a rare case, right? Specifically, that of source elements being inserted out of order... And if the update to new source elements being inserted were fully async, it would only happen if you trickle the source elements in out of order. But yes, your idea of flagging a source when it fails would also address this just fine. OK, perhaps I should take this more seriously. Making the whole algorithm synchronous probably isn't a brilliant idea unless we can also do away with all of the state it keeps (i.e. hysteresis). State is not necessarily a problem if it's invalidated as needed One way would be to introduce a magic flag on all source elements to indicate that they have already failed. This would be cleared whenever src, type or media is modified. This seems eminently doable. Yes, it sounds like it very much does, and would result in disasters like this: !doctype html video src=video.webm/video !-- network packet boundary or lag? -- scriptalert(document.querySelector('video').networkState)/script The result will be 0 (NETWORK_EMPTY) or 2 (NETWORK_LOADING) depending on whether or not the parser happened to return to the event loop before the script. The only way this would not be the case is if the event loop is spun before executing scripts, but I haven't found anything to that effect in the spec. I don't think there is anything, no... -Boris
[whatwg] Canvas feedback (various threads)
On Mon, 19 Jul 2010, David Flanagan wrote: The spec describes the transform() method as follows: The transform(m11, m12, m21, m22, dx, dy) method must multiply the current transformation matrix with the matrix described by: m11 m21 dx m12 m22 dy 0 0 1 The first number in these argument names is the column number and the second is the row number. This surprises me, and I want to check that it is not an inadvertent error: 1) Wikipedia says (http://en.wikipedia.org/wiki/Matrix_multiplication) that the convention is to list row numbers first 2) Java's java.awt.geom.AffineTransform class also lists the row index first, as in the following javadoc excerpt: [ x'] [ m00 m01 m02 ] [ x ] [ m00x + m01y + m02 ] [ y'] = [ m10 m11 m12 ] [ y ] = [ m10x + m11y + m12 ] [ 1 ] [ 001 ] [ 1 ] [ 1 ] It would be nice if this spec was not inconsistent with other usage. Even changing the argument names to neutral a,b,c,d,dx,dy would be better than what is there currently. Done. On Mon, 19 Jul 2010, Boris Zbarsky wrote: I do think the spec could benefit from an example akin to the one in the CoreGraphics documentation. I followed your references but I couldn't figure out which example you meant. What exactly do you think we should add? On Tue, 20 Jul 2010, Yp C wrote: But I think the number can indicate the position of the value in the matrix,if change them into a,b,c... like cairo, I think it will still confuse the beginner. The a,b,c,... notation is at least as common as m11,m12, On Mon, 19 Jul 2010, Brendan Kenny wrote: Looking at that last CoreGraphics link, it seems like the current names are an artifact of a row-vector matrix format (in which 'b' *is* m12) that is transposed for external exposure in the browser, but retains the same entry indexing. Yes. The row- vs column-vector dispute is an ancient one, but I can't think of anyone that refers to an entry of a matrix by [column, row]. It appears at least .NET uses the same notation and order. On Mon, 19 Jul 2010, David Flanagan wrote: While I'm harping on the transform() method, I'd like to point out that the current spec text must multiply the current transformation matrix with the matrix described by... is ambiguous because matrix multiplication is not commutative. Perhaps an explicit formula that showed the order would be clearer. Furthermore, if the descriptions for translate(), scale() and rotate() were to altered to describe them in terms of transform() that would tighten things up. Could you describe what interpretations of the current text would be valid but would not be compatible with the bulk of existing implementations? I'm not sure how to fix this exactly. (Graphics is not my area of expertise, unfortunately. I'm happy to apply any proposed text though!) On Tue, 20 Jul 2010, Andreas Kling wrote: Greetings! The current draft of HTML5 says about rendering radial gradients: This effectively creates a cone, touched by the two circles defined in the creation of the gradient, with the part of the cone before the start circle (0.0) using the color of the first offset, the part of the cone after the end circle (1.0) using the color of the last offset, and areas outside the cone untouched by the gradient (transparent black). I find this behavior of transparent spread rather strange and it doesn't match any of the SVG gradient's spreadMethod options. The sensible behavior here IMO is pad spread (SVG default, and what most browsers implementing canvas currently do) which means repeating the terminal color stops indefinitely. I'm pretty sure it's too late to change this. On Wed, 28 Jul 2010, David Flanagan wrote: Firefox and Chrome disagree about the implementation of the destination-atop, source-in, destination-in, and source-out compositing operators. [...] I suspect, based on the reference to an infinite transparent black bitmap in 4.8.11.1.13 Drawing model that Firefox gets this right and Chrome gets it wrong, but it would be nice to have that confirmed. I suggest clarifying 4.8.11.1.3 Compositing to mention that the compositing operation takes place on all pixels within the clipping region, and that some compositing operators clear large portions of the canvas. On Wed, 28 Jul 2010, Tab Atkins Jr. wrote: The spec is completely clear on this matter - Firefox is right, Chrome/Safari are wrong. They do it wrongly because that's how CoreGraphics, their graphics library, does things natively. On Wed, 28 Jul 2010, Oliver Hunt wrote: This is the way the webkit canvas implementation has always worked, firefox implemented this incorrectly, and the spec was based off of that implementation. Actually the spec was based off the WebKit implementation, but this particular part had no documentation describing what it did, so I couldn't specify it. :-( On Fri, 30 Jul 2010,
Re: [whatwg] Canvas feedback (various threads)
On Wed, Aug 11, 2010 at 9:35 PM, Ian Hickson i...@hixie.ch wrote: On Thu, 29 Jul 2010, Gregg Tavares (wrk) wrote: source-over glBlendFunc(GL_ONE, GL_ONE_MINUS_SRC_ALPHA); I tried searching the OpenGL specification for either glBlendFunc or GL_ONE_MINUS_SRC_ALPHA and couldn't find either. Could you be more specific regarding what exactly we would be referencing? I'm not really sure I understand you proposal. The OpenGL spec omits the gl/GL_ prefixes - search for BlendFunc instead. (In the GL 3.0 spec, tables 4.1 (the FUNC_ADD row) and 4.2 seem relevant for defining the blend equations.) -- Philip Taylor exc...@gmail.com
Re: [whatwg] Canvas feedback (various threads)
Ian Hickson wrote: On Mon, 19 Jul 2010, David Flanagan wrote: Even changing the argument names to neutral a,b,c,d,dx,dy would be better than what is there currently. Done. Thanks On Mon, 19 Jul 2010, David Flanagan wrote: While I'm harping on the transform() method, I'd like to point out that the current spec text must multiply the current transformation matrix with the matrix described by... is ambiguous because matrix multiplication is not commutative. Perhaps an explicit formula that showed the order would be clearer. Furthermore, if the descriptions for translate(), scale() and rotate() were to altered to describe them in terms of transform() that would tighten things up. Could you describe what interpretations of the current text would be valid but would not be compatible with the bulk of existing implementations? I'm not sure how to fix this exactly. (Graphics is not my area of expertise, unfortunately. I'm happy to apply any proposed text though!) I think that the sentence The transformations must be performed in reverse order is sufficient to remove the ambiguity in multiplication order. So the spec is correct (but confusing) as it stands, except that it doesn't actually say that the CTM is to be replaced with the product of the CTM and the new matrix. It just says multiply them. I suggest changing the description of transform() from: must multiply the current transformation matrix with the matrix described by: To something like this: must set the current transformation matrix to the matrix obtained by postmultiplying the current transformation matrix with this matrix: a c e b d f 0 0 1 That is: a c e CTM = CTM * b d f 0 0 1 Changing translate(), scale() and rotate() to formally define them in terms of transform() would be simple, and the current prose descriptions of the methods could then be moved to the non-normative green box. The current descriptions suffer from the use of the word add near the word matrix when in fact a matrix multiplication is to be performed, but I don't think they can be mis-interpreted as they stands. I'd be happy to write new method descriptions if you want to tighten things up in this way, however. David
[whatwg] Constraint validation feedback (various threads)
On Tue, 20 Jul 2010, Mounir Lamouri wrote: At the moment, three form elements are barred from constraint validation: object, fieldset and output. I can understand why object and fieldset are barred from constraint validation but I think output could use the constraint validation. The user can't edit the value of output, so the only time this would be helpful is when you have script, with the script setting the validity using setCustomValidity(). But then the script can just as easily set it on the control that's actually making the output be invalid. That would also be better UI -- having the user agent point out the output control is invalid would likely just infuriate users who couldn't tell what they had to do to change the value. On Tue, 20 Jul 2010, Mounir Lamouri wrote: I'm wondering why there is no categories for elements candidate for constraint validation. In the current state of the specs, all listed elements are candidate for constraint validation except when they are barred from constraint validation. Barring an element from constraint validation when it is in a certain state seems good but having elements always barred from constraint validation seems a bad idea. It makes them having the entire constraint validation API for nothing. Not quite nothing -- they have it so that you can iterate through form.elements and use the constraint validation API on all of them. On Tue, 20 Jul 2010, Simon Pieters wrote: I believe some elements have the API but are barred because it makes it easier to loop through form.elements and do the validation stuff without checking if the validation stuff is available on the element. (Same reason textarea has .type.) Right. On Fri, 23 Jul 2010, Mounir Lamouri wrote: But keygen, object, fieldset and output are barred from constraint validation and textarea, button, input and select are not [1]. Half of the elements have a useless API, that sounds too much for me. I think it's not so complicated to loop through the form elements and checking if it implements a part of the constraint validation api or checking the tag name. I don't understand. Why does it matter? On Fri, 23 Jul 2010, Jonas Sicking wrote: It probably results in less code if a handful of implementations have to add a few stubbed functions, than if millions of pages all will have to check if constraint validation is there before using it. Not to mention that I trust you (a implementor) to get this right, a lot more than I trust thousands of webauthors to get this right. Indeed. On Thu, 22 Jul 2010, Aryeh Gregor wrote: maxlength predates all the other form validation attributes by many years. Historically, browsers would prohibit users from entering text beyond the maxlength of an input or textarea, but would not prohibit form submission. HTML5 changes this: It only changes this in theory. In practice the result is the same as far as I can tell. Constraint validation: If an element has a maximum allowed value length, and its dirty value flag is true, and the code-point length of the element's value is greater than the element's maximum allowed value length, then the element is suffering from being too long. http://www.whatwg.org/specs/web-apps/current-work/multipage/association-of-controls-and-forms.html#limiting-user-input-length If I read it correctly, this means that pages that previously worked no longer will, if a script sets the value of the input to something longer than the maxlength. The script setting the value doesn't set the dirty flag. The only way this could be a problem is if the user edits the control and _then_ the script sets the value to an overlong value. These two test cases show that Opera maintains the legacy behavior (not compatible with the spec) and submits the forms regardless of maxlength violations, while WebKit (Chromium) blocks submission as required by the spec: data:text/html,!doctype htmlbody onload=document.getElementById('a').value='foo'forminput id=a maxlength=2input type=submit/formTry to submit the form data:text/html,!doctype htmlforminput id=ainput type=submit/forma href= onclick=document.getElementById('a').maxLength = 2; return falseEnter foo into the input, click here, then try to submit/a Should the spec (and WebKit) be changed here, or should Opera change? WebKit is wrong here. Opera matches the spec as far as I can tell. The submission should only be blocked if the user edits the value before onload in these examples. On Wed, 28 Jul 2010, Mounir Lamouri wrote: At the moment, to suffer from a pattern mismatch an element needs a non-empty value and a specified pattern (and the pattern should not match the entire value) [1]. So, if, for any reason, an author write input pattern='', the element would be always suffering from a pattern mismatch except when value=''. Indeed. The same occurs if the author writes input
Re: [whatwg] Scrollable Tables and HTML5
On Wed, 21 Jul 2010, Schalk Neethling wrote: I have been working on getting scrollable tables working across all browsers. While there exists jQuery plugins that does the job for the most part, I have to find one that works 100% and works at all in Chrome. The reason I am putting this to the HTML5 list is because I am wondering whether there is something in the HTML5 spec that is going to aid in this regard. In the current HTML 4 spec (http://www.w3.org/TR/REC-html40/struct/tables.html#h-11.2.3) it is stated that: Table rows may be grouped into a table head, table foot, and one or more table body sections, using the THEAD http://www.w3.org/TR/REC-html40/struct/tables.html#edef-THEAD , TFOOT http://www.w3.org/TR/REC-html40/struct/tables.html#edef-TFOOT and TBODY http://www.w3.org/TR/REC-html40/struct/tables.html#edef-TBODY elements, respectively. This division enables user agents to support scrolling of table bodies independently of the table head and foot. The only browser, including the IE9 previews, that has implemented this behavior is Firefox. For the rest it is quite a terrible hack with JavaScript to get similar behavior. Is there any work being done get UA's to implement this as a standard? Is there work being done by other working groups, maybe ARIA, in this regard? Looking forward to your feedback. On Wed, 21 Jul 2010, Boris Zbarsky wrote: Note that the Firefox implementation was removed, because it violated the CSS2.1 spec, caused compatibility problems for other browsers, and was buggy to the point that it wasn't worth the effort needed to maintain it (esp. given the other strikes against it). This seems like a CSS issue -- nothing in HTML prevents browsers from implementing this in theory, and I don't think there's much we could do to help make it implementable in practice. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] iframes with potential for independent navigation controls
On Thu, 22 Jul 2010, Brett Zamir wrote: I would like to see attributes be added to allow iframes to have independent navigation controls, or rather, to allow a parent document to have ongoing access to the navigation history of its iframes (say to be informed of changes to their histories via an event) so that it could create such controls? I would think that the average user might suspect that their clicks would not necessarily be private if they were already in the context of another site, but if privacy would be of concern here (or security, though GET requests alone shouldn't alone be able to give access to sensitive data), maybe the user could be asked for permission, as with Geolocation. This really has I think wonderful potential. Especially but not exclusively for larger screens, one can envision, for example, a site which displays content in a table, with paragraphs being in one column, and commentary in another. If the commentary column uses say a wiki (or a comment feed/discussion thread), to keep track of resources and cross-references, insights, errata, etc. pertaining to a given paragraph or verse (e.g., for books, but also potentially for blog articles, etc.--anywhere people may wish to dissect in context), it would be desirable for one to be able to say edit content in one fixed iframe pane, follow links in another one (including to other domains), etc., all while keeping the original context of the table of content and iframes on screen, and allowing the user to go backward and forward in any pane independently (or possibly type in a URL bar for each iframe). One can even imagine ongoing chat discussions taking place within such a mosaic. It's an interesting idea (the ability to ask for the browser to embed controls, that is; we wouldn't be able to grant cross-domain access to pages, since that would open all kinds of security problems). The best way to proceed is to follow the steps described in the wiki: http://wiki.whatwg.org/wiki/FAQ#Is_there_a_process_for_adding_new_features_to_a_specification.3F In particular, I think getting implementation experience would be critical for such a feature. Do users get confused? Does just providing a set of UI controls actually address the use cases, or do authors still need more? That is the kind of questions we'd need answers to. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Canvas feedback (various threads)
On Tue, 10 Aug 2010, Charles Pritchard wrote: I recommend not using canvas for text editing. I've worked on a substantial amount of code dealing with text editing. At present, the descent of the current font has been the only deficiency. Well, there's also the way it doesn't interact with the OS text selection, copy-and-paste, drag-and-drop, accessibility APIs, the browsers' undo logic, the OS spell-checker and grammar-checker, the OS text tools like Search in Spotlight, and the i18n features like bidi handling. And that's just for starters. :-) Drag-and-drop works just fine, it's covered by event.dataTransfer. Accessibility has been addressed through drawFocusRing amongst other techniques of including relevant HTML within the canvas tag. Undo logic is covered by the history state objects. Text selection visuals work using measureText and textBaseline bottom. OS tools are fairly out of spec, though I do understand the value of extending a dataTransfer API to assist with UA context menu hooks, such as spellcheck/autocomplete/grammar suggestions. That said, there's nothing in the way of implementing those features within an HTML app. Even offline, both can be accomplished via WebSQL. Perhaps there should be a discussion about dataTransfer and the context menu. I feel that using Canvas to implement HTML5/CSS provides a quality proof of the completeness of the 2D API. The 2D API isn't complete by a long shot, there's no difficulty in proving that. It's not trying to be complete. Perhaps completeness is a poor choice. I'm concerned about obstructions. At present, there are very few obstacles. CSS and Canvas have allowed me to create implementations of HTML Forms, CSS line boxes and SVG. Baseline positioning is an obstacle. I realize that SVG has an even-odd fill mode. This is something that can be converted, and is rarely used. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.' Through CSS computed styles, or a canvas interface, I will get accessible baseline offsets. :-)
Re: [whatwg] Web Workers
On Thu, 22 Jul 2010, Ryan Heise wrote: [...] For all of the reasons above, I would like to see something like threads in Javascript. Yes, threads give rise to race conditions and deadlocks, but this seems to be in line with Javascript's apparent philosophy of doing very little static error checking, and letting things just happen at runtime (e.g. nonexistent static type system). In other words, this may be simply a case of: yes, javascript allows runtime errors to happen. Is not allowing deadlocks important enough that we should make it impossible for a certain class of algorithms to exploit multi-core CPUs? Generally speaking, the thinking on these topics is basically that we should try to avoid allowing authors to do anything that is hard to debug. Threads in particular are _incredibly_ complicated to work with even for very experienced programmers. Having said that, this is the kind of thing that begins with experimental implementations, and then migrate to the standardisation process once they are proven. This is indeed how Workers started (first Google experimented with this area in Gears, and then the experience from that work informed the standardisation process). I recommend working with browser vendors to experiment in this area. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Canvas feedback (various threads)
On 8/11/10 4:35 PM, Ian Hickson wrote: On Mon, 19 Jul 2010, Boris Zbarsky wrote: I do think the spec could benefit from an example akin to the one in the CoreGraphics documentation. I followed your references but I couldn't figure out which example you meant. What exactly do you think we should add? Effectively the part starting with the second paragraph under Discussion at http://developer.apple.com/mac/library/documentation/GraphicsImaging/Reference/CGAffineTransform/Reference/reference.html#//apple_ref/doc/c_ref/CGAffineTransform and going through the two linear equations defining x' and y'. Plus a bit that says how the linear list of arguments passed to transform() maps to the 2-dimensional array of numbers in the transformation matrix. -Boris
Re: [whatwg] postMessage's target origin argument can be a full URL in some implementations
2010/8/11 Boris Zbarsky bzbar...@mit.edu: On 8/10/10 9:11 PM, Ian Hickson wrote: Specifically, this means the magic / string is no longer supported Why? That seemed like a useful feature, and not something likely to break anyone out there In particular, it allows use cases that are not possible right now (e.g. reasonable postMessage from an about:blank page to a page that has the same origin as the about:blank page). I stated somewhere that reverting the path validation stuff should include removing the magic '/' feature, given this use case it was probably better to keep '/'. -- Hallvord R. M. Steen
[whatwg] Javascript: URLs as element attributes
Re-sending from the correct address. -- Forwarded message -- From: Boris Zbarsky bzbar...@mit.edu Date: Wed, Aug 11, 2010 at 12:07 PM Subject: Re: Javascript: URLs as element attributes To: Cris Neckar c...@google.com Cc: wha...@whatwg.org On 8/11/10 2:57 PM, Cris Neckar wrote: 6.1.5 So for example a javascript: URL for a src attribute of an img element would be evaluated in the context of an empty object as soon as the attribute is set; it would then be sniffed to determine the image type and decoded as an image. Right. Browsers currently deal with these in a fairly ad-hoc way. I used the following to test a few examples in various browsers. Your test is assuming an alert property on the scope chain, and that the value of the property is a function. The first assumption would be false in the situation described in 6.1.5, since an empty object would have no such property. Firefox 3.6.3: Allows object.data, applet.code, and embed.src. Blocks all others. Firefox 3.7.863: Allows object.data and embed.src. Blocks all others. Gecko's currently-intended behavior is to do what section 6.1.5 describes in all cases except: iframe src=javascript: object data=javascript: embed src=javascript: applet code=javascript: Has there been discussion on this in the past? If not we should work towards defining which of these we want to allow and which we should block. Agreed. For what it's worth, as I see it there are three possible behaviors for a javascript: URI (whether in an attribute value or elsewhere): 1) Don't run the script. 2) Run the script, but in a sandbox. 3) Run the script against some Window object (which one?) Defining which of these happens in which case would be good. Again, Gecko's behavior is #2 by default (in all sorts of situations; basically anywhere you can dereference a URI), with exceptions made to do #3 in some cases. -Boris
[whatwg] Javascript: URLs as element attributes
Resending from the correct address -- Forwarded message -- From: Cris Neckar c...@google.com Date: Wed, Aug 11, 2010 at 11:57 AM Subject: Javascript: URLs as element attributes To: wha...@whatwg.org Cc: bzbar...@mit.edu The HTML5 Spec is somewhat ambiguous on the handling of javascript: URLs when supplied as attributes to different elements. It does not specifically prohibit handling them in most cases but I was wondering if this has been discussed and whether there is consensus on correct behavior. There are several areas of the spec that specifically reference the use of javascript: URLs as the src attribute for img nodes but this is not universal. For example see http://dev.w3.org/html5/spec/Overview.html#introduction-3 6.1.1 Processing of inline javascript: URLs (e.g. the src attribute of img elements, or an @import rule in a CSS style element block). And http://dev.w3.org/html5/spec/Overview.html#javascript-protocol 6.1.5 So for example a javascript: URL for a src attribute of an img element would be evaluated in the context of an empty object as soon as the attribute is set; it would then be sniffed to determine the image type and decoded as an image. Browsers currently deal with these in a fairly ad-hoc way. I used the following to test a few examples in various browsers. embed src=javascript:alert('embed-src');/embed embed src=http://none; pluginurl=javascript:alert('embed-pluginurl');/embed object classid=javascript:alert('object-classid');/object object archive=javascript:alert('object-archive');/object object data=javascript:alert('object-data');/object img src=javascript:alert('img-src'); script src=javascript:alert('script-src');/script applet code=javascript:alert('applet-code');/applet applet code=http://none; archive=javascript:alert('applet-archive');/applet applet code=http://none; codebase=javascript:alert('applet-codebase');/applet link rel=stylesheet type=text/css href=javascript:alert('link-href'); / IE 8: Blocks all tests Chrome 5.0.375: object.data, embed.src, Blocks all others. Firefox 3.6.3: Allows object.data, applet.code, and embed.src. Blocks all others. Firefox 3.7.863: Allows object.data and embed.src. Blocks all others. Opera 10.54: script.src and object.data. Blocks all others. Has there been discussion on this in the past? If not we should work towards defining which of these we want to allow and which we should block. Thank you, -cris
Re: [whatwg] Constraint validation feedback (various threads)
On Tue, 10 Aug 2010, Jesse McCarthy wrote: I consider it highly desirable to have some way to differentiate between SELECT values explicitly selected by the user and values that were selected by default and unchanged by the user. I have a note in the spec to add a feature at some point to track what controls have been changed and what haven't, but that doesn't seem to have the need for urgency that Jonas describes required= as having, so I still think we should keep delaying that one until browsers have caught up. Allow me to clarify. What I'm referring to is having @required for SELECT and some way to include a label, so that the user must deliberately select something in order for the form to be submitted. My comment was a response to comments you made (see below) that suggested that @required is not important for SELECTs without @size or @multiple, and that an initial label option, e.g. option value=Choose One/option, is invalid. Not having @required for SELECT and simply omitting an initial label OPTION would make the first OPTION (whatever it is) selected by default, which would make it impossible to differentiate between the user deliberately selecting that OPTION and simply leaving the default. Having @required for SELECT and some way to specify a label (as you've just described), so that the user must deliberately select something in order for the SELECT to not suffer from being missing, satisfies the need I described. Regarding how to implement @required a label for SELECTs, some people have resisted the idea of implementing it so that empty string would make it suffer from being missing. Tab Atkins Jr. said this in response: Yes? And when that's valid, you should just *not use @required*, same as when an empty string is a valid value for an input type=text. I've been wondering if someone can present a use case where that logic is not sufficient. I haven't thought of any yet, but I'm not sure there aren't any. Since these specifications have a huge impact on developers for years to come, it's probably better to be on the safe side and just provide some explicit way of marking up a label, such as the means you described. Is the logic that an OPTION is special if it's first and has value= preferable to using an attribute to indicate that? E.g. option dummyLabel/option (not suggesting that as an actual attribute name, just using it for illustration). Or perhaps @placeholder could be a boolean attribute on OPTION? Jesse On Mon, 9 Aug 2010 Ian Hickson wrote: It's impossible to submit a select element (without a size= attribute or multiple= attribute) without it having a value -- essentially, required= is already implied. On Thu, 22 Jul 2010, Mounir Lamouri wrote: 1. A typical use case of select is to have option value=''Choose an option/option as a default value. Having @required would prevent authors to write any js check when they are using select like that. That seems like an invalid use of option to me. It would be better as: label Choose an option: select ... /select /label Currently you can do this just by not providing empty values and not using multiple= or size=. - Original Message - From: Ian Hickson i...@hixie.ch To: wha...@whatwg.org Sent: Wednesday, August 11, 2010 6:03 PM Subject: Constraint validation feedback (various threads) On Tue, 20 Jul 2010, Mounir Lamouri wrote: At the moment, three form elements are barred from constraint validation: object, fieldset and output. I can understand why object and fieldset are barred from constraint validation but I think output could use the constraint validation. The user can't edit the value of output, so the only time this would be helpful is when you have script, with the script setting the validity using setCustomValidity(). But then the script can just as easily set it on the control that's actually making the output be invalid. That would also be better UI -- having the user agent point out the output control is invalid would likely just infuriate users who couldn't tell what they had to do to change the value. On Tue, 20 Jul 2010, Mounir Lamouri wrote: I'm wondering why there is no categories for elements candidate for constraint validation. In the current state of the specs, all listed elements are candidate for constraint validation except when they are barred from constraint validation. Barring an element from constraint validation when it is in a certain state seems good but having elements always barred from constraint validation seems a bad idea. It makes them having the entire constraint validation API for nothing. Not quite nothing -- they have it so that you can iterate through form.elements and use the constraint validation API on all of them. On Tue, 20 Jul 2010, Simon Pieters wrote: I believe some elements have the API but are barred because it makes it easier to loop through form.elements and do the validation stuff
Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements
On Thu, Aug 12, 2010 at 1:26 AM, Philip Jägenstedt phil...@opera.comwrote: On Wed, 11 Aug 2010 15:38:32 +0200, Silvia Pfeiffer silviapfeiff...@gmail.com wrote: On Wed, Aug 11, 2010 at 10:30 PM, Philip Jägenstedt phil...@opera.com wrote: On Wed, 11 Aug 2010 01:43:01 +0200, Silvia Pfeiffer silviapfeiff...@gmail.com wrote: Going with HTML in the cues, we either have to drop voices and inner timestamps or invent new markup, as HTML can't express either. I don't think either of those are really good solutions, so right now I'm not convinced that reusing the innerHTML parser is a good way forward. I don't see a need for the voices - they already have markup in HTML, see above. But I do wonder about the timestamps. I'd much rather keep the innerHTML parser if we can, but I don't know enough about how the timestamps could be introduced in a non-breakable manner. Maybe with a data- attribute? Maybe span data-t=00:00:02.100.../span? data- attributes are reserved for use by scripts on the same page, but we *could* of course introduce new elements or attributes for this purpose. However, adding features to HTML only for use in WebSRT seems a bit odd. I'd rather avoid adding features to HTML only for WebSRT. Ian turned the timestamps into ProcessingInstructions http://www.whatwg.org/specs/web-apps/current-work/websrt.html#websrt-cue-text-dom-construction-rules. Could we introduce something like ?t at=00:00:02.100? without breaking the innerHTML parser? That would make text/srt and text/websrt synonymous, which is kind of pointless. No, it's only pointless if you are a browser vendor. For everyone else it is a huge advantage to be able to choose between a guaranteed simple format and a complex format with all the bells and whistles. The advantages of taking text/srt is that all existing software to create SRT can be used to create WebSRT That's not strictly true. If they load a WebSRT file that was created by some other software for further editing and that WebSRT file uses advanced WebSRT functionality, the authoring software will break. Right, especially settings appended after the timestamps are quite likely to be stripped when saving the file. Or may even break the software if it's badly implemented, or may end up inside the cue text - just like the other control instructions which will end up as plain text inside the cue. You won't believe how many people have pointed out to me that my SRT test parser exposed an i tag markup in the cue text rather than interpreting it, when I was experimenting with applying SRT cues in a HTML div without touching the cue text content. Extraneous markup is really annoying. Indeed, but given the option of seeing no subtitles at all and seeing some markup from time to time, which do you prefer? For a long time I was using a media player that didn't handle HTML in SRT and wasn't very amused at seeing i and similar, but it was sure better than no subtitles at all. I doubt it will take long for popular software to start ignoring things trailing the timestamp and things in square brackets, which is all you need for basic compatibility. Some of the tested software already does so. Hmm... not sure if I'd prefer to see the crap or rather be forced to run it through a stripping tool first. I think what would happen is that I'd start watching the movie, then notice the crap, get annoyed, stop it, run a stripping tool, restart the movie. I'd probably prefer noticing that before I start the movie, which would happen if the file was a different format. But it does take a bit of expert knowledge to know that websrt can be easily converted to srt and to have such a stripping tool installed, I give you that. OTOH, if you say that it will take a short time for popular software to start ignoring the extra WebSRT stuff, well, in this case they have implemented WebSRT support in its most basic form and then there is no problem any more anyway. They will then accept the new files and their extensions and mime types and there is explicit support rather than the dodgy question of whether these SRT files will provide crap or not. During a transition period, we will make all software that currently supports SRT become unstable and unreliable. I don't think that's the right way to deal with an existing ecosystem. Coming in as the big brother, claiming their underspecified format, throwing in incompatible features, and saying: just deal with it. It's just not the cavalier thing to do. and servers that already send text/srt don't need to be updated. In either case I think we should support only one mime type. What's the harm in supporting two mime types but using the same parser to parse them? Most content will most likely be plain old SRT without voices, ruby or similar. People will create them using existing software with the .srt extension and serve them using the text/srt MIME type. When
[whatwg] Minor editorial fix: Section 4.6.9 The time element - first example needs update
The first markup example in section 4.6.9 needs updating: http://www.whatwg.org/specs/web-apps/current-work/multipage/text-level-semantics.html#the-time-element Current Example and text: === snip === div class=vevent a class=url href=http://www.web2con.com/;http://www.web2con.com//a span class=summaryWeb 2.0 Conference/span: time class=dtstart datetime=2007-10-05October 5/time - time class=dtend datetime=2007-10-2019/time, at the span class=locationArgent Hotel, San Francisco, CA/span /div (The end date is encoded as one day after the last date of the event because in the iCalendar format, end dates are exclusive, not inclusive.) === snip === Suggested update: === snip === div class=vevent a class=url href=http://www.web2con.com/;http://www.web2con.com//a span class=summaryWeb 2.0 Conference/span: time class=dtstart datetime=2005-10-05October 5/time- time class=dtend datetime=2005-10-077/time, at the span class=locationArgent Hotel, San Francisco, CA/span /div === snip === Note: the parenthetical paragraph in the previous version about end date inconsistency has been removed since hCalendar 1.0 has resolved that issue (see dtend issue for details). More details (if needed) on the wiki: http://wiki.whatwg.org/wiki/Time_element#Update_hCalendar_example Thanks, Tantek -- http://tantek.com/ - I made an HTML5 tutorial! http://tantek.com/html5
Re: [whatwg] HTML5 (including next generation additions still in development) - Mozilla Firefox (Not Responding)
On 8/11/10, Boris Zbarsky bzbar...@mit.edu wrote: On 8/11/10 11:48 AM, Boris Zbarsky wrote: javascript:var start = new Date(); function f(n) { for (var k = n.firstChild; k; k = n.nextSibling) f(k); } f(document); alert(new Date() - start) Er, that had a typo. The correct script is: javascript:var start = new Date(); function f(n) { for (var k = n.firstChild; k; k = k.nextSibling) f(k); } f(document); alert(new Date() - start); My result is 1012 and that is significant and noticeable. It's also highly contrived example. When you start doing any DOM manipluation, particularly appending or removing nodes, you're going to notice a lot larger times. Now the numbers are slightly larger; on the order of 230ms to 350ms. Barely above human lag-perception. This is on a several-years-old laptop as hardware. How do figure that's barely above human lag perception? Garrett
Re: [whatwg] typeMismatch for type=number (Re: Input color state: type mismatch)
On Fri, 23 Jul 2010, TAMURA, Kent wrote: On Sat, Apr 3, 2010 at 06:37, Ian Hickson i...@hixie.ch wrote: On Sat, 3 Apr 2010, TAMURA, Kent wrote: I found type=number also had no typeMismatch. If a user wants to type a negative value, he types '-' first. This state should make typeMismatch true because '-' is not a valid floating point number. The user agent shouldn't update the value until the input is a valid number. (User agents must not allow the user to set the value to a string that is not a valid floating point number.) I don't accept this behavior. Suppose that a user type - to an empty input type=number, then press ENTER to submit the form. As per the current specification, UA should send an empty value for the number control even though the number control has a visible string. The user doesn't expect a value different from the visible value is sent. This is very confusing. In such case, UA should prevent the form submission and show a validation message for typeCheck. I would expect hitting enter in such a scenario to do something like change the display so that the - is replaced by a zero. This is similar to, e.g., how when you type in - as the port in the Mac OS X Network configuration panel's Proxy tab and then hit OK, it just gets dropped with comment, or how when you put in a bogus IP address as a DNS entry and hit OK, it beeps and removes the entry. Firefox's Offline Storage numeric entry edit box and the edit boxes for proxy port numbers are similar -- they won't let you enter an invalid value. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] HTML5 (including next generation additions still in development) - Mozilla Firefox (Not Responding)
On 8/11/10 9:17 PM, Garrett Smith wrote: On 8/11/10, Boris Zbarskybzbar...@mit.edu wrote: On 8/11/10 11:48 AM, Boris Zbarsky wrote: javascript:var start = new Date(); function f(n) { for (var k = n.firstChild; k; k = n.nextSibling) f(k); } f(document); alert(new Date() - start) Er, that had a typo. The correct script is: javascript:var start = new Date(); function f(n) { for (var k = n.firstChild; k; k = k.nextSibling) f(k); } f(document); alert(new Date() - start); My result is 1012 In what browser? Firefox 3.6? (And presumably on reasonably slow hardware, if so.) If so, really do try 4.0 beta. It's a good bit faster. It's also highly contrived example. When you start doing any DOM manipluation, particularly appending or removing nodes, you're going to notice a lot larger times. Well, sure, but you also won't be walking the entire DOM in JS like this. The HTML5 spec scripts sure don't, last I checked. Now the numbers are slightly larger; on the order of 230ms to 350ms. Barely above human lag-perception. This is on a several-years-old laptop as hardware. How do figure that's barely above human lag perception? The commonly accepted figure for when things start to feel laggy in UI terms is 200ms. If someone clicks and nothing happens for more than 200ms then they perceive the response as slow. Otherwise they generally perceive it as pretty much instant. -Boris
Re: [whatwg] Please disallow javascript: URLs in browser address bars
On Thu, 22 Jul 2010, Luke Hutchison wrote: There has been a spate of facebook viruses in the last few months that have exploited social engineering and the ability to paste arbitrary javascript into the addressbar of all major browsers to propagate themselves. Typically these show up as Facebook fan pages with an eye-catching title that ask you to copy/paste a piece of javascript into the addressbar to show whatever the title is talking about. However doing so scrapes your facebook friends list, and the virus mails itself to all your fb friends. [...] There is no legitimate reason that non-developers would need to paste javascript: URLs into the addressbar, and the ability to do so should be disabled by default on all browsers. (Of course this would not affect the ability of browsers to successfully click on javascript links.) This seems like a UI issue, so I haven't changed the spec (it doesn't really talk about the location bar -- indeed it doesn't even require that one be visible at all). However, should anyone want to discuss this further, e.g. to organise browser vendor plans, you are welcome to do so. On Thu, 22 Jul 2010, Boris Zbarsky wrote: On 7/22/10 5:03 PM, Mike Shaver wrote: What should the URL bar say when the user clicks a javascript: link which produces content?a href=javascript:5;five!/a This part the spec actually covers, I think; the url bar is supposed to say the url of the page that link was on, iirc. Which is what I think everyone but Gecko does already; we actually show the javascript: url in the url bar in this case. Well, the requirement (search for override URL to see what we're talking about here) isn't on the location bar per se -- it's just on what the document's address is, which is used in some of the APIs. You don't have to show that, indeed you could show both, or something else, or nothing. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Please disallow javascript: URLs in browser address bars
On Thu, Jul 22, 2010 at 1:46 PM, Adam Barth w...@adambarth.com wrote: On Thu, Jul 22, 2010 at 1:41 PM, Aryeh Gregor simetrical+...@gmail.comsimetrical%2b...@gmail.com wrote: On Thu, Jul 22, 2010 at 4:32 PM, Luke Hutchison luke.hu...@mit.edu wrote: There is no legitimate reason that non-developers would need to paste javascript: URLs into the addressbar, and the ability to do so should be disabled by default on all browsers. Sure there is: bookmarklets, basically. javascript: URLs can do lots of fun and useful things. Also fun but not-so-useful things, like: javascript:document.body.style.MozTransform=document.body.style.WebkitTransform=document.body.style.OTransform=rotate(180deg);void(0); (Credit to johnath for that one. Repeat with 0 instead of 180deg to undo.) You can do all sorts of interesting things to the page by pasting javascript: URLs into the URL bar. Of course, there are obviously security problems here too, but no legitimate reason is much too strong. We could allow bookmarklets without allowing direct pasting into the URL bar. That would make the social engineering more complex at least. Adam Would a pop-up warning be sufficient, rather than disallowing it? For example, if I write the following URL into Firefox... http://char...@49research.com/ ... Firefox will pop-up a modal dialog box with the following message... You are about to log in to the site 49research.com with the username charles, but the website does not require authentication. This may be an attempt to trick you. Is 49research.com the site you want to visit? [yes] [no] Perhaps a modal dialog box could pop-up for copy-and-pasted JavaScript URLs to (after the user presses enter). -- Charles Iliya Krempeaux, B.Sc.