Re: [whatwg] Document with a single input[type=radio]?
Mikko Rantalainen writes: > The spec says in 4.10.5.1.16 Radio Button state (type=radio) > https://html.spec.whatwg.org/multipage/forms.html#radio-button-state-%28type=radio%29 > > "A document must not contain an input element whose radio button group > contains only that element." > > What this is supposed to mean in practice? That if you include a radio button group with only one radio button in it, your document isn't valid HTML. > Could this sentence be dropped because this does not match real world > browser behavior? No, because it isn't a requirement on browsers, so browser behaviour is irrelevant. It's a requirement on authors. (It could be dropped for other reasons, of course. I'm not making any claim either way on whether it's a good requirement on authors; merely that browser behaviour can't be the reason for dropping it.) > (A streaming browser will may hit this case after parsing the first > radio button element in the document. Then what?) Then browsers will follow the parsing algorithms and treat it accordingly. For backwards compatibility and interoperability, the spec covers how to handle all sorts of invalid input that authors might send them. Smylers -- http://twitter.com/Smylers2
Re: [whatwg] Icon mask and theme color
Maciej Stachowiak writes: On Jun 16, 2015, at 4:37 AM, Nils Dagsson Moskopp n...@dieweltistgarnichtso.net wrote: (5) Use the shape of the path in the SVG icon as a mask and retain the theme color meta value. Why isn't this done? One could have a properly colored icon for one purpose and use the outline of the same icon for the flat design staff. We could change to considering only the alpha channel of the mask icon instead of both mask and luminance. ... Note though, that even if we went alpha-only, it might not be possible to use the same file for a mask icon and a full-color icon and get good results, for certain effects. Sure — for the best results a site may want separate icons. But the recent threads have been largely prompted by sites inadvertently serving suboptimal icons, so we also need to consider the behaviour when they make a mistake, not just the ideal case. And even for the ideal case, a single icon may suffice for some sites. Twitter, for example, with a solid blue bird shape as the colour icon, which could also work as a mask. That _some_ sites would require two icons doesn't seem like a reason to impose that burden on _all_ sites. obvious example is Facebook’s normal favicon, which is a white lowercase f on a blue rounded rectangle. It’s important in the color version for the white to be white, not transparent, but if both the white and blue are solid, the mask version is just a roundrect. Yep, the ideal colour version wouldn't work as a mask. But t'other way round, the mask could work as an acceptable (albeit not ideal) colour icon. Currently when the mask is inadvertently used as the ‘colour’ icon, it has to be all black. But with Nils's suggested change above, the mask could use Facebook blue instead of black; the masking effect would be the same, but if the mask ends up being interpreted as as a colour icon, it then at least has some colour in it. On Jun 15, 2015, at 12:53 PM, Kornel Lesiński kor...@geekhood.net wrote: (4) Don't require the mask icon to be 100% black and read the color from the icon itself. The mask flag would indicate that shape of the icon is distinctive enough, i.e. alpha channel of the icon can be used without the color channels, but wouldn't forbid use of color channels. If in Safari you'd like to enforce use of only a single solid theme color for the icon, then you can compute the theme color by averaging colors of all non-transparent pixels of the mask icon, and use that as the icon's theme color. We do have a requirement to have the mask icons render with a single color. I don’t think the approach suggested here is very good. Color averaging would not be very predictable in its results and could be unstable to changes in the icon if it’s actually multi-color. No, but colour-averaging would only be a fallback to get _some_ colour in the situation where the developer failed to follow guidelines and put multiple colours in their mask image. Again, consider Twitter: if they have an icon which already is a solid shape of the correct colour (so it can be used as a colour icon, too), why should they have to specify that colour a second time in their HTML? You already know what the colour is, from the icon itself. Smylers -- http://twitter.com/Smylers2
Re: [whatwg] A mask= advisory flag for link rel=icon
Kornel Lesiński writes: - Change link rel=icon mask to link rel=mask-icon, but keep using the theme-color meta for the color Please don't use meta theme-color. Financial Times' theme color is salmon pink (#fff1e0), but FT's logo must use black letters. That's another advantage of specifying the mask icon should be a single colour (with transparency), and using that colour as the basis for displaying it: The Pink Un can use black letters and have them actually be black, and Twitter can use a blue bird and have it actually be blue, with nobody having to add or change any existing theme-color. It's also much easier to teach ‘if you want a red house, draw a solid house in the particular share of red you want’ than ‘if you want a red house, draw it in solid black, then specify the shade of red separately in multiple files that you don't necessarily have full control over’. Smylers -- http://twitter.com/Smylers2
Re: [whatwg] alternate ids for elements
Nils Dagsson Moskopp writes: Julian Reschke julian.resc...@gmx.de writes: On 2014-12-03 15:02, Jukka K. Korpela wrote: 2014-12-03, 15:49, Julian Reschke wrote: I have a use case where a certain location in a document can have two anchors ... Can you elaborate on that? Why cannot you use the same id attribute value in all references to an element? 1.) An author-supplied anchor may change, but you want to preserve existing deep links from other documents. The solution seems simple to me: Do not change the anchor id, ever. But what if the original ID used had a typo in it? Or a product name has to change for legal reasons? It's entirely reasonable for anchors to be ‘meaningful-to-human’ IDs that are indicative of the section they are labelling, and for section names to change over time. For instance, Wikipedia pages have an ID for each section which is based on the section name. Every time somebody edits a section title, the anchor changes ... and any external links specifically to that section break. There are far too many broken links on the web of this form, where the link goes to the correct page but includes a non-existent anchor. Quite often the intended section is still on the page, just with a new anchor. Letting sites keep old anchors working would be a benefit for users following external links to them. Alternatively, wrap the element in a div with a new id. Or have some JavaScript which checks for known alternative anchors and replaces them with their canonical spelling. Smylers -- http://twitter.com/Smylers2
Re: [whatwg] Dirty Property (Was: Markup-related feedback)
Ian Hickson writes: On Thu, 10 Jul 2014, Garrett Smith wrote: 1. Form `dirty` property. Set to false, initially. Set to true when the user has interacted with any of the form's controls to make them dirty. What's the use case? Situations where I could have used something like this: • A content management system where the user is previewing their content and can either publish it as it is or make further changes and preview it again. The ‘Publish’ button needs to check that no changes have been made (so that all content gets previewed before publishing). • An application which involves editing records (such as an address book), where closing the page without saving should throw warning that the changes will be lost, unless there haven't been any changes. Can you work around this by just catching 'input' events on the form? Yes. Though on a form with many fields, a mixture of different types of controls, it can be tedious to get it right, and I'm not sure if it's possible to follow the behaviour of some native apps (such as Vim) where making a change and then pressing ‘undo’ unsets the ‘dirty’ flag. (Note I'm not arguing that the feature is, on balance, worth including, merely answering the questions you asked.) Smylers -- http://twitter.com/Smylers2
Re: [whatwg] Preloading and deferred loading of scripts and other resources
Ian Hickson writes: Here's how [the proposal] would handle the use cases listed above. [Use-case G:] A website knows there's a piece of Javascript code that the user might need if they click on a part of the page. The developer would like to have the user download it, but not at the expense of other resources. script src=button-reaction.js id=reaction load-policy=when-needed precache low-priority // button-reaction.js defines react() /script button type=button onclick=document.scripts.reaction.load().then( function() { react(); }) Part of the Page /button What does low-priority add in case G? How does that differ from case H, where when-needed precache is sufficient to avoid delaying other things from loading? [Use-case H:] A website is prefetching photos in a photo album and would like to make sure these images are lower priority than images the user is actually viewing. img src=photo1.jpg alt=... load-policy=when-needed precache img src=photo2.jpg alt=... load-policy=when-needed precache img src=photo3.jpg alt=... load-policy=when-needed precache img src=photo4.jpg alt=... load-policy=when-needed precache img src=photo5.jpg alt=... load-policy=when-needed precache As they come into view, they'll become needed automatically. When they are not needed, they get precached if that wouldn't get in the way of other things getting loaded. Thanks. Smylers -- http://twitter.com/Smylers2
Re: [whatwg] input type=number for year input
Ryosuke Niwa writes: On Mar 7, 2014, at 3:54 AM, Smylers smyl...@stripey.com wrote: An international website wanting a [year] ... could internally store all years using one particular system (say the Gregorian one), but allow input in other systems. This could be with a free-form text box with interpretation, validation, and error-handling on the server side, but that would be a substandard user interface. Better would be to use browser-side JavaScript either to perform the validation or to provide a year picker which only allows selecting valid years; regardless of the interface on this picker — for instance, listing Japanese emperors — it could set the value submitted with the form to be the equivalent Gregorian year. This is why type=year would be useful so that UAs could present it in accordance to the user preference. It's only useful if there are actually sites which want to do this. Are there many websites currently catering [for] Japanese years by offering such an interface? If so, it would make sense to create input type=year such that browsers can offer this consistently, freeing authors from having to develop these for each site. SMBC, the second largest bank in Japan, has an online account form which asks the date of birth using Japanese calendar system. They don't provide an option to type that in using the Gregorian calendar at least in the Japanese version of their website. Sony bank (moneykit.net) asks the date of birth using Gregorian calendar but provides a conversion table from Japanese calendar system: http://moneykit.net/visitor/account/account14.html I'll note, however, that both of these use cases are better addressed via type=date. Yes, so they aren't actually demonstrating that input type=year would be useful. Also, both of those seem to be sites intended only for Japanese users. As such, a Japanese-specific year selection is sufficient for them; they can use select for the entire year, or possibly select for the era then input type=number for the year within that era. Such sites wouldn't be trying to use input type=number for a year in the first place, so the unsuitability of it for Japanese years doesn't matter. The need for a widget which offers either Japanese or Gregorian interfaces for selecting a year, depending on the user's preference, then always submits it in a single defined way to the server, only really crops up on an international site which can expect users most familiar with each of those calendar systems. However, that still wouldn't solve the problem of input type=number putting commas in 4-digit page numbers. Right. But with if we had type=year, UAs could localize it appropriately for this use case. Indeed. But if input type=year isn't otherwise useful, there may be a more generic way of addressing the ‘no commas in 4-digit years’ issue which also addresses 4-digit page numbers (and the like). Cheers Smylers -- http://twitter.com/Smylers2
Re: [whatwg] input type=number for year input
Ryosuke Niwa writes: On Mar 11, 2014, at 2:28 AM, Smylers smyl...@stripey.com wrote: Ryosuke Niwa writes: On Mar 7, 2014, at 3:54 AM, Smylers smyl...@stripey.com wrote: Are there many websites currently catering [for] Japanese years by offering such an interface? If so, it would make sense to create input type=year such that browsers can offer this consistently, freeing authors from having to develop these for each site. SMBC, the second largest bank in Japan, has an online account form which asks the date of birth using Japanese calendar system. They don't provide an option to type that in using the Gregorian calendar at least in the Japanese version of their website. Sony bank (moneykit.net) asks the date of birth using Gregorian calendar but provides a conversion table from Japanese calendar system: http://moneykit.net/visitor/account/account14.html I'll note, however, that both of these use cases are better addressed via type=date. Yes, so they aren't actually demonstrating that input type=year would be useful. Also, both of those seem to be sites intended only for Japanese users. As such, a Japanese-specific year selection is sufficient for them; they can use select for the entire year, or possibly select for the era then input type=number for the year within that era. Such sites wouldn't be trying to use input type=number for a year in the first place, so the unsuitability of it for Japanese years doesn't matter. younger generations tend to be more comfortable with the Gregorian calendar while older generations tend to be more comfortable with Japanese era system. Ah, OK. input type=date with an interface offering both ways of specifying a date would indeed be useful, but I don't think that requires any changes to the HTML spec. It follows that any site which wants a year for any purpose and has a Japanese audience would therefore benefit from a year widget which also offers both Gregorian and Japanese ways of specifying a year. I could find such sites in English, and hypothesize that equivalent Japanese sites would also exist. But in terms of providing requirements to the HTML spec, it'd be better to have examples of actual sites, not mere hypothetical ones. Smylers -- http://twitter.com/Smylers2
Re: [whatwg] input type=number for year input
Ryosuke Niwa writes [re-ordered]: On Feb 19, 2014, at 7:36 AM, Jukka K. Korpela jkorp...@cs.tut.fi wrote: 2014-02-19 11:10, Smylers wrote: Jukka K. Korpela writes: The point is that year numbers aren't really numbers in a normal sense, any more than car plate numbers, credit card numbers, product numbers, or social security numbers are. Surely they can be regarded as numbers, but so can car plate numbers and the others. Except that years do actually form a sequence, and it's possible to perform maths on them; for instances, subtracting one year from another yields a duration Mathematically, you are right, but input types aren't based on general properties of quantities but on practical classification of input data. All the examples I gave, including year numbers, are normally input by typing the digits - in contrast with, say, using a color picker, a data picker, or a slider. And year numbers differ, as mentioned, from normal numbers as regards to conventional formats (e.g., 2014 vs. 2,014 or 2'014 or 2 014 or...). So in the input process, a year number is not treated like a number. It typically appears when asking for year of birth or some other event (marriage, employment, etc.). The input check is normally against any non-digit data, the kind of thing we can do with pattern=... Logically, one might say that since asking for a year is very often an alternative to asking for more specific data such as month or day, it should be treated as date and time input rather than text input with restrictions. But I don't see how this would be practically relevant. What else could input type=year be other than reading some digits? There is the possibility of allowing two-digit numbers, with an implied century, but if that is desirable, authors can use input type=text pattern=\d{4}|\d{2} and deal with the implied century in their own code. Let me point out that not every calendar uses simple 2-4 digit numbers to denote years. The Japanese era name calendar system, for example, requires an era name such as Showa and Heisei associated with each year. For example, I was born in Gregorian year 1986 but any Japanese government document would say I was born in Showa 61. My brother was born in 1989 but, again, he must write Heisei 1 instead on any government form. There are also even quite few banks and other organizations in Japan that use the era name system for various forms and documents. Yes, so for a Japanese organization using the era system, input type=number would clearly be inappropriate for years. An international website wanting a could use input type=text and let users specify the year any way they want — Japanese eras, 2-digit years, Roman numerals, whatever. This could only realistically be stored as a string, and the only thing the website could do with it is display it again; it would be hard to sort by it, or perform restrictions based on the year, for instance. In this scenario, there's nothing special about the year so far as HTML is concerned. Or it could force all users to use Gregorian years, and anybody using a different system needs to convert their year themselves. At which point input type=number works just fine. Or the website could internally store all years using one particular system (say the Gregorian one), but allow input in other systems. This could be with a free-form text box with interpretation, validation, and error-handling on the server side, but that would be a substandard user interface. Better would be to use browser-side JavaScript either to perform the validation or to provide a year picker which only allows selecting valid years; regardless of the interface on this picker — for instance, listing Japanese emperors — it could set the value submitted with the form to be the equivalent Gregorian year. Are there many websites currently catering Japanese years by offering such an interface? If so, it would make sense to create input type=year such that browsers can offer this consistently, freeing authors from having to develop these for each site. However, that still wouldn't solve the problem of input type=number putting commas in 4-digit page numbers. Cheers Smylers -- http://twitter.com/Smylers2
Re: [whatwg] input type=number for year input
Nils Dagsson Moskopp writes: Jonathan Watt jw...@jwatt.org writes: is it wrong to use input type=number for year input. I am certainly not an expert on the topic, but I believe the conceptual problem can be reduced to using an input designed for a group (in the mathematical sensce) to represent a value that is torsor. Quote http://ro-che.info/articles/2013-01-08-torsors.html: That seems to be begging the question. What is it about input type=number that leads you to believe it makes sense to add two such values together? Would you also rule out input type=number for page numbers — if I've reached page 122 of a book and you've read up to page 169, adding those together to get page 291 doesn't make any sense, especially if the book only has 255 pages. Or for temperatures in a degree scale, such as Celsius? Here in Leeds it was 10°C yesterday and is 13°C today, but adding those together to make 23° is absurd. Or what about for people's ages? It seems nonsense to me when the media write things like The Rolling Stones having “a combined age of 273”, yet they do: http://www.standard.co.uk/goingout/music/the-rolling-stones-o2-arena--review-8351840.html While adding two dates is not possible, it is possible to add a time interval to a date («five days from today»). This suggests that we should not confound dates and time intervals — they are different types of values. Therefore asking for a duration using input type=number is fine – asking for a calendar year, however, is obviously a type error. http://math.ucr.edu/home/baez/torsors.html That doesn't seem obvious to me at all. The only mathematical operations I've seen available on input type=number are increment and decrement, which clearly is possible on (Gregorian) years. There are many situations where a web form could wish to collect a non-self-addable value such as a temperature. What would be the advantage in telling web authors they can't use input type=number for these? We could add input type=torsor (except I suspect most web authors would be unfamiliar with the term; I can't find it in the ‘Oxford English Dictionary’), but if it's appearance, interface, and behaviour are identical to that of input type=number, what is the point of distinguishing the two? Cheers Smylers -- http://twitter.com/Smylers2
Re: [whatwg] input type=number for year input
Jukka K. Korpela writes: The point is that year numbers aren't really numbers in a normal sense, any more than car plate numbers, credit card numbers, product numbers, or social security numbers are. Surely they can be regarded as numbers, but so can car plate numbers and the others. Except that years do actually form a sequence, and it's possible to perform maths on them; for instances, subtracting one year from another yields a duration[*1], which is a meaningful quantity, whereas subtracting a couple of credit card numbers is completely useless. [*1] Yes, there are exceptions. But there are still many situations where this is useful, because of the context, such as the range of possible years and the location. Smylers -- http://twitter.com/Smylers2
Re: [whatwg] input type=number for year input
Jukka K. Korpela writes: 2014-02-19 11:10, Smylers wrote: Jukka K. Korpela writes: The point is that year numbers aren't really numbers in a normal sense, any more than car plate numbers, credit card numbers, product numbers, or social security numbers are. Surely they can be regarded as numbers, but so can car plate numbers and the others. Except that years do actually form a sequence, and it's possible to perform maths on them; for instances, subtracting one year from another yields a duration Mathematically, you are right, but input types aren't based on general properties of quantities but on practical classification of input data. That's a reasonable way of doing it. All the examples I gave, including year numbers, are normally input by typing the digits Many other numbers — actual, no-doubt-about-it, definitely 100% genuine numbers — are also typically typed in. - in contrast with, say, using a color picker, a data picker, or a slider. There are situations where up/down arrows makes sense on years. For instance, a chart of various baby names could have a box for the year currently being displayed, and it's handy to be able to nudge that along by a year at a time to see it change, without having to manually retype the year. Or when displaying one year's tax return, with the ability to display other years' returns — with adjacent years being likely options. Obviously not every year actually gets treated as a number, but there are many situations where they are, and where a number input control makes sense for them. Contrast this with credit card numbers or telephone numbers, which never actually get treated as numbers (unless you want a form with the ability to easily cycle through the final digit of a credit card number until it passes the mod 10 check!). And year numbers differ, as mentioned, from normal numbers as regards to conventional formats (e.g., 2014 vs. 2,014 or 2'014 or 2 014 or...). Many people, at least in the UK, don't bother with a thousands separator in 4-digit numbers anyway, but probably would put them in a 5-digit year. The style guides Mike quoted (which in general did use commas in 4-digit numbers) also had other categories apart from years which don't use commas, including page numbers. Page numbers are undoubtedly numbers, and it definitely makes sense to provide up/down arrows for them. So if we wish to be able to follow those style guides, we still need to be able to provide comma-less input type=number controls for page numbers, regardless of whether you consider a year to be a number. If we don't care about following those style guides, we could simply go with Hixie's suggestion of never putting thousands separators in 4-digit numbers. In neither case does decreeing that years aren't numbers actually help. Cheers Smylers -- http://twitter.com/Smylers2
Re: [whatwg] Stroking algorithm in Canvas 2d
Rik Cabanier emailed: On Thu, Oct 10, 2013 at 2:48 PM, Ian Hickson i...@hixie.ch wrote: On Thu, 10 Oct 2013, Rik Cabanier wrote: On Thu, Oct 10, 2013 at 1:28 PM, Ian Hickson i...@hixie.ch wrote: On Thu, 10 Oct 2013, Rik Cabanier wrote: If you draw a rect with dashes today, the dashing will be applied normally. Justin wants to change this behavior so we will need something to trigger that. Othewise, existing applications that use dashed rectangles will start looking different. Do we really have enough deployed content using this API that we are already constrained? What applications are these? Not sure I follow. Are you asking who would use dashed rectangles in canvas? You mentioned existing applications. I'm just curious which these are? Websites using canvas? Do you have URLs I could look at? Rik, I see you didn't answer that question. While obviously it's possible that there are websites which are using dashes in canvas and are already depending on the precise behaviour in browsers that already implement it, the argument ‘we need to keep it this way for backwards compatibility’ only works if we can see examples of actual sites where that is the case, and where changing the dashes would be detrimental to them. Jasper St. Pierre writes: On Thu, Oct 10, 2013 at 6:57 PM, Rik Cabanier caban...@gmail.com wrote: On Thu, Oct 10, 2013 at 3:36 PM, Ian Hickson i...@hixie.ch wrote: Just so we're clear, I really don't have a strong opinion on this issue. I just want to make sure we apply the same rigour to deciding what the model should be as we do to everything else, and that means not just doing things because they've always been done that way, but instead either figuring out why they've always been done that way, or starting from first principles or data and deriving the right behaviour. I think that's totally reasonable. That's how we've always done it often no longer applies. Consistency with every other drawing model out there is probably more important than you first imagine. Documentation, testing, interoperability between browsers, and developer learning are all big motivations to keep things the same. Conversely, I suspect that overall it's less important than you imagine: the web is continuing to grow at a massive rate, so they are likely to be far more web developers in the future than there have been developers so far. In other words, most developers using this API won't have existing experience of pre-canvas systems' dash-drawing APIs — and that will become more so as time goes on. It's likely that early adopters of the technology will be those who are already working in that field using other systems, but I suggest it would be a mistake to design for the preferences of this small group. If a new web developer asks “Why isn't there a straightforward way of doing X?”, I'd rather that the answer isn't “To retain compatibility with a non-web technology which was invented before you were born.” Cheers Smylers -- Stop drug companies hiding negative research results. Sign the AllTrials petition to get all clinical research results published. Read more: http://www.alltrials.net/blog/the-alltrials-campaign/
Re: [whatwg] use cases for figure without figcaption?
Steve Faulkner writes: What are the use cases for a figure without a figcaption ? If a work has only one figure (or graph, map, code listing, whatever) in it, then the surrounding text could say something like see the graph and it'd be obvious what it's referring to, without the need for any further label. Or the content of a figure may intrinsically have a title embedded in it already, such that an additional caption would be superfluous. Smylers -- Stop drug companies hiding negative research results. Sign the AllTrials petition to get all clinical research results published. Read more: http://www.alltrials.net/blog/the-alltrials-campaign/
Re: [whatwg] Proposal: Change HTML spec to allow any arbitrary value for the meta name attribute
Michael[tm] Smith writes: Speaking from my perspective as a contributor to development of a conformance checker: In practice, we receive a lot of comments and bug reports from confused/frustrated users who are trying to use values for meta@name that are not registered. Hi Mike. Thanks for sharing your wisdom on this. Could you give some examples of the kinds of meta names people are using? And do you have an idea which user-agents are acting on those names? I'm interested in whether the meta tags people are using are meaningless cruft or have useful effects in niche fields. And as far as the strategy of trying to use the spec and Wiki page as a means to educate them about trying to taking the time to register meta@name values and only use registered values and standard values (those listed in the spec), well, that strategy is not working well. They just want the validator to shut up. How about a validator interface along the lines of: I see you've used a meta name=kapow tag. kapow isn't a meta name that I know about. Are you sure that it's a correct name, and that it's doing something useful in your document? If so, please tell me its purpose here, then I'll know what it's for and I won't complain about it again: __ Then the validator could add a wiki entry for it. Cheers Smylers -- Stop drug companies hiding negative research results. Sign the AllTrials petition to get all clinical research results published. Read more: http://www.alltrials.net/blog/the-alltrials-campaign/
Re: [whatwg] Proposal: Change HTML spec to allow any arbitrary value for the meta name attribute
Robin Berjon writes: On 04/06/2013 11:08 , Smylers wrote: Michael[tm] Smith writes: we receive a lot of comments and bug reports from confused/ frustrated users who are trying to use values for meta@name that are not registered. Could you give some examples of the kinds of meta names people are using? I've seen quite a few. One recent example is bug-assist.js — a script that makes it easy for readers of a document to file bugs about it — that looks for all metadata names that start with bug. and uses the remainder of the name as parameters to a Bugzilla bug entry. Thanks. That's really helpful for understanding the issue. The point is often that the person seeing the validity error is not the same person who defined the metadata name. That seems to be an instance of the general scenario where a page includes some components provided by third parties (including where the main content is inserted into an outer template provided by a third party), and where a diligent local author wishes to check for errors in her content but not be nagged over problems in parts of the page out of her control. If the third parties care about conformance (or at least care about not losing customers who care about conformance), then they will be amenable to fixing bugs, such as the one Simon reported for bug-assist. And indeed in this case the validator error does something useful. If the third parties _don't_ care about conformance then there could be any sorts of errors in code they provide, not just those relating to meta name=... -- in which case it doesn't sound like it's going to be possible to quell all the error messages that third parties could make while still notifying authors about problems with their part. Maybe instead a validator could let an author select which portion of a page she has jurisdiction over? Or perhaps it could allow uploading both a 'known bad' empty template and a complete page, and only complain about errors in the second that aren't also in the first? (That would also help in a similar situation when editing an old, non-conforming site, and wishing to check that you haven't introduced any new errors, but aren't in a position right now to fix all the existing ones. You could upload the current error-strewn page and your proposed change, and only be told about errors you have introduced.) It [registering meta names] doesn't seem to buy anyone much, either. That seems a more interesting assertion. Is any harm actually being caused by rogue meta names? If somebody changes bug-assist to use data-* attributes instead, does that make the world a better place -- or at least enough of a better place to be worth doing? The benefits would seem to be in avoiding naming clashes: * To bug-assist's developers (and users) it avoids that somebody else in the future mints clashing meta names which have a different meaning, and starts erroneously interpreting data intended for bug-assist . * To other minters of meta names it reduces either complaints about clashing or time spent checking for clashes (without a central list). Having a canonical list of allowed names and a validator that complains about names not on the list means that in the event of a clash, there's a place where a party using an unregistered name can be alerted to the possibility in advance. Whereas with any value allowed they don't get that. Note, I'm not saying those benefits do constitute sufficient reason to keep the check -- just that we should consider them, so that if we then abandon the check we have decided that they aren't worth bothering with. Cheers Smylers -- Stop drug companies hiding negative research results. Sign the AllTrials petition to get all clinical research results published. Read more: http://www.alltrials.net/blog/the-alltrials-campaign/
Re: [whatwg] Features for responsive Web design
Fred Andrews writes: From: m...@apple.com img style=width: 10em src=image-320x200.jpg set=image-320x200.jpg 320 200 10k, image-640x400.jpg 640 400 40k, image-1280x800.jpg 1280 800 150k The layout size of that img element is not computable until all external stylesheets have loaded, as you have written it. Actually, the image width is '10em' in this example, without having to load any style sheets! And how big is 10em? 1em is dependent on the font-size of the parent element of the img, which may be set by an external style-sheet. The browser can immediately determine the image to use and load it in this particular case. You see how easy it would be for authors to get this wrong, even if they knew they had to put image sizes in-line in order to have good performance and tried to do that. That you, the promoter of the feature, can't even get it right suggests that it would also be hard for authors to do so. Cheers Smylers -- New series of TV puzzle show 'Only Connect' (some questions by me) Mondays at 20:30 on BBC4, or iPlayer: http://www.bbc.co.uk/onlyconnect
Re: [whatwg] alt= and the meta name=generator exception
Jukka K. Korpela writes: On 5.8.2012 15:52, Henri Sivonen wrote: Alice anticipates Bob's reaction and preemptively makes her generator output alt= So? Whose problem is this? It hurts users browsing without images of pages generated by that generator. If the validator can do something different which wouldn't nudge developers into writing software which produces such mark-up, end-users benefit. Smylers -- http://twitter.com/Smylers2
Re: [whatwg] A mechanism to improve form autofill
Aryeh Gregor writes: ... in Israel, and I assume some other countries, there are national ID numbers that are considered public info. In the UK library user ID 'numbers' are useful on multiple sites. As well as the local library's own website, it grants access to many reference sites that the library subscribes to, such as those of 'The OED' and 'Encyclopædia Britannica'. Similarly there are other organizations that one can be a member of whose ID numbers need to be provided to multiple websites. For example with some loyalty cards or frequent flyer programmes points can be collected from multiple retailers. It would be really useful if the auto-fill mechanism could cope with such ID numbers: they are often long strings which few people know from memory. So I'm wondering if there could be a 'membership' or 'ID number' field-type, followed by an identifier which organization this is, such as: membership-uk-library membership-israel-id membership-flypoints or: idnum-uk-library idnum-israel idnum-flypoints This would be different from the other autocomplete field types Hixie has proposed, because the organization suffix is open-ended, rather than from a fixed set. I think that's inevitable: the HTML standard can hardly spec every organization that somebody could be a member of. It would be up to each organization that issues membership numbers to decree the suffix that's used for it. Other websites that have forms requiring membership numbers for that organization presumably already have a relationship with it, so can easily ask them which suffix to use; there's no particular need for a central list where one can look up an arbitrary organization's suffix. Clashes are possible, of course, but I suspect in practice if organizations chose their own name (including a geographic part if the organization is specific to a particular country or region) this wouldn't be a big problem. 'Number' above is in scare quotes, since some of these types of ID numbers contain letters as well. OpenID URLs could be viewed as one example of cross-site membership, so could possibly be covered by a system such as this. But since OpenID is an open standard which anybody can use, and isn't tied into a particular organization, an autocomplete type specifically for OpenID URLs may be worthwhile. Cheers Smylers -- http://twitter.com/Smylers2
Re: [whatwg] A mechanism to improve form autofill
Maciej Stachowiak writes: On Jul 25, 2012, at 11:21 PM, Aryeh Gregor a...@aryeh.name wrote: I would also like to point out that this feature seems to overlap with not only type= (as has been pointed out), but inputmode= as well, and for that matter pattern=. I think it would be quite unfortunate if authors found themselves writing things like input inputmode=numeric pattern=\d{16} autocompletetype=cc-num because that's logically pretty redundant. The specific combo of features you list is highly foreseeable. Perhaps specifying certain autocomplete types could set defaults for pattern and inputmode? So for this example autocomplete=cc-num would, if pattern isn't specified, imply pattern=\d{16}, and equivalently for inputmode? You may be right that there will be harder to predict scenarios. By having the highly foreseeable cases merely be defaults for pattern and inputmode, it allows anybody doing something less predictable to still set those attributes explicitly. The complicated cases would be possible, but wouldn't force redundancy on the common cases. Cheers Smylers -- http://twitter.com/Smylers2
Re: [whatwg] A mechanism to improve form autofill
Aryeh Gregor writes: On Thu, Jul 26, 2012 at 11:52 AM, Smylers smyl...@stripey.com wrote: Perhaps specifying certain autocomplete types could set defaults for pattern and inputmode? So for this example autocomplete=cc-num would, if pattern isn't specified, imply pattern=\d{16}, and equivalently for inputmode? That would be surprising, because autocomplete is just a hint, That's a matter of definition. If you squint autocomplete could be seen as the 'purpose of this field' attribute (which just happens to be called autocomplete because we already have an attribute of that name to build on). pattern doesn't allow form submission if it's not met. Possibly only surprising to people who bother to think about it in those terms. I suspect many web developers would simply find it convenient. Also, I couldn't swear to you that all credit card numbers are actually 16 digits, True. But whatever the actual pattern is, it isn't useful to the owner of a debit card with an 18-digit number if the pattern varies between sites and some only allow 16 digits to be submitted. I'd rather trust Hixie to find out what the rules are and bake them into the spec than for every separate webmaster to try to get this right, because some inevitably won't, especially if there are rules which apparently work for many common cases but actually exclude a minority. or that they will forever be 16 digits, so I'm hesitant to make that connection canonical. If the format for credit card numbers changes significantly enough to break patterns that have been working for years, we're in trouble wherever those patterns have been specified. Cheers Smylers -- http://twitter.com/Smylers2
Re: [whatwg] Allow author's data either in header and footer
Aurelio De Rosa writes: ... where to have the author's data inside an article. According to specification, the header The header element represents a group of introductory or navigational aids. Yep. And for some articles the byline could be seen to be introductory. while the footer A footer typically contains information about its section such as who wrote it, links to related documents, copyright data, and the like. So the specs assert that the author's data have to stay into the footer. No, the spec gives examples of what kind of things a footer typically contains, to aid understanding the purpose of the element. It is clearly not a definitive list (it says such as and the like), is and does not say you have to do anything. If we consider a tipical structure of an article (header-article-footer), it is already an habit to see the [author] data in both locations (header and footer). Indeed. the question will became a simple matter of taste and habit. It already is, with the spec as it is. Smylers -- http://twitter.com/Smylers2
[whatwg] Typo in web+ Description
All web+ schemes should use UTF-8 encodings were relevant. -- http://www.whatwg.org/specs/web-apps/current-work/multipage/iana.html#web+-scheme-prefix I think that should be where. Smylers -- http://twitter.com/Smylers2
[whatwg] 'Applicable Specifications'' Relevance Authors
The definition of an 'applicable specification' is marked as only having relevance for implementations: http://www.whatwg.org/html#other-applicable-specifications I think that the paragraph which defines an applicable specification and the note that follows it are relevant to authors, too -- in particular authors wondering if they can extend HTML. Currently 'HTML5 for Web Developers' has a link 'other applicable specifications' which doesn't go anywhere: http://developers.whatwg.org/elements.html Cheers Smylers
Re: [whatwg] The blockquote element spec vs common quoting practices
Ian Hickson writes: On Thu, 14 Jul 2011, Kevin Marks wrote: There is another common pattern, seen in blogging a lot, of putting the citation at the top eg As cite class=vcarda href=http://www.gyford.com/phil/; class=url rel=acquaintance met colleagueabbr title=Phil Gyford class=fnPhil/abbr/a/cite wrote about the a href=http://www.gyford.com/phil/writing/2009/04/28/geocities.php;ugly and neglected fragments/a of Geocities:/p blockquote pGeoCities is an awful, ugly, decrepit mess. And this is why it will be sorely missed. It’s not only a fine example of the amateur web vernacular but much of it is an increasingly rare example of a emperiod/em web vernacular. GeoCities sites show what normal, non-designer, people will create if given the tools available around the turn of the millennium./p /blockquote ... the current markup handles it fine already (as you demonstrate above). Using cite like that isn't conforming, surely? Smylers
Re: [whatwg] Spec with Implementation Details Highlighted?
Ian Hickson writes: On Sat, 7 Jan 2012, Smylers wrote: If it's something you'd find useful even in its incomplete state, I can add an alternative style sheet Yes, please. Roger. I've added an alternative style sheet set that has a rule for .impl sections for you. HTH. Thank you -- very impressive service. Smylers -- http://twitter.com/Smylers2
Re: [whatwg] Spec with Implementation Details Highlighted?
Ian Hickson writes: On Thu, 29 Dec 2011, Smylers wrote: Hi there. Is there still a version of the HTML5 spec with implementation-only parts highlighted? The annotations in the spec were incomplete -- they only covered the parts of the spec that are used to generate the developers.whatwg.org version, As it happens, that's precisely what I'm interested in. If it's something you'd find useful even in its incomplete state, I can add an alternative style sheet Yes, please. I found this helped with reviewing/proofreading the developer-only view. Having the 'missing' parts of the spec available to consult makes checking some thing easier, and enables spotting where too much or too little text has inadvertently been marked for the developer-only edition. Thanks. Smylers -- http://twitter.com/Smylers2
Re: [whatwg] Handling of collapsed whitespace in contenteditable
Aryeh Gregor writes: The behavior we really want here is to output regular spaces, and use white-space: pre-wrap. Does anyone have any suggestions on how best to handle this? It seems like no matter what we do, the best advice to authors would be to set white-space: pre-wrap on the editable region and the resulting editable content. Can you detect when an author has set white-space: pre-wrap, and specify that browsers have the sane behaviour in that case? You'd still have to specify the awkward, fragile behaviour for when it isn't set, but it at least means that going forwards authors could opt in to the sane behaviour. And any author who complains about the nbsp;-s not being quite as they wanted could be pointed at the pre-wrap alternative. Smylers -- http://twitter.com/Smylers2
Re: [whatwg] [html5] r5562 - [] (0) Change how vendor extensions are marked up. Fixing http://www.w3.org/Bugs [...]
wha...@whatwg.org writes: New Revision: 5562 [] (0) Change how vendor extensions are marked up. Fixing http://www.w3.org/Bugs/Public/show_bug.cgi?id=9590 + attributes of the form code title=x0var + title=vendor/var-var title=feature/var/code, where Using x0vendor-feature is perhaps a slightly bigger change than you intended to introduce. Smylers -- http://twitter.com/Smylers2
Re: [whatwg] time element feedback
Aryeh Gregor writes: On Tue, Aug 31, 2010 at 3:53 PM, Ashley Sheridan a...@ashleysheridan.co.uk wrote: I think localisation does have a valid use though. Consider a page written in English with the date 01/12/2010. Is that date the 1st December, or the 12th January? The only clue might be the spelling of certain words in the document, but even then, the most popular office software in use at the moment defaults to American spelling for its spell-check feature, even if bought in England, which leads to words being spelt wrong and giving the reader no good clue as to what the date might be. Localisation in this case would mean that I could read the document and easily figure out what the date was. What do expect the browser to do in this case? Flip it to 12/01/2010 if appropriate, ... would make things much worse, because now rather than having to guess whether the *page* is using American or British convention (usually not too hard), you have to guess what convention your *browser* thinks is right (and it might be someone else's computer, a public computer, . . .). Even so, that still doesn't help. You _also_ have to know whether the author just wrote the date in text or used the time element, in order to know whether your browser has already localized the date for you. Which, in general, an author will have no way of knowing. Smylers -- http://twitter.com/Smylers2
[whatwg] 'Main Part of the Content' Idiom
The HTML5 spec should define how to mark up the main content on a page (even if the answer is by omission). This is something that many authors ask about, the latest example being today's thread on the help mailing list: http://lists.whatwg.org/htdig.cgi/help-whatwg.org/2010-June/000561.html Please could this be added to the 'idioms' section, perhaps giving examples of when article or section might be appropriate as well as one in which the main content is simply that which isn't in header, aside, etc. Thanks. Smylers -- http://twitter.com/Smylers2
Re: [whatwg] RFC: input type=username
Schalk Neethling writes: if your username field will be in the form of an email address, then simply use type=username with a pattern to facilitate email validation. Surely a major reason for having standard validation types is so web developers don't need to come up with patterns for these common things? It also avoids lots of different authors coming up with something different, and not getting it right. The validation needed to accurately match a valid e-mail address is surprisingly convoluted -- see for example the regular expression on this page: http://www.ex-parrot.com/pdw/Mail-RFC822-Address.html Not sure if that is really even needed at that point anyway because you are not really concerned over a well formed email address. If that was a problem, it would have been detected during registration. Sure, you aren't concerned that a user's correct username might not be a valid e-mail address. But if a user tries to submit something that isn't a syntactically correct e-mail address, then he must have mis-typed his username. Using type=email allows the browser to alert him to this, so he can fix it. Without that, he has to wait for server-side validation. Smylers
Re: [whatwg] api for fullscreen()
Brian Campbell writes: I'm a bit concerned about when the fullscreen events and styles apply, though. If the page can tell whether or not the user has actually allowed it to enter fullscreen mode, it can refuse to display content until the user gives it permission to enter fullscreen mode. Why is that a problem? Or even if it's not refusing to display content, it may simply not scale the content up to the full window if the user neglects to give permission for full screen. If the user wants the content to be large, why would he withhold permission? As I understand it, the risk with full-screen view is that a malicous site may spoof browser chrome, such as the URL bar, thereby tricking a user who isn't aware the site is full-screen. So these scenarios seem relevant: 1 A malicious site wishes to switch to full-screen view and spoof chrome. The user hadn't asked for full-screen, so withholds permission. The site may at this point refuse to display content as you put it, but since that content's only purpose is to trick the user, its non-display is a good thing. 2 A user wishes to display some content full-screen, so grants permission and views it. 3 A user doesn't wish to display some content full-screen, so ignores any attempt by the site to become full-screen, and continues to view it normal size. I'm struggling to come up with a scenario in which your concerns apply. Please could you elaborate. Thanks. Smylers -- Watch fiendish TV quiz 'Only Connect' (some questions by me) Mondays at 20:30 on BBC4, or iPlayer: http://www.bbc.co.uk/programmes/b00lskhg
Re: [whatwg] api for fullscreen()
Brian Campbell writes: As I understand it, the risk with full-screen view is that a malicous site may spoof browser chrome, such as the URL bar, thereby tricking a user who isn't aware the site is full-screen. This is addressing a different scenario; not malicious sites per-se, but sites that insist on being displayed full screen. OK. That's 'merely' an annoyance, not a security threat. There are lots of ways authors can be obnoxious if they choose; I'm not sure it's desirable, or even possible, to outlaw them. My boss was very insistent about content being displayed full screen, to make the experience more immersive and reduce distractions ... Please press the button to enter full-screen mode and start the program, and the program would not start until full-screen mode was entered. I could imagine games, and other content doing the same as well. I think that this behavior is fairly user hostile, however. In general user-agents are allowed to display content in anyway that a user has configured them to do, regardless of what the spec gives as the normal behaviour. If a user wishes to view content scaled up to fill the window, without the distractions of navigational links, comments, descriptions, and so on, they don't usually have a way to do this. If it were possible to use the full-screen button, but deny permission to actually go full screen, and have that simply display the content in the full window exactly as if it were full screen, it would give the users more control over how they view the content. I've seen Firefox options (possibly in an extension) which allow users to tweak which toolbars and the like are still displayed when in full-screen view. If a browser (or an extension) wished to implement full-screen view as still having borders, the title bar, status bar, and so on then it could. And there's nothing an author could do about it. Content authors should not be able to force fullscreen mode on users, however, so I think it would be best if the spec allows UAs to send the fullscreen event and set the fullscreen pseudoclass even if the content is not actually filling the entire screen. To say that slightly differently: authors can dictate that certain output is only displayed when in full-screen view; but they have no control how full-screen view looks for a particular user and user-agent. All the spec would have to say to cover all of the possible implementations is that the fullscreen events may be sent even if the content isn't actually filling an entire screen, Allowing that behaviour is entirely reasonable. Though I think it should be covered by a more general statement that user-agents may display things however they want if so-configured, rather than just stating it for this particular narrow case. Smylers -- Watch fiendish TV quiz 'Only Connect' (some questions by me) Mondays at 20:30 on BBC4, or iPlayer: http://www.bbc.co.uk/programmes/b00lskhg
Re: [whatwg] the cite element
Jim Jewett writes: In http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-September/023005.html, Ian quoted Erik Vorhes as writing: Put another way, if you had no prior knowledge of the current HTML5 definition of cite (and perhaps any other specification's definition of the element), what would seem to be logical and appropriate uses of the element? Ian: You mean based on just the element name? I wouldn't use it without reading the spec first. Most people seem to think it means italics, though, for what that's worth. I think that gets at the root of the problem with cite. Most people don't read the spec, or even know where to find it. cite isn't common enough to just copy by example, and it turns out to be ambiguous as the name of an element or attribute. But why would somebody be in the situation where they encounter cite, want to use it, but aren't sure where? Surely that's backwards? Why would authors be trying to use elements for the sake of them? I'd expect the more usual sequence to be an author typing some text, blissfully unaware of cite, then coming to the title of a book and wanting it to be styled differently so as to convey that to users, and looking for the element to use. I don't think anybody's claiming that cite is a great name. But it's what we have. Do you wrap the actual excerpt (the precise thing you're citing), or the name of the source? The name of the work is the part that readers typically have distinguished to them. If you wrap the name/title of the source, is there a way to show the scope of what you're attributing? Not in HTML5 (but possibly with a microformat specifically for that). In what way do you envisage this being conveyed to or of use to users? My own interpretation of (a fraction of) http://philip.html5.org/data/cite.txt did not support narrowing the definition only to titles. But in the cases where cite is being used for things other than titles of works, what does it achieve? How do users benefit? If authors are spending time on using an element which has no effect on users (and Hixie's pointed out that in many cases where cite is used other than for titles of works authors use CSS to remove the default italics, to ensure that users don't actually have the presence of the cite conveyed to them) then there's no reason for HTML5 to continue to support it. Rather it does authors a favour; they'll no longer have to spend time doing something of no benefit. These do seem useful; if you wanted more information, it might well be How do I contact this photographer or that model to get something similar? How does the use of cite make that any easier for users than if, say, span (or i or div or whatever) had been used instead? Smylers
Re: [whatwg] the cite element
Erik Vorhes writes: A use-case for person's name in the context of cite: In reference to many Classical texts one will often refer to the author in lieu of the title (or in some cases that author's corpus). That isn't an argument for people's names _in general_ being marked up; it's an argument for marking them up in the specific case where they are used as (nicknames of) titles of works. E.g.: pYou should read citeHerodotus/cite./p That's using Herodotus as the title of a work. In many fields it's common to refer to well-known works by nicknames, such as 'Smith Thomas', 'The Dragon Book', 'The Red Book', or 'The White Album'. So cite should be used for them. But it doesn't follow that cite should be used for any other occurrences of those terms -- the people Smith and Thomas, or a book which just happens to be red. Smylers
Re: [whatwg] [html5] r3886 - [e] (0) highlight relevant part of example for consistency
wha...@whatwg.org writes: Log: [e] (0) highlight relevant part of example for consistency These parts are highlighted with strong: + tdpre class=exampleDisk space remaining: stronglt;meter75%lt;meter/strong/pre Isn't mark the appropriate element here -- highlighting the relevance of those parts in the current context, rather than denoting them as being important? Smylers
Re: [whatwg] [html5] r3859 - [acgiowt] (2) Parser changes: dc, ds, dialog are now treated differently. [...]
wha...@whatwg.org writes: + pThe span class=implfirst/span codea href=#the-dt-elementdt/a/code element child + of the element, if any, represents the caption of the + codea href=#the-figure-elementfigure/a/code element's contents. If there is no child + codea href=#the-dt-elementdt/a/code element, then there is no caption./p + pThe span class=implfirst/span codea href=#the-dd-elementdd/a/code element child + of the elementspan class=impl, if any,/span represents the + element's contents. span class=implIf there is no child + codea href=#the-dd-elementdd/a/code element, then there is no caption./span/p I think that last caption is supposed to be content. Also, the If there is no for dd is class=impl but the equivalent one for dt isn't. While proofreading this change I also spotted an inconsistency in the related example undet the dd element: http://www.whatwg.org/html5#the-dd-element I think the first class=part-of-speech should be on the i rather than the dd (matching the other instances). Smylers
Re: [whatwg] the cite element
of that are using CSS to remove the italics) are wrong. It's unfortunate if HTML4 mislead them. But that's no reason to make the situation any worse by encouraging others to make the same mistake. Are you retroactively finding fault because you have redefined cite in the HTML5 specification? Doing the above never made sense, notwithstanding and interpretations of HTML4 which suggest otherwise. And as Jeremy Keith and others have pointed out, there's nothing wrong with overriding default presentational styles. I'm not sure why it should be such a cause for concern with cite . Overriding is very differnt from trying to remove the effects of. I believe I understand why you have chosen to define cite as it appears in the current draft of the HTML5 specification; I just happen to believe that the current definition is not as useful as it could be and (more importantly) invalidates current reasonable uses of the element. Why is that important? Automated validators generally won't catch it, so it won't make previous valid pages suddenly spew dozens of errors (a concern with other changes from HTML4). And if authors of such pages on discovering non-title uses of cite aren't valid then remove them, that's a win for users of non-CSS browsers. Smylers
Re: [whatwg] Fakepath revisited
Alex Henrie writes: A better solution exists: drop the fakepath requirement. Browsers that desire extra compatibility can add fakepath to their compatibility modes as they choose. Browsers have 'extra' compatibility is one of the things which currently causes the _most_ grief for many web developers: writing something to the spec is nowhere near sufficient to have confidence of it working as intended in all browsers. If one major browser implements non-standard behaviour for compatibility with existing content, it would have an advantage with users over other browsers -- those other browsers would likely want to implement it, to avoid losing market share. But browsers unilaterally implementing 'extra compatibility' means other browsers wanting to be similarly compatibile have to reverse engineer the first browser -- a time-consuming and brittle process, which in practice often leads to some edge cases where the behaviour is not the same. Also, it makes it hard for a new browser developer to enter the market, since being compatibile with real-world content involves implementing this undocumened behaviour. Like other compatibility mode behavior, implementation would be voluntary and not governed by the W3C. What other compatibility mode behavior? The bottom line is that no web developer wants to have a confusing, unintuitive, and very permanent standard. There is much evidence to suggest that web developers are not happy either with the previous situation of lots of browsers picking their behaviour independently, leading to differences between browsers. Don't punish all web developers for the poor past designs of the few. Unfortunately that's pretty much the modus operandi of HTML 5: standardizing previous stupidities so that we can all share in them. (We've already tried the alternative, and it's worse.) Smylers
Re: [whatwg] Comments on the definition of a valid e-mail address
Aryeh Gregor writes: Historically, MediaWiki has mostly just required that an @ symbol be present in the address. Originally we used a simplistic regex, It's relatively well known that a simple regex can't be used to match e-mail addresses (and not match things that aren't!); Jeffrey Friedl's 'Mastering Regular Expressions' (O'Reilly) included a pattern for this over a decade ago, but it is exceedingly long: http://groups.google.co.uk/group/comp.lang.perl.misc/msg/603ba6fc642a3124 http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html ... but when users complained, we looked into the RFCs and decided it was too complicated to bother with validation beyond checking for an @ sign. It's too complicated for most developers to roll their own validation, but there are standard libraries available which get it right. ... I decided to do some research on how many users' addresses would be invalidated [by HTML 5's validation] ... 1) Addresses in the form foo b...@baz.example, or similar. These mostly match RFC 5322's name-addr production instead of addr-spec Forms on websites capturing users' e-mail addresses typically want just the address part, prompting for the human-readable name in a separate box, so I think HTML 5's input type=email not allowing the above is helpful. 2) Addresses with dots in incorrect places, in either the local part or the domain name part. For instance, multiple consecutive dots, or leading/trailing dots. These don't match RFC 5322 at all AFAICT, but I asked one of the users with an invalid address of the form f...@example.com, and he said it worked fine for him. GNU mail gave a syntax error when I tried to send mail to that address, but Gmail sent it without complaint, and the user received it successfully. There may actually be several categories of oddly placed dots. While the address in the form you give above works it may be, say, that those with repeated dots in the hostname part don't work. On the specific case of a . immediately before the @, I've seen that before: this Perl library module extends an RFC-compliant module to allow just that; its author admits .@ breaks the RFCs but claims such breakage is useful in the real world, specifically when dealing with e-mail addresses for Japanese mobile phones: http://search.cpan.org/perldoc?Email::Valid::Loose That somebody has found this to be a sufficiently widespread problem with standard Perl e-mail address validation to write and upload a module which 'fixes' this (and just that; it makes no other changes) suggests that people will find HTML 5's input type=email to be problematic in precisely the same way. There were other types of addresses that didn't meet HTML 5's specification after whitespace was stripped, but none with more than a single-digit number of addresses occurring in the sample of three million or so that I looked at. So it may actually be that there isn't a general problem here of lots of real-world e-mail addresses which work but don't comply with the RFCs; it may simply be the one case of .@? There aren't a plethora of Email::Valid extensions which relax various different criteria; just the one which allows .@. Alternatively, you could just loosen the restrictions even further, and only ban input that doesn't contain an @ sign. (Or that doesn't match ^...@]+@[...@]+\.[^@]+$, or whatever.) Or just don't ban anything at all, like with type=tel. type=email differs from most of the other types with validity constraints (like month, number, etc.) in that the difference between valid and invalid values is a purely pragmatic question (what will actually work?) that the user can often answer better than the application. It doesn't seem like a good idea for the standard to tell users that the e-mail addresses they've actually been using are invalid. Users often mis-type e-mail addresses. It seems useful to be able to trap as many typos as possible. Many authors obviously believe this, given how many employ JavaScript validators. If HTML 5 were overly permissive about input type=email then it's likely such authors would continue to use homegrown JavaScript solutions, which slightly defeats the purpose of HTML 5 introducing input type=email). Smylers
Re: [whatwg] Comments on the definition of a valid e-mail address
Aryeh Gregor writes: On Mon, Aug 24, 2009 at 4:36 AM, Smylerssmyl...@stripey.com wrote: It's too complicated for most developers to roll their own validation, but there are standard libraries available which get it right. Standard libraries available for all major languages? I'd be surprised if they weren't. As far as I can tell from a quick search, the PHP standard library contains no e-mail validation routines before 5.2.0 Sorry, I meant there is a library (meaning additional to the core language) available in a standard place (wherever that language's libraries are typically found); I wasn't intending to claim that the standard library of functionality which is part of a language's core distribution would include it. For PHP I Googled email validation Pear and found the following as the top hit. I haven't tried it, but it claims to comply to RFC822, and I'd have more faith in it than the average home-rolled attempt: http://pear.php.net/package/Validate/ Forms on websites capturing users' e-mail addresses typically want just the address part, prompting for the human-readable name in a separate box, so I think HTML 5's input type=email not allowing the above is helpful. It might be more helpful if they stripped the part outside the angle brackets, but I agree that it's reasonable to just reject these. Good point. And that's largely a UI matter: either way the web server doesn't receive a value with the outside clutter in it. The breakdown of the 202 is as follows. Thanks for providing this. * Single trailing dot in domain part: 100 (prohibited by RFC but plausibly deliverable) Yup. If it is deliverable then surely it's an alias to the same address without the trailing dot, in which case a browser could choose to remove it. * Single trailing dot in local part: 40 (prohibited by RFC but plausibly deliverable) Discussed previously. This seems to be the problematic category. * Valid address in angle brackets (with other junk around it): 21 (permitted by RFC, kind of, and plausibly deliverable) Discussed above. * Multiple consecutive dots: 20 (prohibited by RFC but plausibly deliverable) If you mean the ..s are in the local part then yes, it sounds likely that would get delivered, and a quick non-exhaustive trial seemed to show this can work. (If they're in the hostname then I'd be amazed if it's deliverable, but surely it'd be to the same address that's reached by replacing sequences of dots to a single dot.) * No @: 9 (unlikely to be deliverable) Indeed. * Comment: 3 (permitted by RFC and plausibly deliverable) Equivalent to the angle bracket case above -- the address without the comment could be extracted. * Miscellaneous: 9 (one containing [...@[spam], two with trailing , one in quotes, one with single leading dot in local part, two with single leading comma in local part, one with leading : , one with leading \) They don't sound deliverable, or if they are would also be with superfluous punctuation stripped. And I'm not sure single cases are worth fretting about. If HTML 5 validation rejected one of the above it seems very likely the user would be able to provide an alternative address (or alternatively punctuated address) which is valid. So it may actually be that there isn't a general problem here of lots of real-world e-mail addresses which work but don't comply with the RFCs; it may simply be the one case of .@? No, that was just the example I chose because I knew that person personally, and so was able to confirm that the address actually worked. There are two categories of input which could be a working e-mail address yet violate the RFCs: 1 A valid e-mail address with extra 'stuff' in it or surrounding it (spaces, comments, trailing punctuation characters, etc). As you suggested, browsers can clean up the user's input, so what servers receive is a valid e-mail address. 2 A working e-mail address which contains something the RFCs say it shouldn't but needs that in order to function; attempting to clean it up would transform it to a different e-mail address, which possibly delivers somewhere differently from the original. Analysis of your detailed breakdown suggests the only addresses in category 2 are those with dots in odd places in the local part. So it may be the only change required to allow all working real-world e-mail addresses is a willful violation that permits dots anywhere in the local part (even immediately after another . or before the @). That change would appear to cover the cases in your data, but others may have data which shows there are additional cases. Smylers
Re: [whatwg] Text areas with pattern attributes?
Aryeh Gregor writes: On Wed, Aug 19, 2009 at 11:32 AM, Geoffrey Sneddongsned...@opera.com wrote: What's the use-case for it? Textareas are almost always for such large amounts of input that they are almost always free-form text. Why allow the pattern attribute? You could impose a minimum character length for posts -- that's common on forums. Or ban certain words or phrases. Are there currently sites using JavaScript to perform the above checks pre-submission? If so, would the checks be easier to write using a pattern attribute than they currently are? (If they aren't then it seems unlikely authors would bother with it.) Max Romantschuk writes: Mike Shaver wrote: It's also pretty common to enter multiple email addresses or tracking numbers or URLs one-per-line for batch operations on sites, and they would benefit from having client-side validation of such patterns. I also believe that it would be beneficial to have an option to regex-validate a text area in cases like this. It's not far fetched to imagine copy-pasting a bunch of data from a spreadsheet column into a textarea, in which case it would make sense to be able to have client side validation for a given pattern repeated n times with newlines in between. Are there any such sites already? If there aren't, it seems unlikely that the lack of textarea pattern=... is what's holding them back. I really don't see a case for not allowing pattern for a textarea. The point is to have cases specifically _for_ it -- not adding everything for which there isn't a reason against. If textarea pattern=... wouldn't in practice be used by authors then there's no point in adding it. If it would be used then it should be trivial to show some places where it would be used. Smylers
Re: [whatwg] small element should allow nested elements
Remy Sharp writes: On 14 Aug 2009, at 10:09, Ian Hickson wrote: I wouldn't bother wrapping any of the above as small print. If you're structuring this enough that you have numbered lists and paragraphs and everything, then it's either not small print, or it shouldn't be. For example, the BBC's ... default pages are shown in 1.2em. The exception being their terms pages, which overrides the font size to 1em in a terms.css style sheet. BBC Terms: http://www.bbc.co.uk/terms/ That's an entire page of legalese. The legalese is the point of the page. It doesn't need marking up in some way from the rest of the text on the page because there isn't any such text. This is because they want the text to appear as small print. CSS seems an entirely reasonable way of doing that. If a CSS-less user doesn't get the text delivered smaller no meaning is lost (since the user reading that page is aware that it's all small print). Ditto for a listen whose speaking browser uses the normal voice for it. Where small might be useful is another page which has a competition on it (in regular sized text) followed by: p smallTerms and conditions apply. For full details see the a href=http://www.bbc.co.uk/terms/;standard BBC Tamp;Cs./a./small /p In that case the short amount of 'small print' is distinguished from the surrounding text. Visual users can see it as such; a speaking browser could read it out faster. Smylers
Re: [whatwg] the cite element
Erik Vorhes writes: So the definition of cite in HTML5 should currently be title of work or something that could be construed as a title where one doesn't exist in the explicit sense of 'title.' But not people's names, even if they're the citation, because using cite for citations is silly. Hi Erik. Rather than start with the cite element and think how you can use it, I find it easier to understand t'other way round: When writing text you sometimes want some words to be presented differently (typically in italics), to convey some information to readers. If the semantic you wish the italicized text to convey is that it's the title of a published work, then cite is the appropriate HTML element to use for this. (When word processing many folks simply use italics, meaning titles are marked up the same as, say, emphasis. This precludes later changing the house style in a way which distinguishes them, and from having voice output use different voice variants for each.) If you wish the graphical presentation of such titles to be something other than italic (underlined perhaps, or in a different colour, or in normal text but surrounded by quote marks) then you can achieve that with CSS. But the semantic is still there in the document, so can still be conveyed to all readers and listeners, regardless of their environment and user-agents. For words that you wish to have no distinct presentation from the surrounding text -- words that readers don't need calling out to them as being in any way 'special' -- simply don't mark them up. As Ian has pointed out, the above is technically non-conforming with what the HTML 4 spec claims. But it's how I've been using cite for years, since it makes sense and has a use. Other proposed definitions of cite may more closely correspond to the English word cite, but the set of phrases they would denote do not seem to be a useful set of things to lump together; they do not match any set of things which are typically conveyed to readers in a particular way (for example by typographical conventions). While HTML 5's definition of cite is a useful thing to have an element for, the name 'cite' is not a great choice to label that. However the element already exists; its previous definition has overlap with the useful definition; and its default display in existing browsers is the common typographic style for the useful definition (which gives weight to the idea that the HTML 5 definition is actually what at least some people intended in the first place, or have already been using it as). So tweaking the definition to be more useful seems better than inventing a new element with a better name. Smylers
Re: [whatwg] the cite element
Erik Vorhes writes: On Thu, Aug 13, 2009 at 4:59 AM, Smylerssmyl...@stripey.com wrote: As Ian has pointed out, the above is technically non-conforming with what the HTML 4 spec claims. But it's how I've been using cite for years, since it makes sense and has a use. I defy you to show me in the HTML 4.01 specification where something like the following is nonconforming: By the above I was refering to my previous paragraphs -- in which I'd just described my use of cite. I am admitting that _my_ definition isn't permitted by HTML 4. I am _not_ claiming that your definition isn't allowed in HTML 4; I'm claiming that the HTML 4 definition, including things like marking up names, isn't particularly useful. For this example: pI like to read nonfiction, such as citeJohn Adams/cite, but I had more time for that when I was a professional academic./p How do you want that to be rendered? The conventional presentation would be for John Adams simply to be rendered in exactly the same way as the surrounding text, with the reader being given no information at all that those words are in some way special. Smylers
Re: [whatwg] Spec comments, sections 3.1-4.7
Tab Atkins Jr. writes: On Wed, Aug 12, 2009 at 7:43 PM, Aryeh Gregorsimetrical+...@gmail.com wrote: I haven't noticed many progress bars on the web You see them a lot more in the indeterminate form, as a 'spinner' image or the like. ... I suspect, though, that there are a lot of places you currently don't see a progress bar solely because it's a bit of a pain to do. Many shopping sites have indicators of how far you are through the buying process (Step 2 of 4), and online surveys often have progress bars indicating the position in the survey. Typically these are static to the page (as in, making progress and seeing the indicator move involves submitting a form and displaying the next page in the sequence), but so far as I can see from the spec progress can be used in these situations; it isn't restricted to use on a single page where it is updated dynamically. Smylers
Re: [whatwg] the cite element
Erik Vorhes writes: On Thu, Aug 13, 2009 at 4:59 AM, Smylerssmyl...@stripey.com wrote: For words that you wish to have no distinct presentation from the surrounding text -- words that readers don't need calling out to them as being in any way 'special' -- simply don't mark them up. Interesting point. Should the HTML5 specification explicitly admonish against using microformats, microdata, RDFa, and the like? Possibly I stated the above too strongly. In general invisible metadata doesn't have a great history; the most successful systems involving machine-parsed web pages seem to involve machines parsing the human visible parts of pages rather than things like meta keywords. But I didn't actually mean to go so far as to say these should never be used. If somebody can do something useful with names marked up as metadata then that's a reason for marking it up in some way. But HTML 5 doesn't need a specific element for that; there's the generic microdata syntax. If marking up people's names when citing them becomes really common then a future version of the spec could mint an element for that (like happened with time, a common metadata pattern). But there still wouldn't be a call for an element which sometimes indicates its contents should be displayed to the reader in a way which indicates they are the title of a work and sometimes indicates its a person's name. Smylers
Re: [whatwg] Reading spec without boxes
Ian Hickson writes: On Wed, 5 Aug 2009, Elliotte Rusty Harold wrote: the little status boxes in the left margins on the draft spec? They seem to cover some of the text I'd like to read. If they cover up any of the text, that is a bug. I experienced this recently with a minimum font size set (in Firefox). I tracked it down to something like this (sorry, that was on another computer, so this is from memory): * The main content's left margin, in which the boxes have to fit, is specified relative to the main content's text size. * The boxes' font size is specified as a proportion of the main content's font size. * The boxes' width is specified relative to the boxes' font size. And because of the previous two points this is relative to the main content's left margin, so is always less than that margin regardless of the main font size. * But with a minimum font size set in the UI, the actual box font size can end up larger than that computed above. The boxes' width are then correspondingly bigger, and may now be wider than the main content's left margin. I prototyped a fix for this, which went something like: Instead of setting the smaller font on the boxes, set it on all their children (.box * -- or whatever the class name is). This still makes the text smaller. But that leaves the width of the box being specified relative to the main content font -- the same as the margin it needs to fit in. As such it's trivial to pick a size that always fits. I hadn't yet submitted this because I first planned to try it in more browsers. In particular I'm concerned that the child selector isn't support in some IE versions. Hope that helps. Smylers
Re: [whatwg] [html5] r3429 - [e] (0) Add a section on establishing a connection.
wha...@whatwg.org writes: + pThe Web Socket protocol is an independent TCP-based + protocol. It's only relationship to HTTP ... An apostrofly has crept in there. Smylers
Re: [whatwg] Week Strings
Ian Hickson writes: On Fri, 19 Jun 2009, Smylers wrote: For input type=week elements the spec requires: The value attribute, if specified, must have a value that is a valid week string. -- http://www.whatwg.org/html5#week-state But the spec's HTML source contains this comment immediately afterwards: !-- ok to set out-of-range value, we never know when we might have to represent bogus input -- it [...] means that there's no requirement that the value= be within the range given by min= and max=. Thanks for clarifying that. input type=week value=2010-W53 input type=week value=2010-W54 If out-of-range week values are to be permitted in input elements then a validator should permit both of them. Conversely if they aren't permitted then it should accept neither of them. Please report such bugs straight to Henri. :-) Of course -- but I wanted to check what the correct behaviour was first. Smylers
Re: [whatwg] Localised numbers
K?i?tof ?elechovski writes: The rules for parsing floating point number values [2.4.4.5] apply to the input element in number state as well. Do those rules actually limit UI? The say how a user agent must parse a value attribute provided in the source. And they constrain what a UA may send back to the server. But surely a UA can pick any UI they want for this -- including a text box where the user types a comma, and a decimal point displays as a comma? Smylers
Re: [whatwg] Offline Conformance Checkers
I wrote: Conformance checkers must use the information given on the WHATWG Wiki MetaExtensions page to establish if a value not explicitly defined in this specification is allowed or not. -- http://www.whatwg.org/html5#other-metadata-names I think we should allow conformance checkers which can be run on local files without an internet connection, but the above appears to deem them non-conforming. That applies to this bit too: Conformance checkers must use the information given on the WHATWG Wiki PragmaExtensions page to establish if a value not explicitly defined in this specification is allowed or not. -- http://www.whatwg.org/html5#other-pragma-directives Also, the above sentence isn't marked up as an implementation requirement. Smylers
Re: [whatwg] [html5] r3316 - [e] (0) A quick introduction to HTML.
The new quick introduction includes: Attributes are placed inside the start tag, and consist of a name and a value, separated by an = character. The attribute value can be left unquoted if it is a keyword, but generally will be quoted. -- http://www.whatwg.org/html5#a-quick-introduction-to-html Generally will be seems to be a prediction that the spec doesn't need to make, and could be seen as a recommendation for authors to quote attributes even when they're unnecessary. Could we simply omit that (finishing the sentence at keyword)? Smylers
Re: [whatwg] [html5] r3323 - [] (0) Add rules for improving compat with XSLT 1.0. (bug 6776)
wha...@whatwg.org writes: + /dlh3 id=dom-based-xslt-1.0-processorsspan class=secno3.7 /spanDOM-based XSLT 1.0 processors/h3 + pXSLT 1.0 processors outputting to a DOM when the output method is + html (either explicitly or via the defaulting rule in XSLT 1.0) + are affected as follows:/p + + pIf the transformation program outputs an element in no namespace, + the processor must ... snip Should this text be marked up as an implementation requirement? Smylers
[whatwg] Offline Conformance Checkers
Conformance checkers must use the information given on the WHATWG Wiki MetaExtensions page to establish if a value not explicitly defined in this specification is allowed or not. -- http://www.whatwg.org/html5#other-metadata-names I think we should allow conformance checkers which can be run on local files without an internet connection, but the above appears to deem them non-conforming. The general conformance requirements include this get-out clause, but it isn't broad enough to cover this case: User agents may impose implementation-specific limits on otherwise unconstrained inputs, e.g. to prevent denial of service attacks, to guard against running out of memory, or to work around platform-specific limitations. -- http://www.whatwg.org/html5#conformance-requirements Smylers
[whatwg] hasFeature() When Only 1 Syntax is Supported
The current text suggests that a user-agent may choose to support only the HTML syntax (not XHTML) but should still return true for hasFeature(XHTML, 5.0). If that isn't intended then the requirements for hasFeature() should be changed to depend on the syntaxes chosen to be implemented. If it _is_ intended (and given various things browsers have to do for web compatibility, it wouldn't surprise me) then perhaps it would be better to spell this out explicitly, since it's counter-intuitive. hasFeature() currently has the implementation requirements: User agents should respond with a true value when the hasFeature method is queried with these values. -- http://www.whatwg.org/html5#dom-feature-strings: Where these values are (HTML, 5.0) and (XHTML, 5.0). However while supporting both HTML and XHTML is encouraged, user-agents may choose to support only one of them: http://www.whatwg.org/html5#conformance-requirements Smylers
[whatwg] Week Strings
For input type=week elements the spec requires: The value attribute, if specified, must have a value that is a valid week string. -- http://www.whatwg.org/html5#week-state But the spec's HTML source contains this comment immediately afterwards: !-- ok to set out-of-range value, we never know when we might have to represent bogus input -- Does that comment mean that the above requirement will be changed to something along the lines of must have a value that is a syntactically valid week string but may represent a week that doesn't actually exist? That is, the author can seed a browser's week-picker control to a value which the browser must not submit back to the server? In general determining that something is a valid week string requires knowing which day of the week the year in question begins on. For example 2009-W53 is a valid week string (because 2009 began on a Thursday) but 2010-W53 isn't (because 2010 will begin on a Friday). Browsers will need to do this to know whether they can submit a week value. The spec doesn't appear to provide an algorithm for determining which day of the week a year begins on (however I am not a browser developer; possibly this is sufficiently straightforward that those who are don't need it spelling out). Currently Validator.nu accepts this: input type=week value=2010-W53 but not this: input type=week value=2010-W54 If out-of-range week values are to be permitted in input elements then a validator should permit both of them. Conversely if they aren't permitted then it should accept neither of them (and therefore have to implement a 'which day is January 1st' algorithm, which I'm guessing it currently doesn't). Smylers
Re: [whatwg] Week Strings
Anne van Kesteren writes: On Fri, 19 Jun 2009 11:48:17 +0200, Smylers smyl...@stripey.com wrote: The spec doesn't appear to provide an algorithm for determining which day of the week a year begins on (however I am not a browser developer; possibly this is sufficiently straightforward that those who are don't need it spelling out). It does actually, but it is not clearly linked from valid week and such: http://www.whatwg.org/specs/web-apps/current-work/#weeks That says that 1969 December 29th was a Monday, but doesn't appear to give an algorithm for answering the question Will the year 2010 begin on a Thursday?. Smylers
Re: [whatwg] Week Strings
Křištof Želechovski writes: An algorithm for calculating the weekday of Jan. 1st given a year would be outside the scope of the HTML specification. That's begging the question. Similarly, the HTML specification does not describe how you increment a number by 1. No, but it does explain how to interpret a sequence of digits as an integer (multiplying the value by 10 for each digit encountered). And defines that week is a period of 7 days. And defines how many days there are in each month of a year. And even states the blindingly obvious: The 'week number of the last day' of a week-year with 53 weeks is 53; the 'week number of the last day' of a week-year with 52 weeks is 52. Those things all seem much more obvious to me than working out which day January 1st of a given year is. But as I said, I'm not a browser developer so perhaps it's fine. Smylers
Re: [whatwg] Dom as Audience Prereq
Kristof Zelechovski writes: Unlike in previous versions, the DOM is the skeleton and the underlying model of the specification. Yup. But I don't think any more Dom knowledge is needed to read this version. Even if there are sections that do not reference the DOM explicitly, a reader that tries to apply them to anything will not probably be able to draw the right conclusions without a basic knowledge of the DOM. I think that an author wanting to know how to use certain elements to mark up a static document (no scripting) could understand enough of the relevant sections -- for example the rules on when to use which of strong, mark, b or providing alt text for images. Smylers
Re: [whatwg] Week Strings
timeless writes: On Fri, Jun 19, 2009 at 2:32 PM, Smylerssmyl...@stripey.com wrote: The 'week number of the last day' of a week-year with 53 weeks is 53; the 'week number of the last day' of a week-year with 52 weeks is 52. well... there are people who might think you could count from week 0 Except that the spec has already defined that a year starts on week 1 before the above sentence. Smylers
Re: [whatwg] b Lede Example
Kristof Zelechovski writes: A lede is a summary or an invitation to read the whole article. It is semantically relevant; the reader may ask, e.g., Give me the ledes and I shall choose what I would like to read. For a user-agent to reliably provide that functionality would require a specific lede element, not merely allowing one of several uses of b be for denoting ledes. Using b for ledes really only enables ledes to be styled differently, not for semantic interpretation. Asking for the first paragraph of each article is not that practical, as the article need not contain a lede there Are there sites which have variable-length semantic ledes, use an element to mark that up, and where a reader who doesn't have the lede styled differently (for example if span class=lede is used and the reader doesn't have CSS) is missing something? In practice it seems that sites which style ledes also have a journalistic house style which requires journalists to consistently have the lede be the first paragraph (or whatever). Smylers
[whatwg] Using em for Meta-Content
HTML 5 currently defines em as being for stress emphasis of its contents, noting that: The placement of emphasis changes the meaning of the sentence. The element thus forms an integral part of the content. -- http://www.whatwg.org/html5#the-em-element I'm not sure this definition is wide enough to encompass the use that HTML 5 itself puts em to, using it for the This section is non-normative bits at the start of sections, such as: http://www.whatwg.org/html5#introduction The italics there don't seem to be indicating stress (and the sentence doesn't warrant an exclamation mark at the end), more that it's meta-content -- information about the section. Of current HTML 5 defintions that seems closest to one of the purposes of i: an alternate voice or mood, or otherwise offset from the normal prose: http://www.whatwg.org/html5#the-i-element I suggest that either the definition of em is broadened to include this sense, or these normativity designators are instead marked up with something like i class=normativity or i class=other. This meta-content use seems similar to an article by a guest author being prefaced by an italicized paragraph from a regular author introducing the guest. Or editoral comments inserted into somebody else's work, which are often in square brackets and italics as well as having - Ed at the end. Mainly it's just indicating some kind of separation from the main text. (strong isn't quite right for these uses either: while the sentence is important, it's hardly the key information in that section. If reading the spec out loud to somebody This section is non-normative is the kind of thing I'd say very quickly, as boilerplate to be got out of the way of the interesting content to follow (almost like legalese on radio adverts). That suggests the small element, but that isn't quite right either: whether a section is normative is materially relevant to the content, not just a legal technicality.) Smylers
[whatwg] b Lede Example
One of the examples of b is marking up a lede paragraph: http://www.whatwg.org/html5#the-b-element Is a lede semantically relevant to the document such that it needs to be in the mark-up? Emboldening the first paragraph of an article seems like a matter of style to me, similar to using a drop cap for the first letter. For example, if an article were syndicated to multiple news sites it's conceivable that some would style the lede differently from other paragraphs and some wouldn't -- and doing so or not wouldn't affect the meaning of the article or its interpretation by readers. So using CSS (I think article h2 + p would do it) would seem more appropriate than any mark-up here. Especially since b is labelled as a last resort. Highlighting specifically just the first sentence (even if the first paragraph has multiple sentences) is more awkward, in that I don't think it's currently possible with CSS. But that it exists as a plausible choice for presenting an article demonstrates how much a matter of styling, rather than content, this area is. And a limit of CSS should be fixed in CSS, not HTML. (span class=lede can always be used as a work-around.) Smylers
[whatwg] Tool Implementor Audience
One of the audiences for HTML is stated as implementors of tools that are intended to conform to this specification: http://www.whatwg.org/html5#audience That seems circular, verging on tautologous: a tool author wondering whether this spec is relevant to her (and therefore whether her tool should aim to conform with it) isn't any better informed having read the above. And conversely, an author of a lousy tool (which attempts to parse webpages but does so in a way not compatible with how browsers do) might have this spec pointed out to him. But he can claim it doesn't apply to him, since he's never intended his tool to conform to it. Could we make it something like implementors of tools that emit HTML or parse Web content? Smylers
[whatwg] Dom as Audience Prereq
The audience section states familiarity with Dom Core and Dom Events as prereqs for reading the HTML 5 spec: http://www.whatwg.org/html5#audience As somebody without this Dom background there are certainly many parts of the spec which I've found both understandable and useful (to a web author). There may be parts which do require Dom knowledge, but as written it sounds like a prereq for understanding any part of the spec, and as such may unnecessarily put people off. Smylers
[whatwg] HTML 5 and HTML4
The 'History' section starts: Work on HTML 5 originally started in late 2003, as a proof of concept to show that it was possible to extend HTML4's forms ... -- http://www.whatwg.org/html5#history-0 Having HTML 5 (with a space) and HTML4 (no space) seems oddly inconsistent. Could we have them matching? (I haven't searched to see if this also occurs elsewhere in the document.) Smylers
[whatwg] workers Highlighted
In the 'Design Notes' section the word workers has a thick pale green underline. It isn't apparent what this is signifying. Moving the mouse over it reveals it isn't a link, but a tool-tip appears -- with the text Worker. That didn't really elucidate matters: http://www.whatwg.org/html5#serializability-of-script-execution What's this about? Smylers
[whatwg] Plus Signs in Signed Integers
The algorithm for parsing signed integers does not allow an optional plus sign before positive integers; that is, parsing +4 will return an error at step 8 of this algorithm: http://www.whatwg.org/html5#rules-for-parsing-integers That is inconsistent with the algorithm for non-negative integers, which tolerates (and ignores) a leading plus sign (step 6): http://www.whatwg.org/html5#rules-for-parsing-non-negative-integers It also doesn't seem to match browser behaviour: the ol element's start attribute is an integer, so I tried this out in various browsers: ol start=+4 liPlus four /ol All the ones I had to hand (Firefox, Opera, Konqueror, Dillo, Lynx, Links, and W3M) numbered the element with 4. I've no idea if any web content is relying on this, but there doesn't seem to be any harm in being consistent with both current browser behaviour and non-negative integers. To check that it is specifically the plus sign they are ignoring and not any non-digit character I also tried: ol start=H2SO4 liAcid test /ol That should cause parsing an integer to abort and so the default of start=1 to be used. Opera, Links, and W3M get that right. Konqueror, Dillo, and Lynx all also seem to manage the aborting, but use a default of zero instead. Firefox parses the 2 out of H2SO4, seemingly using the first integer it can find in the attribute, so possibly isn't special-casing +. Smylers
Re: [whatwg] Removing the need for separate feeds
Adrian Sutton writes: On 22/05/2009 08:21, Ian Hickson i...@hixie.ch wrote: As far as I can tell, things get better if the feed format and the default output format are the same, yes. Generally, redundant information has tended to lead to problems. Can you point to examples of this in relation to the use of feeds in particular? I can't find examples right now, but I have encountered various problems along these lines in the past, including: * The feed suddenly becomes empty. * A new blog has a 'feed' link, but it never works. * A blog's feed URL changes, but doesn't redirect. * A feed is misformatted in a way which causes it to be ignored. * The content of a feed is misformatted, such that in a feed reader its display is mangled, such as HTML tags and entities showing, or spaces having been squeezed out from around tags such that linked words don't have spaces around them. * The content of a feed has certain critical information, such as an image, stripped from it, such that it makes no sense, or has a different meaning from the full post. * The content of a feed has certain critical mark-up stripped from it, such as sup around exponents in a mathematical expression rendering 36 where 3 to the power of 6 was intended. In all cases the HTML version of the blog had correctly displaying and updating content; only the feed was affected by the issues. This usually left the author unaware of the problem, as they don't subscribe to their own blog. Eduard Pascual writes: sites using feeds tend to be almost always dynamic: both the HTML pages and the feeds are generated via server scripts from the *same set of source data*, I believe that to be true for at least most of the above cases I encountered. However that obviously wasn't sufficient to avoid the problems. For manually authored pages and feeds things would be different; but are there really a significant ammount of such cases out there? Not many. But that's quite possibly because of the effort involved in doing so. The algorithm in the HTML 5 spec would allow some categories of handcrafted pages to gain feeds for free. I've often encountered webpages which I wished had feeds but don't. It's possible that an algorithm such as this would encourage more pages to do so. Smylers
Re: [whatwg] [Fwd: Re: Helping people seaching for content filtered by license]
Nils Dagsson Moskopp writes: Am Freitag, den 08.05.2009, 19:57 + schrieb Ian Hickson: * Tara runs a video sharing web site for people who want licensing information to be included with their videos. When Paul wants to blog about a video, he can paste a fragment of HTML provided by Tara directly into his blog. The video is then available inline in his blog, along with any licensing information about the video. This can be done with HTML5 today. For example, here is the markup you could include to allow someone to embed a video on their site while including the copyright or license information: figure video src=http://example.com/videodata/sJf-ulirNRk; controls a href=http://video.example.com/watch?v=sJf-ulirNRk;Watch/a /video legend Pillar post surgery, starting to heal. smallcopy; copyright 2008 Pillar. All Rights Reserved./small /legend /figure Seriously, I don't get it. Is there really so much entrenched (widely deployed, a mess, IE-style) software out there relying on @rel=license meaning license of a single main content blob Merely using rel=license in the above example would not cause the copyright message to be displayed to users. that an unambigous (read: machine-readable) writeup of part licenses is impossible ? Why does the license information need to be machine-readable in this case? (It may need to be for a different scenario, but that would be dealt with separately.) The example above shows this for a movie, but it works as well for a photo: figure img src=http://nearimpossible.com/DSCF0070-1-tm.jpg; alt= legend Picture by Bob. smalla href=http://creativecommons.org/licenses/by-nc-sa/2.5/legalcode;Creative Commons Attribution-Noncommercial-Share Alike 2.5 Generic License/a/small /legend /figure Can I infer from this that an a in a small inside a legend is some kind of microformat for licensing information ? No. But if a human sees a string that mentions © copyright or license then she's likely to realize it's licencing information. And if it's placed next to a picture it's conventional to interpret that as applying to a picture. It's also conventional for such information to be small, because it's usually not the main content the user is interested in when choosing to view the page. Magazines and the like have been using this convention for years, without any need to explicitly define what indicates licensing information, seemingly without any ambiguity or confusion. Smylers
Re: [whatwg] [Fwd: Re: Helping people seaching for content filtered by license]
Eduard Pascual writes: On Fri, May 15, 2009 at 8:40 AM, Smylers smyl...@stripey.com wrote: Am Freitag, den 08.05.2009, 19:57 + schrieb Ian Hickson: * Tara runs a video sharing web site for people who want licensing information to be included with their videos. When Paul wants to blog about a video, he can paste a fragment of HTML provided by Tara directly into his blog. The video is then available inline in his blog, along with any licensing information about the video. Why does the license information need to be machine-readable in this case? (It may need to be for a different scenario, but that would be dealt with separately.) It would need to be machine-readable for tools like http://search.creativecommons.org/ to do their job: check the license against the engine's built-in knowledge of some licenses, and figure out if it is suitable for the usages the user has requested (like search for content I can build upon or search for content I can use commercialy). Ideally, a search engine should have enough with finding the video on either Tara's site *or* Paul's blog for it to be available for users. Yeah, that sounds plausible. However that's what I meant by a different scenario -- adding criteria to the above, specifically about searching. Hixie attempted to address this case too: Admittedly, if this scenario is taken in the context of the first scenario, meaning that Bob wants this image to be discoverable through search, but doesn't want to include it on a page of its own, then extra syntax to mark this particular image up would be useful. However, in my research I found very few such cases. In every case where I found multiple media items on a single page with no dedicated page, either every item was licensed identically and was the main content of the page, or each item had its own separate page, or the items were licensed under the same license as the page. In all three of these cases, rel=license already solves the problem today. To which Nils responded: Relying on linked pages just to get licensing information would be, well, massive overhead. Still, you are right - most blogs using many pictures have dedicated pages. It's perfectly valid to disagree with this being sufficient (I personally have no view either way on the matter). I was just clarifying that the legend mark-up example wasn't attempting to address this case, and wasn't proposing legendsmall (or whatever) as a machine-readable microformat. Smylers
Re: [whatwg] Question on (new) header and hgroup
jgra...@opera.com writes: Quoting Smylers smyl...@stripey.com : James Graham writes: hgroup affects the document structure, header does not. That explains _how_ they are different (as does the spec), but not _why_ it is like that. More specifically: * Are there significant cases where header needs _not_ to imply hgroup ? Consider wrapping an hgroup inside every header ; how many places has that broken the semantics? I could believe that most of the cases where a pager header appropriately contains multiple headings they are subtitles rather than subsections. The semantic that authors seem to want from an element named header is All the top matter of my page before the main content. That could include headers, subheaders, navigation, asides and almost anything else. It could. But most of the above have no effect on the outline algorithm. In practice, how often do current div class=header sections contain headers of multiple sections, without those nested sections being separately wrapped in their own div-s (or similar, which could become section or whatever's appropriate in HTML 5)? Since the header can contain multiple distinct logical sections of the document, each with their own headers, it makes no sense to implicitly wrap its contents in hgroup. You're right. What I was really thinking of is something closer to: inside header if any hx elements are encountered before any nested sectioning elements then treat all the hx elements as being a single heading. So header could still contain section-s, with their own headings. And a header with no hx elements wouldn't create an empty entry in the outline. * Given the newness and nuance of header and hgroup and the distinction between them, it's likely that some authors will confuse them. Given that hgroup doesn't appear to do anything on the page (it's similar to invisible meta-data), it's likely that some authors will omit it[*1] when it's needed to convey the semantics they intend. Yes, that is possible. The thinking behind the change (or, at least, part of my reason for proposing it) was that it is less harmful if authors omit something where it would be useful than that they use it incorrectly in such a way that tools which follow the spec would be broken from the point of view of end users. That's a good point. In particular the old formulation of header would have caused the h2 element to be omitted from the outline in cases like header h1 My Blog/h1 nav h2 Navigation/h2 /nav /header , which would be confusing for users. Indeed. What I intended to raise for consideration (and hopefully now have done) is that header would not merge the above, because nav starts a new section inside header. Consider a similar example: header h1My Blog/h1 h2Ramblings of an internet nobody/h2 navh2Navigation/h2 ... /nav /header The spec currently has both the h2-s as subsections. The alternative I was thinking of would treat the h1 and first h2 as being a single heading (of the entire document), but keep the second h2 (as the heading of the navigation). On he other hand in the current formulation of the spec, the most likely error (omitting hgroup ) only has the effect that that outline heirachy is slightly wrong, with the subheader appearing as an actual header; it does not lead to data loss. This seems like a much better failure mode. That's true. But if the number of failures can be minimized, it matters less what the failure mode is. My concern is that with hgroup being so esoteric, combined with its effect being largely invisible, it will hardly be used and therefore possibly not worth adding to HTML 5. Authors don't have a good track record on accurately adding invisible metadata. If we can algorithmically get it right in most cases, while leaving a way for careful authors to explicitly override it if necessary, that may be better overall. * Are there significant cases where hgroup will be useful outside of header ? hgroup exists to allow for subtitles and the like. It's fairly common for documents to have these -- where it's likely there's use for a header element anyway. It's much less common for a mere section of a document to warrant a multi-part title; is that a case which is worth solving? If it is, would it be problematic to force authors to use header there? It seems highly odd to have header perform a dual role where sometimes it means section header and sometimes it means group of heading/subheading elements. Much more confusing than one element per role. I think the two concepts are sufficiently overlapping that it isn't really a dual role. header could mean 'section (or document) header' -- it would be used when a section's header consists of more than just a single hx element. Whether those elements are because of multi-part titles or search boxes or whatever is a distinction that authors would
Re: [whatwg] code attributes
ddailey writes: I found myself rather taxed by the limitations of code (disclaimer: I may well have been working on incorrect assumptions) 1. Having to type pre code lt;tagname /code /pre seemed a little bit silly to me: is there a use case for *not* wanting pre when doing code ? Yes: designating something as code in running text, such as: pType codels/code for a listing of the current directory./p Using pre around code would split the above into three separate lines. 2. having to escape as lt; in the middle of code seems like work for the author that could just as easily be handled by the browser. It could. But doing so would prevent being able to use other elements inside code, such as: pType codels vardir/var/code to see what's in the directory vardir/var./p There are plenty of other elements that can sensibly be used inside code, including: * em to emphasize a particular part that's explained in the surrounding text * mark to indicate a part that's relevant to the user * a href=... to link a term to its documentation * span to add classes to different parts, as hooks for CSS syntax highlighting 3. trying to style a code so that it would have an indented margin, a border, a default font-style (monospaced), preserve within-line indentation, and work consistently across browsers seemed to defy my humble abilities with CSS. (see http://srufaculty.sru.edu/david.dailey/cs427/StateOfArt-Dailey.html#test_file as an example of the very clumsy solution I ultimately opted for pre should already have a monospaced typeface and preserve white-space. I'd expect you could either apply the indent and border to the pre or (if you have other pre-s in your document which aren't precode, so need to specifically only style the latter) or turn code-s inside pre-s into blocks and set the indent and border on them: pre code { display: block; // set margins and border here } Are either of those what you tried? If so, please would you share the details of in which way they failed, and with which browsers. Thanks. 5. Some ... good folks ... let me know that code p happy/p p sad/p /code was bad form, and that I should use pre code instead. It never would have dawned on me that the first was bad form, It's incorrect in the same way that emppow pboom!/em is incorrect for two emphasized paragraphs. Phrasing elements like em and code can only contain phrasing content; they can't contain any flow elements. nor that the second would be good form. The HTML 5 spec has example of doing exactly that at the definitions of each of the code and pre elements. pre is defined as being for block[s] of preformatted text, which seems to precisely describe what you have in this instance. Second the introduction of p within code was actually generated by a robot that converted a bunch of MS Word to html so someone other than me must have thought it was a good idea to do it that way. Or simply that the robot was separately converting pairs of line-breaks into p tags and use of a monospaced typeface into code spans, and the two happened to co-incide -- possibly the robot's author never even considered it. Smylers
Re: [whatwg] time
Tom Duhamel writes: It seems that pretty much everyone agrees on this: Hi Tom. I'd like you to clarify an aspect of your proposal: time2009-03-16/time Printing directly on the page, no tool tip: March 16, 2009 Because the author wrote a date in ISO 8601 format, a browser should rewrite it the user's local date format, such that it is indistinguishable from if the author had typed it that way in the first place? (Obviously pre-HTML-5 browsers will still display the raw 2009-03-16.) Suppose I'm a UK user who happens to've configured my computer's date format to DD/MM/ (which is common over here) and I see an American conference's website American give its date as 04/07/2009. I know that the USA date order is different from the UK's, so I'm used to having to remember to read that as April 7th. You're suggesting that there should be two possibilities I have to take into account: * The author literally wrote 04/07/2009, and the conference is on April 7th. * The author literally wrote time2009-07-04/time, my browser converted that to my local format and displayed it as 04/07/2009, and the conference is on July 4th And that as a reader I can't tell which of these it is, without viewing the document's source? (And even to spot that there is an ambiguity I've got to be aware that my browser 'sometimes' changes dates, that it depends on my computer's configuration, and what config I picked.) Does the same apply to times? Would they also be converted to the user's local timezone? time16 mars 2009/time The user agent could, but is not required to, make an effort to interpret the date and do whatever it likes with it. However, if the date provided cannot be interpreted as ISO 8601 it could simply print the content as is without any change. In this example, if the user agent is able to understand this French date, the tool tip could be March 16, 2009 If a browser interprets a date in a different format, the localized version goes in the tooltip but the user sees exactly what the author typed? That is, which version (author-written or localized) the browser shows in the page depends on which format the author used? Smylers
Re: [whatwg] time
Andy Mabbett writes: In message 20090314083450.ga30...@stripey.com, Smylers smyl...@stripey.com writes This thread appears to be proving that dates are very complicated and that to get them right for the general case involves lots of subtleties, All true. which would be a reason for punting -- only doing the simplest possible thing for now, acknowledging that that doesn't meet all desirable scenarios, and leaving everything else for HTML 6. I'm not clear on what basis you reach that conclusion from the undisputed facts above. Even attempts to produce a small list of changes that we have consensus on yields others disputing them, showing that we don't have consensus. ...yet. Indeed. Spending more time on this may well yield something useful and acceptable to at least most of those with opinions. When that has happened the outcome can be incorporated in HTML 6. So my suggestion for a spec change is to replace zero with 1582. That further reduces the set of dates that time can represent, but avoids the complexity of pre-Gregorian dates, and avoids inadvertently giving a meaning to them that hampers the efforts of a future version of HTML to do all of this right. What advantage does deferring this problem give us, other than side-stepping ... Side-stepping _is_ the advantage. HTML 5 has many improvements over current specs, most of them not in the slightest bit related to dates, eras, timezones, lunar cycles, or Popes. ... something which needs to be addressed? It doesn't need to be addressed in order for people to benefit from the other features, and improved interoperability, in HTML 5. An HTML standard which doesn't provide a completely general time element is still a useful HTML standard. I'm proposing that we don't hold up that standard while trying to solve a hard problem. (I'm still in favour of people working on it to solve it. And if there happened to be a consensus for a (partial) solution now I wouldn't be against including it. But that isn't where we are.) Smylers
Re: [whatwg] time
Tom Duhamel writes: My opinion is that all the following dates are precise: 2009 2009-03 2009-03-09 The later is more precise, but the three are all precise in my opinion. Being precise means having a small granularity. Obviously that's subjective, but in many cases granularity of a year would be deemed quite large. There are numerous reason to use dates which are not very precise, but are still precise nevertheless. I'm going to release the new version of my current project in time datetime=2009-04April/time but I cannot tell as of now the exact day of the release. Indeed, that's a reason to use an imprecise date in that paragraph of text. But it isn't apparently why that date needs to be marked up as such; what consumers of the above HTML would do something useful with it? On the other hand, those are NOT precise dates: Last year About a month ago That's considerably more precise than 2009, in that it bounds a much smaller period of time than a whole year. But I don't see how any of this is revelant; the date support that HTML 5 needs is that which is generally useful, not something that happens to meet either your or my definitions of particular terms. From my understanding of the current draft, the earlier date that can be used is 1970-01-01. I think you're mistaken. The time definition requires a valid date or time: http://www.whatwg.org/specs/web-apps/current-work/multipage/text-level-semantics.html#the-time-element The date component of which must be a valid date string: http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#valid-date-or-time-string http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#valid-date-string Which must start with a valid month string: http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#valid-month-string Which has the constraint year 0. Smylers
Re: [whatwg] [html5] Rendering of interactive content
Aryeh Gregor writes: On Tue, Feb 10, 2009 at 10:30 PM, Ian Hickson i...@hixie.ch wrote: If the UA suddenly displays hyperlinks in green and I decided that my background is green, the user will complain with me, not with the UA (and will probably switch to a different website) Authors should never the background colour without setting the foreground colour. So that would be the author's fault. I don't think that's relevant. It is relevant, because as you then say: If I, as an author, use the rule body { color: white; background: green; } and the UA uses the rule :link { color: green; } then links would be invisible despite my background color. That is precisely an instance of an author setting a background colour without a foreground colour -- specifically the author set the background colour used on links without setting the foreground colour for links. If an author sets a background colour then she needs to set the foreground colour for all text that appears on it, anywhere. Although authors are encouraged to always set colors and backgrounds together, UAs conventionally do *not* do this for links, for fairly good reason. Browsers do of course (typically) set both together -- in that they provide a default background colour, plus foreground colours for visited links, unvisitied links, and non-link text. If an author overrides one of those four then he generally needs to override all of them (except in circumstances where he knows the area in question won't have any links, or any-non links, or any text). You could say that not only should authors never set the background color without setting the foreground color, they should also never set the background color without setting the *link* color. Yup. But this still doesn't help if the UA (or a user stylesheet) uses span { color: green; } for some strange reason Indeed. That would be strange. So strange that I really don't see why HTML 5 should be concerned with it, and I certainly don't think there's anything it can do about it: * A user-agent that does something as arbitrary as the above is unlikely to achieve much market share. * A user style-sheet that does the above has been presumably set by a user who wants it. Let her. (not much stranger than green links by default!), It's significantly stranger, for a couple of reasons: * Links are a distinct type of element with a specific purpose. span-s convey no semantics of their own, and are used for multiple purposes. * Links are conventionally a different colour from other text; span-s aren't. Prefering links to be green rather than blue is a minor change, possibly one of aesthetics or to assist somebody who has trouble distinguishing certain colours. Whereas prefering span-s to be green rather than indistinguishable from the surrounding text is adding in a distinction not normally seen. in which case everything is still messed up. Yes. Users can choose to implement custom styles which mess things up. Users are free to do that if they want. They may even have good reasons for doing so; for example, when attempting to debug a website it can be useful to make various elements different colours to show which is rendered where. It's possible that at some point I want to debug your website, or examine how you achieved a particular effect, and that making your span-s green will assist me in doing that. I don't think there's any way around this. We don't want a way round it. It's a feature that users are in ultimate control. Smylers
Re: [whatwg] [html5] Rendering of interactive content
Aryeh Gregor writes: On Wed, Feb 11, 2009 at 8:36 AM, Smylers smyl...@stripey.com wrote: That is precisely an instance of an author setting a background colour without a foreground colour -- specifically the author set the background colour used on links without setting the foreground colour for links. . . . Browsers do of course (typically) set both together -- in that they provide a default background colour, plus foreground colours for visited links, unvisitied links, and non-link text. I interpret authors should always set background and foreground colors together as no single CSS rule should set 'background' without setting 'color' or vice versa. I can't figure out exactly how you're interpreting it. Ah, I see. Thanks for explaining that. I'm interpreting it as for each bit of text that you cause the background colour to be set for, also specify its foreground colour (and _vice versa_). So it's reasonable for a browser to set a background colour on body and know that other elements take their background from that. The point is that to ensure that the foreground and background don't conflict, you must always set foreground and background colors *in the same CSS rule*. As long as all stylesheets observe this principle, there will never be a mismatch. True, but that introduces other awkwardnesses. Given that many pages (or regions of pages) have a background colour shared across several elements, it's tedious to have to specify it for each one -- and also harder to change. If the default browser style-sheet explicitly specified the background colour for links (rather than letting the body background show through) then authors would have to override all of them, instead of just the body colour. I expect that would break many existing websites. Either the color and background are overridden both at once, or not at all, since the cascade works on a rule-by-rule basis, not property-by-property. On that basis transparent backgrounds are useless, since you can never be sure that somebody else won't change the background colour of the parent element without ensuring none of its content's text colours clash. HTML elements nest. People expect that setting a background on an elment makes it appear behind all nested elments it contains. Let's work with that, not against it. The browser is the one that's failing to observe the principle in this case, and the author is observing it. The author is *not* setting the background color for :link, nor the foreground color. However she's including link elements inside a body for which she has set a background colour. Therefore, by my understanding, she's specified a background colour without a foreground colour. Your interpretation and my interpretation both 'work'. However, mine's less hassle to deal with (avoids proliferation of repetitive properties) and compatible with the extant web. If an author overrides one of those four then he generally needs to override all of them (except in circumstances where he knows the area in question won't have any links, or any-non links, or any text). So you're saying links are a special case for this principle? Why? Yes. Because links typically have a different colour -- as defined in the expected styling information given in HTML 5, and therefore authors should expect it. UAs might just as reasonably change the colors of other elements as well. It's only convention that says they only change the color of links. Yes. Conventions are good. But that exact same convention says that the color of links is blue. Fair point. But browsers have for a long time provided UI for users to specify other link colours in way that they haven't, for exmaple, the colour of headings or emphasized text. I'm sure some users take advantage of such UI (and wouldn't consider them to be user style-sheets; somebody clicking on such UI possibly isn't even aware of the concept of user style-sheets). There are plausible reasons why a user would want to set non-default colours, perhaps to make things easier to read. And if he sets a different text and background colour, he's likely to want to set different link colours too, so that they still stand out. Smylers
Re: [whatwg] [html5] Rendering of interactive content
Aryeh Gregor writes: On Wed, Feb 11, 2009 at 10:13 AM, Smylers smyl...@stripey.com wrote: Ah, I see. Thanks for explaining that. I'm interpreting it as for each bit of text that you cause the background colour to be set for, also specify its foreground colour (and _vice versa_). But that's not *possible* in CSS. Not within reason, anyway. You can't be expected to set color for all descendants. You're right. It is possible if you presume foreground colours generally inherit (except for links) and backgrounds are generally transparent. Which apparently I was presuming, but omitted to state. Sorry for not thinking it through properly. True, but that introduces other awkwardnesses. Given that many pages (or regions of pages) have a background colour shared across several elements, it's tedious to have to specify it for each one -- and also harder to change. It's the only way to be *sure* that things won't break (at least, you can be sure as long as everyone does it). We can be sure that there are live websites out there currently not doing it! For a start, there are those which don't use CSS at all but on body set the bgcolor, text, link, and vlink attributes (another reason why links are special, since there aren't similar attributes for setting any other elements' colours). I would say that what you're suggesting is an entirely different principle: stylesheet authors should manually set :link, :visited, and :hover foreground colors if they set any background color on any element that might contain a link, because they can't guarantee UA behavior otherwise. This is a much more specific point -- it doesn't cover interaction with author or user stylesheets and requires at most three rules per set of stylesheets. Yes. And also the reverse (that if you set a link colour then set the background colour on either links or an ancestor element). I'll agree that authors should always specify link colors to override the UA's stylesheet, if they specify backgrounds. Hurrah -- in that case I think we're in agreement, and no changes to the spec are necessary. Smylers
Re: [whatwg] Hyphenation
Markus Ernst writes: Ian Hickson schrieb: I don't think this is a big enough problem to deserve solutions more complicated than the soft hyphen at this time. Jukka Korpela stated that the intention of the soft hyphen is not actually a hyphenation hint: http://www.cs.tut.fi/~jkorpela/shy.html He claims that there are multiple standards that contradict each other. So whatever is implemented is bound to contravene at least one of them. However he mentions that: * HTML 4 defines it as a hyphenation hint. * Unicode defines it as a hyphenation hint. * Recent browsers are now treating it as a hyphenation hint. * The contradictory standard (ISO-8859) only defines a soft hyphen when used at the end of a line, namely that it should be rendered like a hyphen. Since that standard doesn't envisage the character being used elsewhere, it is silent on how it should be rendered. It seems to me that choosing to render invisibly a soft hyphen which isn't at the end of a line doesn't contradict the text of ISO-8859 (though it could be argued to contradict its spirit). (Anyway I don't really understand the difference between a normal hyphen and a soft hyphen then...) Suppose you are reflowing some text (perhaps because you are quoting it); words which were broken over lines in the original may want rejoining into a single word in your version (that is, the soft hyphen disappears); but hyphens (non-soft) between two words need to remain. Smylers
Re: [whatwg] [html5] Rendering of interactive content
Giovanni Campagna writes: So the whole rendering section is just for implementors and authors should act if no default style sheet is present No; the section is also for authors, in that it advises them of how content is expected to be rendered in mainstream graphical browsers. or worse, if it could be everything, like a inline-block div or blue table, Indeed it _could_ be anything, but there's no reason for authors to be worried about that; if a (non-malicious) user-agent does something different from that expected by HTML 5 then it presumably has done so intentionally and it wouldn't necessarily benefit users for authors to attempt to combat it. so that the author should set all supported properties to initial or the HTML5 expected value? Not necessarily. That is: I, author, want consistent rendering on all plaforms and browser: I think that's where you're going wrong. Platforms themselves aren't consistent -- in things like screen sizes, resolutions, distance between screen and audience, whether they are interactive, what input devices are available, pagination. If an esoteric platform chooses to divert from the expected HTML 5 rendering then it's likely because that platform knows more about what rendering is appropriate for that platform that you do. I import the HTML5 style sheet inside author ones. That's a very parochial view. In mainstream graphical browsers such importing would be redundant, because they'll have the expected behaviour anyway. In esoteric user-agents you cannot know that the HTML 5 defaults provide a better user experience than those chosen by the user-agents' developers. I, implementor, want to provide backward-compatible rendering for those author that didn't follow rule 1), I import HTML5 style sheet inside UA defaults. In both case, a downloadable stylesheet would be much appreciated. I think a downloadable style-sheet is inevitable! Smylers
Re: [whatwg] [html5] Rendering of interactive content
Giovanni Campagna writes: If input type=submit in some UA is rendered with all properties set to initial, not only it does not express the semantic of a button, but it may be difficult for a user to actually recognize it as a button and eventually click it. In that case I, as the author, may need to manually set { appearance:push-button; content:attr(value,string,Send); } in order to have my form submitted. Try this example (in Firefox or Safari): data:text/html,stylelabel { position:fixed; top:-1em; border:1px solid black; } label input { -moz-appearance:none; -webkit-appearance:none; border:none; width:auto; } input[type=submit] { -moz-appearance:none; -webkit-appearance:none; background-color:transparent; border:none; }/styleform action=http://www.google.com/search; method=getlabelSearch: input type=text value= name=q/labelinput type=submit value=Go Imagine that was the UA default stylesheet instead of an author stylesheet and you may see what interoperability means with web application look and feel. Indeed it would be a problem if a major web browser shipped with such a default style-sheet. But ... I'm really having trouble imagining any of them doing so. It isn't in mainstream browsers' interests to produce products which purposefully confuse their users. Surely a browser which disguises buttons as plain text is going to lose market share of its own accord, regardless of what HTML 5 says? Or, to look at it from the opposite direction, supposing a browser producer really wanted to make buttons look like plain text, would whether HTML 5 condemns this practice really affect what they do? Would not being able to market their browser as HTML-5-compliant be enough that they'd begrudgingly forget their desire? Would users dissatisfied with the behaviour only complain because it breaches HTML 5, rather than because, say, it's really stupid? I can't see how a requirement such as you propose would make any practical difference on avoiding the outcomes you wish to avoid. But it might unnecessarily curtail innovation in directions that we haven't yet envisaged -- perhaps somebody developing a specialist user-agent for mobile devices (or digital TV, or for print-based output, or large-screen non-interactive displays, or ...) comes up with a different way of displaying certainly elements which she considers is superior for her particular target audience; why should the spec attempt to dissuade her from doing so? Smylers
Re: [whatwg] [html5] Rendering of interactive content
Giovanni Campagna writes: 2009/2/8 Smylers smyl...@stripey.com Giovanni Campagna writes: data:text/html,stylelabel { position:fixed; top:-1em; border:1px solid black; } label input { -moz-appearance:none; -webkit-appearance:none; border:none; width:auto; } input[type=submit] { -moz-appearance:none; -webkit-appearance:none; background-color:transparent; border:none; }/styleform action=http://www.google.com/search; method=getlabelSearch: input type=text value= name=q/labelinput type=submit value=Go Imagine that was the UA default stylesheet instead of an author stylesheet and you may see what interoperability means with web application look and feel. Indeed it would be a problem if a major web browser shipped with such a default style-sheet. But ... I'm really having trouble imagining any of them doing so. It isn't in mainstream browsers' interests to produce products which purposefully confuse their users. Surely a browser which disguises buttons as plain text is going to lose market share of its own accord, regardless of what HTML 5 says? Or, to look at it from the opposite direction, supposing a browser producer really wanted to make buttons look like plain text, would whether HTML 5 condemns this practice really affect what they do? Would not being able to market their browser as HTML-5-compliant be enough that they'd begrudgingly forget their desire? Would users dissatisfied with the behaviour only complain because it breaches HTML 5, rather than because, say, it's really stupid? I can't see how a requirement such as you propose would make any practical difference on avoiding the outcomes you wish to avoid. But it might unnecessarily curtail innovation in directions that we haven't yet envisaged -- perhaps somebody developing a specialist user-agent for mobile devices (or digital TV, or for print-based output, or large-screen non-interactive displays, or ...) comes up with a different way of displaying certainly elements which she considers is superior for her particular target audience; why should the spec attempt to dissuade her from doing so? Smylers I agree with you that we must not constrain the innovation, but in that case, what is the whole purpose of the Rendering section? I think that section is for - implementors of new UAs, that don't need to reverse engineer the competitor products in order to find the defaults Yup -- it means that somebody creating a conventional graphical web browser can straightforwardly produce the output of other such web browsers, and that authors expect. - authors, that in this way know what to expect from the various UA Yes. But note that expectations are merely that; expectations are not always met! And there may well be instances in which a user-agent for a particular market decides to different rendering would better suit that market; users of such a device are likely to appreciate it -- and whether the author expected it (or even became aware of it) doesn't necessarily matter. Having somewhere written that hyperlinks should be blue, allows you to style the background-color to anything but blue. If the UA suddenly displays hyperlinks in green and I decided that my background is green, the user will complain with me, Indeed. That's precisely why they should only be labelled as expectations and not things an author can rely on. The only sane options for an author are to set no colours at all or to specify both background and text colours. It is not safe only to set one of them, and it would be irresponsible for HTML 5 to give the impression that it is. (Remember that even if all browsers implement the provided defaults, users can still set a preference of green links or whatever.) The solutions are two: 1) either provide a default style sheets only for author and say: you want the usual rendering everywhere? import this. This means that the whole Rendering sections should be moved to an Appendix and a separate downloadable CSS file should be provided. That's very close to what we have got. Except that it isn't in an appendix; it's in a non-normative section. But the important point is that it isn't normative. And that a browser producer is permitted to choose to implement them by default rather than expecting users to have to import them. Consider if a future release of Firefox didn't contain the CSS for expected web rendering, but users could import it should they want to. It turns out that nearly all users desire this rendering; this new Firefox release and they find it tedious to have to specify it separately (if indeed they are technical enough to have discovered how to do so; many simply consider that Firefox release to be broken). So somebody takes the Firefox source code, adds in the default user styles, and releases it as Snowchicken. Most users prefer Snowchicken over Firefox, so Snowchicken gains market share
Re: [whatwg] Deprecating small , b
Pentasis writes: [Asbjørn Ulsberg writes:] However, as you write and as HTML5 defines it, there is nothing wrong with small per se, and I agree that as an element indicating smallprint, it works just fine. Since my initial reply might have been a bit too colored by the HTML4 definition of the element and its current usage on the web, I hereby withdraw my comment and conclude that I mostly agree with you. :-) Yay, consensus! Thanks, Asbjørn. But isn't this just the reason why it should be dis-used? The HTML4 spec defined it as a styling tag, and that is how it is *mostly* used and understood by the majority of the users/authors. That may be true (though authors who want smaller text just because they think the default looks too large could also use font size=2 or CSS), but authors who wanted to diminished the emphasis of certain content to users are likely to pick small because there isn't much else available. Just because an element is currently widely used for a purpose we deem inappropriate doesn't mean that its appropriate uses aren't important. Tables are widely used for layout; br-s are widely misused. Both of those clearly have other valid uses, so are still in HTML. Just because HTML5 redefines the element does not mean that the element will suddenly be semantic. Even if people start using it purely semantically from now on (and what is the chance of that?), the existing websites still carry small-tags that are not compliant with the new definition. Yes. But the suggested alternative was to deprecate small entirely and invent a new element to convey the semantic of 'small print'. That would of course make _all_ current uses of small non-conforming. Presentational small-s are going to be non-conforming either way; allowing semantic small-s to conform doesn't change that. By redefining it the (existing) web breaks; allbeit purely in the semantic area. That's intentional. If anybody checks legacy content against the new standard they will discover that what they did is no longer recommended. However, browsers will 100% support it and continue to render it as it always has been, so the 'breakage' is no way visible; if the author chooses not to care about it then no harm is done. Smylers PS: Pentasis, please could you send mails that do at least one of attributing who you're quoting or include In-Reply-To: headers so that they continue the existing thread rather than starting a new one. Without either it's rather tedious to have to look up who said the text you quote. Thanks.
Re: [whatwg] Deprecating small , b ?
Asbjørn Ulsberg writes: On Mon, 17 Nov 2008 15:26:22 +0100, Smylers [EMAIL PROTECTED] wrote: In printed material users are typically given no out-of-band information about the semantics of the typesetting. However, smaller things are less noticeable, and it's generally accepted that the author of the document wishes the reader to pay less attention to them than more prominent things. That works fine with small . No, it doesn't, and you explain why yourself here: User-agents which can't literally render smaller fonts can choose alternative mechanisms for denoting lower importance to users. I don't see how that explains why small is an inappropriate tag to use for things which an author wishes to be less noticeable. If the point isn't to literally render smaller fonts, you shouldn't indicate that you want the fonts rendered smaller either. Indeed. font size=-1 would be bad to use for this. What you want is to semantically indicate that the text wrapped inside the element is of less significance than the surrounding text, e.g. a negative 'strong' or 'em'. Yes. And I reckon than small works for that. English has the idiom of 'small print', roughly meaning text written by the legal department rather than the marketing department. But 'small print' doesn't literally have to be typeset with a smaller font; it's a figure of speech. 'small' isn't equal to what we're trying to express here. What we need is a new element that can capture this semantic. If we were starting from scratch then indeed small may not be the best name to choose for this element. But, unfortunately, we aren't. small has existed for some time, and people are already using it. If one currently wants to denote lessor importance small is the best element to pick. Further, existing browsers know what to do with small; if we introduce a new element then content that uses it will have a sub-optimal rendering in current browsers, whereas small already does something appropriate. So I still think small works for denoting that something is of smaller importance. Indeed you can't. And nor can you if you were reading printed text with some words in bold. Why does printed text set the standard for what we are able to express with a markup language? It doesn't set the standard. But it's useful in some comparisons. And most of the time humans cope perfectly well with inferring typographic conventions without having them spelt out. However, you would appreciate that the author had wished for some particular words to stand out from the surrounding text. That's a job for the style sheet, whether it's provided by the author or by the user agent. The style-sheet can only pick out particular words if those words have been marked-up as special in the document, so it doesn't solve the problem of how to mark them up. Further, this isn't using b because the house style is to have all text in a bold weight (that can be done by style-sheets, and if the style-sheet is missing all the content is still there); it's using b to convey _some_ semantics: namely that those particular words are in some way special. So if the mark-up is span class=brand_name or similar and the distinguishing presentation added with CSS then users without style-sheets are completely unaware that the author identified those words as being special. Whereas with b, everybody gets to know. However, you can only notice this if the words have been distinguished in some way. With b , all user-agents can choose to convey to users that those words are special. They are only special for sighted users, browsing the page with a rather advanced user agent. They are not special to blind users or to users of text-based user agents like Lynx. Not true. Any user-agent can choose to convey that words marked in b are somehow different from the surrounding words. Lynx does this. If you want to express semantics, then use a semantic element. That's begging the question. If we define b to be semantic, then it is! Expressing semantics through presentation only is done in print because of the limitations in the printing system. Well, yes. If the print was for a blind person, printed with braille, one could imagine (had it been supported) that letters with a higher weight could be physically warmer than others, or with a more jagged edge so they could stand out. Yup -- and an HTML-to-braille converter could choose to do that with words marked in b, whereas it couldn't with span class=BrandName. Such effects would have been impossible if the document was only tagged with presentational markup. To some extent, yes: not knowing whether a letter is where it is on the page because it's a start of a paragraph or a heading, or just because the previous line is full, hampers doing that. And similarly for typefaces. The same applies to other mediums than print -- you need to know the underlying reason of why
Re: [whatwg] Deprecating small , b ?
Felix Miata writes: On 2008/11/24 16:19 (GMT) Smylers composed: So I still think small works for denoting that something is of smaller importance. I do too, but I don't believe less importance can be the only inference. One could simply want smaller text, without expecting that inference. If you just want something to be smaller stylistically and there's nothing special about that portion of the text then I think using small for it would be as bad as using h1 just to make text bigger; CSS is a better choice. e.g., just because fine print legalese is called what it is doesn't doesn't necessarily make it unimportant or less important. It's less important in the sense that it isn't the point of what the author wants users to have conveyed to them; it's less important to the message. (Of course, to users any caveats in the small print may be very important indeed!) Smylers
Re: [whatwg] Absent rev?
Martin McEvoy writes: Ian Hickson wrote: On Tue, 18 Nov 2008, Martin McEvoy wrote: (I am not criticizing just trying to understand it) surely all it needed was to define some rev values (the same as rel) and people will start using rev correctly? That's backwards -- looking for a problem to fit the solution, not looking for a solution to fit the problem No not really because If you look at the anyalasis(link above) made in 2005 rev=made (9th) is used more than, rel start, search, help, top, up, author and a whole lot of other link relationships that have made their way into HTML5, It doesn't make any sense? There's a difference between adding an attribute and adding to the set of values defined for an attribute; given rel's existence, the cost of adding start, up, etc is quite possibly less than of adding rev. There's also the misuse to consider. If, say, rel=up is barely used but when it is used it's generally used correctly then it's benign, and not causing any harm. Significant rev misuse has been identified; its existence is confusing people into writing something they don't mean. Smylers
Re: [whatwg] Absent rev?
Martin McEvoy writes: Ian Hickson wrote: given the way that the rel attribute and the related keywords are defined, rel=author does in fact convey the semantics that rev=made did. No It doesn't Yes it does; it's specified that they are equivalent: For historical reasons, user agents must also treat link, a, and area elements that have a rev attribute with the value made as having the author keyword specified as a link relationship. -- http://www.whatwg.org/specs/web-apps/current-work/multipage/structured-client-side-storage.html#link-type-author Reverse and Inverse properties are key factors of any Semantics OK. without both @rev and @rel there is hardly any semantics at all just a one way stream of information, That simply isn't true, since it's always possible to define rel=foo and rel=bar where bar conveys the same semantics as foo but in the opposite direction; no rev needed. which most of the time you have to guess what the Authors intentions were. Not at all, since part of the definition of the rel values says in which direction they are to be interpreted. For example, rel=author indicates that the referenced document provides further information about the author of the section that the element defining the hyperlink applies to. rel=author on the whole only relates to published documents, rel=made relates to Documents, Music, Photos, Videos, Sunday Lunch! Literaly anything that can be *made* So the problem is that if I wanted to be able to create a link from my Sunday lunch to its cook, annotating it as such, I wouldn't be able to do so because rel=author isn't appropriate terminology to use in meals? That's true, but given that my Sunday lunch isn't written in HTML anyway, I don't see how it could be trying to use rel=author (or indeed rev=made) in the first place! Ditto for all your other examples. By definition the thing which is making the rel=author link has to be written in HTML 5, and therefore has an author of some sort. Furthermore, since the definition of rel in HTML5 allows relationships in either direction to be defined, there is no need anymore for a separate rev= attribute. So essentially @rel in html5 is breaking the semantics of @rel just because it cant deal with @rev? No, the semantics of rel aren't changed from existing use; HTML 5 takes care not to break existing widespread use. the misuse of stylesheet is trivial and only a matter of informing authors of their error Well, who's going to be doing the informing? The publishers of HTML5 Why should they bother to do that when then can more easily define the problem no longer to exist? Authors who today use rev=made could equally well use rel=author without loss of generality IMHO. OK then example: I am the author of numerous websites and I decide (like many people do) to place some links on my homepage a portfolio If you like. My Homepage is at : http://groovydeveloper.com/ Here is my link a rel=author href=http://somegroovysite.com/; Groovy Site/a Above Statement (In HTML4) says http://somegroovysite.com/ Authored http://groovydeveloper.com/ Which Is rubbish its the other way round Indeed. The Same statement in HTML5 will say (because @rel is a reverse and inverse link type) I don't understand what you mean by the part in parentheses. Please could you expand on it, or provide a reference. http://somegroovysite.com/ Authored http://groovydeveloper.com/ and http://groovydeveloper.com/ Authored http://somegroovysite.com/ Of course not. See my quote from the rel=author part of the spec above; it clearly defines in which way that relationship applies. Among the set of relationships that rel allows there are relationships in each direction (both from and towards the current document), but a given relationship is always unambiguously defined to be in a particular direction. If there are redundant features that are only used 0.2% of the time, we should probably remove them, yes. Are there any? A lot considering that the average website only uses 19 elements[1] That simply doesn't follow. There are many ways in which hundreds of different elements could be distributed throughout a population such that each of them are used on more than 0.2% of pages yet the mean elements per page is 19. Smylers
Re: [whatwg] Absent rev?
Martin McEvoy writes: o be precise, the most commonly used value was rev=made, which is equivalent to rel=author and thus was not a convincing use case. !! rel-author doesn't mean the same as rev-made eg: In which cases doesn't it? If A is the author of B then B was made by A, surely? I have just finished this new a rel=author href=http://coolsite.co.uk/; Cool website/a check it out that would mean http://coolsite.co.uk/ is the author of the referring page which is nonsense. Indeed, but nobody is suggesting that would be appropriate. rev=author is clearly better semantics in the above case? Yes, if using rev. Without rev it could be written as rel=made, because made is the opposite of author. The second most common value was rev=stylesheet, which is meaningless and obviously meant to be rel=stylesheet. And that was the basis of the whatwg decision to drop rev? (I am not criticizing just trying to understand it) Data of what people have actually done, with the existence of current browsers and standards, informs many decisions. surely all it needed was to define some rev values (the same as rel) and people will start using rev correctly? What semantics do you think authors who wrote rev=stylesheet were meaning to convey? Presumably not that the webpage containing it is the style-sheet for the CSS file that it linked to -- so it's definitely a mistake by the author. If what the author meant to write was rel=stylesheet then HTML 5 is surely an improvement, by dropping the confusing rev=stylesheet? Or do you think something else is commonly meant by rev=stylesheet? We therefore determined that authors would benefit more from the validator complaining about this attribute instead of supporting it. Anything that could be done with rev= can be done with rel= with an opposite keyword, so this omission should be easy to handle. There are some cases where that is just not possible. Which? Smylers
Re: [whatwg] [rest-discuss] HTML5 and RESTful HTTP in browsers
[EMAIL PROTECTED] writes: as an example: a href=http://example.com/report;html report/a a href=http://example.com/report; Accept=application/pdfpdf report/a a href=http://example.com/report; Accept=application/rss+xmlxml report/a So I can send a colleague a message; 'you can get the report at http://example.com/report', and they can use that URL in any user agent that is appropriate. Except that in practice on receiving a URL like the above, nearly all users will try it in a web browser; they are unlikely to put it into their PDF viewer, in the hope that a PDF version of the report will happen to be available. A browser is a special case in which many different content-types are dealt with. It's also the most common case. Supposing I opened the above URL in a browser, and it gave me the HTML version; how would I even know that the PDF version exists? Suppose my browser has a PDF plug-in so can render either the HTML or PDF versions, it's harder to bookmark a particular version because the URL is no longer sufficient to identify precisely what I was viewing. Browsers could update the way bookmarks work to deal with this, but any exterrnal (such as web-based) bookmarking tools would also need to change. Or suppose the HTML version links to the PDF version. I wish to download the PDF on a remote server, and happen to have an SSH session open to it. So I right-click on the link in the HTML version I'm looking at, choose 'Copy Link Location' from the menu, and in the remote shell type wget then paste in the copied link. If the link explicitly has ?type=PDF in the URL, I get what I want; if the format is specified out of the URL then I've just downloaded the wrong thing. Smylers
Re: [whatwg] Deprecating small , b ?
Pentasis writes: 2) When using small on different text-nodes throughout the document, one would expect all these text-nodes to be semantically the same. But they are not (unless all of them are copyright notices). In printed material users are typically given no out-of-band information about the semantics of the typesetting. However, smaller things are less noticeable, and it's generally accepted that the author of the document wishes the reader to pay less attention to them than more prominent things. That works fine with small. User-agents which can't literally render smaller fonts can choose alternative mechanisms for denoting lower importance to users. There's no chance of doing this with span class=legalese or similar, since user-agents are unaware of the semantic they should be conveying. 3) small is a styling element, it has zero semantic meaning, so it does not belong inside HTML. Denoting particular text as being of lessor importance is quite different from choosing the overall base font size (or indeed typeface) for the page, or the colour of links or headings -- that's merely expressing a preference for how graphical user-agents should render particular semantics, but the semantics themselves are conveyed to _all_ user-agents (a, h3, etc). 4) b Siemens/b also does not tell me anything about the semantics. Is it used as a name, a brand a foreign word ? etc. I cannot get that information from looking at the b element. Indeed you can't. And nor can you if you were reading printed text with some words in bold. However, you would appreciate that the author had wished for some particular words to stand out from the surrounding text. Perhaps you then notice it's being done for all brand names? Or that the emboldened words spell out a secret message? However, you can only notice this if the words have been distinguished in some way. With b, all user-agents can choose to convey to users that those words are special. Smylers
Re: [whatwg] [rest-discuss] HTML5 and RESTful HTTP in browsers
[EMAIL PROTECTED] writes: It's also the most common case. Supposing I opened the above URL in a browser, and it gave me the HTML version; how would I even know that the PDF version exists? Hypertext. OK. Except that in practice on receiving a URL like the above, nearly all users will try it in a web browser; they are unlikely to put it into their PDF viewer, in the hope that a PDF version of the report will happen to be available. I've adressed this subsequently: 'here's the URL: example.com/report you can open this with adobe, excel, powerpoint, word' Would the sender of that link necessarily know all the formats the URL provides? Anyway, that's an unrealistic amount of typing -- typically round here people just copy and paste a URL into an instant message and send it without any surrounding text. Whereas without any other information, people will generally open URLs in a web browser. So it'd be faster just to send the URL of the page which contains hypertext links to all the formats; at which point we no longer care whether those links specify the format in the URL or elsewhere. Suppose my browser has a PDF plug-in so can render either the HTML or PDF versions, it's harder to bookmark a particular version because the URL is no longer sufficient to identify precisely what I was viewing. Browsers could update the way bookmarks work to deal with this, but any exterrnal (such as web-based) bookmarking tools would also need to change. I've also already addressed this in the original post; I was quite clear that if browsers don't store the application state when you make a bookmark (headers, URI, HTTP method), then this is an argument for continuing to use URI conneg *aswell* as HTTP conneg; rather than instead. What is the point of doing it in HTTP if it's being done in HTML anyway? Until the browsers fix this. ;) Not just browsers, as I pointed out. Also many databases which have tables with URL fields would need extra fields adding. Browsers should really be bookmarking the whole request/state; the only reason they don't do this is because that's not the way it's done now. The reason for that is lack of incentive due to inadequate tooling, it's not a fair justification to say 'no one does it at the moment because its not necessary', that's disingenuous. True. But if the current way of doing it is good enough, there's no incentive to change. There's little point in making browsers implement extra functionality and inventing new mark-up and evangelizing it, only to end up with the same functionality we started with; there has to be more. And the greater the effort involved, the greater the benefit has to be to make it worthwhile. Or suppose the HTML version links to the PDF version. I wish to download the PDF on a remote server, and happen to have an SSH session open to it. So I right-click on the link in the HTML version I'm looking at, choose 'Copy Link Location' from the menu, and in the remote shell type wget then paste in the copied link. If the link explicitly has ?type=PDF in the URL, I get what I want; if the format is specified out of the URL then I've just downloaded the wrong thing. Here you go: wget example.com/report --header=Accept: application/pdf Typing that would require my knowing that URL of the PDF also serves other formats. But, moreover, it requires typing. Currently the URL can be pasted in, the text that the browser copied to the clipboard. There's no way that my browser's 'Copy Link URL' function is going to put on the clipboard the exact syntax of wget command-line options. Having to type that lot in massively increases the effort in this task -- even if I can type the relevant media type in from the top of my head, without needing to look it up. Or what about if I wanted to mail somebody pointing out a discrepency between two versions of the report, and wished to link to both of them. That's tricky if they have the same URL. Possibly I could do it like you have with the wget command-line above, but that requires me knowing which browsers my audience use and the precise syntax for them. Smylers
Re: [whatwg] Deprecating small , b ?
Pentasis writes: In printed material users are typically given no out-of-band information about the semantics of the typesetting. However, smaller things are less noticeable, and it's generally accepted that the author of the document wishes the reader to pay less attention to them than more prominent things. That works fine with small. User-agents which can't literally render smaller fonts can choose alternative mechanisms for denoting lower importance to users. There's no chance of doing this with span class=legalese or similar, since user-agents are unaware of the semantic they should be conveying. 3) small is a styling element, it has zero semantic meaning, so it does not belong inside HTML. Denoting particular text as being of lessor importance is quite different from choosing the overall base font size (or indeed typeface) for the page, or the colour of links or headings -- that's merely expressing a preference for how graphical user-agents should render particular semantics, but the semantics themselves are conveyed to _all_ user-agents (a , h3 , etc). 4) b Siemens/b also does not tell me anything about the semantics. Is it used as a name, a brand a foreign word ? etc. I cannot get that information from looking at the b element. Indeed you can't. And nor can you if you were reading printed text with some words in bold. However, you would appreciate that the author had wished for some particular words to stand out from the surrounding text. Perhaps you then notice it's being done for all brand names? Or that the emboldened words spell out a secret message? However, you can only notice this if the words have been distinguished in some way. With b, all user-agents can choose to convey to users that those words are special. You cannot make a 100% comparison between printed and web-published styling and semantics. Apart from the obvious visual difference, we are talking about the ability here to convey semantics other than just visual. Indeed. For example to aid machine-readability but far more importantly, Assistive Technologies. If markup in web-publishing was meant to be just for visual feedback, we would only need 1 block and one inline element as we can do anything with just classes and CSS in that respect. But that would be using a styling technology (and an optional one at that) for conveying meaning. Anybody without the CSS -- or with a non-graphical user-agent, which can't render the CSS rules to the user -- will be missing out. Such users wouldn't be able to distinguish span class=legalese or even span class=secret_message from the surrounding text. Whereas if small or b are used, all user-agents can do _something_ with them. So I completely agree with what you say. Smylers
Re: [whatwg] [rest-discuss] HTML5 and RESTful HTTP in browsers
want a shell in that directory anyway to work with the downloaded file; and of course I can enter this without having to type the path in full) then type a 4-letter command name and paste the URL -- very efficient, and for me involving far less typing or clicking than navigating a graphical 'Save As' dialogue box. So wget currently _is_ convenient for me. And the defence of a change which will make this significantly less convenient is that it was my fault for finding it so convenient in the first place?! Or what about if I wanted to mail somebody pointing out a discrepency between two versions of the report, and wished to link to both of them. That's tricky if they have the same URL. Possibly I could do it like you have with the wget command-line above, but that requires me knowing which browsers my audience use and the precise syntax for them. - separate versions are separate resources, not separate content types. That has nothing to do with conneg.. I was meaning a difference between the HTML version and the PDF version of the same content (or at least what is supposed to be the same content) -- how would I link to them? Smylers
Re: [whatwg] Dealing with UI redress vulnerabilities inherent to the current web
Elliotte Harold writes: Smylers wrote: That's a sometimes convenient feature for site developers, but there's nothing you can do with content loaded from two sites you can't do with content loaded from one. Here's some I can think of: * Many sites are funded by displaying adverts from a third-party service which picks appropriate ads for the current user-page combination. Serve ads from the host site. That's fine if the host site can do it. But often the point of subcontracting something to a third party is through lack of ability to do it oneself. A website on a particular topic may be financially viable by running third-party-provided ads; that requires merely a standard template in each page, a one-off cost which can be forgotten about. Whereas running ads themselves would take ongoing effort by the site (uploading them, making the pages link to them). That may reduce the time those running the site have available to work on the site's content. Or they may have to pay somebody else to set up the ads. Either of which could make the site no longer financially viable. Further, I don't see how users can be tracked across multiple sites. This is useful to serve users a variety of different ads, rather than the same one lots of times, even as they read multiple sites which all use the same third party ad service. That's a feature, not a bug. Or another way: users shouldn't be able to be tracked across sites. That they are is a bug, not a feature. If I'm going to see ads, I'd rather get different ads than repeatedly encounter the same ones. * Third party traffic analysis services, ranging from simple image hit-counters to something like Google Analytics, require being part of a page's loading. Not all such services do require this though. That's true. I don't have time to respond in detail to each of the valid points your raise. I may later. No, I understand your general point, and I'm sure for each of them I could come up with an alternative technical implementation. It's just that I think such alternatives require sites to make other changes -- changes to their business models, such as switching to a more advanced hosting package, or having to perform in-house tasks which were previously outsourced. And of course they break the business models of some of the outsources, whose products could no longer be used. If we break these things such that third party content is no longer the simplest solution that could possibly work, then developers and sites will move on to the next simplest solution. Or they won't bother, and sites that currently exist will give up. Addressing the root cause will cause pain because a lot of systems you mention will have to be rewritten to work in the new world. Indeed; that's my major concern here. HTML in general is a sub-optimal. XHTML 2 tried the approach of accepting transition pain in order to move to a saner place, but that doesn't seem to have gained traction. I'm simply not persuaded that those who would have to pay the cost of that pain will accept it. The first time a browser is released that implements this idea, many existing sites will fail. Users will blame the browser (we know that because Mozilla-based browsers got blamed for things it did correctly but differently from IE, such as alt text not being in a tooltip). At that point, all other browsers (and older releases of that browser) will still work. Why would users bother switching? And if users don't switch, why would sites bother updating themselves? Nor do I see that it is within the remit of HTML 5 to outlaw certain business models that were permitted under HTML 4. We're saying that such sites aren't welcome on the web? What if such sites are served with valid HTML 4; should they still have HTML 5's new rules applied to them? So be it. For any suggestion on this mailing list it's proponents could dismiss its costs with so be it. That doesn't in any way justify why those costs are worthwhile. You appear to be arguing here that the costs are worth it no matter how high they are. Simon
Re: [whatwg] Dealing with UI redress vulnerabilities inherent to the current web
Elliotte Harold writes: Large content providers already move their content closer to the end user. They do this by physically locating boxes with the same host name and fancy DNS and router tricks. Yup. But those are _large_ content providers. We shouldn't design HTML 5 such that smaller players are at a disadvantage (because they aren't big enough to warrant doing such things themselves, and they can't outsource things to a third party because we've blocked such services from working). Smylers
Re: [whatwg] Dealing with UI redress vulnerabilities inherent to the current web
in HTML messages, but that doesn't always work well, and it can significantly increase the size of the messages (which can in turn thwart the sender's ability to mail all of its subscribers in a timely fashion). * A Google Cache view of a webpage can be useful. It links to images and style-sheets on the live website, hence on a different host. Clearly Google could also cache all the media, but why should they have to? The service is useful as it is. * Many large sites serve images from a different domain from other content, or from multiple different domains, to bypass browser limits on multiple simultaneous connections to a single host. Forcing all the media to a single host would make such sites take longer to load. Getting rid of the browser throttling risks browsers overpowering smaller sites which can't cope with so many connections. * Pages that just happen to link to images on other sites, for no particularly good reason, would break. Such sites could be re-implemented not to do this (without suffering from any of the above problems). The only problem is that they would have to be re-implemented. Many webpages have been abandoned years ago, yet they still have utility. Given that images are a basic feature which people have been relying on for so long, this would break many, many sites. It isn't reasonable to expect them all to undergo active maintenance. I challenge anyone to demonstrate a single multisite web page that cannot be reproduced as a single-site page. Do not confuse details of implementation with necessity. Just because we sometimes put images, ads, video, tracking scripts, and such on different sites doesn't mean we have to. The web would be far more secure if we locked this down, and simply made multisite pages impossible. I agree it would be more secure. But I don't see how we get there from here, even over multiple years. The first browser to implement such a restriction would break so many sites that its users would all switch to a browser that kept the web working as it has till now. And, knowing that, why would website authors bother to make the first move? Smylers
Re: [whatwg] RDFa Features (was: RDFa Problem Statement)
Manu Sporny writes: Ian Hickson wrote: there are a number of technical merits that speak in favor of RDFa over Microformats (fully qualified vocabulary terms Why is this better? Emulated-namespace/Pseudo-namespace (EN/PN) vocabulary terms have been mentioned on this list during the various RDFa discussions over the past week. ... This approach has been rejected by the Microformats community because it is believed that namespaces are more difficult for webpage authors to learn and are thus best if avoided due to the limited scope of Microformats. Hi Manu. Do you disagree with the Microformats community's belief about namespaces being more difficult, or do you think they are more difficult but that this doesn't matter? The approach was rejected by the RDFa community because it doesn't follow core W3C TAG practices and instead invents a new method of specifying resources on the web that are not dereference-able. In short - it re-invents the URI wheel unnecessarily. It would be unnecessary if those kind of unique prefixes have no advantage over URIs. However you later say: why we have this URL short-hand (aka: CURIEs) in the first place. It is a feature that helps web authors and others that are writing this stuff by hand to refer to long URLs in an easy way So that is one disadvantage of URIs: they are long. In fact they are so long that people have gone to the bother of inventing additional syntax to avoid having to write them out. The other advantage of unique prefixes over URIs is the one you mention: they are not dereferenceable. As has been mentioned on this list, that means nobody (human or system) will attempt to reference them, either by mistake or in the hope of finding something there. This avoids confusing learners (who on seeing a URI like those you use in examples may think that content it links to is relevant) and avoids unnecessary server load. (And if somebody wants information about a uniquely prefixed name, using a web search engine will find it.) So unique prefixes have 2 advantages over URIs; therefore they cannot be dismissed as unnecessary merely because URIs exist. Of course those advantages don't necessarily apply to all users in all situations; there may be users whom don't find the above advantageous, and prefer URIs for other reasons. That's OK, because such users can still choose to use a URI as their unique prefix. (And there can be a rule which says you are only allowed to have something which is syntactically a URL as a unique prefix if you own that URL.) That suggests that giving users the freedom to use either URIs or any other prefixes of their choice is superior to forcing them to use URIs, surely? Smylers