Re: [whatwg] Spellchecking mark III
Apologies in advance if this covers old ground, it appears I missed some e-mails in the last round of e-mails about this topic. On Tue, 30 Dec 2008, Anne van Kesteren wrote: Opera wants to support this feature as well in due course, so I don't think we would mind it being added to HTML5. Does it being in Chrome mean it is also WebKit? If so, together with Firefox support, seems like a compelling reason to add the feature. On Tue, 30 Dec 2008, Maciej Stachowiak wrote: The Google Chrome team has not submitted patches for such a feature to WebKit. I am not sure if they plan to eventually submit it to mainline WebKit. In fact, this is the first I've heard about Chrome having such an extension. It's not clear to me whether the feature is useful without seeing some motivating examples. WebKit by default spellchecks (and grammar checks) all editable parts of the document, and it is not obvious to me why one would want to force it off for particular form controls or editable HTML areas. On Tue, 30 Dec 2008, Tab Atkins Jr. wrote: Agreed. This feature lives purely in user-space. It can be convenient for a user to be able to turn off spellchecking globally, or perhaps even locally (FF exposes this currently through a right-click option on editable areas), but I cannot see any reason for an author to have control over this. If I want to spellcheck an area, I want to spellcheck it. If I don't, I don't. On Tue, 30 Dec 2008, Kornel Lesi�~Dski wrote: It's useful for fields that contain non-textual content, e.g. product ID, license plate number, CAPTCHA answer, etc. Browser would mark these as misspelt, which might be confusing or at least distracting. [snip more discussion back and forth about whether it's a good idea or not, or whether we could come up with some heuristics for it instead] Based on the interest (not uniform interest, but interest nonetheless) on this topic, I've left the feature in the spec. I don't think that heuristics would work -- in practice, little distinguishes the subject line from the To: line in GMail, for instance, but one wants spell checking and the other does not. On Wed, 31 Dec 2008, Maciej Stachowiak wrote: The proposal Hixie linked seems way overengineered for this purpose. Yeah, it's certainly not the simplest thing that could have been invented, I'll give you that. First, it allows spellchecking to be explicitly turned on, potentially overriding normal defaults, but that seems wrong; an input type=email should never spellcheck regardless of the page author says. The user agent is allowed to override the author here, if desired. The applicability to input type=email fields is mostly just a side-effect of the attribute applying to everything, which is because we want it to apply to contentEditable. The true state is so that subparts of contentEditable fields can have checking enabled when outer parts have it disabled. I can't see any valid use case for the author turning spellchecking on regardless of UA defaults or user preferences. Second, it allows spellchecking to be controlled at a finer granularity than editability, for which again I think there is no valid use case. Both of these aspects make the feature more complicated to implement and harder to understand, compared to just having a way to only disable spellchecking at the same granularity as editing. In contentEditable, it's easy to imagine that some parts shouldn't be spellchecked when others should, e.g. the editor might introduce a URL and not want that checked. On Wed, 31 Dec 2008, Kornel Lesi�~Dski wrote: I don't like current proposal either, because true/false value is inconsistent with other boolean attributes in HTML. It's consistent with contentEditable, which it's intended to be used with. IMHO it should be nospellcheck=nospellcheck (which also solves problem of forcing spellchecking where it doesn't make sense). That's a pretty ugly attribute name, though. On Thu, 1 Jan 2009, Robert O'Callahan wrote: A use case is editable program code, where spellchecking is disabled, but where spellchecking is enabled inside comments. Maybe that sounds a little far-fetched for today's Web applications, but some IDEs (e.g. Eclipse) support this so it seems like something we'd want in the future. BeSpin, for instance, might want this, if they ever switch from canvas to contentEditable. On Wed, 31 Dec 2008, Maciej Stachowiak wrote: So I don't think this makes for a very compelling use case. It's like arguing for a page layout feature based on something only WordPerfect does. I agree that it seems a bit overpowerful. Experience from Gecko suggests it's not all that bad though. On Sat, 14 Feb 2009, Kristof Zelechovski wrote: The following sentences are *commands* that refer to browser actions: Let automatic completion be turned _on_. (command) Let spell checking be turned
Re: [whatwg] Spellchecking mark III
On Thu, 12 Feb 2009, Kristof Zelechovski wrote: Regarding http://html5.org/tools/web-apps-tracker?from=2800to=2801, my requests: 1. Change the literals true/false to on/off, leaving the DOM values Boolean. There are three of these attributes so far: autocomplete = on/off contenteditable = true/false draggable = true/false I used true/false for spellcheck since it had slightly more other attributes doing the same thing. Also, it's been implemented twice now, so using other keywords is a problem. 2. Check the spelling of the passage (asits!) :0) Fixed. 3. Say that the default behavior for BODY is on and the default behavior for INPUT[type=text] is off. The default behavior is user-agent-dependent. This is intentional since different users may have different needs. 4. (I understand that it is implicit that this SHOULD indicate does not make tiny clients that do not have the resources non-compliant?) Correct. Stretching it a bit, a user's language always matches the site's, otherwise the user would not be able to submit to the site anything that makes sense, except when the site is a gateway for submissions to an uninvolved third party, in which case said submissions should be tagged with the language of submission anyway (IMHO). On Thu, 12 Feb 2009, Bil Corry wrote: Let me give you an example where this isn't true. I'm in the United States and I do contract work for a company in Germany. At the German company, they have an internal bug tracker for their intranet applications. Usually the bug descriptions are written in German, except mine, which are in English. So they will submit bugs in both German and in English, depending on who is taking care of the issue. How do you envision the UA will determine which language the user is writing in? And what happens when the user submits both German AND English, for two audiences? On Thu, 12 Feb 2009, Kristof Zelechovski wrote: The server has two ways of knowing the user's preferred language: the user's preferences and the browser settings, in that order. Submitting in two languages usually needs two controls, one for English and one for German, with appropriate markup. The server must be prepared to handle this use case. On Thu, 12 Feb 2009, Aryeh Gregor wrote: Both of which are often wrong. Users may be multilingual, and multiple users may use the same computer. On the forum I administer, I post almost exclusively in English. However, sometimes I find occasion to write a post partly or wholly in Hebrew. How is the site supposed to know when I'll decide to do that before I even start typing the post? How can the site ever be sure what language the user will type until he actually starts typing? The server might be able to make an educated guess as to what language will be entered, but so can the browser. And the browser is in a *much* better position to check that guess, because it has access in real time to the actual text the user is typing, plus the user interface language, and -- of course -- any lang= or xml:lang= attributes specified in the HTML. Ergo, the logic should be left up to the browser. On Thu, 12 Feb 2009, Kristof Zelechovski wrote: The language attribute can be changed at run time if needed. It requires an additional event that can be called langmismatch. Of course, a more traditional selector is also a solution. If the site is primary English, with Hebrew fragments here and there, it is not much harm that the fragments are considered spelling errors (although, in the case of English/Hebrew bilingualism, it is unlikely because the character set is different). In short, the user agent is allowed to use whatever AI it is equipped with. Markup for German AND English submissions at the same time, as per your request: LABEL LANG=de Inhalt: TEXTAREA NAME=INHALT /TEXTAREA /LABEL LABEL LANG=de Contents: TEXTAREA NAME=CONTENTS /TEXTAREA /LABEL On Thu, 12 Feb 2009, Bil Corry wrote: In my case, we have a single field, bug description that may contain both English and German. And in some cases, even a pure German bug report may reference the English form fields, such as: Legen Sie City vor Postal Code In that case, there is no way for a UA or Server to auto-determine the language, even if you're aware the user speaks both German and English. My suggestion is to leave the lang attribute out of the spec, and let the UA handle it as it wants. On Thu, 12 Feb 2009, K�~Yištof Želechovski wrote: Having interjected words marked as spelling errors is not a failure. The same phenomenon occurs with proper names and you cannot help that. The UI you described is inconsistent and it should be fixed. The control for German should be labeled Fehlerbeſchreibung or whatever. On Thu, 12 Feb 2009, Kristof Zelechovski wrote: I do not know much about UI standards but the rule that the answer
Re: [whatwg] Spellchecking mark III
The discussion on spellcheck= focused on two ideas; using spellcheck= mostly as specced here: http://damowmow.com/playground/spellcheck.txt ...and doing something with lang=. The idea of using lang= had problems that were pointed out by several people, most notably, the issue that the user's language doesn't always match the site's. I think this makes it inappropriate for this use. I have added spellcheck= to the spec. If there's anything in the feedback that I missed, please let me know. I read every e-mail but there didn't seem to be anything specific that I should comment on. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Spellchecking mark III
Regarding http://html5.org/tools/web-apps-tracker?from=2800to=2801, my requests: 1. Change the literals true/false to on/off, leaving the DOM values Boolean. 2. Check the spelling of the passage (asits!) :0) 3. Say that the default behavior for BODY is on and the default behavior for INPUT[type=text] is off. 4. (I understand that it is implicit that this SHOULD indicate does not make tiny clients that do not have the resources non-compliant?) Stretching it a bit, a user's language always matches the site's, otherwise the user would not be able to submit to the site anything that makes sense, except when the site is a gateway for submissions to an uninvolved third party, in which case said submissions should be tagged with the language of submission anyway (IMHO). Best regards, Chris
Re: [whatwg] Spellchecking mark III
Kristof Zelechovski wrote on 2/12/2009 6:24 AM: Stretching it a bit, a user's language always matches the site's, otherwise the user would not be able to submit to the site anything that makes sense, except when the site is a gateway for submissions to an uninvolved third party in which case said submissions should be tagged with the language of submission anyway (IMHO). Let me give you an example where this isn't true. I'm in the United States and I do contract work for a company in Germany. At the German company, they have an internal bug tracker for their intranet applications. Usually the bug descriptions are written in German, except mine, which are in English. So they will submit bugs in both German and in English, depending on who is taking care of the issue. How do you envision the UA will determine which language the user is writing in? And what happens when the user submits both German AND English, for two audiences? - Bil
Re: [whatwg] Spellchecking mark III
The server has two ways of knowing the user's preferred language: the user's preferences and the browser settings, in that order. Submitting in two languages usually needs two controls, one for English and one for German, with appropriate markup. The server must be prepared to handle this use case. HTH, Chris
Re: [whatwg] Spellchecking mark III
On Thu, Feb 12, 2009 at 8:57 AM, Kristof Zelechovski giecr...@stegny.2a.pl wrote: The server has two ways of knowing the user's preferred language: the user's preferences and the browser settings, in that order. Both of which are often wrong. Users may be multilingual, and multiple users may use the same computer. On the forum I administer, I post almost exclusively in English. However, sometimes I find occasion to write a post partly or wholly in Hebrew. How is the site supposed to know when I'll decide to do that before I even start typing the post? How can the site ever be sure what language the user will type until he actually starts typing? The server might be able to make an educated guess as to what language will be entered, but so can the browser. And the browser is in a *much* better position to check that guess, because it has access in real time to the actual text the user is typing, plus the user interface language, and -- of course -- any lang= or xml:lang= attributes specified in the HTML. Ergo, the logic should be left up to the browser. Submitting in two languages usually needs two controls, one for English and one for German, with appropriate markup. The server must be prepared to handle this use case. I don't understand what you mean here.
Re: [whatwg] Spellchecking mark III
The language attribute can be changed at run time if needed. It requires an additional event that can be called langmismatch. Of course, a more traditional selector is also a solution. If the site is primary English, with Hebrew fragments here and there, it is not much harm that the fragments are considered spelling errors (although, in the case of English/Hebrew bilingualism, it is unlikely because the character set is different). In short, the user agent is allowed to use whatever AI it is equipped with. Markup for German AND English submissions at the same time, as per your request: LABEL LANG=de Inhalt: TEXTAREA NAME=INHALT /TEXTAREA /LABEL LABEL LANG=de Contents: TEXTAREA NAME=CONTENTS /TEXTAREA /LABEL HTH, Chris
Re: [whatwg] Spellchecking mark III
Kristof Zelechovski wrote on 2/12/2009 9:05 AM: Markup for German AND English submissions at the same time, as per your request: LABEL LANG=de Inhalt: TEXTAREA NAME=INHALT /TEXTAREA /LABEL LABEL LANG=de Contents: TEXTAREA NAME=CONTENTS /TEXTAREA /LABEL In my case, we have a single field, bug description that may contain both English and German. And in some cases, even a pure German bug report may reference the English form fields, such as: Legen Sie City vor Postal Code In that case, there is no way for a UA or Server to auto-determine the language, even if you're aware the user speaks both German and English. My suggestion is to leave the lang attribute out of the spec, and let the UA handle it as it wants. - Bil
Re: [whatwg] Spellchecking mark III
Having interjected words marked as spelling errors is not a failure. The same phenomenon occurs with proper names and you cannot help that. The UI you described is inconsistent and it should be fixed. The control for German should be labeled Fehlerbeſchreibung or whatever. Best regards, Chris -Original Message- From: whatwg-boun...@lists.whatwg.org [mailto:whatwg-boun...@lists.whatwg.org] On Behalf Of Bil Corry Sent: Thursday, February 12, 2009 5:05 PM To: wha...@whatwg.org Subject: Re: [whatwg] Spellchecking mark III Kristof Zelechovski wrote on 2/12/2009 9:05 AM: Markup for German AND English submissions at the same time, as per your request: LABEL LANG=de Inhalt: TEXTAREA NAME=INHALT /TEXTAREA /LABEL LABEL LANG=de Contents: TEXTAREA NAME=CONTENTS /TEXTAREA /LABEL In my case, we have a single field, bug description that may contain both English and German. And in some cases, even a pure German bug report may reference the English form fields, such as: Legen Sie City vor Postal Code In that case, there is no way for a UA or Server to auto-determine the language, even if you're aware the user speaks both German and English. My suggestion is to leave the lang attribute out of the spec, and let the UA handle it as it wants. - Bil
Re: [whatwg] Spellchecking mark III
Křištof Želechovski wrote on 2/12/2009 10:15 AM: The UI you described is inconsistent and it should be fixed. Inconsistent with which UI standard? - Bil
Re: [whatwg] Spellchecking mark III
I do not know much about UI standards but the rule that the answer should be formulated in the language of the question is rather straightforward. It is just common sense. Exceptions are questions like How is that in German?. Chris
Re: [whatwg] Spellchecking mark III
Kristof Zelechovski wrote on 2/12/2009 11:06 AM: I do not know much about UI standards but the rule that the answer should be formulated in the language of the question is rather straightforward. It is just common sense. Exceptions are questions like How is that in German?. No one can control the language a user will choose to use in a textarea, regardless of the label used to describe it. Providing a localized textarea for every language might increase the odds of the user using the language the server prefers, but there is no guarantee. And I'm unclear what problem that would ultimately solve. - Bil
Re: [whatwg] Spellchecking mark III
The majority of users will answer the question in the language of the question, this is the normal reaction. Of course there is no guarantee but the odds of getting the expected result are high. Assuming that the user's input will actually be read by somebody, providing proper markup will help the readers to get something they are able to read. Chris
Re: [whatwg] Spellchecking mark III
On Wed, Jan 28, 2009 at 2:35 AM, Křištof Želechovski k...@mimuw.edu.plwrote: *No, the _original_ use was to turn it on on fields where it would otherwise have been on. * I do not understand. If spell checking would be on, why turn it on explicitly? I mistyped. The last word should have been off. If the control is not expected to contain a private language, it should be subject to spell checking. This thread has already had multiple examples of cases where this is untrue. Spelling quizzes, address fields, etc. And even if it were true, it's not the way browsers behave today (e.g. Firefox does not spellcheck single-line fields, precisely to avoid a lot of cases like this), and changing those defaults to be something non-annoying, using complex heuristics, is significantly harder (in terms of your time/money cost below) compared to supporting the attribute. Avoiding an additional attribute is a gain, Why? Because adding an additional attribute costs time and money. To whom? What tradeoff are you making? Keying spellchecking off language support costs engineering time too, for the UA. And for a web author. All changes have costs. The point here seems like a vague principle rather than a specific application. Which no one will ever use, because users aren't going to take the trouble to declare such a thing when human recipients can just _read the text_. After all, WE have built-in language detectors in our heads. We disagree here but further discussion is void unless you have the resources necessary to perform an investigation of the subject. If you need data to prove that people will not make the effort to explicitly tell recipients what languages their messages are in, I offer you the entire history of written communication, where people don't say By the way this is in English! at the top of each letter. Users entering text in a foreign language cause trouble to the forum moderators who have to discipline them. Thus, the software could accommodate to the needs of the moderator, so that the poster gets warned before posting, not admonished afterwards. This is more convenient and less work for everyone. Providing an indication what language is recommended by forum users is good, because most users would take that into account (for fear of getting plonked, if not for good manners). How is this relevant to a discussion about spellchecking? If you want UA-based language detection facilities that are, say, accessible from JS, that may be a reasonable request, but like much of this discussion, it seems tangential. PK
Re: [whatwg] Spellchecking mark III
On Wed, Jan 28, 2009 at 10:27 AM, Křištof Želechovski k...@mimuw.edu.plwrote: Spelling quizzes are an artificial example; they are not interesting once spell checking is commonly available because the user can cheat by temporarily using another control that is being checked. They can cheat today by pasting something into Microsoft Word. Or looking it up in a dictionary. That doesn't mean there's no value in this. There are many internet quizzes where you can cheat by looking answers up with a search engine, but they're still fun and wildly popular. Your argument that because people could cheat no author would, or should, want to write such a thing does not seem supported by evidence. Address fields contain data in a technical language, not in a natural language. Of course, the browser can support technical languages by checking the syntax and validity of data as well (e.g. matching the zip code against the place using an external database). This seems rather far afield from spellchecking. There's a whole section of the spec (forms) that deals with validation of various kinds of form input. It's separate from spellchecking for a reason: the algorithms are completely different (and a potential rabbit-hole). From my perspective as a UA author that actually writes the code to do this stuff, you're conflating too many kinds of input validation here. People do not say this is English but machines do (Content-Language MIME header). And that header content is not generally set by explicit user action. In fact, it's often not set at all, or set incorrectly. Hoping that this will change seems naive. I want incorrect input, including input in an unexpected language, to be marked as such by the spell checker. I already said that having a general-purpose, JS-accessible language detector might be a good thing. It would certainly be a necessary thing for this request. Once one had it, the request would be better addressable without touching the behavior of the browser's spellchecker at all, because the author could use the output of the language detector to display any message or take any action he desired, rather than simply having the UA draw a line under every word and thus look completely broken. What you want is better accomplished by means other than what you propose, and what you propose does not address the use cases for the spellcheck attribute. I'm not sure we can reach further agreement, so I leave this subthread in Hixie's hands. PK
Re: [whatwg] Spellchecking mark III
2009/1/27 Křištof Želechovski k...@mimuw.edu.pl The original use of the spellcheck attribute was to switch spell checking off No, the _original_ use was to turn it on on fields where it would otherwise have been on. (I think we both believe it should generally be on). Using a private language for the control would do the trick equally well, without introducing a new attribute. It wouldn't do it equally well, since semantically, it would mean this is of language private, which will be strictly inaccurate. Avoiding an additional attribute is a gain, Why? If the language detection libraries are as good as you claim, why is Firefox unable to use them in a way that is not annoying? Because no one has had the time or energy to devote to this? I have worked full-time on browsers for a number of years now and have never seen any team with the time to fix all the things that could or should be fixed. As I have already mentioned, GMail should provide an option for the sender to inform the recipient about the language used in the message, not for the client-side spell checker, but for the recipient. Which no one will ever use, because users aren't going to take the trouble to declare such a thing when human recipients can just _read the text_. After all, WE have built-in language detectors in our heads. We can drop the suggestion language=auto if you wish, but it would be an explicit way of informing the user that he is allowed to enter text in any language he pleases. As if users aren't going to just enter whatever language they please into any field they wish? We design software that has to accommodate people, not the other way around :) I have no idea whether there are better things web apps and UAs can do w.r.t. communicating what languages are used where. All I know is that both in the abtstract and practically, whether I want a field spellchecked by default is a distinct concern from which language(s) would be used to spellcheck it. Therefore I continue to see the spellcheck attribute as distinct from (though possibly complimentary to) language. PK
Re: [whatwg] Spellchecking mark III
2009/1/26 Křištof Želechovski k...@mimuw.edu.pl Q: Should the localization influence the spell checking mechanism? A: Definitely, since the user is likely to write most messages in his preferred UI language. Which is why this is a perfectly valid input for the heuristic the UA uses to determine the checking language. Q: Is GMail a use case for having spell check without specifying a language to check against? A: No, it is not. You don't provide any reason why not. The user is likely to write most messages in his preferred UI language (which is not true of all users, but leaving that aside) does not imply the user will write all messages exclusively in his preferred UI language. Therefore gmail cannot (correctly) specify the spellchecking language of editable fields. Therefore the UA must decide. Unless the probable input language of a particular field differs from that of the rest of the page, there's no reason for gmail to specify the probable input language of that field. There is no benefit to conflating this concept with should this field be spellchecked. Q: In case when the user decides to use another language, is the user agent free to detect it? A: Yes, it is, unless the language specified is private, which means the field was not intended for checking. Again, this is needless conflation. You gain nothing, and lose both clarity and flexibility, by mapping don't spellcheck to specify the language as private in this way. In terms of the semantics of the page, this is extremely confusing, sicne whether a field should be spellchecked and what language it's in are nearly orthogonal concepts. Q: When the language recognition technology advances to an acceptable state, will it be possible to extend the language attribute to explicitly request automatic identification of the language? A: Yes, it is. Just specify lang=auto or whatever is agreed upon. There is no benefit to forcing authors to say lang=auto. What have you gained? What if they _don't_ say this? (The HTML5 spec must still say what the UA behavior is.) Language detection libraries today are already extremely good, far more reliable than anything explicitly set ahead of time by authors _or_ users. Unless I am completely misunderstanding you, I think your suggestions fail to solve the original use cases for the spellcheck attribute, add needless burden on web authors, and would be completely ignored by UAs who wished to provide a good user experience. PK
Re: [whatwg] Spellchecking mark III
On Sun, Jan 25, 2009 at 10:52 AM, Křištof Želechovski kri...@wp.pl wrote: Gmail can use 1. the localisation preferences chosen by the user in GMail configuration, 2. the localisation preferences chosen by the user in the browser configuration to determine the what language the user is likely to use in the subject field. (Generally, it should be the same language as the Subject label is in.) If the user incidentally sends a message in another language, the Web browser can recognize the language after the subject is typed, as described before. But your original claim was that the web author, not the UA, should have the ability to force a particular language for spellchecking -- and that the spellcheck attribute was worthless outside this, as what authors needed was a way to force the spellcheck _language_, not simply its presence. Now you seem to be reversing your comments and indicating that perhaps the UA may end up knowing better what language to use (e.g. because the user types in another language), which is what I was saying all along. And none of this gives any support for the idea that spellcheck as an attribute is not useful for gmail! Why should gmail have to try and guess what lanugage the user will be typing emails in? Isn't it instead desirable to tell the UA if you can figure out the right language here, then go ahead and spellcheck this field and leave everything else in the hands of the UA? PK
Re: [whatwg] Spellchecking mark III
On Wed, Jan 21, 2009 at 4:51 PM, Aryeh Gregor simetrical+...@gmail.com wrote: In practice, I think the only way to avoid this problem is for browsers to implement content-sniffing techniques of some kind to figure out the language, at least per field but ideally on a word-by-word basis. If the browser is set to spellcheck in English but you start putting in lots of non-Latin characters and every word is therefore misspelled, the browser should be clever enough to try switching the spellcheck language, or at least disabling spellcheck for words that can't possibly be from the language it's checking against. More refined heuristics could detect even subtle differences, like between British and American English, and remember for next time which one the user usually types in. this is approximately what I'm hoping to see implemented for Firefox. I haven't worked on the spell checking code recently, but it's what I feel is necessary having worked in an organization where the default language and the used language don't match. The result is everyone either ignores or turns off spell checking. I'm hoping to either find someone to implement this, or implement it myself. Either way, with this implemented, my employer would eventually update my coworkers' browsers to such a Firefox, and then I can hope they will get more useful feedback and actually pay attention to their typing. -Yes, I'm aware that this is a pipe dream. I need this dream. None of this needs, or even could effectively use, author intervention: 1) The author cannot know what languages users will want to enter in all cases. I've sometimes found myself writing posts in Hebrew on English-only sites, for instance. 2) The author certainly won't be able to determine the dialect or variant of the language the user will want to use, which is necessary for spellcheck. 3) Authors should not have to add extra markup if it's not really necessary, because in practice, most won't. To be as useful as possible, spellcheck should Just Work without explicit author intervention.
Re: [whatwg] Spellchecking mark III
Peter Kasting ha scritto: On Wed, Jan 21, 2009 at 7:38 PM, Calogero Alex Baldacchino alex.baldacch...@email.it mailto:alex.baldacch...@email.it wrote: Why not to let the user choose the language, as it happens in word processors? A UA can't choose accurately whether, for instance, color is a correct American English, a wrong British English, or even a correct (truncated) Italian word, while a human can do it better, thus a UA could provide an interface to change the language for a selection spellchecking, or even for each mispelled word, starting from a hint language, which could be the value of an element lang attribute (beside a default value and a user-preference forced one - the latter bypassing any authored value). Also, using the lang attribute value as the start language to check (if not in contrast with a user preference) would allow an interactive interface with a script changing that value according to a user's choice (UAs could also expose a list of supported languages). I'm not sure I fully grasped everything here, but what I did grasp sounds very much like a cross between what Chromium is doing today and what we want to do in the future (I imagine similar things are true for other browser vendors). User specification and page hints are both useful tools for a UA. But I still claim that all of those aspects are outside the scope of the spellcheck attribute, and fall into the realm of things that should not be in the HTML5 spec as they're very much UA-specific behavior. PK Probably. However, establishing that the lang attribute is the first-choice language to check (which wouldn't prevent the UA from providing other choices, or just ignoring such behaviour due to a user preference, or using other dictionaries too -- and that might be suggested in a note on usability, I guess), I mean, would allow a webapp to emulate those functionalities to some extent, just setting a different value for the lang attribute of a contenteditable box and some of its subregions through a script at the user whim (that is, let's do it through script until UAs provided a better solution, which could be hinted by scripting hacks based on the lang and spellcheck attributes working together at the same grane). I think that a control over the language to check can improve spellchecking at the same grane as the spellcheck attribute, whereas it can't harm end users more than a wrong assumption on spellchecking. A user would notice a wrong checking not matching the language he's using, and could disable it or do whatever else a UA allows him to do (though being annoying); on the other hand, a user might not notice spellchecking is disabled on a certain area, and could not beware his errors, unless the UA informed him somehow (about spellchecking being turned off). Therefore, a special care by UAs is needed in both cases, yet both features can improve webapps providing a rich and/or specialized editor (such as a code editor, where disabling spell checking but for comments may make sense), so why not consider both of them, since they're related? Also, implementation and usages experience could suggest whether it is worth to expose UAs' supported languages through DOM APIs (e.g. to allow a webapp to create a dynamic list of checking-available languages, to avoid static lists being either incomplete, or too long and possibly including unsupported languages), and this would affect either the Window or the Navigator interface (or something else in HTML5 scope). Everything, IMHO. WBR, Alex -- Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP autenticato? GRATIS solo con Email.it http://www.email.it/f Sponsor: Con Danone Activia, puoi vincere cellulari Nokia e Macbook Air. Scopri come Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8547d=22-1
Re: [whatwg] Spellchecking mark III
Probably. However, establishing that the lang attribute is the first-choice language to check (which wouldn't prevent the UA from providing other choices, or just ignoring such behaviour due to a user preference, or using other dictionaries too -- and that might be suggested in a note on usability, I guess), I mean, would allow a webapp to emulate those functionalities to some extent, just setting a different value for the lang attribute of a contenteditable box and some of its subregions through a script at the user whim (that is, let's do it through script until UAs provided a better solution, which could be hinted by scripting hacks based on the lang and spellcheck attributes working together at the same grane). I don't think that applications need ability to precisely control spell-checking language. Browser knows best which dictionaries are available, and can auto-detect language based on user's preferences, page's language and text itself. You can expect that browsers will have much more sophisticated and reliable language detection than web apps (that's an area where browsers can freely compete). Many of your suggestions are just implementation details, which HTML shouldn't specify precisely (it could force browsers to use method that is suboptimal). HTML just needs to offer reasonable way to implement good heuristics, and I think existing lang, input types and spellchecking attribute are sufficient. -- regards, Kornel
Re: [whatwg] Spellchecking mark III
Kornel Lesiński ha scritto: Probably. However, establishing that the lang attribute is the first-choice language to check (which wouldn't prevent the UA from providing other choices, or just ignoring such behaviour due to a user preference, or using other dictionaries too -- and that might be suggested in a note on usability, I guess), I mean, would allow a webapp to emulate those functionalities to some extent, just setting a different value for the lang attribute of a contenteditable box and some of its subregions through a script at the user whim (that is, let's do it through script until UAs provided a better solution, which could be hinted by scripting hacks based on the lang and spellcheck attributes working together at the same grane). I don't think that applications need ability to precisely control spell-checking language. Browser knows best which dictionaries are available, and can auto-detect language based on user's preferences, page's language and text itself. You can expect that browsers will have much more sophisticated and reliable language detection than web apps (that's an area where browsers can freely compete). Browsers can't do better than word processors, which are the state of the art in... word processing. At most, browsers can do as well, and, over some extent, word processors don't use heuristics while you're typing, because no heuristics can guess whether you're *purposedly* switching between dialects (such as British and American English), or if you just mispelled a word (personally, I dislike even the automatic correction of common mistakes in w.p.). Word processors make a choice when you start writing (or before, basing on your installation language, for instance), and let you change it for the whole document or for each single word. I don't think any heuristic auto-detection can be better; instead, no language detection (and users' explicit choice) is more reliable than any sophisticated heuristics. Turning spelling checking on or off makes sense if one can guess how the user agent would behave AND if the user agent can recover misuses, thus I believe that spellcheck is strictly related to the way a spellcheking language is detected and is half of the problem of controlling spellcheking. Otherwise, if it's thought that everything should be under the control of a UA, let's state spellchecking must be always on and peace. Just because being annoyed by a wrong checking (e.g. because the heuristics fails, but it would be the same for a wrong lang value) is less harmful than thinking one's writing correct text because of being unaware that checking has been disabled by the author without asking one's permission. Yet, both lang and spellcheck attributes can be useful for the purpose of controlling spellchecking and improving a web-based word processor, and in both cases UAs can recover from misuses, somehow (e.g. allowing the user to bypass authors' choices). Moreover, I think that interactive and script-aware UAs should act as a framework for web-based applications providing as much of a client-only application functionalities as possible, thus browsers should include new features when possible and reasonable (while trying not to became oversized). I agree that spellchecking is a good feature to support in a browser; I don't see why a web-based rich text editor should be prevented from controlling it on users' behalf, as it happens in word processors, givent it's about to support an existing attribute (lang, which could be stated to be triggering UAs heuristics by default when unspecified for editable elements) and a new one (spellcheck) in conjunction for this purpose (also a list of supported dictionaries would be useful). I also think that features which are not core functionalities for a UA should be provided in a basic version (for general use in web pages) and as building blocks for web applications, not in a complete version under a UA exclusive control (for instance, a UA could allow the user to change the language for some selected text through a context menu option, but the right place for an option allowing a (starting) choice valid for a whole editable element, in a rich text editor, should be the editor interface, which shouldn't be provided by a UA, as a whole or in part, or, if the UA provides it, it should be exposed to any webapp to be customized and enhanced). That's because a specific application can focus on a specific task usability better than its underlying, general purpose framework (like a browser is or should be for a web application). Furthermore, if you agree that a page's language should be used to improve auto-detection, why not to use an element language attribute too? With the benefit that it can be changed dynamically to please the user. Many of your suggestions are just implementation details, which HTML shouldn't specify precisely (it could force browsers to use
Re: [whatwg] Spellchecking mark III
Peter Kasting wrote: 2009/1/20 Mikko Rantalainen mikko.rantalai...@peda.net I agree. I think that specifying the spellcheck attribute would be a mistake. It allows only forcing the automatic spell checking on or off but it doesn't help a bit to allow mixing different languages on a single page. I don't see how the second sentence is an argument for the first. If the browser does not know the language of the content, how on earth is it supposed to *correctly* spellcheck it? I'm daily hitting a situation where browser is trying to spellcheck content with incorrect language. I've toggled such automatic spellchecker off and those will stay off until correct language is detected. My second sentence was trying to argument that page author has no business forcing the spellchecking on if the page author cannot force the spellchecking language! Especially for a case where the page contains a mix of multiple languages. Just specify that spell checking must follow the content language. How many pages specify the content language? AFAIK the farthest most authors get is to specify the encoding, and even that is frequently done wrong, and browsers have all kinds of crazy heuristics to try and second-guess authors. This seems like it would make spellchecking function very poorly on the web at large, whereas adding the spellcheck attribute at worst would not harm anyone. I'm aware that many web pages do not specify content language. There aren't many web pages forcing the spellchecking on or off, either. Forcing a spellchecking on with incorrect language would harm the user! It really does not make any sense to ever force spellchecking if the language that the spellchecker uses is the incorrect one. The current spellcheck attribute does not define any language and it seems that the page author has no way to know if the spell checking should really be disabled or not. My point is that if the page does not specify the language then the behavior should be explicitly undefined. This should not be changed. On the other hand, if the content language is explicitly defined, then the user agent has the required knowledge to decide if the spellchecking should be enabled or disabled. There's no need for the spellcheck attribute. Make specifying the language the *only* accepted method for triggering the spell checking. Specify that any unknown language must not be spellchecked automatically. Then you automatically have a method for forcing the automatic spell checking off and in addition to that you have some incentive to define correct language for the page. If we can persuade content authors to specify the correct content language, I believe that in the future there will be *other* benefits, too. For example, automatic hyphenation would improve typographic quality of web pages but automatic hyphenation is impossible unless you know the language of the content. -- Mikko signature.asc Description: OpenPGP digital signature
Re: [whatwg] Spellchecking mark III
Mikko Rantalainen wrote: My second sentence was trying to argument that page author has no business forcing the spellchecking on if the page author cannot force the spellchecking language! Especially for a case where the page contains a mix of multiple languages. Not really. Consider e.g. flickr in which photos may be given titles, descriptions and comments in the language of the user's choice but the site UI is not localised. If flickr decided to do input type=text lang=en to get spellchecking to turn for photo titles then that would be much worse for the large number of non-native English speakers than input type=text spellcheck=on which would likely use the user's preferred dictionary (although this would be UA-dependent of course). For another example, consider the case where I post on a Swedish forum in English, knowing that the general level of English in Sweden is excellent and in any case better than the level of my Swedish. It doesn't seem reasonable to expect sites to always be localised or for sites accepting multilingual user generated content to not exist. Therefore it seems totally conterproductive from the point of view of people communicating in less dominant languages to require spellchecking to be tied to language.
Re: [whatwg] Spellchecking mark III
James Graham wrote: Mikko Rantalainen wrote: My second sentence was trying to argument that page author has no business forcing the spellchecking on if the page author cannot force the spellchecking language! Especially for a case where the page contains a mix of multiple languages. Not really. Consider e.g. flickr in which photos may be given titles, descriptions and comments in the language of the user's choice but the site UI is not localised. If flickr decided to do input type=text lang=en to get spellchecking to turn for photo titles then that would be much worse for the large number of non-native English speakers than input type=text spellcheck=on which would likely use the user's preferred dictionary (although this would be UA-dependent of course). How about input type=text lang=mul if the content author does not want to specify a language? That would hint the UA that this field assumes human language but the input may be in any language. The current (heuristics) could be requested with input type=text lang=und which explicitly marks this input to contain text with undefined language. For another example, consider the case where I post on a Swedish forum in English, knowing that the general level of English in Sweden is excellent and in any case better than the level of my Swedish. I agree. However, if the forum maintainer would rather have no text at all instead of text in wrong language, then the forum maintainer should use input type=text lang=se and the UA would correctly flag any non-swedish word as incorrect. It doesn't seem reasonable to expect sites to always be localised or for sites accepting multilingual user generated content to not exist. Therefore it seems totally conterproductive from the point of view of people communicating in less dominant languages to require spellchecking to be tied to language. I'm not suggesting spellchecking to require only a single language. I'm requesting that if the page wants automatic spell checking it must explicitly define the language that the spellchecking should check for. For multiple languages case, the RFC 3066 defines the MUL language code and for the undefined case, the UND code has been defined. Currently the lang attribute accepts exactly one language code. For the case where acceptable input for forum message would be Swedish or English it would be nice to be able to write input type=text lang=se,en or perhaps even lang=se,en;q=0.1. -- Mikko signature.asc Description: OpenPGP digital signature
Re: [whatwg] Spellchecking mark III
On Wed, Jan 21, 2009 at 4:15 AM, Mikko Rantalainen mikko.rantalai...@peda.net wrote: If the browser does not know the language of the content, how on earth is it supposed to *correctly* spellcheck it? I'm daily hitting a situation where browser is trying to spellcheck content with incorrect language. I've toggled such automatic spellchecker off and those will stay off until correct language is detected. In practice, I think the only way to avoid this problem is for browsers to implement content-sniffing techniques of some kind to figure out the language, at least per field but ideally on a word-by-word basis. If the browser is set to spellcheck in English but you start putting in lots of non-Latin characters and every word is therefore misspelled, the browser should be clever enough to try switching the spellcheck language, or at least disabling spellcheck for words that can't possibly be from the language it's checking against. More refined heuristics could detect even subtle differences, like between British and American English, and remember for next time which one the user usually types in. None of this needs, or even could effectively use, author intervention: 1) The author cannot know what languages users will want to enter in all cases. I've sometimes found myself writing posts in Hebrew on English-only sites, for instance. 2) The author certainly won't be able to determine the dialect or variant of the language the user will want to use, which is necessary for spellcheck. 3) Authors should not have to add extra markup if it's not really necessary, because in practice, most won't. To be as useful as possible, spellcheck should Just Work without explicit author intervention.
Re: [whatwg] Spellchecking mark III
Mikko Rantalainen wrote on 1/21/2009 5:03 AM: For another example, consider the case where I post on a Swedish forum in English, knowing that the general level of English in Sweden is excellent and in any case better than the level of my Swedish. I agree. However, if the forum maintainer would rather have no text at all instead of text in wrong language, then the forum maintainer should use input type=text lang=se and the UA would correctly flag any non-swedish word as incorrect. I see value in being able to provide a hint to the UA that it should or should not spell check certain content, but the ultimate control should reside with the user. I hate the idea of a web site dictating which dictionary must be used to spell check the user's content. Spell checking is for the benefit of the user, not the web site, and forcing a dictionary in a language that the user doesn't speak is completely useless and would only serve to annoy (i.e. it wouldn't prevent the user from submitting content in any language of their choosing). Beyond that, it has other problems. Say I visit a site in the UK and it forces the UK dictionary; as an American speaker, I'll be confused as to why my UA is flagging color as misspelled and will simply turn off spell checking entirely since it's broken. Additionally, not all UAs ship with dictionaries for every single language (do any?), so the UA wouldn't be able to spell check when a dictionary isn't available for that user. I guarantee that if my UA shipped with all of them, I'd remove them all except the languages I converse in to prevent the web site from forcing a particular dictionary. Then there are some languages that do not have a dictionary available at all, such as Tamil in Firefox: https://addons.mozilla.org/en-US/firefox/browse/type:3 I don't see any benefit to the user in forcing them to use a particular dictionary and the only benefit to the site is it might annoy someone into using a particular language (assuming they even have the dictionary for that language). - Bil
Re: [whatwg] Spellchecking mark III
On Wed, Jan 21, 2009 at 1:15 AM, Mikko Rantalainen mikko.rantalai...@peda.net wrote: If the browser does not know the language of the content, how on earth is it supposed to *correctly* spellcheck it? As others have noted, the user's preferences are generally a better indicator of how something should be spellchecked, for a number of reasons. (Bill Corry's email was on-point here.) I'm daily hitting a situation where browser is trying to spellcheck content with incorrect language. I've toggled such automatic spellchecker off and those will stay off until correct language is detected. As I said, this seems a separate problem to me. Dynamic language switching or multi-language spellchecking based on various heuristics seems like the solution here. This applies to any spellchecked field anywhere and is separate from the issue of whether an author wants to tell the UA that a field is even appropriate for spellchecking or not. My second sentence was trying to argument that page author has no business forcing the spellchecking on if the page author cannot force the spellchecking language! I disagree completely. Consider one of the original use cases for this: Gmail instructing UAs to spellcheck the optional Subject field of a mail. There's no way Gmail can know what language(s) the user may type in this field, but it's still appropriate to tell the UA that the field is appropriate for spellchecking. At this point it's up to the AU to determine what language to use. I also take issue with the word force, which is imprecise. The spellcheck attribute spec was carefully written to ensure that the user and UA have ultimate control over whether spellchecking actually occurs, regardless of what the author specifies; the attribute is a hint to the UA, not force. Forcing a spellchecking on with incorrect language would harm the user! A good reason why the UA's spellchecking language should not be determined by the author (and thus why your proposal leaves me cold). On the other hand, if the content language is explicitly defined, then the user agent has the required knowledge to decide if the spellchecking should be enabled or disabled. There's no need for the spellcheck attribute. The UA does not know which fields actually contain language and which simply contain strings of characters. Enumerating input types (e.g. this field contains email addresses) can address this, but suffers from two problems: * There are an unbounded number of input types, potentially * Types should perhaps not always be treated equally. For example, if an author wrote a spelling quiz, then input boxes for a user to type in would contain words and thus be of a spellcheckable type, but the author would clearly prefer the UA not spellcheck them :) If we can persuade content authors to specify the correct content language, Proposals that sound like if we could just get authors to write valid, semantic content with no errors... have always seemed naive to me. PK
Re: [whatwg] Spellchecking mark III
Aryeh Gregor ha scritto: On Wed, Jan 21, 2009 at 4:15 AM, Mikko Rantalainen mikko.rantalai...@peda.net wrote: If the browser does not know the language of the content, how on earth is it supposed to *correctly* spellcheck it? I'm daily hitting a situation where browser is trying to spellcheck content with incorrect language. I've toggled such automatic spellchecker off and those will stay off until correct language is detected. In practice, I think the only way to avoid this problem is for browsers to implement content-sniffing techniques of some kind to figure out the language, at least per field but ideally on a word-by-word basis. If the browser is set to spellcheck in English but you start putting in lots of non-Latin characters and every word is therefore misspelled, the browser should be clever enough to try switching the spellcheck language, or at least disabling spellcheck for words that can't possibly be from the language it's checking against. More refined heuristics could detect even subtle differences, like between British and American English, and remember for next time which one the user usually types in. Why not to let the user choose the language, as it happens in word processors? A UA can't choose accurately whether, for instance, color is a correct American English, a wrong British English, or even a correct (truncated) Italian word, while a human can do it better, thus a UA could provide an interface to change the language for a selection spellchecking, or even for each mispelled word, starting from a hint language, which could be the value of an element lang attribute (beside a default value and a user-preference forced one - the latter bypassing any authored value). Also, using the lang attribute value as the start language to check (if not in contrast with a user preference) would allow an interactive interface with a script changing that value according to a user's choice (UAs could also expose a list of supported languages). A declaration such as lang='und' sounds like telling the user agent to do whatever is computed as being a good choice, which is different from telling don't even try to understand what the language is here, because I know you can't guess it; declaring a value known to be unsupported (such as an invented one) to turn off spellchecking sounds like a hack needed because we miss a more appropriate feature. Everything IMHO. WBR, Alex -- Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP autenticato? GRATIS solo con Email.it http://www.email.it/f Sponsor: Partecipa al concorso Danone Activia e vinci MacBook Air e Nokia N96. Prova Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8548d=22-1
Re: [whatwg] Spellchecking mark III
On Wed, Jan 21, 2009 at 7:38 PM, Calogero Alex Baldacchino alex.baldacch...@email.it wrote: Why not to let the user choose the language, as it happens in word processors? A UA can't choose accurately whether, for instance, color is a correct American English, a wrong British English, or even a correct (truncated) Italian word, while a human can do it better, thus a UA could provide an interface to change the language for a selection spellchecking, or even for each mispelled word, starting from a hint language, which could be the value of an element lang attribute (beside a default value and a user-preference forced one - the latter bypassing any authored value). Also, using the lang attribute value as the start language to check (if not in contrast with a user preference) would allow an interactive interface with a script changing that value according to a user's choice (UAs could also expose a list of supported languages). I'm not sure I fully grasped everything here, but what I did grasp sounds very much like a cross between what Chromium is doing today and what we want to do in the future (I imagine similar things are true for other browser vendors). User specification and page hints are both useful tools for a UA. But I still claim that all of those aspects are outside the scope of the spellcheck attribute, and fall into the realm of things that should not be in the HTML5 spec as they're very much UA-specific behavior. PK
Re: [whatwg] Spellchecking mark III
Křištof Želechovski wrote: Spell checking of regions of text should be governed by the lang attribute, if any, and browser preferences; it would be switched off for language tags the spell-checking engine does not support, including custom ones. It is extremely annoying how Safari, although (supposedly) localized to Polish, wants all input to be in English. I agree. I think that specifying the spellcheck attribute would be a mistake. It allows only forcing the automatic spell checking on or off but it doesn't help a bit to allow mixing different languages on a single page. Robert O'Callahan wrote: The browser can't know ahead of time that a text field is not supposed to contain natural-language text. Yes it can, the lang attribute contains the required information. If the page lies about its language, then there's abviously nothing the browser can do to fix it. Just specify that spell checking must follow the content language. This way any already existing page that correctly specifies the content language would turn automatic spell checking on/off as required. There's no point trying to automatically spell check an unknown language so there's no need to explictly turn off the spellchecking (assuming that the content language is correctly specified). As the lang attribute can be used in inner elements, too, it allows mixing different languages on a single page and it allows UA to apply different spell checkers to different parts. At least the following use cases have been discussed in this thread: - email subject field (e.g. lang=en according to UI language perhaps?) - email address field (lang=x-email-to, can be spellchecked against the user's address book) - web site address (lang=x-url, perhaps also follow type=url?) - product id (lang=x-proprietary) - license plate (lang=x-proprietary) - captcha (lang=x-proprietary) - program code (lang=x-program-c++ perhaps?) If the page does not specify any language, allow the UA to decide the best method for its spell checking (leave the behavior explicitly undefined). I used x-proprietary for a custom/special language above. RFC 3066 specifies UND (undefined) language code that could be used instead. However, I think that the whatwg should specify that UND language code is used to turn on the undefined (UA dependant heuristics) behavior for selected inner elements. -- Mikko signature.asc Description: OpenPGP digital signature
Re: [whatwg] Spellchecking mark III
2009/1/20 Mikko Rantalainen mikko.rantalai...@peda.net I agree. I think that specifying the spellcheck attribute would be a mistake. It allows only forcing the automatic spell checking on or off but it doesn't help a bit to allow mixing different languages on a single page. I don't see how the second sentence is an argument for the first. Just specify that spell checking must follow the content language. How many pages specify the content language? AFAIK the farthest most authors get is to specify the encoding, and even that is frequently done wrong, and browsers have all kinds of crazy heuristics to try and second-guess authors. This seems like it would make spellchecking function very poorly on the web at large, whereas adding the spellcheck attribute at worst would not harm anyone. As the lang attribute can be used in inner elements, too, it allows mixing different languages on a single page and it allows UA to apply different spell checkers to different parts. Again, this seems somewhat orthogonal to the spellcheck attribute discussion. Pages which mix languages are of interest to the Chromium development team, too, and we have ideas on how to make life better for those users -- but none of those ideas intersect the spellcheck attribute in any way. I think your post is tangential. PK
Re: [whatwg] Spellchecking mark III
Spell checking of regions of text should be governed by the lang attribute, if any, and browser preferences; it would be switched off for language tags the spell-checking engine does not support, including custom ones. It is extremely annoying how Safari, although (supposedly) localized to Polish, wants all input to be in English. IMHO, Chris
Re: [whatwg] Spellchecking mark III
On Tue, Dec 30, 2008 at 3:38 AM, Ian Hickson i...@hixie.ch wrote: The same engineers have since implemented this feature in Chrome also, Incorrect. One engineer implemented a crude hack in a small portion of the Chromium glue code that implements a fraction of the spec -- enough to make Gmail work a little more nicely, and that's about it. On Wed, Dec 31, 2008 at 7:15 AM, Maciej Stachowiak m...@apple.com wrote: 2) The proposal Hixie linked seems way overengineered for this purpose. First, it allows spellchecking to be explicitly turned on, potentially overriding normal defaults, but that seems wrong; an input type=email should never spellcheck regardless of the page author says. I can't see any valid use case for the author turning spellchecking on regardless of UA defaults or user preferences. Email subject line boxes. In Firefox (where I implemented support for this attribute matching Hixie's spec), the default is to spellcheck multiline boxes and not single-line boxes, which meant that webmail subject line fields would not be spellchecked by default. Second, it allows spellchecking to be controlled at a finer granularity than editability, for which again I think there is no valid use case. Besides the above example in the positive direction, the negative direction is, again, editable fields which you don't want spellchecked, e.g. email recipient list fields (which may be multiline and contain whitespace). I agree with Roc that it is not practical for UAs to detect (via heuristics) which fields should and should not be checked in all cases, and spellchecking desirability seems finer grained than editability to me (not completely orthogonal, as I don't think non-editable fields should ever be spellchecked). I also agree with Roc that this is not complicated, in practice, to implement. It was a tricky patch for me in Firefox since I was not familiar with any of the associated code, but the actual logic of the spec was not hard at all. I support adding Hixie's spec, as-is, to HTML5. It's implemented in Firefox, it's desired in Opera, and there's a bug on file to add support for it to WebKit (which I would like to do someday). PK
Re: [whatwg] Spellchecking mark III
On Mon, Jan 19, 2009 at 4:53 PM, Robert O'Callahan rob...@ocallahan.orgwrote: Actually I was just poking around and noticed that we don't actually support variation of spellcheck values within different parts of an editable element. So I won't make any claims about how hard that is to support. Doesn't the spec only define things on a per-element level of granularity? I wasn't really paying attention to this side-conversation of yours so I didn't think to confirm/refute it. But I don't think the spec in fact covers doing such a thing. PK
Re: [whatwg] Spellchecking mark III
On Dec 30, 2008, at 7:20 AM, Kornel Lesiński wrote: On 30.12.2008, at 13:45, Geoffrey Sneddon wrote: I have therefore not added this feature to HTML5 for the time being. If there is more interest in this feature, please speak up. This seems stupid. If I want to have spell-checking, let me. Don't force it off. I don't see any reason to have it forced off, ever. It's useful for fields that contain non-textual content, e.g. product ID, license plate number, CAPTCHA answer, etc. Browser would mark these as misspelt, which might be confusing or at least distracting. It does make sense I guess, that certain fields should not be subject to automatic spellchecking. However, three counterpoints: 1) At least Safari's spellchecking won't mark a word misspelled until you hit a space; fields that contain data which would be flagged by the spellchecker but which are also likely to contain internal whitespace are rare. 2) The proposal Hixie linked seems way overengineered for this purpose. First, it allows spellchecking to be explicitly turned on, potentially overriding normal defaults, but that seems wrong; an input type=email should never spellcheck regardless of the page author says. I can't see any valid use case for the author turning spellchecking on regardless of UA defaults or user preferences. Second, it allows spellchecking to be controlled at a finer granularity than editability, for which again I think there is no valid use case. Both of these aspects make the feature more complicated to implement and harder to understand, compared to just having a way to only disable spellchecking at the same granularity as editing. In general it would be helpful if some of the Google folks who requested this feature and some of the Chrome folks who (apperently) implemented it could explain the actual use cases they had in mind. Regards, Maciej
Re: [whatwg] Spellchecking mark III
On 31.12.2008, at 15:15, Maciej Stachowiak wrote: It does make sense I guess, that certain fields should not be subject to automatic spellchecking. However, three counterpoints: 1) At least Safari's spellchecking won't mark a word misspelled until you hit a space; fields that contain data which would be flagged by the spellchecker but which are also likely to contain internal whitespace are rare. In Webkit spellchecking is also done when field loses focus, so even a single-word fields would be flagged. 2) The proposal Hixie linked seems way overengineered for this purpose. First, it allows spellchecking to be explicitly turned on, potentially overriding normal defaults, but that seems wrong; an input type=email should never spellcheck regardless of the page author says. I can't see any valid use case for the author turning spellchecking on regardless of UA defaults or user preferences. Second, it allows spellchecking to be controlled at a finer granularity than editability, for which again I think there is no valid use case. Both of these aspects make the feature more complicated to implement and harder to understand, compared to just having a way to only disable spellchecking at the same granularity as editing. I don't like current proposal either, because true/false value is inconsistent with other boolean attributes in HTML. IMHO it should be nospellcheck=nospellcheck (which also solves problem of forcing spellchecking where it doesn't make sense). -- regards, Kornel
Re: [whatwg] Spellchecking mark III
On Thu, Jan 1, 2009 at 4:15 AM, Maciej Stachowiak m...@apple.com wrote: 2) The proposal Hixie linked seems way overengineered for this purpose. First, it allows spellchecking to be explicitly turned on, potentially overriding normal defaults, but that seems wrong; an input type=email should never spellcheck regardless of the page author says. I can't see any valid use case for the author turning spellchecking on regardless of UA defaults or user preferences. It allows you to have a region of text where spellchecking is disabled via the spellcheck attribute, but containing subregions where spellchecking is enabled. Second, it allows spellchecking to be controlled at a finer granularity than editability, for which again I think there is no valid use case. Both of these aspects make the feature more complicated to implement and harder to understand, compared to just having a way to only disable spellchecking at the same granularity as editing. A use case is editable program code, where spellchecking is disabled, but where spellchecking is enabled inside comments. Maybe that sounds a little far-fetched for today's Web applications, but some IDEs (e.g. Eclipse) support this so it seems like something we'd want in the future. Rob -- He was pierced for our transgressions, he was crushed for our iniquities; the punishment that brought us peace was upon him, and by his wounds we are healed. We all, like sheep, have gone astray, each of us has turned to his own way; and the LORD has laid on him the iniquity of us all. [Isaiah 53:5-6]
Re: [whatwg] Spellchecking mark III
On Wed, Dec 31, 2008 at 3:22 AM, Robert O'Callahan rob...@ocallahan.org wrote: That handles some cases, but not others --- e.g. text boxes that contain program code. I run spell checkers on code blocks. the number of misspellings that could have been avoided by using them they're actually useful for spellcheckers. and for slashdot's really lame captcha they help there too
Re: [whatwg] Spellchecking mark III
2008/12/30 Giovanni Campagna scampa.giova...@gmail.com: maybe we could just say that spellchecking is disabled when type is not text (for email, uri and number you have validation) and when a pattern attribute is specified Personally, if I were to write Gionvanni Campagna into a multiline text field. I'd like it to match the thing that i wrote into the email field (it turns out that I've managed to misspell your name, I'm sorry, but that's the point). So ideally the system which i use to spell check would be able to share information with my contacts and would also enable me to teach it spelling based on the email address fields.
Re: [whatwg] Spellchecking mark III
On Dec 31, 2008, at 12:26 PM, Robert O'Callahan wrote: On Thu, Jan 1, 2009 at 4:15 AM, Maciej Stachowiak m...@apple.com wrote: 2) The proposal Hixie linked seems way overengineered for this purpose. First, it allows spellchecking to be explicitly turned on, potentially overriding normal defaults, but that seems wrong; an input type=email should never spellcheck regardless of the page author says. I can't see any valid use case for the author turning spellchecking on regardless of UA defaults or user preferences. It allows you to have a region of text where spellchecking is disabled via the spellcheck attribute, but containing subregions where spellchecking is enabled. It seems to me you would have to have a lot of custom code to maintain the boundaries between such regions during editing operations for this to ever work right. Normal text editing would easily lead to text moving across the boundaries. There would have to be strong motivating examples to justify such a hard-to-use feature. Second, it allows spellchecking to be controlled at a finer granularity than editability, for which again I think there is no valid use case. Both of these aspects make the feature more complicated to implement and harder to understand, compared to just having a way to only disable spellchecking at the same granularity as editing. A use case is editable program code, where spellchecking is disabled, but where spellchecking is enabled inside comments. Maybe that sounds a little far-fetched for today's Web applications, but some IDEs (e.g. Eclipse) support this so it seems like something we'd want in the future. This sounds like a pretty ill-conceived feature. It is very common for comments to include code, or fragments of code (such as variable names) mixed with natural language. (I was unable to find any evidence of spellchecking comments in the copy of Eclipse I downloaded, so I can't comment on the details.) Furthermore, other IDEs generally don't attempt to do this, and I can't think of other application categories that would do something similar. So I don't think this makes for a very compelling use case. It's like arguing for a page layout feature based on something only WordPerfect does. Regards, Maciej
Re: [whatwg] Spellchecking mark III
On Thu, Jan 1, 2009 at 2:04 PM, Maciej Stachowiak m...@apple.com wrote: On Dec 31, 2008, at 12:26 PM, Robert O'Callahan wrote: A use case is editable program code, where spellchecking is disabled, but where spellchecking is enabled inside comments. Maybe that sounds a little far-fetched for today's Web applications, but some IDEs (e.g. Eclipse) support this so it seems like something we'd want in the future. This sounds like a pretty ill-conceived feature. It is very common for comments to include code, or fragments of code (such as variable names) mixed with natural language. (I was unable to find any evidence of spellchecking comments in the copy of Eclipse I downloaded, so I can't comment on the details.) OK. It's there, though. Furthermore, other IDEs generally don't attempt to do this, and I can't think of other application categories that would do something similar. Seems to me that an HTML source view with spellchecking of the non-markup text would be useful. For what it's worth, it seemed easy to implement the general spellcheck behaviour in Gecko, once we'd decided to allow any author spellcheck control at all (you seem to have agreed that spellcheck=no is useful). But I really don't feel strongly one way or the other. Peter Kasting or Brett Wilson should speak up. Rob -- He was pierced for our transgressions, he was crushed for our iniquities; the punishment that brought us peace was upon him, and by his wounds we are healed. We all, like sheep, have gone astray, each of us has turned to his own way; and the LORD has laid on him the iniquity of us all. [Isaiah 53:5-6]
Re: [whatwg] Spellchecking mark III
In 2006 I proposed the following spec for a spellcheck= attribute, based on requests from the Google engineers then working on Firefox: http://www.damowmow.com/playground/spellcheck.txt The same engineers have since implemented this feature in Chrome also, and Google does use this attribute on its sites. However, the attribute has seen very little interest outside of Google, with just a handful of sites using it, primarily in dyanamic editor libraries. I have therefore not added this feature to HTML5 for the time being. If there is more interest in this feature, please speak up. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Spellchecking mark III
On Tue, 30 Dec 2008 12:38:42 +0100, Ian Hickson i...@hixie.ch wrote: In 2006 I proposed the following spec for a spellcheck= attribute, based on requests from the Google engineers then working on Firefox: http://www.damowmow.com/playground/spellcheck.txt The same engineers have since implemented this feature in Chrome also, and Google does use this attribute on its sites. However, the attribute has seen very little interest outside of Google, with just a handful of sites using it, primarily in dyanamic editor libraries. I have therefore not added this feature to HTML5 for the time being. If there is more interest in this feature, please speak up. Opera wants to support this feature as well in due course, so I don't think we would mind it being added to HTML5. Does it being in Chrome mean it is also WebKit? If so, together with Firefox support, seems like a compelling reason to add the feature. -- Anne van Kesteren http://annevankesteren.nl/ http://www.opera.com/
Re: [whatwg] Spellchecking mark III
On 30 Dec 2008, at 11:38, Ian Hickson wrote: In 2006 I proposed the following spec for a spellcheck= attribute, based on requests from the Google engineers then working on Firefox: http://www.damowmow.com/playground/spellcheck.txt The same engineers have since implemented this feature in Chrome also, and Google does use this attribute on its sites. However, the attribute has seen very little interest outside of Google, with just a handful of sites using it, primarily in dyanamic editor libraries. I have therefore not added this feature to HTML5 for the time being. If there is more interest in this feature, please speak up. This seems stupid. If I want to have spell-checking, let me. Don't force it off. I don't see any reason to have it forced off, ever. -- Geoffrey Sneddon http://gsnedders.com/
Re: [whatwg] Spellchecking mark III
On Dec 30, 2008, at 4:55 AM, Anne van Kesteren wrote: On Tue, 30 Dec 2008 12:38:42 +0100, Ian Hickson i...@hixie.ch wrote: In 2006 I proposed the following spec for a spellcheck= attribute, based on requests from the Google engineers then working on Firefox: http://www.damowmow.com/playground/spellcheck.txt The same engineers have since implemented this feature in Chrome also, and Google does use this attribute on its sites. However, the attribute has seen very little interest outside of Google, with just a handful of sites using it, primarily in dyanamic editor libraries. I have therefore not added this feature to HTML5 for the time being. If there is more interest in this feature, please speak up. Opera wants to support this feature as well in due course, so I don't think we would mind it being added to HTML5. Does it being in Chrome mean it is also WebKit? If so, together with Firefox support, seems like a compelling reason to add the feature. The Google Chrome team has not submitted patches for such a feature to WebKit. I am not sure if they plan to eventually submit it to mainline WebKit. In fact, this is the first I've heard about Chrome having such an extension. It's not clear to me whether the feature is useful without seeing some motivating examples. WebKit by default spellchecks (and grammar checks) all editable parts of the document, and it is not obvious to me why one would want to force it off for particular form controls or editable HTML areas. Regards, Maciej
Re: [whatwg] Spellchecking mark III
On Tue, Dec 30, 2008 at 8:50 AM, Maciej Stachowiak m...@apple.com wrote: On Dec 30, 2008, at 4:55 AM, Anne van Kesteren wrote: On Tue, 30 Dec 2008 12:38:42 +0100, Ian Hickson i...@hixie.ch wrote: In 2006 I proposed the following spec for a spellcheck= attribute, based on requests from the Google engineers then working on Firefox: http://www.damowmow.com/playground/spellcheck.txt The same engineers have since implemented this feature in Chrome also, and Google does use this attribute on its sites. However, the attribute has seen very little interest outside of Google, with just a handful of sites using it, primarily in dyanamic editor libraries. I have therefore not added this feature to HTML5 for the time being. If there is more interest in this feature, please speak up. Opera wants to support this feature as well in due course, so I don't think we would mind it being added to HTML5. Does it being in Chrome mean it is also WebKit? If so, together with Firefox support, seems like a compelling reason to add the feature. The Google Chrome team has not submitted patches for such a feature to WebKit. I am not sure if they plan to eventually submit it to mainline WebKit. In fact, this is the first I've heard about Chrome having such an extension. It's not clear to me whether the feature is useful without seeing some motivating examples. WebKit by default spellchecks (and grammar checks) all editable parts of the document, and it is not obvious to me why one would want to force it off for particular form controls or editable HTML areas. Agreed. This feature lives purely in user-space. It can be convenient for a user to be able to turn off spellchecking globally, or perhaps even locally (FF exposes this currently through a right-click option on editable areas), but I cannot see any reason for an author to have control over this. If I want to spellcheck an area, I want to spellcheck it. If I don't, I don't. ~TJ
Re: [whatwg] Spellchecking mark III
On 30.12.2008, at 13:45, Geoffrey Sneddon wrote: I have therefore not added this feature to HTML5 for the time being. If there is more interest in this feature, please speak up. This seems stupid. If I want to have spell-checking, let me. Don't force it off. I don't see any reason to have it forced off, ever. It's useful for fields that contain non-textual content, e.g. product ID, license plate number, CAPTCHA answer, etc. Browser would mark these as misspelt, which might be confusing or at least distracting. -- regards, Kornel
Re: [whatwg] Spellchecking mark III
On Tue, Dec 30, 2008 at 5:20 PM, Kornel Lesiński kor...@geekhood.net wrote: It's useful for fields that contain non-textual content, e.g. product ID, license plate number, CAPTCHA answer, etc. Browser would mark these as misspelt, which might be confusing or at least distracting. this sounds like something browser vendors need to worry about on their own and is not a reason to let web pages do anything about it.
Re: [whatwg] Spellchecking mark III
2008/12/30 timeless timel...@gmail.com On Tue, Dec 30, 2008 at 5:20 PM, Kornel Lesiński kor...@geekhood.net wrote: It's useful for fields that contain non-textual content, e.g. product ID, license plate number, CAPTCHA answer, etc. Browser would mark these as misspelt, which might be confusing or at least distracting. this sounds like something browser vendors need to worry about on their own and is not a reason to let web pages do anything about it. maybe we could just say that spellchecking is disabled when type is not text (for email, uri and number you have validation) and when a pattern attribute is specified Giovanni
Re: [whatwg] Spellchecking mark III
2008/12/31 timeless timel...@gmail.com On Tue, Dec 30, 2008 at 5:20 PM, Kornel Lesiński kor...@geekhood.net wrote: It's useful for fields that contain non-textual content, e.g. product ID, license plate number, CAPTCHA answer, etc. Browser would mark these as misspelt, which might be confusing or at least distracting. this sounds like something browser vendors need to worry about on their own and is not a reason to let web pages do anything about it. The browser can't know ahead of time that a text field is not supposed to contain natural-language text. Rob -- He was pierced for our transgressions, he was crushed for our iniquities; the punishment that brought us peace was upon him, and by his wounds we are healed. We all, like sheep, have gone astray, each of us has turned to his own way; and the LORD has laid on him the iniquity of us all. [Isaiah 53:5-6]
Re: [whatwg] Spellchecking mark III
2008/12/31 Giovanni Campagna scampa.giova...@gmail.com 2008/12/30 timeless timel...@gmail.com On Tue, Dec 30, 2008 at 5:20 PM, Kornel Lesiński kor...@geekhood.net wrote: It's useful for fields that contain non-textual content, e.g. product ID, license plate number, CAPTCHA answer, etc. Browser would mark these as misspelt, which might be confusing or at least distracting. this sounds like something browser vendors need to worry about on their own and is not a reason to let web pages do anything about it. maybe we could just say that spellchecking is disabled when type is not text (for email, uri and number you have validation) and when a pattern attribute is specified That handles some cases, but not others --- e.g. text boxes that contain program code. Rob -- He was pierced for our transgressions, he was crushed for our iniquities; the punishment that brought us peace was upon him, and by his wounds we are healed. We all, like sheep, have gone astray, each of us has turned to his own way; and the LORD has laid on him the iniquity of us all. [Isaiah 53:5-6]
Re: [whatwg] Spellchecking mark III
Robert O'Callahan ha scritto: 2008/12/31 Giovanni Campagna scampa.giova...@gmail.com mailto:scampa.giova...@gmail.com 2008/12/30 timeless timel...@gmail.com mailto:timel...@gmail.com On Tue, Dec 30, 2008 at 5:20 PM, Kornel Lesiński kor...@geekhood.net mailto:kor...@geekhood.net wrote: It's useful for fields that contain non-textual content, e.g. product ID, license plate number, CAPTCHA answer, etc. Browser would mark these as misspelt, which might be confusing or at least distracting. this sounds like something browser vendors need to worry about on their own and is not a reason to let web pages do anything about it. maybe we could just say that spellchecking is disabled when type is not text (for email, uri and number you have validation) and when a pattern attribute is specified That handles some cases, but not others --- e.g. text boxes that contain program code. Rob -- He was pierced for our transgressions, he was crushed for our iniquities; the punishment that brought us peace was upon him, and by his wounds we are healed. We all, like sheep, have gone astray, each of us has turned to his own way; and the LORD has laid on him the iniquity of us all. [Isaiah 53:5-6] Indeed, that's a valid use case. Anyway, I don't think such a spec should and _would_ prevent UAs from giving users a chance to bypass the 'spellcheck=' attribute (e.g., such an attribute may overcome a UA default value, as spec'ed out, but the user may be notified of it, and a UA context menu option may allow a different setting, just as a resort in case of misuses/errors, such in the example of a 'spellcheck= false' applied to a box containing some code). The language to check might be choosen from several sources, such as the 'lang' attribute of the contenteditable element itself, if different from the document language. For instance, a blog editor's interface document might not be translated in a certain language, whereas allowing content creation in that language and giving the author a chance to set the proper language for a spell checker by changing (through script) the editor box language. A possible evolution, if required upon time, might involve a further attribute referencing an external dictionary file, perhaps in a standard format, or in a format a UA can recognize (thus, indicating alternatives), and using the 'spellcheck' attribute when no appropriate language/dictionary can be specified, or to say that just the specified dictionary/dictionaries must be used. Best Regards, Alex -- Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP autenticato? GRATIS solo con Email.it http://www.email.it/f Sponsor: Proteggi la tua auto * Garanzia furto e incendio a soli 30 euro! Offerta valida fino al 31 Dicembre! Non perdere l�occasione! * Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8509d=31-12
Re: [whatwg] Spellchecking mark III
Calogero Alex Baldacchino ha scritto: The language to check might be choosen from several sources, such as the 'lang' attribute of the contenteditable element itself, if different from the document language. For instance, a blog editor's interface document might not be translated in a certain language, whereas allowing content creation in that language and giving the author a chance to set the proper language for a spell checker by changing (through script) the editor box language. Or, perhaps, the editor interface might be negotiated basing on the author's language settings, but he/she might be interested to write a content in a foreign language, thus wishing spellcheking in that language (if allowed by a UA's capabilities). Best Regards, Alex -- Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP autenticato? GRATIS solo con Email.it http://www.email.it/f Sponsor: Proteggi la tua auto * Garanzia furto e incendio a soli 30 euro! Offerta valida fino al 31 Dicembre! Non perdere l�occasione! * Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8509d=31-12
Re: [whatwg] Spellchecking mark III
Ian Hickson wrote: 3. Otherwise, if the user has disabled the checking for this text, then the checking is disabled. 4. Otherwise, if the user has forced the checking for this text to always be enabled, then the checking is enabled. 5. Otherwise, if the element with which the text is associated has a spellcheck content attribute, then: if that attribute has the literal value on, then checking is enabled; otherwise, if that attribute has the literal value off, then checking is disabled; otherwise, move on to the next step. How does this get away from the Check Spelling: ( ) No ( ) Yes(i.e. when the page says) ( ) Really, really, yes(i.e. always, whatever the page says) preference problem? Gerv
Re: [whatwg] Spellchecking mark III
The more I think about this the more I believe that the correct choise would be to describe the expected content more accurately. The UA may then proceed to accurately turn spellchecking on or off. The problem is that the lang attribute allows only stuff defined in RFC 3066, which seems to support only ISO 639 defined language tags. That is, the expressable languages are limited to *spoken* languages. Ian Hickson wrote: On Sun, 11 Jun 2006, Alexey Feldgendler wrote: Information like this input field should have autoindent is presentational. Yeah, but you'd have to say auto-indent this like C++, which isn't. IMHO. Perhaps instead of using |spellcheck| attribute as a toggle, allow white space separated list of expected input languages. If user is expected to enter C++ code with English comments, then author should use markup such as textarea lang=zzz spellcheck=c++ en for no linguistic content with spell checking for c++ and English. An another option would be to expand the lang attribute to allow languages outside human languages. This has the added bonus that the lang attribute could describe also other content more accurately. RFC 3066 reserves language codes starting with x- for private use and that could be used to aid spellchecking, too. Unfortunately only A-Z,0-9 are allowed so perhaps something like textarea lang=x-cpp-en for private language cpp-en or C++ with English comments. Or if lang attribute is extended to allow multiple languages listed then one could write textarea lang=en x-cpp for English text mixed with C++ code (which is less accurate than the x-cpp-en above). The GMail To: input field could be expressed as textarea lang=x-mail-to and UAs that don't regognize language x-mail-to should turn off the spellchecking. A typical blog input field could be encoded as textarea lang=x-html-fragment-en Here one sees more need for multiple language tags inside the lang attribute. It would make more sense to use lang=x-html-fragment en or there would be need for *very* many private languages starting with x-html-fragment- including x-html-fragment-sv-fi. On Fri, 23 Jun 2006, Sander Tekelenburg wrote: [AUTHOR REQUIREMENTS] Authors should set the document's language information, to enable user agents to accurately determine which dictionary to use when checking the spelling or grammar of user input. IMO this should should be a must. What about if the author doesn't know the language? ISO 639 Part 2 includes und for undetermined language. A sane default for UA is to disable the spell checking. Or use some unknown heuristic to define the language itself. On Sat, 24 Jun 2006, Alexey Feldgendler wrote: Even worse: when entering text in textarea, the user actually has a choice which language to write in. I think the user agent should provide, besides just the control to turn spellchecking on and off, a choice of languages. Agreed. If a form expects some English text to be entered, it would be wise to mark text written with any other language as incorrectly spelled. If author expects any language then he should specify lang=mul for multiple languages (again, defined by ISO 639 part 2). Again, a list of acceptable languages would be nice here. -- Mikko
[whatwg] Spellchecking mark III
I believe this answers all outstanding e-mails on the subject, please let me know if I missed one. I include a new proposal (still with a spellcheck= attribute, based mostly on implementation feedback from the Mozilla guys) at the end of the e-mail. On Sun, 11 Jun 2006, Robert Gr�sdal wrote: How about something like cascading behaviour sheets? style type=text/cbs #first-name { inputmode: user startUpper; spellcheck: enabled; } .cplusplus { spellcheck: disabled; autoindent: C++; highlight: C++; auto-evaluate: disabled; } .math { spellcheck: math; inputmode: math; highlight: math; auto-evaluate: enabled; } /style I'd hate to have to specify all those attributes to every single input field for sure. Well, it wouldn't be stylistic per se, so I don't think it would belong in the stylesheet. Even if you disable the styles, the spellcheck settings still apply. By the way - Hello everyone, this is my first post to the list. Welcome! On Sun, 11 Jun 2006, Matthew Raymond wrote: If, however, we're really just talking about adding words to the UA dictionary temporarily and for a specific site, couldn't we just do that with meta using the same format as we do with keywords? | meta name=vocabulary lang=en-us | content=HTML5, WHATWG, WF2, WA1, WD1, CSS3-UI, TARDIS, ZPM, DHD Are there actually situations where different controls would need different vocabulary?!? That's an interesting idea, but I'd shy away from doing this for now. Let's start small and build up... On Sun, 11 Jun 2006, Alexey Feldgendler wrote: Maybe features like spellckeching, syntax highlighting and so on should be controlled via CSS? That way, they can be fine-tuned to any degree of precision without complicating the HTML schema. This will also reduce verbosity of input elements because they would otherwise have the same repeated attributes. While I certainly sympathise with the concern of heavy input elements (that's why I was against the spellcheck= attribute in the first place), I don't think this is stylistic. Maybe we need another kind of macro capability, such as XBL, for this problem. Information like this input field should have autoindent is presentational. Yeah, but you'd have to say auto-indent this like C++, which isn't. IMHO. You should raise this in the www-style list, though, if you do think it is appropriate. On Sun, 11 Jun 2006, Lachlan Hunt wrote: No, spell checking is a user agent feature that should be controlled by the UA and the user. Authors should have no explicit control over it. Besides, spell checking *is not* presentation, it is UA functionality and so it does not belong in the presentation layer. I agree that it isn't presentation, but I disagree that the author shouldn't be able to suggest whether or not to enable it. More on this below. On Sun, 11 Jun 2006, Alexey Feldgendler wrote: Besides, spell checking *is not* presentation, it is UA functionality and so it does not belong in the presentation layer. Visual elements = Presentation Interactive elements = Behavior I think these are similar relationships. BTW, isn't the cursor CSS property about behavior? The behaviour vs presentation debate is a rat hole. The key thing is should this continue to work if you disable the author stylesheet and should this continue to work if you stop using a screen. It is clear, IMHO, that spellchecking being enabled or not is independent of both whether the author's stylesheet is enabled or not and whether the content is being displayed on a screen or not. On Sun, 11 Jun 2006, Alexey Feldgendler wrote: One can also say that authors should not have explicit control over whether hyperlinks are underlined or not. The difference is that underlining is presentation, spell checking is not. The functionality of a link cannot be changed with CSS, likewise spell checking shouldn't either. Enabling or disabling spell checking doesn't change the functionality of an input. Sure it does. It changes whether or not the user's typos will be flagged to the user or not. That seems like quite a big difference. It can still be used to submit arbitrary text to the server. By that argument. type=text vs type=password is a presentational aspect. Or even more, type=text vs type=checkbox. But misspelled words in an input with spellchecking enabled are underlined with a wavy red line (and the underlining style could even be changed by CSS), and that's presentation. I agree that the line itself should be stylable in CSS (if at all), but I disagree that the presence or absence of the line is a stylistic matter. On Mon, 12 Jun 2006, Lachlan Hunt wrote: While the core functionality of allowing the user to enter text isn't changed, I'd consider spell checking to be part of the control's functionality, and so disabling it
Re: [whatwg] Spellchecking mark III
At 23:56 + UTC, on 2006-06-29, Ian Hickson wrote: [...] On Mon, 12 Jun 2006, Alexey Feldgendler wrote: There's nothing really bad in allowing CSS to control behavior to some extent. CSS is the part of the document that can be disabled/replaced. If disabling the author styles changes the functionality of the page, then that's bad. Agreed. [...] On Thu, 15 Jun 2006, Sander Tekelenburg wrote: [...] Just like authors cannot know what font size is best for a user they cannot know whether a spellchecker is useful or a nuisance. But they can suggest what font-size might be most appropriate. I don't see how allowing authors to abuse one thing is an argument to give them more things to abuse :) I'm well aware that *everything* can be abused, but when something is useful enough then its potential for abuse is a downside you choose to live with. When something is not useful enough, the abusability argument should win. [...] On Fri, 23 Jun 2006, Sander Tekelenburg wrote: [AUTHOR REQUIREMENTS] Authors should set the document's language information, to enable user agents to accurately determine which dictionary to use when checking the spelling or grammar of user input. IMO this should should be a must. What about if the author doesn't know the language? Of the user input you mean? Good point. But then what if a page is in english but accepts input in any language? The author should still indicate the content's language, thereby triggering the wrong spellchecker for those who wish to input text in another language. In turn, the result may well be that authors set no lang attribute at all for the page. Surely a spec shouldn't push authors in that direction? A solution might be to make this a *must* and allow lang=*, so the author can state lang=en on the body, and lang=* on the input field. I still seriously doubt authors will use this as intended though... IMPLEMENTATION REQUIREMENTS All elements can have spellchecking enabled or disabled. UAs may allow the user to set this flag Why may? Why not must? [...] Because you can't require a particular UI. For example, the UA could be a kiosk-style system, where the user is not to have any ability to do anything but enter his text and hit submit. Good point. But if I'm not mistaken HTML 4.01 already says that some things do not apply to UAs that can't implement them. I think you should at least change this to a should, and add a note to the spec that explains that this means user-agents that can should, and only those that can't are excused. (To be clear: my argument is that the spec should do its best to avoid giving user-agents an excuse to not bother giving the user (easy) control.) [...] On Fri, 23 Jun 2006, Lachlan Hunt wrote: I don't particularly like giving the authors any control over spell checking. For the majority of cases, I think browsers should become smart enough to know whether or not to enable/disable spell checking without any explicit author input, based on various heuristics (as I've written about before [1]). In other words, for most cases, authors should not need to use this attribute. The request for this attribute came from a UA in the first place. This would seem to suggest they can't find a way to be smart enough, and would like author input. Not meaning to be disrespectful, but can't suggests it's simply too difficult technically, while it could just as well be that they prefer to take the easier way out, for instance because they can't afford spending the necessary resources on this. That can be a perfectly valid argument for an individual browser vendor, but it's hardly a solid basis for a HTML spec. Especially when the request comes from a single browser vendor and everybody else seems to see more problems than benefits. [...] Ok, so how can we ensure that spell checking is enable for GMail's To: line but enabled for its Subject line? Ordinarily, input type=email would handle no spell checking for email addresses, but given that Gmail uses a textarea that contains both people's names and email addresses, that may be one case where heuristics may not give optimal results. Indeed. So how should we do it, if not using an attribute to hint to the UA whether it should be enabled or not? I can't follow this. If a site uses 2 types of content within the same field and wants one of those types to be spellchecked, and the other not, how is a |spellcheck| attribute going to help? They'll need to split those 2 types of content into 2 different fields. (That aside, I don't see how users would have a problem with spellcheckers indicating spell errors on email addresses or names. Surely they're already used to ignoring that.) [...] since the entire discussion here was spawned by one such browser vendor saying we need a way for authors to control this!, I would suggest that browser vendors have determined that they need a way for authors to control this. :-)