Re: [whatwg] Spellchecking mark III

2009-02-26 Thread Ian Hickson

Apologies in advance if this covers old ground, it appears I missed some 
e-mails in the last round of e-mails about this topic.

On Tue, 30 Dec 2008, Anne van Kesteren wrote:
 
 Opera wants to support this feature as well in due course, so I don't 
 think we would mind it being added to HTML5. Does it being in Chrome 
 mean it is also WebKit? If so, together with Firefox support, seems like 
 a compelling reason to add the feature.

On Tue, 30 Dec 2008, Maciej Stachowiak wrote:
 
 The Google Chrome team has not submitted patches for such a feature to 
 WebKit. I am not sure if they plan to eventually submit it to mainline 
 WebKit. In fact, this is the first I've heard about Chrome having such 
 an extension.
 
 It's not clear to me whether the feature is useful without seeing some 
 motivating examples. WebKit by default spellchecks (and grammar checks) 
 all editable parts of the document, and it is not obvious to me why one 
 would want to force it off for particular form controls or editable HTML 
 areas.

On Tue, 30 Dec 2008, Tab Atkins Jr. wrote:
 
 Agreed.  This feature lives purely in user-space.  It can be convenient 
 for a user to be able to turn off spellchecking globally, or perhaps 
 even locally (FF exposes this currently through a right-click option on 
 editable areas), but I cannot see any reason for an author to have 
 control over this.  If I want to spellcheck an area, I want to 
 spellcheck it.  If I don't, I don't.

On Tue, 30 Dec 2008, Kornel Lesi�~Dski wrote:
 
 It's useful for fields that contain non-textual content, e.g. product 
 ID, license plate number, CAPTCHA answer, etc. Browser would mark 
 these as misspelt, which might be confusing or at least distracting.

[snip more discussion back and forth about whether it's a good idea or 
not, or whether we could come up with some heuristics for it instead]

Based on the interest (not uniform interest, but interest nonetheless) on 
this topic, I've left the feature in the spec.

I don't think that heuristics would work -- in practice, little 
distinguishes the subject line from the To: line in GMail, for instance, 
but one wants spell checking and the other does not.


On Wed, 31 Dec 2008, Maciej Stachowiak wrote:
 
 The proposal Hixie linked seems way overengineered for this purpose. 

Yeah, it's certainly not the simplest thing that could have been invented, 
I'll give you that.


 First, it allows spellchecking to be explicitly turned on, potentially 
 overriding normal defaults, but that seems wrong; an input 
 type=email should never spellcheck regardless of the page author 
 says.

The user agent is allowed to override the author here, if desired.

The applicability to input type=email fields is mostly just a 
side-effect of the attribute applying to everything, which is because we 
want it to apply to contentEditable. The true state is so that subparts 
of contentEditable fields can have checking enabled when outer parts have 
it disabled.


 I can't see any valid use case for the author turning spellchecking on 
 regardless of UA defaults or user preferences. Second, it allows 
 spellchecking to be controlled at a finer granularity than editability, 
 for which again I think there is no valid use case. Both of these 
 aspects make the feature more complicated to implement and harder to 
 understand, compared to just having a way to only disable spellchecking 
 at the same granularity as editing.

In contentEditable, it's easy to imagine that some parts shouldn't be 
spellchecked when others should, e.g. the editor might introduce a URL and 
not want that checked.


On Wed, 31 Dec 2008, Kornel Lesi�~Dski wrote:
 
 I don't like current proposal either, because true/false value is 
 inconsistent with other boolean attributes in HTML.

It's consistent with contentEditable, which it's intended to be used with.


 IMHO it should be nospellcheck=nospellcheck (which also solves problem 
 of forcing spellchecking where it doesn't make sense).

That's a pretty ugly attribute name, though.


On Thu, 1 Jan 2009, Robert O'Callahan wrote:
 
 A use case is editable program code, where spellchecking is disabled, 
 but where spellchecking is enabled inside comments. Maybe that sounds a 
 little far-fetched for today's Web applications, but some IDEs (e.g. 
 Eclipse) support this so it seems like something we'd want in the 
 future.

BeSpin, for instance, might want this, if they ever switch from canvas 
to contentEditable.


On Wed, 31 Dec 2008, Maciej Stachowiak wrote:
 
 So I don't think this makes for a very compelling use case. It's like 
 arguing for a page layout feature based on something only WordPerfect 
 does.

I agree that it seems a bit overpowerful. Experience from Gecko suggests 
it's not all that bad though.


On Sat, 14 Feb 2009, Kristof Zelechovski wrote:

 The following sentences are *commands* that refer to browser actions:
 
   Let automatic completion be turned _on_. (command)
   Let spell checking be turned 

Re: [whatwg] Spellchecking mark III

2009-02-13 Thread Ian Hickson
On Thu, 12 Feb 2009, Kristof Zelechovski wrote:

 Regarding http://html5.org/tools/web-apps-tracker?from=2800to=2801, my 
 requests:
 
 1. Change the literals true/false to on/off, leaving the DOM values
 Boolean.

There are three of these attributes so far:

  autocomplete = on/off
  contenteditable = true/false
  draggable = true/false

I used true/false for spellcheck since it had slightly more other 
attributes doing the same thing.

Also, it's been implemented twice now, so using other keywords is a 
problem.


 2. Check the spelling of the passage (asits!) :0)

Fixed.


 3. Say that the default behavior for BODY is on and the default behavior
 for INPUT[type=text] is off.

The default behavior is user-agent-dependent. This is intentional since 
different users may have different needs.


 4. (I understand that it is implicit that this SHOULD indicate does 
 not make tiny clients that do not have the resources non-compliant?)

Correct.


 Stretching it a bit, a user's language always matches the site's, 
 otherwise the user would not be able to submit to the site anything that 
 makes sense, except when the site is a gateway for submissions to an 
 uninvolved third party, in which case said submissions should be tagged 
 with the language of submission anyway (IMHO).

On Thu, 12 Feb 2009, Bil Corry wrote:

 Let me give you an example where this isn't true.  I'm in the United 
 States and I do contract work for a company in Germany.  At the German 
 company, they have an internal bug tracker for their intranet 
 applications.  Usually the bug descriptions are written in German, 
 except mine, which are in English.  So they will submit bugs in both 
 German and in English, depending on who is taking care of the issue.
 
 How do you envision the UA will determine which language the user is 
 writing in?  And what happens when the user submits both German AND 
 English, for two audiences?

On Thu, 12 Feb 2009, Kristof Zelechovski wrote:

 The server has two ways of knowing the user's preferred language: the 
 user's preferences and the browser settings, in that order.

 Submitting in two languages usually needs two controls, one for English 
 and one for German, with appropriate markup.  The server must be 
 prepared to handle this use case.

On Thu, 12 Feb 2009, Aryeh Gregor wrote:
 
 Both of which are often wrong.  Users may be multilingual, and multiple 
 users may use the same computer.  On the forum I administer, I post 
 almost exclusively in English.  However, sometimes I find occasion to 
 write a post partly or wholly in Hebrew.  How is the site supposed to 
 know when I'll decide to do that before I even start typing the post?  
 How can the site ever be sure what language the user will type until he 
 actually starts typing?
 
 The server might be able to make an educated guess as to what language 
 will be entered, but so can the browser.  And the browser is in a *much* 
 better position to check that guess, because it has access in real time 
 to the actual text the user is typing, plus the user interface language, 
 and -- of course -- any lang= or xml:lang= attributes specified in the 
 HTML.  Ergo, the logic should be left up to the browser.

On Thu, 12 Feb 2009, Kristof Zelechovski wrote:

 The language attribute can be changed at run time if needed.  It 
 requires an additional event that can be called langmismatch.  Of 
 course, a more traditional selector is also a solution.  If the site is 
 primary English, with Hebrew fragments here and there, it is not much 
 harm that the fragments are considered spelling errors (although, in the 
 case of English/Hebrew bilingualism, it is unlikely because the 
 character set is different). In short, the user agent is allowed to use 
 whatever AI it is equipped with.
 
 Markup for German AND English submissions at the same time, as per your 
 request:

 LABEL LANG=de Inhalt: TEXTAREA NAME=INHALT /TEXTAREA /LABEL 
 LABEL LANG=de Contents: TEXTAREA NAME=CONTENTS /TEXTAREA /LABEL 

On Thu, 12 Feb 2009, Bil Corry wrote:
 
 In my case, we have a single field, bug description that may contain 
 both English and German.  And in some cases, even a pure German bug 
 report may reference the English form fields, such as:
 
   Legen Sie City vor Postal Code
 
 In that case, there is no way for a UA or Server to auto-determine the 
 language, even if you're aware the user speaks both German and English.
 
 My suggestion is to leave the lang attribute out of the spec, and let 
 the UA handle it as it wants.

On Thu, 12 Feb 2009, K�~Yištof Želechovski wrote:

 Having interjected words marked as spelling errors is not a failure.  
 The same phenomenon occurs with proper names and you cannot help that. 
 The UI you described is inconsistent and it should be fixed.  The 
 control for German should be labeled Fehlerbeſchreibung or whatever.

On Thu, 12 Feb 2009, Kristof Zelechovski wrote:

 I do not know much about UI standards but the rule that the answer 
 

Re: [whatwg] Spellchecking mark III

2009-02-12 Thread Ian Hickson

The discussion on spellcheck= focused on two ideas; using spellcheck= 
mostly as specced here:

   http://damowmow.com/playground/spellcheck.txt

...and doing something with lang=. The idea of using lang= had 
problems that were pointed out by several people, most notably, the issue 
that the user's language doesn't always match the site's. I think this 
makes it inappropriate for this use.

I have added spellcheck= to the spec.

If there's anything in the feedback that I missed, please let me know. I 
read every e-mail but there didn't seem to be anything specific that I 
should comment on.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Spellchecking mark III

2009-02-12 Thread Kristof Zelechovski
Regarding http://html5.org/tools/web-apps-tracker?from=2800to=2801, my
requests:

1. Change the literals true/false to on/off, leaving the DOM values
Boolean.
2. Check the spelling of the passage (asits!) :0)
3. Say that the default behavior for BODY is on and the default behavior
for INPUT[type=text] is off.
4. (I understand that it is implicit that this SHOULD indicate does not
make tiny clients that do not have the resources non-compliant?)

Stretching it a bit, a user's language always matches the site's, otherwise
the user would not be able to submit to the site anything that makes sense,
except when the site is a gateway for submissions to an uninvolved third
party, in which case said submissions should be tagged with the language of
submission anyway (IMHO).

Best regards,
Chris







Re: [whatwg] Spellchecking mark III

2009-02-12 Thread Bil Corry
Kristof Zelechovski wrote on 2/12/2009 6:24 AM: 
 Stretching it a bit, a user's language always matches the site's, 
 otherwise the user would not be able to submit to the site anything 
 that makes sense, except when the site is a gateway for submissions 
 to an uninvolved third party in which case said submissions should be
 tagged with the language of submission anyway (IMHO).

Let me give you an example where this isn't true.  I'm in the United States and 
I do contract work for a company in Germany.  At the German company, they have 
an internal bug tracker for their intranet applications.  Usually the bug 
descriptions are written in German, except mine, which are in English.  So they 
will submit bugs in both German and in English, depending on who is taking care 
of the issue.

How do you envision the UA will determine which language the user is writing 
in?  And what happens when the user submits both German AND English, for two 
audiences?


- Bil



Re: [whatwg] Spellchecking mark III

2009-02-12 Thread Kristof Zelechovski
The server has two ways of knowing the user's preferred language: the user's
preferences and the browser settings, in that order.
Submitting in two languages usually needs two controls, one for English and
one for German, with appropriate markup.  The server must be prepared to
handle this use case.
HTH,
Chris






Re: [whatwg] Spellchecking mark III

2009-02-12 Thread Aryeh Gregor
On Thu, Feb 12, 2009 at 8:57 AM, Kristof Zelechovski
giecr...@stegny.2a.pl wrote:
 The server has two ways of knowing the user's preferred language: the user's
 preferences and the browser settings, in that order.

Both of which are often wrong.  Users may be multilingual, and
multiple users may use the same computer.  On the forum I administer,
I post almost exclusively in English.  However, sometimes I find
occasion to write a post partly or wholly in Hebrew.  How is the site
supposed to know when I'll decide to do that before I even start
typing the post?  How can the site ever be sure what language the user
will type until he actually starts typing?

The server might be able to make an educated guess as to what language
will be entered, but so can the browser.  And the browser is in a
*much* better position to check that guess, because it has access in
real time to the actual text the user is typing, plus the user
interface language, and -- of course -- any lang= or xml:lang=
attributes specified in the HTML.  Ergo, the logic should be left up
to the browser.

 Submitting in two languages usually needs two controls, one for English and
 one for German, with appropriate markup.  The server must be prepared to
 handle this use case.

I don't understand what you mean here.


Re: [whatwg] Spellchecking mark III

2009-02-12 Thread Kristof Zelechovski
The language attribute can be changed at run time if needed.  It requires an
additional event that can be called langmismatch.  Of course, a more
traditional selector is also a solution.  If the site is primary English,
with Hebrew fragments here and there, it is not much harm that the fragments
are considered spelling errors (although, in the case of English/Hebrew
bilingualism, it is unlikely because the character set is different).
In short, the user agent is allowed to use whatever AI it is equipped with.

Markup for German AND English submissions at the same time, as per your
request:
LABEL LANG=de Inhalt: TEXTAREA NAME=INHALT /TEXTAREA /LABEL 
LABEL LANG=de Contents: TEXTAREA NAME=CONTENTS /TEXTAREA /LABEL 

HTH,
Chris





Re: [whatwg] Spellchecking mark III

2009-02-12 Thread Bil Corry
Kristof Zelechovski wrote on 2/12/2009 9:05 AM: 
 Markup for German AND English submissions at the same time, as per your
 request:
 LABEL LANG=de Inhalt: TEXTAREA NAME=INHALT /TEXTAREA /LABEL 
 LABEL LANG=de Contents: TEXTAREA NAME=CONTENTS /TEXTAREA /LABEL 

In my case, we have a single field, bug description that may contain both 
English and German.  And in some cases, even a pure German bug report may 
reference the English form fields, such as:

Legen Sie City vor Postal Code

In that case, there is no way for a UA or Server to auto-determine the 
language, even if you're aware the user speaks both German and English.

My suggestion is to leave the lang attribute out of the spec, and let the UA 
handle it as it wants.


- Bil



Re: [whatwg] Spellchecking mark III

2009-02-12 Thread Křištof Želechovski
Having interjected words marked as spelling errors is not a failure.  The same 
phenomenon occurs with proper names and you cannot help that.
The UI you described is inconsistent and it should be fixed.  The control for 
German should be labeled Fehlerbeſchreibung or whatever.
Best regards,
Chris

-Original Message-
From: whatwg-boun...@lists.whatwg.org [mailto:whatwg-boun...@lists.whatwg.org] 
On Behalf Of Bil Corry
Sent: Thursday, February 12, 2009 5:05 PM
To: wha...@whatwg.org
Subject: Re: [whatwg] Spellchecking mark III

Kristof Zelechovski wrote on 2/12/2009 9:05 AM: 
 Markup for German AND English submissions at the same time, as per your
 request:
 LABEL LANG=de Inhalt: TEXTAREA NAME=INHALT /TEXTAREA /LABEL 
 LABEL LANG=de Contents: TEXTAREA NAME=CONTENTS /TEXTAREA /LABEL 

In my case, we have a single field, bug description that may contain both 
English and German.  And in some cases, even a pure German bug report may 
reference the English form fields, such as:

Legen Sie City vor Postal Code

In that case, there is no way for a UA or Server to auto-determine the 
language, even if you're aware the user speaks both German and English.

My suggestion is to leave the lang attribute out of the spec, and let the UA 
handle it as it wants.


- Bil




Re: [whatwg] Spellchecking mark III

2009-02-12 Thread Bil Corry
Křištof Želechovski wrote on 2/12/2009 10:15 AM: 
 The UI you described is inconsistent and it should be fixed.

Inconsistent with which UI standard?


- Bil



Re: [whatwg] Spellchecking mark III

2009-02-12 Thread Kristof Zelechovski
I do not know much about UI standards but the rule that the answer should be
formulated in the language of the question is rather straightforward.  It is
just common sense.  Exceptions are questions like How is that in German?.
Chris






Re: [whatwg] Spellchecking mark III

2009-02-12 Thread Bil Corry
Kristof Zelechovski wrote on 2/12/2009 11:06 AM: 
 I do not know much about UI standards but the rule that the answer should be
 formulated in the language of the question is rather straightforward.  It is
 just common sense.  Exceptions are questions like How is that in German?.

No one can control the language a user will choose to use in a textarea, 
regardless of the label used to describe it.

Providing a localized textarea for every language might increase the odds of 
the user using the language the server prefers, but there is no guarantee.  And 
I'm unclear what problem that would ultimately solve.


- Bil



Re: [whatwg] Spellchecking mark III

2009-02-12 Thread Kristof Zelechovski
The majority of users will answer the question in the language of the
question, this is the normal reaction.  Of course there is no guarantee but
the odds of getting the expected result are high.  Assuming that the user's
input will actually be read by somebody, providing proper markup will help
the readers to get something they are able to read.
Chris





Re: [whatwg] Spellchecking mark III

2009-01-28 Thread Peter Kasting
On Wed, Jan 28, 2009 at 2:35 AM, Křištof Želechovski k...@mimuw.edu.plwrote:

  *No, the _original_ use was to turn it on on fields where it would
 otherwise have been on.
 *



 I do not understand.  If spell checking would be on, why turn it on
 explicitly?

I mistyped.  The last word should have been off.

 If the control is not expected to contain a private language, it should be
 subject to spell checking.

This thread has already had multiple examples of cases where this is untrue.
 Spelling quizzes, address fields, etc.  And even if it were true, it's not
the way browsers behave today (e.g. Firefox does not spellcheck single-line
fields, precisely to avoid a lot of cases like this), and changing those
defaults to be something non-annoying, using complex heuristics, is
significantly harder (in terms of your time/money cost below) compared to
supporting the attribute.

 Avoiding an additional attribute is a gain,

  Why?

 Because adding an additional attribute costs time and money.

To whom?  What tradeoff are you making?  Keying spellchecking off language
support costs engineering time too, for the UA.  And for a web author.  All
changes have costs.  The point here seems like a vague principle rather than
a specific application.

   Which no one will ever use, because users aren't going to take the
 trouble to declare such a thing when human recipients can just _read the
 text_.  After all, WE have built-in language detectors in our heads.

 We disagree here but further discussion is void unless you have the
 resources necessary to perform an investigation of the subject.

If you need data to prove that people will not make the effort to explicitly
tell recipients what languages their messages are in, I offer you the entire
history of written communication, where people don't say By the way this is
in English! at the top of each letter.

 Users entering text in a foreign language cause trouble to the forum
 moderators who have to discipline them.  Thus, the software could
 accommodate to the needs of the moderator, so that the poster gets warned
 before posting, not admonished afterwards.  This is more convenient and less
 work for everyone.  Providing an indication what language is recommended by
 forum users is good, because most users would take that into account (for
 fear of getting plonked, if not for good manners).

How is this relevant to a discussion about spellchecking?  If you want
UA-based language detection facilities that are, say, accessible from JS,
that may be a reasonable request, but like much of this discussion, it seems
tangential.

PK


Re: [whatwg] Spellchecking mark III

2009-01-28 Thread Peter Kasting
On Wed, Jan 28, 2009 at 10:27 AM, Křištof Želechovski k...@mimuw.edu.plwrote:

  Spelling quizzes are an artificial example; they are not interesting once
 spell checking is commonly available because the user can cheat by
 temporarily using another control that is being checked.

They can cheat today by pasting something into Microsoft Word.  Or looking
it up in a dictionary.  That doesn't mean there's no value in this.  There
are many internet quizzes where you can cheat by looking answers up with a
search engine, but they're still fun and wildly popular.  Your argument that
because people could cheat no author would, or should, want to write such a
thing does not seem supported by evidence.

  Address fields contain data in a technical language, not in a natural
 language.  Of course, the browser can support technical languages by
 checking the syntax and validity of data as well (e.g. matching the zip code
 against the place using an external database).

This seems rather far afield from spellchecking.  There's a whole section of
the spec (forms) that deals with validation of various kinds of form input.
 It's separate from spellchecking for a reason: the algorithms are
completely different (and a potential rabbit-hole).  From my perspective as
a UA author that actually writes the code to do this stuff, you're
conflating too many kinds of input validation here.

 People do not say this is English but machines do (Content-Language MIME
 header).

And that header content is not generally set by explicit user action.  In
fact, it's often not set at all, or set incorrectly.  Hoping that this will
change seems naive.

 I want incorrect input, including input in an unexpected language, to be
 marked as such by the spell checker.

I already said that having a general-purpose, JS-accessible language
detector might be a good thing.  It would certainly be a necessary thing for
this request.  Once one had it, the request would be better addressable
without touching the behavior of the browser's spellchecker at all, because
the author could use the output of the language detector to display any
message or take any action he desired, rather than simply having the UA draw
a line under every word and thus look completely broken.

What you want is better accomplished by means other than what you propose,
and what you propose does not address the use cases for the spellcheck
attribute.  I'm not sure we can reach further agreement, so I leave this
subthread in Hixie's hands.

PK


Re: [whatwg] Spellchecking mark III

2009-01-27 Thread Peter Kasting
2009/1/27 Křištof Želechovski k...@mimuw.edu.pl

  The original use of the spellcheck attribute was to switch spell checking
 off

No, the _original_ use was to turn it on on fields where it would otherwise
have been on.

 (I think we both believe it should generally be on).  Using a private
 language for the control would do the trick equally well, without
 introducing a new attribute.

It wouldn't do it equally well, since semantically, it would mean this is
of language private, which will be strictly inaccurate.

   Avoiding an additional attribute is a gain,

Why?

 If the language detection libraries are as good as you claim, why is
 Firefox unable to use them in a way that is not annoying?

Because no one has had the time or energy to devote to this?  I have worked
full-time on browsers for a number of years now and have never seen any team
with the time to fix all the things that could or should be fixed.

 As I have already mentioned, GMail should provide an option for the sender
 to inform the recipient about the language used in the message, not for the
 client-side spell checker, but for the recipient.

Which no one will ever use, because users aren't going to take the trouble
to declare such a thing when human recipients can just _read the text_.
 After all, WE have built-in language detectors in our heads.

 We can drop the suggestion language=auto if you wish, but it would be an
 explicit way of informing the user that he is allowed to enter text in any
 language he pleases.

As if users aren't going to just enter whatever language they please into
any field they wish?  We design software that has to accommodate people, not
the other way around :)

I have no idea whether there are better things web apps and UAs can do
w.r.t. communicating what languages are used where.  All I know is that both
in the abtstract and practically, whether I want a field spellchecked by
default is a distinct concern from which language(s) would be used to
spellcheck it.  Therefore I continue to see the spellcheck attribute as
distinct from (though possibly complimentary to) language.

PK


Re: [whatwg] Spellchecking mark III

2009-01-26 Thread Peter Kasting
2009/1/26 Křištof Želechovski k...@mimuw.edu.pl

  Q: Should the localization influence the spell checking mechanism?

 A: Definitely, since the user is likely to write most messages in his
 preferred UI language.

Which is why this is a perfectly valid input for the heuristic the UA uses
to determine the checking language.

 Q: Is GMail a use case for having spell check without specifying a language
 to check against?

 A: No, it is not.

You don't provide any reason why not.  The user is likely to write most
messages in his preferred UI language (which is not true of all users, but
leaving that aside) does not imply the user will write all messages
exclusively in his preferred UI language.  Therefore gmail cannot
(correctly) specify the spellchecking language of editable fields.
 Therefore the UA must decide.  Unless the probable input language of a
particular field differs from that of the rest of the page, there's no
reason for gmail to specify the probable input language of that field.
 There is no benefit to conflating this concept with should this field be
spellchecked.

 Q: In case when the user decides to use another language, is the user agent
 free to detect it?

 A: Yes, it is, unless the language specified is private, which means the
 field was not intended for checking.

Again, this is needless conflation.  You gain nothing, and lose both clarity
and flexibility, by mapping don't spellcheck to specify the language as
private in this way.  In terms of the semantics of the page, this is
extremely confusing, sicne whether a field should be spellchecked and what
language it's in are nearly orthogonal concepts.

 Q: When the language recognition technology advances to an acceptable
 state, will it be possible to extend the language attribute to explicitly
 request automatic identification of the language?

 A: Yes, it is.  Just specify lang=auto or whatever is agreed upon.

There is no benefit to forcing authors to say lang=auto.  What have you
gained?  What if they _don't_ say this?  (The HTML5 spec must still say what
the UA behavior is.)

Language detection libraries today are already extremely good, far more
reliable than anything explicitly set ahead of time by authors _or_ users.
 Unless I am completely misunderstanding you, I think your suggestions fail
to solve the original use cases for the spellcheck attribute, add needless
burden on web authors, and would be completely ignored by UAs who wished to
provide a good user experience.

PK


Re: [whatwg] Spellchecking mark III

2009-01-25 Thread Peter Kasting
On Sun, Jan 25, 2009 at 10:52 AM, Křištof Želechovski kri...@wp.pl wrote:

 Gmail can use
 1. the localisation preferences chosen by the user in GMail configuration,
 2. the localisation preferences chosen by the user in the browser
 configuration
 to determine the what language the user is likely to use in the subject
 field.
 (Generally, it should be the same language as the Subject label is in.)
 If the user incidentally sends a message in another language, the Web
 browser can recognize the language after the subject is typed, as described
 before.


But your original claim was that the web author, not the UA, should have the
ability to force a particular language for spellchecking -- and that the
spellcheck attribute was worthless outside this, as what authors needed
was a way to force the spellcheck _language_, not simply its presence.  Now
you seem to be reversing your comments and indicating that perhaps the UA
may end up knowing better what language to use (e.g. because the user types
in another language), which is what I was saying all along.  And none of
this gives any support for the idea that spellcheck as an attribute is not
useful for gmail!  Why should gmail have to try and guess what lanugage the
user will be typing emails in?  Isn't it instead desirable to tell the UA
if you can figure out the right language here, then go ahead and spellcheck
this field and leave everything else in the hands of the UA?

PK


Re: [whatwg] Spellchecking mark III

2009-01-22 Thread timeless
On Wed, Jan 21, 2009 at 4:51 PM, Aryeh Gregor simetrical+...@gmail.com wrote:
 In practice, I think the only way to avoid this problem is for
 browsers to implement content-sniffing techniques of some kind to
 figure out the language, at least per field but ideally on a
 word-by-word basis.  If the browser is set to spellcheck in English
 but you start putting in lots of non-Latin characters and every word
 is therefore misspelled, the browser should be clever enough to try
 switching the spellcheck language, or at least disabling spellcheck
 for words that can't possibly be from the language it's checking
 against.  More refined heuristics could detect even subtle
 differences, like between British and American English, and remember
 for next time which one the user usually types in.

this is approximately what I'm hoping to see implemented for Firefox.
I haven't worked on the spell checking code recently, but it's what I
feel is necessary having worked in an organization where the default
language and the used language don't match. The result is everyone
either ignores or turns off spell checking. I'm hoping to either find
someone to implement this, or implement it myself. Either way, with
this implemented, my employer would eventually update my coworkers'
browsers to such a Firefox, and then I can hope they will get more
useful feedback and actually pay attention to their typing.

-Yes, I'm aware that this is a pipe dream. I need this dream.

 None of this needs, or even could effectively use, author intervention:

 1) The author cannot know what languages users will want to enter in
 all cases.  I've sometimes found myself writing posts in Hebrew on
 English-only sites, for instance.

 2) The author certainly won't be able to determine the dialect or
 variant of the language the user will want to use, which is necessary
 for spellcheck.

 3) Authors should not have to add extra markup if it's not really
 necessary, because in practice, most won't.  To be as useful as
 possible, spellcheck should Just Work without explicit author
 intervention.


Re: [whatwg] Spellchecking mark III

2009-01-22 Thread Calogero Alex Baldacchino

Peter Kasting ha scritto:
On Wed, Jan 21, 2009 at 7:38 PM, Calogero Alex Baldacchino 
alex.baldacch...@email.it mailto:alex.baldacch...@email.it wrote:


Why not to let the user choose the language, as it happens in word
processors? A UA can't choose accurately whether, for instance,
color is a correct American English, a wrong British English, or
even a correct (truncated) Italian word, while a human can do it
better, thus a UA could provide an interface to change the
language for a selection spellchecking, or even for each mispelled
word, starting from a hint language, which could be the value of
an element lang attribute (beside a default value and a
user-preference forced one - the latter bypassing any authored
value). Also, using the lang attribute value as the start
language to check (if not in contrast with a user preference)
would allow an interactive interface with a script changing that
value according to a user's choice (UAs could also expose a list
of supported languages).


I'm not sure I fully grasped everything here, but what I did grasp 
sounds very much like a cross between what Chromium is doing today and 
what we want to do in the future (I imagine similar things are true 
for other browser vendors).  User specification and page hints are 
both useful tools for a UA.


But I still claim that all of those aspects are outside the scope of 
the spellcheck attribute, and fall into the realm of things that 
should not be in the HTML5 spec as they're very much UA-specific 
behavior.


PK


Probably. However, establishing that the lang attribute is the 
first-choice language to check (which wouldn't prevent the UA from 
providing other choices, or just ignoring such behaviour due to a user 
preference, or using other dictionaries too -- and that might be 
suggested in a note on usability, I guess), I mean, would allow a webapp 
to emulate those functionalities to some extent, just setting a 
different value for the lang attribute of a contenteditable box and some 
of its subregions through a script at the user whim (that is, let's do 
it through script until UAs provided a better solution, which could be 
hinted by scripting hacks based on the lang and spellcheck 
attributes working together at the same grane).


I think that a control over the language to check can improve 
spellchecking at the same grane as the spellcheck attribute, whereas it 
can't harm end users more than a wrong assumption on spellchecking. A 
user would notice a wrong checking not matching the language he's using, 
and could disable it or do whatever else a UA allows him to do (though 
being annoying); on the other hand, a user might not notice 
spellchecking is disabled on a certain area, and could not beware his 
errors, unless the UA informed him somehow (about spellchecking being 
turned off). Therefore, a special care by UAs is needed in both cases, 
yet both features can improve webapps providing a rich and/or 
specialized editor (such as a code editor, where disabling spell 
checking but for comments may make sense), so why not consider both of 
them, since they're related?


Also, implementation and usages experience could suggest whether it is 
worth to expose UAs' supported languages through DOM APIs (e.g. to allow 
a webapp to create a dynamic list of checking-available languages, to 
avoid static lists being either incomplete, or too long and possibly 
including unsupported languages), and this would affect either the 
Window or the Navigator interface (or something else in HTML5 scope).


Everything, IMHO.

WBR, Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Con Danone Activia, puoi vincere cellulari Nokia e Macbook Air. Scopri come
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8547d=22-1


Re: [whatwg] Spellchecking mark III

2009-01-22 Thread Kornel Lesiński
Probably. However, establishing that the lang attribute is the  
first-choice language to check (which wouldn't prevent the UA from  
providing other choices, or just ignoring such behaviour due to a  
user preference, or using other dictionaries too -- and that might  
be suggested in a note on usability, I guess), I mean, would allow a  
webapp to emulate those functionalities to some extent, just setting  
a different value for the lang attribute of a contenteditable box  
and some of its subregions through a script at the user whim (that  
is, let's do it through script until UAs provided a better solution,  
which could be hinted by scripting hacks based on the lang and  
spellcheck attributes working together at the same grane).


I don't think that applications need ability to precisely control  
spell-checking language. Browser knows best which dictionaries are  
available, and can auto-detect language based on user's preferences,  
page's language and text itself. You can expect that browsers will  
have much more sophisticated and reliable language detection than web  
apps (that's an area where browsers can freely compete).


Many of your suggestions are just implementation details, which HTML  
shouldn't specify precisely (it could force browsers to use method  
that is suboptimal). HTML just needs to offer reasonable way to  
implement good heuristics, and I think existing lang, input types and  
spellchecking attribute are sufficient.


--
regards, Kornel





Re: [whatwg] Spellchecking mark III

2009-01-22 Thread Calogero Alex Baldacchino

Kornel Lesiński ha scritto:
Probably. However, establishing that the lang attribute is the 
first-choice language to check (which wouldn't prevent the UA from 
providing other choices, or just ignoring such behaviour due to a 
user preference, or using other dictionaries too -- and that might be 
suggested in a note on usability, I guess), I mean, would allow a 
webapp to emulate those functionalities to some extent, just setting 
a different value for the lang attribute of a contenteditable box and 
some of its subregions through a script at the user whim (that is, 
let's do it through script until UAs provided a better solution, 
which could be hinted by scripting hacks based on the lang and 
spellcheck attributes working together at the same grane).


I don't think that applications need ability to precisely control 
spell-checking language. Browser knows best which dictionaries are 
available, and can auto-detect language based on user's preferences, 
page's language and text itself. You can expect that browsers will 
have much more sophisticated and reliable language detection than web 
apps (that's an area where browsers can freely compete).




Browsers can't do better than word processors, which are the state of 
the art in... word processing. At most, browsers can do as well, and, 
over some extent, word processors don't use heuristics while you're 
typing, because no heuristics can guess whether you're *purposedly* 
switching between dialects (such as British and American English), or if 
you just mispelled a word (personally, I dislike even the automatic 
correction of common mistakes in w.p.). Word processors make a choice 
when you start writing (or before, basing on your installation language, 
for instance), and let you change it for the whole document or for each 
single word. I don't think any heuristic auto-detection can be better; 
instead, no language detection (and users' explicit choice) is more 
reliable than any sophisticated heuristics.


Turning spelling checking on or off makes sense if one can guess how the 
user agent would behave AND if the user agent can recover misuses, thus 
I believe that spellcheck is strictly related to the way a 
spellcheking language is detected and is half of the problem of 
controlling spellcheking. Otherwise, if it's thought that everything 
should be under the control of a UA, let's state spellchecking must be 
always on and peace. Just because being annoyed by a wrong checking 
(e.g. because the heuristics fails, but it would be the same for a wrong 
lang value) is less harmful than thinking one's writing correct text 
because of being unaware that checking has been disabled by the author 
without asking one's permission. Yet, both lang and spellcheck 
attributes can be useful for the purpose of controlling spellchecking 
and improving a web-based word processor, and in both cases UAs can 
recover from misuses, somehow (e.g. allowing the user to bypass authors' 
choices).


Moreover, I think that interactive and script-aware UAs should act as a 
framework for web-based applications providing as much of a client-only 
application functionalities as possible, thus browsers should include 
new features when possible and reasonable (while trying not to became 
oversized). I agree that spellchecking is a good feature to support in a 
browser; I don't see why a web-based rich text editor should be 
prevented from controlling it on users' behalf, as it happens in word 
processors, givent it's about to support an existing attribute (lang, 
which could be stated to be triggering UAs heuristics by default when 
unspecified for editable elements) and a new one (spellcheck) in 
conjunction for this purpose (also a list of supported dictionaries 
would be useful).


I also think that features which are not core functionalities for a UA 
should be provided in a basic version (for general use in web pages) and 
as building blocks for web applications, not in a complete version under 
a UA exclusive control (for instance, a UA could allow the user to 
change the language for some selected text through a context menu 
option, but the right place for an option allowing a (starting) choice 
valid for a whole editable element, in a rich text editor, should be the 
editor interface, which shouldn't be provided by a UA, as a whole or in 
part, or, if the UA provides it, it should be exposed to any webapp to 
be customized and enhanced). That's because a specific application can 
focus on a specific task usability better than its underlying, general 
purpose framework (like a browser is or should be for a web application).


Furthermore, if you agree that a page's language should be used to 
improve auto-detection, why not to use an element language attribute 
too? With the benefit that it can be changed dynamically to please the user.


Many of your suggestions are just implementation details, which HTML 
shouldn't specify precisely (it could force browsers to use 

Re: [whatwg] Spellchecking mark III

2009-01-21 Thread Mikko Rantalainen
Peter Kasting wrote:
 2009/1/20 Mikko Rantalainen mikko.rantalai...@peda.net
 
 I agree. I think that specifying the spellcheck attribute would be a
 mistake. It allows only forcing the automatic spell checking on or off
 but it doesn't help a bit to allow mixing different languages on a
 single page.
 
 I don't see how the second sentence is an argument for the first.

If the browser does not know the language of the content, how on earth
is it supposed to *correctly* spellcheck it? I'm daily hitting a
situation where browser is trying to spellcheck content with incorrect
language. I've toggled such automatic spellchecker off and those will
stay off until correct language is detected.

My second sentence was trying to argument that page author has no
business forcing the spellchecking on if the page author cannot force
the spellchecking language! Especially for a case where the page
contains a mix of multiple languages.

 Just specify that spell checking must follow the content language.
 
 How many pages specify the content language?  AFAIK the farthest most
 authors get is to specify the encoding, and even that is frequently done
 wrong, and browsers have all kinds of crazy heuristics to try and
 second-guess authors.
 
 This seems like it would make spellchecking function very poorly on the web
 at large, whereas adding the spellcheck attribute at worst would not harm
 anyone.

I'm aware that many web pages do not specify content language. There
aren't many web pages forcing the spellchecking on or off, either.
Forcing a spellchecking on with incorrect language would harm the user!

It really does not make any sense to ever force spellchecking if the
language that the spellchecker uses is the incorrect one. The current
spellcheck attribute does not define any language and it seems that
the page author has no way to know if the spell checking should really
be disabled or not.

My point is that if the page does not specify the language then the
behavior should be explicitly undefined. This should not be changed. On
the other hand, if the content language is explicitly defined, then the
user agent has the required knowledge to decide if the spellchecking
should be enabled or disabled. There's no need for the spellcheck
attribute.

Make specifying the language the *only* accepted method for triggering
the spell checking. Specify that any unknown language must not be
spellchecked automatically. Then you automatically have a method for
forcing the automatic spell checking off and in addition to that you
have some incentive to define correct language for the page.

If we can persuade content authors to specify the correct content
language, I believe that in the future there will be *other* benefits,
too. For example, automatic hyphenation would improve typographic
quality of web pages but automatic hyphenation is impossible unless you
know the language of the content.

-- 
Mikko




signature.asc
Description: OpenPGP digital signature


Re: [whatwg] Spellchecking mark III

2009-01-21 Thread James Graham

Mikko Rantalainen wrote:

My second sentence was trying to argument that page author has no
business forcing the spellchecking on if the page author cannot force
the spellchecking language! Especially for a case where the page
contains a mix of multiple languages.


Not really. Consider e.g. flickr in which photos may be given titles, 
descriptions and comments in the language of the user's choice but the 
site UI is not localised. If flickr decided to do input type=text 
lang=en to get spellchecking to turn for photo titles then that would 
be much worse for the large number of non-native English speakers than 
input type=text spellcheck=on which would likely use the user's 
preferred dictionary (although this would be UA-dependent of course).


For another example, consider the case where I post on a Swedish forum 
in English, knowing that the general level of English in Sweden is 
excellent and in any case better than the level of my Swedish.


It doesn't seem reasonable to expect sites to always be localised or for 
 sites accepting multilingual user generated content to not exist. 
Therefore it seems totally conterproductive from the point of view of 
people communicating in less dominant languages to require spellchecking 
to be tied to language.




Re: [whatwg] Spellchecking mark III

2009-01-21 Thread Mikko Rantalainen
James Graham wrote:
 Mikko Rantalainen wrote:
 My second sentence was trying to argument that page author has no
 business forcing the spellchecking on if the page author cannot force
 the spellchecking language! Especially for a case where the page
 contains a mix of multiple languages.
 
 Not really. Consider e.g. flickr in which photos may be given titles, 
 descriptions and comments in the language of the user's choice but the 
 site UI is not localised. If flickr decided to do input type=text 
 lang=en to get spellchecking to turn for photo titles then that would 
 be much worse for the large number of non-native English speakers than 
 input type=text spellcheck=on which would likely use the user's 
 preferred dictionary (although this would be UA-dependent of course).

How about input type=text lang=mul if the content author does not
want to specify a language? That would hint the UA that this field
assumes human language but the input may be in any language.

The current (heuristics) could be requested with input type=text
lang=und which explicitly marks this input to contain text with
undefined language.

 For another example, consider the case where I post on a Swedish forum 
 in English, knowing that the general level of English in Sweden is 
 excellent and in any case better than the level of my Swedish.

I agree. However, if the forum maintainer would rather have no text at
all instead of text in wrong language, then the forum maintainer
should use input type=text lang=se and the UA would correctly flag
any non-swedish word as incorrect.

 It doesn't seem reasonable to expect sites to always be localised or for 
   sites accepting multilingual user generated content to not exist. 
 Therefore it seems totally conterproductive from the point of view of 
 people communicating in less dominant languages to require spellchecking 
 to be tied to language.

I'm not suggesting spellchecking to require only a single language. I'm
requesting that if the page wants automatic spell checking it must
explicitly define the language that the spellchecking should check for.
For multiple languages case, the RFC 3066 defines the MUL language code
and for the undefined case, the UND code has been defined.

Currently the lang attribute accepts exactly one language code. For the
case where acceptable input for forum message would be Swedish or
English it would be nice to be able to write input type=text
lang=se,en or perhaps even lang=se,en;q=0.1.

-- 
Mikko




signature.asc
Description: OpenPGP digital signature


Re: [whatwg] Spellchecking mark III

2009-01-21 Thread Aryeh Gregor
On Wed, Jan 21, 2009 at 4:15 AM, Mikko Rantalainen
mikko.rantalai...@peda.net wrote:
 If the browser does not know the language of the content, how on earth
 is it supposed to *correctly* spellcheck it? I'm daily hitting a
 situation where browser is trying to spellcheck content with incorrect
 language. I've toggled such automatic spellchecker off and those will
 stay off until correct language is detected.

In practice, I think the only way to avoid this problem is for
browsers to implement content-sniffing techniques of some kind to
figure out the language, at least per field but ideally on a
word-by-word basis.  If the browser is set to spellcheck in English
but you start putting in lots of non-Latin characters and every word
is therefore misspelled, the browser should be clever enough to try
switching the spellcheck language, or at least disabling spellcheck
for words that can't possibly be from the language it's checking
against.  More refined heuristics could detect even subtle
differences, like between British and American English, and remember
for next time which one the user usually types in.

None of this needs, or even could effectively use, author intervention:

1) The author cannot know what languages users will want to enter in
all cases.  I've sometimes found myself writing posts in Hebrew on
English-only sites, for instance.

2) The author certainly won't be able to determine the dialect or
variant of the language the user will want to use, which is necessary
for spellcheck.

3) Authors should not have to add extra markup if it's not really
necessary, because in practice, most won't.  To be as useful as
possible, spellcheck should Just Work without explicit author
intervention.


Re: [whatwg] Spellchecking mark III

2009-01-21 Thread Bil Corry
Mikko Rantalainen wrote on 1/21/2009 5:03 AM: 
 For another example, consider the case where I post on a Swedish forum 
 in English, knowing that the general level of English in Sweden is 
 excellent and in any case better than the level of my Swedish.
 
 I agree. However, if the forum maintainer would rather have no text at
 all instead of text in wrong language, then the forum maintainer
 should use input type=text lang=se and the UA would correctly flag
 any non-swedish word as incorrect.

I see value in being able to provide a hint to the UA that it should or should 
not spell check certain content, but the ultimate control should reside with 
the user.

I hate the idea of a web site dictating which dictionary must be used to spell 
check the user's content.  Spell checking is for the benefit of the user, not 
the web site, and forcing a dictionary in a language that the user doesn't 
speak is completely useless and would only serve to annoy (i.e. it wouldn't 
prevent the user from submitting content in any language of their choosing).

Beyond that, it has other problems.  Say I visit a site in the UK and it forces 
the UK dictionary; as an American speaker, I'll be confused as to why my UA is 
flagging color as misspelled and will simply turn off spell checking entirely 
since it's broken. 

Additionally, not all UAs ship with dictionaries for every single language (do 
any?), so the UA wouldn't be able to spell check when a dictionary isn't 
available for that user.  I guarantee that if my UA shipped with all of them, 
I'd remove them all except the languages I converse in to prevent the web site 
from forcing a particular dictionary.

Then there are some languages that do not have a dictionary available at all, 
such as Tamil in Firefox:

https://addons.mozilla.org/en-US/firefox/browse/type:3

I don't see any benefit to the user in forcing them to use a particular 
dictionary and the only benefit to the site is it might annoy someone into 
using a particular language (assuming they even have the dictionary for that 
language).


- Bil



Re: [whatwg] Spellchecking mark III

2009-01-21 Thread Peter Kasting
On Wed, Jan 21, 2009 at 1:15 AM, Mikko Rantalainen 
mikko.rantalai...@peda.net wrote:

 If the browser does not know the language of the content, how on earth
 is it supposed to *correctly* spellcheck it?


As others have noted, the user's preferences are generally a better
indicator of how something should be spellchecked, for a number of reasons.
 (Bill Corry's email was on-point here.)


 I'm daily hitting a
 situation where browser is trying to spellcheck content with incorrect
 language. I've toggled such automatic spellchecker off and those will
 stay off until correct language is detected.


As I said, this seems a separate problem to me.  Dynamic language switching
or multi-language spellchecking based on various heuristics seems like the
solution here.  This applies to any spellchecked field anywhere and is
separate from the issue of whether an author wants to tell the UA that a
field is even appropriate for spellchecking or not.

My second sentence was trying to argument that page author has no
 business forcing the spellchecking on if the page author cannot force
 the spellchecking language!


I disagree completely.  Consider one of the original use cases for this:
Gmail instructing UAs to spellcheck the optional Subject field of a mail.
 There's no way Gmail can know what language(s) the user may type in this
field, but it's still appropriate to tell the UA that the field is
appropriate for spellchecking.  At this point it's up to the AU to determine
what language to use.

I also take issue with the word force, which is imprecise.  The spellcheck
attribute spec was carefully written to ensure that the user and UA have
ultimate control over whether spellchecking actually occurs, regardless of
what the author specifies; the attribute is a hint to the UA, not force.


 Forcing a spellchecking on with incorrect language would harm the user!


A good reason why the UA's spellchecking language should not be determined
by the author (and thus why your proposal leaves me cold).

On
 the other hand, if the content language is explicitly defined, then the
 user agent has the required knowledge to decide if the spellchecking
 should be enabled or disabled. There's no need for the spellcheck
 attribute.


The UA does not know which fields actually contain language and which
simply contain strings of characters.  Enumerating input types (e.g. this
field contains email addresses) can address this, but suffers from two
problems:
* There are an unbounded number of input types, potentially
* Types should perhaps not always be treated equally.  For example, if an
author wrote a spelling quiz, then input boxes for a user to type in would
contain words and thus be of a spellcheckable type, but the author would
clearly prefer the UA not spellcheck them :)

If we can persuade content authors to specify the correct content
 language,


Proposals that sound like if we could just get authors to write valid,
semantic content with no errors... have always seemed naive to me.

PK


Re: [whatwg] Spellchecking mark III

2009-01-21 Thread Calogero Alex Baldacchino

Aryeh Gregor ha scritto:

On Wed, Jan 21, 2009 at 4:15 AM, Mikko Rantalainen
mikko.rantalai...@peda.net wrote:
  

If the browser does not know the language of the content, how on earth
is it supposed to *correctly* spellcheck it? I'm daily hitting a
situation where browser is trying to spellcheck content with incorrect
language. I've toggled such automatic spellchecker off and those will
stay off until correct language is detected.



In practice, I think the only way to avoid this problem is for
browsers to implement content-sniffing techniques of some kind to
figure out the language, at least per field but ideally on a
word-by-word basis.  If the browser is set to spellcheck in English
but you start putting in lots of non-Latin characters and every word
is therefore misspelled, the browser should be clever enough to try
switching the spellcheck language, or at least disabling spellcheck
for words that can't possibly be from the language it's checking
against.  More refined heuristics could detect even subtle
differences, like between British and American English, and remember
for next time which one the user usually types in.

  


Why not to let the user choose the language, as it happens in word 
processors? A UA can't choose accurately whether, for instance, color 
is a correct American English, a wrong British English, or even a 
correct (truncated) Italian word, while a human can do it better, thus a 
UA could provide an interface to change the language for a selection 
spellchecking, or even for each mispelled word, starting from a hint 
language, which could be the value of an element lang attribute 
(beside a default value and a user-preference forced one - the latter 
bypassing any authored value). Also, using the lang attribute value as 
the start language to check (if not in contrast with a user preference) 
would allow an interactive interface with a script changing that value 
according to a user's choice (UAs could also expose a list of supported 
languages).


A declaration such as lang='und' sounds like telling the user agent to 
do whatever is computed as being a good choice, which is different from 
telling don't even try to understand what the language is here, because 
I know you can't guess it; declaring a value known to be unsupported 
(such as an invented one) to turn off spellchecking sounds like a hack 
needed because we miss a more appropriate feature.


Everything IMHO.

WBR, Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Partecipa al concorso Danone Activia e vinci MacBook Air e Nokia N96. Prova
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8548d=22-1


Re: [whatwg] Spellchecking mark III

2009-01-21 Thread Peter Kasting
On Wed, Jan 21, 2009 at 7:38 PM, Calogero Alex Baldacchino 
alex.baldacch...@email.it wrote:

 Why not to let the user choose the language, as it happens in word
 processors? A UA can't choose accurately whether, for instance, color is a
 correct American English, a wrong British English, or even a correct
 (truncated) Italian word, while a human can do it better, thus a UA could
 provide an interface to change the language for a selection spellchecking,
 or even for each mispelled word, starting from a hint language, which could
 be the value of an element lang attribute (beside a default value and a
 user-preference forced one - the latter bypassing any authored value).
 Also, using the lang attribute value as the start language to check (if
 not in contrast with a user preference) would allow an interactive interface
 with a script changing that value according to a user's choice (UAs could
 also expose a list of supported languages).


I'm not sure I fully grasped everything here, but what I did grasp sounds
very much like a cross between what Chromium is doing today and what we want
to do in the future (I imagine similar things are true for other browser
vendors).  User specification and page hints are both useful tools for a UA.

But I still claim that all of those aspects are outside the scope of the
spellcheck attribute, and fall into the realm of things that should not
be in the HTML5 spec as they're very much UA-specific behavior.

PK


Re: [whatwg] Spellchecking mark III

2009-01-20 Thread Mikko Rantalainen
Křištof Želechovski wrote:
 Spell checking of regions of text should be governed by the lang attribute,
 if any, and browser preferences; it would be switched off for language tags
 the spell-checking engine does not support, including custom ones.
 It is extremely annoying how Safari, although (supposedly) localized to
 Polish, wants all input to be in English.

I agree. I think that specifying the spellcheck attribute would be a
mistake. It allows only forcing the automatic spell checking on or off
but it doesn't help a bit to allow mixing different languages on a
single page.

Robert O'Callahan wrote:
 The browser can't know ahead of time that a text field is not supposed to
 contain natural-language text.

Yes it can, the lang attribute contains the required information. If the
page lies about its language, then there's abviously nothing the browser
can do to fix it.

Just specify that spell checking must follow the content language.

This way any already existing page that correctly specifies the content
language would turn automatic spell checking on/off as required. There's
no point trying to automatically spell check an unknown language so
there's no need to explictly turn off the spellchecking (assuming that
the content language is correctly specified).

As the lang attribute can be used in inner elements, too, it allows
mixing different languages on a single page and it allows UA to apply
different spell checkers to different parts.

At least the following use cases have been discussed in this thread:

- email subject field (e.g. lang=en according to UI language perhaps?)
- email address field (lang=x-email-to, can be spellchecked against the
user's address book)
- web site address (lang=x-url, perhaps also follow type=url?)
- product id (lang=x-proprietary)
- license plate (lang=x-proprietary)
- captcha (lang=x-proprietary)
- program code (lang=x-program-c++ perhaps?)

If the page does not specify any language, allow the UA to decide the
best method for its spell checking (leave the behavior explicitly
undefined).

I used x-proprietary for a custom/special language above. RFC 3066
specifies UND (undefined) language code that could be used instead.
However, I think that the whatwg should specify that UND language code
is used to turn on the undefined (UA dependant heuristics) behavior for
selected inner elements.

-- 
Mikko




signature.asc
Description: OpenPGP digital signature


Re: [whatwg] Spellchecking mark III

2009-01-20 Thread Peter Kasting
2009/1/20 Mikko Rantalainen mikko.rantalai...@peda.net

 I agree. I think that specifying the spellcheck attribute would be a
 mistake. It allows only forcing the automatic spell checking on or off
 but it doesn't help a bit to allow mixing different languages on a
 single page.


I don't see how the second sentence is an argument for the first.

Just specify that spell checking must follow the content language.


How many pages specify the content language?  AFAIK the farthest most
authors get is to specify the encoding, and even that is frequently done
wrong, and browsers have all kinds of crazy heuristics to try and
second-guess authors.

This seems like it would make spellchecking function very poorly on the web
at large, whereas adding the spellcheck attribute at worst would not harm
anyone.

As the lang attribute can be used in inner elements, too, it allows
 mixing different languages on a single page and it allows UA to apply
 different spell checkers to different parts.


Again, this seems somewhat orthogonal to the spellcheck attribute
discussion.  Pages which mix languages are of interest to the Chromium
development team, too, and we have ideas on how to make life better for
those users -- but none of those ideas intersect the spellcheck attribute in
any way.

I think your post is tangential.

PK


Re: [whatwg] Spellchecking mark III

2009-01-19 Thread Křištof Želechovski
Spell checking of regions of text should be governed by the lang attribute,
if any, and browser preferences; it would be switched off for language tags
the spell-checking engine does not support, including custom ones.
It is extremely annoying how Safari, although (supposedly) localized to
Polish, wants all input to be in English.
IMHO,
Chris





Re: [whatwg] Spellchecking mark III

2009-01-19 Thread Peter Kasting
On Tue, Dec 30, 2008 at 3:38 AM, Ian Hickson i...@hixie.ch wrote:

 The same engineers have since implemented this feature in Chrome also,


Incorrect.  One engineer implemented a crude hack in a small portion of the
Chromium glue code that implements a fraction of the spec -- enough to make
Gmail work a little more nicely, and that's about it.

On Wed, Dec 31, 2008 at 7:15 AM, Maciej Stachowiak m...@apple.com wrote:

 2) The proposal Hixie linked seems way overengineered for this purpose.
 First, it allows spellchecking to be explicitly turned on, potentially
 overriding normal defaults, but that seems wrong; an input type=email
 should never spellcheck regardless of the page author says. I can't see any
 valid use case for the author turning spellchecking on regardless of UA
 defaults or user preferences.


Email subject line boxes.  In Firefox (where I implemented support for this
attribute matching Hixie's spec), the default is to spellcheck multiline
boxes and not single-line boxes, which meant that webmail subject line
fields would not be spellchecked by default.


 Second, it allows spellchecking to be controlled at a finer granularity
 than editability, for which again I think there is no valid use case.


Besides the above example in the positive direction, the negative direction
is, again, editable fields which you don't want spellchecked, e.g. email
recipient list fields (which may be multiline and contain whitespace).  I
agree with Roc that it is not practical for UAs to detect (via heuristics)
which fields should and should not be checked in all cases, and
spellchecking desirability seems finer grained than editability to me (not
completely orthogonal, as I don't think non-editable fields should ever be
spellchecked).

I also agree with Roc that this is not complicated, in practice, to
implement.  It was a tricky patch for me in Firefox since I was not familiar
with any of the associated code, but the actual logic of the spec was not
hard at all.

I support adding Hixie's spec, as-is, to HTML5.  It's implemented in
Firefox, it's desired in Opera, and there's a bug on file to add support for
it to WebKit (which I would like to do someday).

PK


Re: [whatwg] Spellchecking mark III

2009-01-19 Thread Peter Kasting
On Mon, Jan 19, 2009 at 4:53 PM, Robert O'Callahan rob...@ocallahan.orgwrote:

 Actually I was just poking around and noticed that we don't actually
 support variation of spellcheck values within different parts of an editable
 element. So I won't make any claims about how hard that is to support.


Doesn't the spec only define things on a per-element level of granularity?
 I wasn't really paying attention to this side-conversation of yours so I
didn't think to confirm/refute it.  But I don't think the spec in fact
covers doing such a thing.

PK


Re: [whatwg] Spellchecking mark III

2008-12-31 Thread Maciej Stachowiak


On Dec 30, 2008, at 7:20 AM, Kornel Lesiński wrote:



On 30.12.2008, at 13:45, Geoffrey Sneddon wrote:


I have therefore not added this feature to HTML5 for the time  
being. If

there is more interest in this feature, please speak up.


This seems stupid. If I want to have spell-checking, let me. Don't  
force it off. I don't see any reason to have it forced off, ever.



It's useful for fields that contain non-textual content, e.g.  
product ID, license plate number, CAPTCHA answer, etc.
Browser would mark these as misspelt, which might be confusing or at  
least distracting.


It does make sense I guess, that certain fields should not be subject  
to automatic spellchecking. However, three counterpoints:


1) At least Safari's spellchecking won't mark a word misspelled until  
you hit a space; fields that contain data which would be flagged by  
the spellchecker but which are also likely to contain internal  
whitespace are rare.


2) The proposal Hixie linked seems way overengineered for this  
purpose. First, it allows spellchecking to be explicitly turned on,  
potentially overriding normal defaults, but that seems wrong; an  
input type=email should never spellcheck regardless of the page  
author says. I can't see any valid use case for the author turning  
spellchecking on regardless of UA defaults or user preferences.  
Second, it allows spellchecking to be controlled at a finer  
granularity than editability, for which again I think there is no  
valid use case. Both of these aspects make the feature more  
complicated to implement and harder to understand, compared to just  
having a way to only disable spellchecking at the same granularity as  
editing.


In general it would be helpful if some of the Google folks who  
requested this feature and some of the Chrome folks who (apperently)  
implemented it could explain the actual use cases they had in mind.


Regards,
Maciej



Re: [whatwg] Spellchecking mark III

2008-12-31 Thread Kornel Lesiński

On 31.12.2008, at 15:15, Maciej Stachowiak wrote:

It does make sense I guess, that certain fields should not be  
subject to automatic spellchecking. However, three counterpoints:


1) At least Safari's spellchecking won't mark a word misspelled  
until you hit a space; fields that contain data which would be  
flagged by the spellchecker but which are also likely to contain  
internal whitespace are rare.


In Webkit spellchecking is also done when field loses focus, so even a  
single-word fields would be flagged.


2) The proposal Hixie linked seems way overengineered for this  
purpose. First, it allows spellchecking to be explicitly turned on,  
potentially overriding normal defaults, but that seems wrong; an  
input type=email should never spellcheck regardless of the page  
author says. I can't see any valid use case for the author turning  
spellchecking on regardless of UA defaults or user preferences.  
Second, it allows spellchecking to be controlled at a finer  
granularity than editability, for which again I think there is no  
valid use case. Both of these aspects make the feature more  
complicated to implement and harder to understand, compared to just  
having a way to only disable spellchecking at the same granularity  
as editing.


I don't like current proposal either, because true/false value is  
inconsistent with other boolean attributes in HTML. IMHO it should be  
nospellcheck=nospellcheck (which also solves problem of forcing  
spellchecking where it doesn't make sense).


--
regards, Kornel





Re: [whatwg] Spellchecking mark III

2008-12-31 Thread Robert O'Callahan
On Thu, Jan 1, 2009 at 4:15 AM, Maciej Stachowiak m...@apple.com wrote:

 2) The proposal Hixie linked seems way overengineered for this purpose.
 First, it allows spellchecking to be explicitly turned on, potentially
 overriding normal defaults, but that seems wrong; an input type=email
 should never spellcheck regardless of the page author says. I can't see any
 valid use case for the author turning spellchecking on regardless of UA
 defaults or user preferences.


It allows you to have a region of text where spellchecking is disabled via
the spellcheck attribute, but containing subregions where spellchecking is
enabled.

Second, it allows spellchecking to be controlled at a finer granularity than
 editability, for which again I think there is no valid use case. Both of
 these aspects make the feature more complicated to implement and harder to
 understand, compared to just having a way to only disable spellchecking at
 the same granularity as editing.


A use case is editable program code, where spellchecking is disabled, but
where spellchecking is enabled inside comments. Maybe that sounds a little
far-fetched for today's Web applications, but some IDEs (e.g. Eclipse)
support this so it seems like something we'd want in the future.

Rob
-- 
He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all. [Isaiah
53:5-6]


Re: [whatwg] Spellchecking mark III

2008-12-31 Thread timeless
On Wed, Dec 31, 2008 at 3:22 AM, Robert O'Callahan rob...@ocallahan.org wrote:
 That handles some cases, but not others --- e.g. text boxes that contain
 program code.

I run spell checkers on code blocks.

the number of misspellings that could have been avoided by using them 

they're actually useful for spellcheckers.

and for slashdot's really lame captcha they help there too


Re: [whatwg] Spellchecking mark III

2008-12-31 Thread timeless
2008/12/30 Giovanni Campagna scampa.giova...@gmail.com:
 maybe we could just say that spellchecking is disabled when type is not text
 (for email, uri and number you have validation) and when a pattern attribute
 is specified

Personally, if I were to write Gionvanni Campagna into a multiline
text field. I'd like it to match the thing that i wrote into the email
field (it turns out that I've managed to misspell your name, I'm
sorry, but that's the point). So ideally the system which i use to
spell check would be able to share information with my contacts and
would also enable me to teach it spelling based on the email address
fields.


Re: [whatwg] Spellchecking mark III

2008-12-31 Thread Maciej Stachowiak


On Dec 31, 2008, at 12:26 PM, Robert O'Callahan wrote:

On Thu, Jan 1, 2009 at 4:15 AM, Maciej Stachowiak m...@apple.com  
wrote:
2) The proposal Hixie linked seems way overengineered for this  
purpose. First, it allows spellchecking to be explicitly turned on,  
potentially overriding normal defaults, but that seems wrong; an  
input type=email should never spellcheck regardless of the page  
author says. I can't see any valid use case for the author turning  
spellchecking on regardless of UA defaults or user preferences.


It allows you to have a region of text where spellchecking is  
disabled via the spellcheck attribute, but containing subregions  
where spellchecking is enabled.


It seems to me you would have to have a lot of custom code to maintain  
the boundaries between such regions during editing operations for this  
to ever work right. Normal text editing would easily lead to text  
moving across the boundaries. There would have to be strong motivating  
examples to justify such a hard-to-use feature.




Second, it allows spellchecking to be controlled at a finer  
granularity than editability, for which again I think there is no  
valid use case. Both of these aspects make the feature more  
complicated to implement and harder to understand, compared to just  
having a way to only disable spellchecking at the same granularity  
as editing.


A use case is editable program code, where spellchecking is  
disabled, but where spellchecking is enabled inside comments. Maybe  
that sounds a little far-fetched for today's Web applications, but  
some IDEs (e.g. Eclipse) support this so it seems like something  
we'd want in the future.



This sounds like a pretty ill-conceived feature. It is very common for  
comments to include code, or fragments of code (such as variable  
names) mixed with natural language. (I was unable to find any evidence  
of spellchecking comments in the copy of Eclipse I downloaded, so I  
can't comment on the details.)


Furthermore, other IDEs generally don't attempt to do this, and I  
can't think of other application categories that would do something  
similar.


So I don't think this makes for a very compelling use case. It's like  
arguing for a page layout feature based on something only WordPerfect  
does.


Regards,
Maciej



Re: [whatwg] Spellchecking mark III

2008-12-31 Thread Robert O'Callahan
On Thu, Jan 1, 2009 at 2:04 PM, Maciej Stachowiak m...@apple.com wrote:

 On Dec 31, 2008, at 12:26 PM, Robert O'Callahan wrote:

 A use case is editable program code, where spellchecking is disabled, but
 where spellchecking is enabled inside comments. Maybe that sounds a little
 far-fetched for today's Web applications, but some IDEs (e.g. Eclipse)
 support this so it seems like something we'd want in the future.

 This sounds like a pretty ill-conceived feature. It is very common for
 comments to include code, or fragments of code (such as variable names)
 mixed with natural language. (I was unable to find any evidence of
 spellchecking comments in the copy of Eclipse I downloaded, so I can't
 comment on the details.)


OK. It's there, though.

Furthermore, other IDEs generally don't attempt to do this, and I can't
 think of other application categories that would do something similar.


Seems to me that an HTML source view with spellchecking of the non-markup
text would be useful.

For what it's worth, it seemed easy to implement the general spellcheck
behaviour in Gecko, once we'd decided to allow any author spellcheck control
at all (you seem to have agreed that spellcheck=no is useful). But I
really don't feel strongly one way or the other. Peter Kasting or Brett
Wilson should speak up.

Rob
-- 
He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all. [Isaiah
53:5-6]


Re: [whatwg] Spellchecking mark III

2008-12-30 Thread Ian Hickson

In 2006 I proposed the following spec for a spellcheck= attribute, 
based on requests from the Google engineers then working on Firefox:

   http://www.damowmow.com/playground/spellcheck.txt

The same engineers have since implemented this feature in Chrome also, and 
Google does use this attribute on its sites. However, the attribute has 
seen very little interest outside of Google, with just a handful of sites 
using it, primarily in dyanamic editor libraries.

I have therefore not added this feature to HTML5 for the time being. If 
there is more interest in this feature, please speak up.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Spellchecking mark III

2008-12-30 Thread Anne van Kesteren

On Tue, 30 Dec 2008 12:38:42 +0100, Ian Hickson i...@hixie.ch wrote:

In 2006 I proposed the following spec for a spellcheck= attribute,
based on requests from the Google engineers then working on Firefox:

   http://www.damowmow.com/playground/spellcheck.txt

The same engineers have since implemented this feature in Chrome also,  
and

Google does use this attribute on its sites. However, the attribute has
seen very little interest outside of Google, with just a handful of sites
using it, primarily in dyanamic editor libraries.

I have therefore not added this feature to HTML5 for the time being. If
there is more interest in this feature, please speak up.


Opera wants to support this feature as well in due course, so I don't  
think we would mind it being added to HTML5. Does it being in Chrome mean  
it is also WebKit? If so, together with Firefox support, seems like a  
compelling reason to add the feature.



--
Anne van Kesteren
http://annevankesteren.nl/
http://www.opera.com/


Re: [whatwg] Spellchecking mark III

2008-12-30 Thread Geoffrey Sneddon


On 30 Dec 2008, at 11:38, Ian Hickson wrote:


In 2006 I proposed the following spec for a spellcheck= attribute,
based on requests from the Google engineers then working on Firefox:

  http://www.damowmow.com/playground/spellcheck.txt

The same engineers have since implemented this feature in Chrome  
also, and
Google does use this attribute on its sites. However, the attribute  
has
seen very little interest outside of Google, with just a handful of  
sites

using it, primarily in dyanamic editor libraries.

I have therefore not added this feature to HTML5 for the time being.  
If

there is more interest in this feature, please speak up.


This seems stupid. If I want to have spell-checking, let me. Don't  
force it off. I don't see any reason to have it forced off, ever.



--
Geoffrey Sneddon
http://gsnedders.com/



Re: [whatwg] Spellchecking mark III

2008-12-30 Thread Maciej Stachowiak


On Dec 30, 2008, at 4:55 AM, Anne van Kesteren wrote:


On Tue, 30 Dec 2008 12:38:42 +0100, Ian Hickson i...@hixie.ch wrote:

In 2006 I proposed the following spec for a spellcheck= attribute,
based on requests from the Google engineers then working on Firefox:

  http://www.damowmow.com/playground/spellcheck.txt

The same engineers have since implemented this feature in Chrome  
also, and
Google does use this attribute on its sites. However, the attribute  
has
seen very little interest outside of Google, with just a handful of  
sites

using it, primarily in dyanamic editor libraries.

I have therefore not added this feature to HTML5 for the time  
being. If

there is more interest in this feature, please speak up.


Opera wants to support this feature as well in due course, so I  
don't think we would mind it being added to HTML5. Does it being in  
Chrome mean it is also WebKit? If so, together with Firefox support,  
seems like a compelling reason to add the feature.


The Google Chrome team has not submitted patches for such a feature to  
WebKit. I am not sure if they plan to eventually submit it to mainline  
WebKit. In fact, this is the first I've heard about Chrome having such  
an extension.


It's not clear to me whether the feature is useful without seeing some  
motivating examples. WebKit by default spellchecks (and grammar  
checks) all editable parts of the document, and it is not obvious to  
me why one would want to force it off for particular form controls or  
editable HTML areas.


Regards,
Maciej



Re: [whatwg] Spellchecking mark III

2008-12-30 Thread Tab Atkins Jr.
On Tue, Dec 30, 2008 at 8:50 AM, Maciej Stachowiak m...@apple.com wrote:

 On Dec 30, 2008, at 4:55 AM, Anne van Kesteren wrote:

 On Tue, 30 Dec 2008 12:38:42 +0100, Ian Hickson i...@hixie.ch wrote:

 In 2006 I proposed the following spec for a spellcheck= attribute,
 based on requests from the Google engineers then working on Firefox:

  http://www.damowmow.com/playground/spellcheck.txt

 The same engineers have since implemented this feature in Chrome also,
 and
 Google does use this attribute on its sites. However, the attribute has
 seen very little interest outside of Google, with just a handful of sites
 using it, primarily in dyanamic editor libraries.

 I have therefore not added this feature to HTML5 for the time being. If
 there is more interest in this feature, please speak up.

 Opera wants to support this feature as well in due course, so I don't
 think we would mind it being added to HTML5. Does it being in Chrome mean it
 is also WebKit? If so, together with Firefox support, seems like a
 compelling reason to add the feature.

 The Google Chrome team has not submitted patches for such a feature to
 WebKit. I am not sure if they plan to eventually submit it to mainline
 WebKit. In fact, this is the first I've heard about Chrome having such an
 extension.

 It's not clear to me whether the feature is useful without seeing some
 motivating examples. WebKit by default spellchecks (and grammar checks) all
 editable parts of the document, and it is not obvious to me why one would
 want to force it off for particular form controls or editable HTML areas.

Agreed.  This feature lives purely in user-space.  It can be
convenient for a user to be able to turn off spellchecking globally,
or perhaps even locally (FF exposes this currently through a
right-click option on editable areas), but I cannot see any reason for
an author to have control over this.  If I want to spellcheck an area,
I want to spellcheck it.  If I don't, I don't.

~TJ


Re: [whatwg] Spellchecking mark III

2008-12-30 Thread Kornel Lesiński


On 30.12.2008, at 13:45, Geoffrey Sneddon wrote:


I have therefore not added this feature to HTML5 for the time  
being. If

there is more interest in this feature, please speak up.


This seems stupid. If I want to have spell-checking, let me. Don't  
force it off. I don't see any reason to have it forced off, ever.



It's useful for fields that contain non-textual content, e.g. product  
ID, license plate number, CAPTCHA answer, etc.
Browser would mark these as misspelt, which might be confusing or at  
least distracting.


--
regards, Kornel





Re: [whatwg] Spellchecking mark III

2008-12-30 Thread timeless
On Tue, Dec 30, 2008 at 5:20 PM, Kornel Lesiński kor...@geekhood.net wrote:
 It's useful for fields that contain non-textual content, e.g. product ID,
 license plate number, CAPTCHA answer, etc.
 Browser would mark these as misspelt, which might be confusing or at least
 distracting.

this sounds like something browser vendors need to worry about on
their own and is not a reason to let web pages do anything about it.


Re: [whatwg] Spellchecking mark III

2008-12-30 Thread Giovanni Campagna
2008/12/30 timeless timel...@gmail.com

 On Tue, Dec 30, 2008 at 5:20 PM, Kornel Lesiński kor...@geekhood.net
 wrote:
  It's useful for fields that contain non-textual content, e.g. product ID,
  license plate number, CAPTCHA answer, etc.
  Browser would mark these as misspelt, which might be confusing or at
 least
  distracting.

 this sounds like something browser vendors need to worry about on
 their own and is not a reason to let web pages do anything about it.


maybe we could just say that spellchecking is disabled when type is not text
(for email, uri and number you have validation) and when a pattern attribute
is specified

Giovanni


Re: [whatwg] Spellchecking mark III

2008-12-30 Thread Robert O'Callahan
2008/12/31 timeless timel...@gmail.com

 On Tue, Dec 30, 2008 at 5:20 PM, Kornel Lesiński kor...@geekhood.net
 wrote:
  It's useful for fields that contain non-textual content, e.g. product ID,
  license plate number, CAPTCHA answer, etc.
  Browser would mark these as misspelt, which might be confusing or at
 least
  distracting.

 this sounds like something browser vendors need to worry about on
 their own and is not a reason to let web pages do anything about it.


The browser can't know ahead of time that a text field is not supposed to
contain natural-language text.

Rob
-- 
He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all. [Isaiah
53:5-6]


Re: [whatwg] Spellchecking mark III

2008-12-30 Thread Robert O'Callahan
2008/12/31 Giovanni Campagna scampa.giova...@gmail.com

 2008/12/30 timeless timel...@gmail.com

 On Tue, Dec 30, 2008 at 5:20 PM, Kornel Lesiński kor...@geekhood.net
 wrote:
  It's useful for fields that contain non-textual content, e.g. product
 ID,
  license plate number, CAPTCHA answer, etc.
  Browser would mark these as misspelt, which might be confusing or at
 least
  distracting.

 this sounds like something browser vendors need to worry about on
 their own and is not a reason to let web pages do anything about it.


 maybe we could just say that spellchecking is disabled when type is not
 text (for email, uri and number you have validation) and when a pattern
 attribute is specified


That handles some cases, but not others --- e.g. text boxes that contain
program code.

Rob
-- 
He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all. [Isaiah
53:5-6]


Re: [whatwg] Spellchecking mark III

2008-12-30 Thread Calogero Alex Baldacchino

Robert O'Callahan ha scritto:
2008/12/31 Giovanni Campagna scampa.giova...@gmail.com 
mailto:scampa.giova...@gmail.com


2008/12/30 timeless timel...@gmail.com mailto:timel...@gmail.com

On Tue, Dec 30, 2008 at 5:20 PM, Kornel Lesiński
kor...@geekhood.net mailto:kor...@geekhood.net wrote:
 It's useful for fields that contain non-textual content,
e.g. product ID,
 license plate number, CAPTCHA answer, etc.
 Browser would mark these as misspelt, which might be
confusing or at least
 distracting.

this sounds like something browser vendors need to worry about on
their own and is not a reason to let web pages do anything
about it.


maybe we could just say that spellchecking is disabled when type
is not text (for email, uri and number you have validation) and
when a pattern attribute is specified


That handles some cases, but not others --- e.g. text boxes that 
contain program code.


Rob
--
He was pierced for our transgressions, he was crushed for our 
iniquities; the punishment that brought us peace was upon him, and by 
his wounds we are healed. We all, like sheep, have gone astray, each 
of us has turned to his own way; and the LORD has laid on him the 
iniquity of us all. [Isaiah 53:5-6]
Indeed, that's a valid use case. Anyway, I don't think such a spec 
should and _would_ prevent UAs from giving users a chance to bypass the 
'spellcheck=' attribute (e.g., such an attribute may overcome a UA 
default value, as spec'ed out, but the user may be notified of it, and a 
UA context menu option may allow a different setting, just as a resort 
in case of misuses/errors, such in the example of a 'spellcheck= 
false' applied to a box containing some code).


The language to check might be choosen from several sources, such as the 
'lang' attribute of the contenteditable element itself, if different 
from the document language. For instance, a blog editor's interface 
document might not be translated in a certain language, whereas allowing 
content creation in that language and giving the author a chance to set 
the proper language for a spell checker by changing (through script) the 
editor box language.


A possible evolution, if required upon time, might involve a further 
attribute referencing an external dictionary file, perhaps in a standard 
format, or in a format a UA can recognize (thus, indicating 
alternatives), and using the 'spellcheck' attribute when no appropriate 
language/dictionary can be specified, or to say that just the specified 
dictionary/dictionaries must be used.


Best Regards,
Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Proteggi la tua auto
* Garanzia furto e incendio a soli 30 euro! Offerta valida fino al 31 Dicembre! 
Non perdere l�occasione!
* 
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8509d=31-12


Re: [whatwg] Spellchecking mark III

2008-12-30 Thread Calogero Alex Baldacchino

Calogero Alex Baldacchino ha scritto:



The language to check might be choosen from several sources, such as 
the 'lang' attribute of the contenteditable element itself, if 
different from the document language. For instance, a blog editor's 
interface document might not be translated in a certain language, 
whereas allowing content creation in that language and giving the 
author a chance to set the proper language for a spell checker by 
changing (through script) the editor box language.




Or, perhaps, the editor interface might be negotiated basing on the 
author's language settings, but he/she might be interested to write a 
content in a foreign language, thus wishing spellcheking in that 
language (if allowed by a UA's capabilities).


Best Regards,
Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Proteggi la tua auto
* Garanzia furto e incendio a soli 30 euro! Offerta valida fino al 31 Dicembre! 
Non perdere l�occasione!
* 
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8509d=31-12


Re: [whatwg] Spellchecking mark III

2006-07-06 Thread Gervase Markham
Ian Hickson wrote:
  3. Otherwise, if the user has disabled the checking for this text,
 then the checking is disabled.
 
  4. Otherwise, if the user has forced the checking for this text to
 always be enabled, then the checking is enabled.
 
  5. Otherwise, if the element with which the text is associated has a
 spellcheck content attribute, then: if that attribute has the
 literal value on, then checking is enabled; otherwise, if that
 attribute has the literal value off, then checking is disabled;
 otherwise, move on to the next step.

How does this get away from the

Check Spelling:

( ) No
( ) Yes(i.e. when the page says)
( ) Really, really, yes(i.e. always, whatever the page says)

preference problem?

Gerv


Re: [whatwg] Spellchecking mark III

2006-06-30 Thread Mikko Rantalainen
The more I think about this the more I believe that the correct 
choise would be to describe the expected content more accurately. 
The UA may then proceed to accurately turn spellchecking on or off. 
The problem is that the lang attribute allows only stuff defined in 
RFC 3066, which seems to support only ISO 639 defined language tags. 
That is, the expressable languages are limited to *spoken* languages.


Ian Hickson wrote:

On Sun, 11 Jun 2006, Alexey Feldgendler wrote:
 Information like this input field should have autoindent is 
 presentational.


Yeah, but you'd have to say auto-indent this like C++, which isn't. 
IMHO.


Perhaps instead of using |spellcheck| attribute as a toggle, allow 
white space separated list of expected input languages. If user is 
expected to enter C++ code with English comments, then author should 
use markup such as


textarea lang=zzz spellcheck=c++ en

for no linguistic content with spell checking for c++ and English.

An another option would be to expand the lang attribute to allow 
languages outside human languages. This has the added bonus that the 
lang attribute could describe also other content more accurately. 
RFC 3066 reserves language codes starting with x- for private use 
and that could be used to aid spellchecking, too. Unfortunately only 
A-Z,0-9 are allowed so perhaps something like


textarea lang=x-cpp-en

for private language cpp-en or C++ with English comments. Or if 
lang attribute is extended to allow multiple languages listed then 
one could write


textarea lang=en x-cpp

for English text mixed with C++ code (which is less accurate than 
the x-cpp-en above).


The GMail To: input field could be expressed as

textarea lang=x-mail-to

and UAs that don't regognize language x-mail-to should turn off 
the spellchecking.


A typical blog input field could be encoded as

textarea lang=x-html-fragment-en

Here one sees more need for multiple language tags inside the lang 
attribute. It would make more sense to use lang=x-html-fragment en 
or there would be need for *very* many private languages starting 
with x-html-fragment- including x-html-fragment-sv-fi.



On Fri, 23 Jun 2006, Sander Tekelenburg wrote:

[AUTHOR REQUIREMENTS]

Authors should set the document's language information, to enable user 
agents to accurately determine which dictionary to use when checking 
the spelling or grammar of user input.

IMO this should should be a must.


What about if the author doesn't know the language?


ISO 639 Part 2 includes und for undetermined language. A sane 
default for UA is to disable the spell checking. Or use some unknown 
heuristic to define the language itself.



On Sat, 24 Jun 2006, Alexey Feldgendler wrote:
Even worse: when entering text in textarea, the user actually has a 
choice which language to write in. I think the user agent should 
provide, besides just the control to turn spellchecking on and off, a 
choice of languages.


Agreed.


If a form expects some English text to be entered, it would be wise 
to mark text written with any other language as incorrectly spelled. 
If author expects any language then he should specify lang=mul for 
multiple languages (again, defined by ISO 639 part 2).


Again, a list of acceptable languages would be nice here.

--
Mikko


[whatwg] Spellchecking mark III

2006-06-29 Thread Ian Hickson

I believe this answers all outstanding e-mails on the subject, please let 
me know if I missed one. I include a new proposal (still with a 
spellcheck= attribute, based mostly on implementation feedback from the 
Mozilla guys) at the end of the e-mail.


On Sun, 11 Jun 2006, Robert Gr�sdal wrote:

 How about something like cascading behaviour sheets?
 
 style type=text/cbs
 #first-name {
   inputmode: user startUpper;
   spellcheck: enabled;
 }
 .cplusplus {
   spellcheck: disabled;
   autoindent: C++;
   highlight: C++;
   auto-evaluate: disabled;
 }
 .math {
   spellcheck: math;
   inputmode: math;
   highlight: math;
   auto-evaluate: enabled;
 }
 /style
 
 I'd hate to have to specify all those attributes to every single input 
 field for sure.

Well, it wouldn't be stylistic per se, so I don't think it would belong in 
the stylesheet. Even if you disable the styles, the spellcheck settings 
still apply.


 By the way - Hello everyone, this is my first post to the list.

Welcome!


On Sun, 11 Jun 2006, Matthew Raymond wrote:

 If, however, we're really just talking about adding words to the UA 
 dictionary temporarily and for a specific site, couldn't we just do that 
 with meta using the same format as we do with keywords?
 
 | meta name=vocabulary lang=en-us
 |  content=HTML5, WHATWG, WF2, WA1, WD1, CSS3-UI, TARDIS, ZPM, DHD
 
 Are there actually situations where different controls would need 
 different vocabulary?!?

That's an interesting idea, but I'd shy away from doing this for now. 
Let's start small and build up...


On Sun, 11 Jun 2006, Alexey Feldgendler wrote:

 Maybe features like spellckeching, syntax highlighting and so on should 
 be controlled via CSS? That way, they can be fine-tuned to any degree of 
 precision without complicating the HTML schema. This will also reduce 
 verbosity of input elements because they would otherwise have the same 
 repeated attributes.

While I certainly sympathise with the concern of heavy input elements 
(that's why I was against the spellcheck= attribute in the first place), 
I don't think this is stylistic. Maybe we need another kind of macro 
capability, such as XBL, for this problem.


 Information like this input field should have autoindent is 
 presentational.

Yeah, but you'd have to say auto-indent this like C++, which isn't. 
IMHO.

You should raise this in the www-style list, though, if you do think it 
is appropriate.




On Sun, 11 Jun 2006, Lachlan Hunt wrote:
 
 No, spell checking is a user agent feature that should be controlled by 
 the UA and the user.  Authors should have no explicit control over it. 
 Besides, spell checking *is not* presentation, it is UA functionality 
 and so it does not belong in the presentation layer.

I agree that it isn't presentation, but I disagree that the author 
shouldn't be able to suggest whether or not to enable it. More on this 
below.


On Sun, 11 Jun 2006, Alexey Feldgendler wrote:
  
  Besides, spell checking *is not* presentation, it is UA functionality 
  and so it does not belong in the presentation layer.
 
 Visual elements  = Presentation
 Interactive elements = Behavior
 
 I think these are similar relationships. BTW, isn't the cursor CSS 
 property about behavior?

The behaviour vs presentation debate is a rat hole. The key thing is 
should this continue to work if you disable the author stylesheet and 
should this continue to work if you stop using a screen.

It is clear, IMHO, that spellchecking being enabled or not is independent 
of both whether the author's stylesheet is enabled or not and whether the 
content is being displayed on a screen or not.


On Sun, 11 Jun 2006, Alexey Feldgendler wrote:
   
One can also say that authors should not have explicit control over 
   whether hyperlinks are underlined or not.
  
  The difference is that underlining is presentation, spell checking is 
  not. The functionality of a link cannot be changed with CSS, likewise 
  spell checking shouldn't either.
 
 Enabling or disabling spell checking doesn't change the functionality of 
 an input.

Sure it does. It changes whether or not the user's typos will be flagged 
to the user or not. That seems like quite a big difference.


 It can still be used to submit arbitrary text to the server.

By that argument. type=text vs type=password is a presentational aspect. 
Or even more, type=text vs type=checkbox.


 But misspelled words in an input with spellchecking enabled are 
 underlined with a wavy red line (and the underlining style could even be 
 changed by CSS), and that's presentation.

I agree that the line itself should be stylable in CSS (if at all), but I 
disagree that the presence or absence of the line is a stylistic matter.


On Mon, 12 Jun 2006, Lachlan Hunt wrote:
 
 While the core functionality of allowing the user to enter text isn't 
 changed, I'd consider spell checking to be part of the control's 
 functionality, and so disabling it 

Re: [whatwg] Spellchecking mark III

2006-06-29 Thread Sander Tekelenburg
At 23:56 + UTC, on 2006-06-29, Ian Hickson wrote:

[...]

 On Mon, 12 Jun 2006, Alexey Feldgendler wrote:

 There's nothing really bad in allowing CSS to control behavior to some
 extent.

 CSS is the part of the document that can be disabled/replaced. If
 disabling the author styles changes the functionality of the page, then
 that's bad.

Agreed.

[...]

 On Thu, 15 Jun 2006, Sander Tekelenburg wrote:

 [...] Just like authors cannot know what font size is
 best for a user they cannot know whether a spellchecker is useful or a
 nuisance.

 But they can suggest what font-size might be most appropriate.

I don't see how allowing authors to abuse one thing is an argument to give
them more things to abuse :)

I'm well aware that *everything* can be abused, but when something is useful
enough then its potential for abuse is a downside you choose to live with.
When something is not useful enough, the abusability argument should win.

[...]

 On Fri, 23 Jun 2006, Sander Tekelenburg wrote:

  [AUTHOR REQUIREMENTS]

  Authors should set the document's language information, to enable user
  agents to accurately determine which dictionary to use when checking
  the spelling or grammar of user input.

 IMO this should should be a must.

 What about if the author doesn't know the language?

Of the user input you mean? Good point. But then what if a page is in english
but accepts input in any language? The author should still indicate the
content's language, thereby triggering the wrong spellchecker for those who
wish to input text in another language. In turn, the result may well be that
authors set no lang attribute at all for the page. Surely a spec shouldn't
push authors in that direction?

A solution might be to make this a *must* and allow lang=*, so the author
can state lang=en on the body, and lang=* on the input field. I still
seriously doubt authors will use this as intended though...

  IMPLEMENTATION REQUIREMENTS
 
  All elements can have spellchecking enabled or disabled. UAs may allow
  the user to set this flag

 Why may? Why not must? [...]

 Because you can't require a particular UI. For example, the UA could be a
 kiosk-style system, where the user is not to have any ability to do
 anything but enter his text and hit submit.

Good point. But if I'm not mistaken HTML 4.01 already says that some things
do not apply to UAs that can't implement them. I think you should at least
change this to a should, and add a note to the spec that explains that this
means user-agents that can should, and only those that can't are excused.

(To be clear: my argument is that the spec should do its best to avoid giving
user-agents an excuse to not bother giving the user (easy) control.)

[...]

 On Fri, 23 Jun 2006, Lachlan Hunt wrote:

 I don't particularly like giving the authors any control over spell
 checking. For the majority of cases, I think browsers should become
 smart enough to know whether or not to enable/disable spell checking
 without any explicit author input, based on various heuristics (as I've
 written about before [1]).  In other words, for most cases, authors
 should not need to use this attribute.

 The request for this attribute came from a UA in the first place. This
 would seem to suggest they can't find a way to be smart enough, and would
 like author input.

Not meaning to be disrespectful, but can't suggests it's simply too
difficult technically, while it could just as well be that they prefer to
take the easier way out, for instance because they can't afford spending the
necessary resources on this. That can be a perfectly valid argument for an
individual browser vendor, but it's hardly a solid basis for a HTML spec.
Especially when the request comes from a single browser vendor and everybody
else seems to see more problems than benefits.

[...]

  Ok, so how can we ensure that spell checking is enable for GMail's To:
  line but enabled for its Subject line?

 Ordinarily, input type=email would handle no spell checking for
 email addresses, but given that Gmail uses a textarea that contains both
 people's names and email addresses, that may be one case where
 heuristics may not give optimal results.

 Indeed. So how should we do it, if not using an attribute to hint to the
 UA whether it should be enabled or not?

I can't follow this. If a site uses 2 types of content within the same field
and wants one of those types to be spellchecked, and the other not, how is a
|spellcheck| attribute going to help? They'll need to split those 2 types of
content into 2 different fields.

(That aside, I don't see how users would have a problem with spellcheckers
indicating spell errors on email addresses or names. Surely they're already
used to ignoring that.)

[...]

 since the entire discussion here was spawned by
 one such browser vendor saying we need a way for authors to control
 this!, I would suggest that browser vendors have determined that they
 need a way for authors to control this. :-)