Re: [whatwg] Spellchecking mark III

2009-01-21 Thread Mikko Rantalainen
Peter Kasting wrote:
 2009/1/20 Mikko Rantalainen mikko.rantalai...@peda.net
 
 I agree. I think that specifying the spellcheck attribute would be a
 mistake. It allows only forcing the automatic spell checking on or off
 but it doesn't help a bit to allow mixing different languages on a
 single page.
 
 I don't see how the second sentence is an argument for the first.

If the browser does not know the language of the content, how on earth
is it supposed to *correctly* spellcheck it? I'm daily hitting a
situation where browser is trying to spellcheck content with incorrect
language. I've toggled such automatic spellchecker off and those will
stay off until correct language is detected.

My second sentence was trying to argument that page author has no
business forcing the spellchecking on if the page author cannot force
the spellchecking language! Especially for a case where the page
contains a mix of multiple languages.

 Just specify that spell checking must follow the content language.
 
 How many pages specify the content language?  AFAIK the farthest most
 authors get is to specify the encoding, and even that is frequently done
 wrong, and browsers have all kinds of crazy heuristics to try and
 second-guess authors.
 
 This seems like it would make spellchecking function very poorly on the web
 at large, whereas adding the spellcheck attribute at worst would not harm
 anyone.

I'm aware that many web pages do not specify content language. There
aren't many web pages forcing the spellchecking on or off, either.
Forcing a spellchecking on with incorrect language would harm the user!

It really does not make any sense to ever force spellchecking if the
language that the spellchecker uses is the incorrect one. The current
spellcheck attribute does not define any language and it seems that
the page author has no way to know if the spell checking should really
be disabled or not.

My point is that if the page does not specify the language then the
behavior should be explicitly undefined. This should not be changed. On
the other hand, if the content language is explicitly defined, then the
user agent has the required knowledge to decide if the spellchecking
should be enabled or disabled. There's no need for the spellcheck
attribute.

Make specifying the language the *only* accepted method for triggering
the spell checking. Specify that any unknown language must not be
spellchecked automatically. Then you automatically have a method for
forcing the automatic spell checking off and in addition to that you
have some incentive to define correct language for the page.

If we can persuade content authors to specify the correct content
language, I believe that in the future there will be *other* benefits,
too. For example, automatic hyphenation would improve typographic
quality of web pages but automatic hyphenation is impossible unless you
know the language of the content.

-- 
Mikko




signature.asc
Description: OpenPGP digital signature


Re: [whatwg] Spellchecking mark III

2009-01-21 Thread James Graham

Mikko Rantalainen wrote:

My second sentence was trying to argument that page author has no
business forcing the spellchecking on if the page author cannot force
the spellchecking language! Especially for a case where the page
contains a mix of multiple languages.


Not really. Consider e.g. flickr in which photos may be given titles, 
descriptions and comments in the language of the user's choice but the 
site UI is not localised. If flickr decided to do input type=text 
lang=en to get spellchecking to turn for photo titles then that would 
be much worse for the large number of non-native English speakers than 
input type=text spellcheck=on which would likely use the user's 
preferred dictionary (although this would be UA-dependent of course).


For another example, consider the case where I post on a Swedish forum 
in English, knowing that the general level of English in Sweden is 
excellent and in any case better than the level of my Swedish.


It doesn't seem reasonable to expect sites to always be localised or for 
 sites accepting multilingual user generated content to not exist. 
Therefore it seems totally conterproductive from the point of view of 
people communicating in less dominant languages to require spellchecking 
to be tied to language.




Re: [whatwg] Spellchecking mark III

2009-01-21 Thread Mikko Rantalainen
James Graham wrote:
 Mikko Rantalainen wrote:
 My second sentence was trying to argument that page author has no
 business forcing the spellchecking on if the page author cannot force
 the spellchecking language! Especially for a case where the page
 contains a mix of multiple languages.
 
 Not really. Consider e.g. flickr in which photos may be given titles, 
 descriptions and comments in the language of the user's choice but the 
 site UI is not localised. If flickr decided to do input type=text 
 lang=en to get spellchecking to turn for photo titles then that would 
 be much worse for the large number of non-native English speakers than 
 input type=text spellcheck=on which would likely use the user's 
 preferred dictionary (although this would be UA-dependent of course).

How about input type=text lang=mul if the content author does not
want to specify a language? That would hint the UA that this field
assumes human language but the input may be in any language.

The current (heuristics) could be requested with input type=text
lang=und which explicitly marks this input to contain text with
undefined language.

 For another example, consider the case where I post on a Swedish forum 
 in English, knowing that the general level of English in Sweden is 
 excellent and in any case better than the level of my Swedish.

I agree. However, if the forum maintainer would rather have no text at
all instead of text in wrong language, then the forum maintainer
should use input type=text lang=se and the UA would correctly flag
any non-swedish word as incorrect.

 It doesn't seem reasonable to expect sites to always be localised or for 
   sites accepting multilingual user generated content to not exist. 
 Therefore it seems totally conterproductive from the point of view of 
 people communicating in less dominant languages to require spellchecking 
 to be tied to language.

I'm not suggesting spellchecking to require only a single language. I'm
requesting that if the page wants automatic spell checking it must
explicitly define the language that the spellchecking should check for.
For multiple languages case, the RFC 3066 defines the MUL language code
and for the undefined case, the UND code has been defined.

Currently the lang attribute accepts exactly one language code. For the
case where acceptable input for forum message would be Swedish or
English it would be nice to be able to write input type=text
lang=se,en or perhaps even lang=se,en;q=0.1.

-- 
Mikko




signature.asc
Description: OpenPGP digital signature


Re: [whatwg] Canvas arcTo all points on a line

2009-01-21 Thread Philip Taylor
On Sat, Dec 27, 2008 at 9:37 AM, Dirk Schulze vb...@gmx.de wrote:
 Hi,

 have two questions to the all points on a line part of canvas' arcTo.
 A short example:

 moveTo(50,0);
 arcTo(100,0,  0,0, 10);

 This should add a new, from p1 infinite far away, point to the subpath
 and draw a straight line to it.

 Two questions.

 1) If I add lineTo(50, 50); after arcTo(..). Wouldn't it draw a quasi
 parallel line to the line of arcTo? Because (Xx, Yx) (mentioned in the
 spec) is infinite far away. That means, we will never reach this point
 in reality.

It should draw a really parallel line, with one end at (50,50) and the
other end infinitely far away in the direction determined by the
arcTo.

 2) We don't allow infinite values for moveTo or lineTo, but can make
 this happen with arcTo.
 The example above would be the same as lineTo(-Infinite, 0);
 But we can make moveTo(-Infinite, 0) too with the example above. Just
 make strokeStyle transparent, use arcTo from above and you're done. And
 moveTo(infinite, infinite); would be possible too.

You can moveTo(-1e+300, 0) and moveTo(1e+300, 2e+300), which are much
more similar to what arcTo is meant to do.

Considering the general case where the arcTo's points are not
perfectly horizontal, the idea is that the point is not simply a point
with coordinates (+/-Infinity, +/-Infinity) - it's really the
(theoretical) limit of a point with coordinates (x+dx*t, y+dy*t) as t
approaches infinity, where x,y,dx,dy represent the position/direction
of the (x1,y1)--(x2,y2) line.

Where the spec says (x∞, y∞) is the point that is infinitely far away
from (x1, y1), that lies on the same line as (x0, y0), (x1, y1), and
(x2, y2), you could read it as ...the point that is very very far
away from ..., e.g. take the (x1,y1)--(x2,y2) line and then move
1e+100 units in that direction, and it would be good enough that
nobody would notice the tiny error.

You already have to handle something very similar to this case,
because (x2,y2) might be very very close to the line (x0,y0)--(x1,y1),
which means the start/end tangent points will be very very far away in
the appropriate direction. The special case where (x2,y2) is precisely
on the line is not really special - the points are just even further
(infinitely far) away in that direction.

As a concrete example: see
http://philip.html5.org/demos/canvas/arcto-inf.html, which I believe
should have output like
http://philip.html5.org/demos/canvas/arcto-inf.png (from Safari
3.0.4 for Windows). As (x2,y2) gets closer to the line of the first
two points, the start/end tangent points are pushed further over to
the left. When y2=0.1 they're far enough away that the two straight
lines are nearly horizontal; when y2=0 it's basically the same, except
now they're precisely horizontal.

So I think the spec's behaviour makes sense from a theoretical
perspective, because it avoids any discontinuities in the output when
the input variables are changed a tiny bit. And it made sense from a
practical perspective, because it matched the behaviour of Safari 3.0
(though apparently things have changed in 3.1).

But I don't know if it makes sense from the perspective of someone
who's got to write an independent implementation of it. Does the above
explanation make more sense than the text in the spec? and if so, does
it seem implementable? If so, it seems best to keep the spec's
behaviour and try to clarify the spec's text. But this doesn't seem
like an important case where users will be unhappy if e.g. the arcTo
call draws nothing when all the points are on the same line, so if
it's still a pain to implement the spec's behaviour then I would be
happy with changing what the spec requires.

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Spellchecking mark III

2009-01-21 Thread Aryeh Gregor
On Wed, Jan 21, 2009 at 4:15 AM, Mikko Rantalainen
mikko.rantalai...@peda.net wrote:
 If the browser does not know the language of the content, how on earth
 is it supposed to *correctly* spellcheck it? I'm daily hitting a
 situation where browser is trying to spellcheck content with incorrect
 language. I've toggled such automatic spellchecker off and those will
 stay off until correct language is detected.

In practice, I think the only way to avoid this problem is for
browsers to implement content-sniffing techniques of some kind to
figure out the language, at least per field but ideally on a
word-by-word basis.  If the browser is set to spellcheck in English
but you start putting in lots of non-Latin characters and every word
is therefore misspelled, the browser should be clever enough to try
switching the spellcheck language, or at least disabling spellcheck
for words that can't possibly be from the language it's checking
against.  More refined heuristics could detect even subtle
differences, like between British and American English, and remember
for next time which one the user usually types in.

None of this needs, or even could effectively use, author intervention:

1) The author cannot know what languages users will want to enter in
all cases.  I've sometimes found myself writing posts in Hebrew on
English-only sites, for instance.

2) The author certainly won't be able to determine the dialect or
variant of the language the user will want to use, which is necessary
for spellcheck.

3) Authors should not have to add extra markup if it's not really
necessary, because in practice, most won't.  To be as useful as
possible, spellcheck should Just Work without explicit author
intervention.


Re: [whatwg] Spellchecking mark III

2009-01-21 Thread Bil Corry
Mikko Rantalainen wrote on 1/21/2009 5:03 AM: 
 For another example, consider the case where I post on a Swedish forum 
 in English, knowing that the general level of English in Sweden is 
 excellent and in any case better than the level of my Swedish.
 
 I agree. However, if the forum maintainer would rather have no text at
 all instead of text in wrong language, then the forum maintainer
 should use input type=text lang=se and the UA would correctly flag
 any non-swedish word as incorrect.

I see value in being able to provide a hint to the UA that it should or should 
not spell check certain content, but the ultimate control should reside with 
the user.

I hate the idea of a web site dictating which dictionary must be used to spell 
check the user's content.  Spell checking is for the benefit of the user, not 
the web site, and forcing a dictionary in a language that the user doesn't 
speak is completely useless and would only serve to annoy (i.e. it wouldn't 
prevent the user from submitting content in any language of their choosing).

Beyond that, it has other problems.  Say I visit a site in the UK and it forces 
the UK dictionary; as an American speaker, I'll be confused as to why my UA is 
flagging color as misspelled and will simply turn off spell checking entirely 
since it's broken. 

Additionally, not all UAs ship with dictionaries for every single language (do 
any?), so the UA wouldn't be able to spell check when a dictionary isn't 
available for that user.  I guarantee that if my UA shipped with all of them, 
I'd remove them all except the languages I converse in to prevent the web site 
from forcing a particular dictionary.

Then there are some languages that do not have a dictionary available at all, 
such as Tamil in Firefox:

https://addons.mozilla.org/en-US/firefox/browse/type:3

I don't see any benefit to the user in forcing them to use a particular 
dictionary and the only benefit to the site is it might annoy someone into 
using a particular language (assuming they even have the dictionary for that 
language).


- Bil



Re: [whatwg] Spellchecking mark III

2009-01-21 Thread Peter Kasting
On Wed, Jan 21, 2009 at 1:15 AM, Mikko Rantalainen 
mikko.rantalai...@peda.net wrote:

 If the browser does not know the language of the content, how on earth
 is it supposed to *correctly* spellcheck it?


As others have noted, the user's preferences are generally a better
indicator of how something should be spellchecked, for a number of reasons.
 (Bill Corry's email was on-point here.)


 I'm daily hitting a
 situation where browser is trying to spellcheck content with incorrect
 language. I've toggled such automatic spellchecker off and those will
 stay off until correct language is detected.


As I said, this seems a separate problem to me.  Dynamic language switching
or multi-language spellchecking based on various heuristics seems like the
solution here.  This applies to any spellchecked field anywhere and is
separate from the issue of whether an author wants to tell the UA that a
field is even appropriate for spellchecking or not.

My second sentence was trying to argument that page author has no
 business forcing the spellchecking on if the page author cannot force
 the spellchecking language!


I disagree completely.  Consider one of the original use cases for this:
Gmail instructing UAs to spellcheck the optional Subject field of a mail.
 There's no way Gmail can know what language(s) the user may type in this
field, but it's still appropriate to tell the UA that the field is
appropriate for spellchecking.  At this point it's up to the AU to determine
what language to use.

I also take issue with the word force, which is imprecise.  The spellcheck
attribute spec was carefully written to ensure that the user and UA have
ultimate control over whether spellchecking actually occurs, regardless of
what the author specifies; the attribute is a hint to the UA, not force.


 Forcing a spellchecking on with incorrect language would harm the user!


A good reason why the UA's spellchecking language should not be determined
by the author (and thus why your proposal leaves me cold).

On
 the other hand, if the content language is explicitly defined, then the
 user agent has the required knowledge to decide if the spellchecking
 should be enabled or disabled. There's no need for the spellcheck
 attribute.


The UA does not know which fields actually contain language and which
simply contain strings of characters.  Enumerating input types (e.g. this
field contains email addresses) can address this, but suffers from two
problems:
* There are an unbounded number of input types, potentially
* Types should perhaps not always be treated equally.  For example, if an
author wrote a spelling quiz, then input boxes for a user to type in would
contain words and thus be of a spellcheckable type, but the author would
clearly prefer the UA not spellcheck them :)

If we can persuade content authors to specify the correct content
 language,


Proposals that sound like if we could just get authors to write valid,
semantic content with no errors... have always seemed naive to me.

PK


Re: [whatwg] Canvas arcTo all points on a line

2009-01-21 Thread Calogero Alex Baldacchino

Philip Taylor ha scritto:


But I don't know if it makes sense from the perspective of someone
who's got to write an independent implementation of it. Does the above
explanation make more sense than the text in the spec? and if so, does
it seem implementable? If so, it seems best to keep the spec's
behaviour and try to clarify the spec's text. But this doesn't seem
like an important case where users will be unhappy if e.g. the arcTo
call draws nothing when all the points are on the same line, so if
it's still a pain to implement the spec's behaviour then I would be
happy with changing what the spec requires.

  


I haven't checked this part of the spec insofar; looking at the image 
you posted it seems the 3 points are used as control points in a 
somewhat algorithm to draw curve lines; personally, thinking to an API 
function to draw arcs, I prefer to have the specified points as being 
part of the arc itself (e.g., the two external ones are the extremes of 
a convex elliptical arc). Anyway, certainly what you say makes sense for 
an arc degenering to a line (that is, if all points lay on the same 
line). Assuming the angular coefficient and the start point of the line 
are known, it is easy to find the intersection between it and a clip 
region (through the mean-point algorithm) -- it should be the same with 
a (x2, y2) point very close with the (x0, y0)--(x1, y1) segment, that is 
if under a certain threshold one can't drow an arc and instead the 
result must be approximated to a half-infinite line (I think all an 
implementation needs is to remember an infinite line has been drawn and 
the last point in the subpath is infinitely far, so it can draw a 
parallel line when .lineTo() is invocked).


WBR, Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Con Danone Activia, puoi vincere cellulari Nokia e Macbook Air. Scopri come
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8549d=21-1


Re: [whatwg] Canvas arcTo all points on a line

2009-01-21 Thread Philip Taylor
On Wed, Jan 21, 2009 at 2:45 PM, Philip Taylor excors+wha...@gmail.com wrote:
 On Sat, Dec 27, 2008 at 9:37 AM, Dirk Schulze vb...@gmx.de wrote:
 Hi,

 have two questions to the all points on a line part of canvas' arcTo.
 A short example:

 moveTo(50,0);
 arcTo(100,0,  0,0, 10);

 This should add a new, from p1 infinite far away, point to the subpath
 and draw a straight line to it.

 [...]

After some discussion on IRC, it seems this part of the spec is not a
great idea.

As I understand it, the low-level graphics APIs have limited
coordinate range and rely on the User agents may impose
implementation-specific limits on otherwise unconstrained inputs, e.g.
to prevent denial of service attacks, to guard against running out of
memory, or to work around platform-specific limitations. clause (and
common sense) to let them have undefined behaviour when people use
really large coordinate values. The infinitely-distant point required
by arcTo is a really large coordinate value, but we don't want this
case to be undefined behaviour (because it can occur with nice small
integer input values and people might accidentally use it).

Implementing the behaviour currently in the spec (with the
infinitely-distant point) is not trivial, because it requires code
unique to that special case (rather than falling naturally out of an
implementation of the rest of arcTo's behaviour) and has to be careful
to act enough like an infinitely-distance point while remaining within
the implementation limits.

And it seems like a rare edge case where people disagree on whether
the output is sensible, and nobody is really going to care what the
output is (as long as it's well defined); so it doesn't seem
worthwhile having everyone understand and implement the non-trivial
behaviour that's in the spec.

So, in the interest of having something that implementors are more
likely to converge on, I'd suggest replacing the behaviour in that
case (the the direction from (x0, y0) to (x1, y1) is the opposite of
the direction from (x1, y1) to (x2, y2) case) with simply drawing a
straight line from (x0, y0) to (y1, y1), which is easy and apparently
is what Safari on OS X already does. It's also the same as the other
case in that paragraph, so the whole paragraph can be collapsed to:

  Otherwise, if the points (x0, y0), (x1, y1), and (x2, y2) all lie
on a single straight line, then the method must add the point (x1, y1)
to the subpath, and connect that point to the previous point (x0, y0)
by a straight line.

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Window::applicationCache missing

2009-01-21 Thread Ian Hickson
On Fri, 2 Jan 2009, Cameron McCormack wrote:

 The applicationCache attribute on the Window interface seems to be 
 missing.

Fixed.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


[whatwg] Dynamic entries in the Application Cache removed

2009-01-21 Thread Ian Hickson
 
After consultation with the two implementors I'm aware of for the
Application Cache feature, I've removed the Dynamic Entries feature.

While the use case still needs to be addressed somehow, I would like to   
get more implementation experience before committing to a particular 
solution. There are other options, and the dynamic entries feature was
never really fully baked.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Spellchecking mark III

2009-01-21 Thread Calogero Alex Baldacchino

Aryeh Gregor ha scritto:

On Wed, Jan 21, 2009 at 4:15 AM, Mikko Rantalainen
mikko.rantalai...@peda.net wrote:
  

If the browser does not know the language of the content, how on earth
is it supposed to *correctly* spellcheck it? I'm daily hitting a
situation where browser is trying to spellcheck content with incorrect
language. I've toggled such automatic spellchecker off and those will
stay off until correct language is detected.



In practice, I think the only way to avoid this problem is for
browsers to implement content-sniffing techniques of some kind to
figure out the language, at least per field but ideally on a
word-by-word basis.  If the browser is set to spellcheck in English
but you start putting in lots of non-Latin characters and every word
is therefore misspelled, the browser should be clever enough to try
switching the spellcheck language, or at least disabling spellcheck
for words that can't possibly be from the language it's checking
against.  More refined heuristics could detect even subtle
differences, like between British and American English, and remember
for next time which one the user usually types in.

  


Why not to let the user choose the language, as it happens in word 
processors? A UA can't choose accurately whether, for instance, color 
is a correct American English, a wrong British English, or even a 
correct (truncated) Italian word, while a human can do it better, thus a 
UA could provide an interface to change the language for a selection 
spellchecking, or even for each mispelled word, starting from a hint 
language, which could be the value of an element lang attribute 
(beside a default value and a user-preference forced one - the latter 
bypassing any authored value). Also, using the lang attribute value as 
the start language to check (if not in contrast with a user preference) 
would allow an interactive interface with a script changing that value 
according to a user's choice (UAs could also expose a list of supported 
languages).


A declaration such as lang='und' sounds like telling the user agent to 
do whatever is computed as being a good choice, which is different from 
telling don't even try to understand what the language is here, because 
I know you can't guess it; declaring a value known to be unsupported 
(such as an invented one) to turn off spellchecking sounds like a hack 
needed because we miss a more appropriate feature.


Everything IMHO.

WBR, Alex


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Partecipa al concorso Danone Activia e vinci MacBook Air e Nokia N96. Prova
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8548d=22-1


Re: [whatwg] Spellchecking mark III

2009-01-21 Thread Peter Kasting
On Wed, Jan 21, 2009 at 7:38 PM, Calogero Alex Baldacchino 
alex.baldacch...@email.it wrote:

 Why not to let the user choose the language, as it happens in word
 processors? A UA can't choose accurately whether, for instance, color is a
 correct American English, a wrong British English, or even a correct
 (truncated) Italian word, while a human can do it better, thus a UA could
 provide an interface to change the language for a selection spellchecking,
 or even for each mispelled word, starting from a hint language, which could
 be the value of an element lang attribute (beside a default value and a
 user-preference forced one - the latter bypassing any authored value).
 Also, using the lang attribute value as the start language to check (if
 not in contrast with a user preference) would allow an interactive interface
 with a script changing that value according to a user's choice (UAs could
 also expose a list of supported languages).


I'm not sure I fully grasped everything here, but what I did grasp sounds
very much like a cross between what Chromium is doing today and what we want
to do in the future (I imagine similar things are true for other browser
vendors).  User specification and page hints are both useful tools for a UA.

But I still claim that all of those aspects are outside the scope of the
spellcheck attribute, and fall into the realm of things that should not
be in the HTML5 spec as they're very much UA-specific behavior.

PK