Re: [whatwg] Proposal in supporting the writing of Arabizi

2011-12-04 Thread Sami Eljabali
Thanks Mark for the clarification, and thanks all for the feedback. To the
valid point however, regarding the result of bloated web browsers storing
each language's dictionary, I feel more thought could be put in swaying
IME's off OSs, as it is limiting in availability for all. That said,
couldn't we have have  'dictionary look-ups' be served as a service? It
could follow the search services model available today, where users choose
their provider to be used by the browser itself. This would allow room for
providers to even emerge given possible incentives or others including
noting trends circulating via users speaking x,y, or z languages. Worst
case, one could look into a peer-to-peer solution, where users donate their
bandwidth/cpu for others. Your thoughts on this are appreciated.

Thanks for your time,
-Sami

On Thu, Dec 1, 2011 at 10:28 PM, Mark Callow callow_m...@hicorp.co.jpwrote:

 I think what is being requested by the OP is very very different from
 the things being requested in the W3C bugs linked from the below
 referenced wiki page (which seem like good ideas, but please ensure that
 '+' can be entered in phone numbers).

 As Sergiusz Wolicki pointed the OP is requesting an IME and IMEs already
 exist for several languages. The corollary of this is that hooks for
 IMEs exist in all major operating systems. As Sergiusz also pointed out,
 users will want this functionality available in any text field.  I think
 it would be better to develop an Arabic IME for the OS rather than
 embedding it in browsers. Maybe such a thing already exists.

 Have you any idea of the size of the dictionary and supporting data
 needed for the Japanese IME? It is quite large. I do not think browser
 vendors will want to bloat their products with large IME dictionaries
 for even one language so any browser-based IMEs will inevitably become
 separate downloads. In which case there is no benefit compared to a
 separately downloaded OS-based IME and the disadvantage that it can't be
 used with any text field on the system.

 Regards

-Mark


 On 02/12/2011 03:36, Tab Atkins Jr. wrote:
  On Thu, Dec 1, 2011 at 1:07 AM, Sami Eljabali seljab...@gmail.com
 wrote:
  [snip]
  *Proposal:*
 
  Have the interpreter described above be embedded within browsers and
  enabled when users click and focus on text fields defined as: input
  type=text lang=arabizi to interpret
  Arabizihttp://en.wikipedia.org/wiki/Arabic_chat_alphabetas Arabic.
  Should a browser not support it, then the input type=text would be
 the
  fallback attribute leaving users writing in a plain text field.
  We are looking into something like this for many languages.  I've
  attempted to record this as a use-case on
  http://wiki.whatwg.org/wiki/Text_input_keyboard_mode_control, but I
  can't figure out how to upload images yet.  Once I do, I'll add
  screenshots, an explanation, and a link to this thread.



Re: [whatwg] Proposal in supporting the writing of Arabizi

2011-12-04 Thread Mark Callow
Why do you feel it is necessary to sway IME's off OSes? As far as I know
the OS ones are all freely downloadable or included in OS distributions.
The downloadable ones are not even as hard to find as they used to be.
They're needed for all text input fields across the system. They're
complicated enough that I wouldn't want to have to learn different ones
in different applications.

I quite agree about the dictionaries and not just for IMEs. I have a
ridiculous number of English dictionaries installed on my system, e.g.,
one in Thunderbird, one in Firefox, one in MS Office, one in XMLMind,
one in Foxit Reader plus a host of others. I also have separate copies
of the _same_ Japanese dictionaries in Thunderbird and Firefox for use
by the Rikaichan plug-in. However having dictionary look-up only
available as a network service is a very dangerous way to go from the
perspective of civil rights and liberties. It needs to be a service
available locally perhaps with an option to go to the network.

Regards

-Mark


On 05/12/2011 07:42, Sami Eljabali wrote:
 Thanks Mark for the clarification, and thanks all for the feedback. To the
 valid point however, regarding the result of bloated web browsers storing
 each language's dictionary, I feel more thought could be put in swaying
 IME's off OSs, as it is limiting in availability for all. That said,
 couldn't we have have  'dictionary look-ups' be served as a service? It
 could follow the search services model available today, where users choose
 their provider to be used by the browser itself. This would allow room for
 providers to even emerge given possible incentives or others including
 noting trends circulating via users speaking x,y, or z languages. Worst
 case, one could look into a peer-to-peer solution, where users donate their
 bandwidth/cpu for others. Your thoughts on this are appreciated.


Re: [whatwg] Proposal in supporting the writing of Arabizi

2011-12-04 Thread Sami Eljabali
By not moving IME's off OSes, you're asking every OS connecting to the
internet to support this feature. Netbooks for example, may just have a
native web browser on it. Would its OS then need to implement its own IME
for a few languages for their entry? Instead its web browser could just
support the input field, given they can render them.


On Sun, Dec 4, 2011 at 5:17 PM, Mark Callow callow_m...@hicorp.co.jpwrote:

 Why do you feel it is necessary to sway IME's off OSes? As far as I know
 the OS ones are all freely downloadable or included in OS distributions.
 The downloadable ones are not even as hard to find as they used to be.
 They're needed for all text input fields across the system. They're
 complicated enough that I wouldn't want to have to learn different ones
 in different applications.

 I quite agree about the dictionaries and not just for IMEs. I have a
 ridiculous number of English dictionaries installed on my system, e.g.,
 one in Thunderbird, one in Firefox, one in MS Office, one in XMLMind,
 one in Foxit Reader plus a host of others. I also have separate copies
 of the _same_ Japanese dictionaries in Thunderbird and Firefox for use
 by the Rikaichan plug-in. However having dictionary look-up only
 available as a network service is a very dangerous way to go from the
 perspective of civil rights and liberties. It needs to be a service
 available locally perhaps with an option to go to the network.

 Regards

-Mark


 On 05/12/2011 07:42, Sami Eljabali wrote:
  Thanks Mark for the clarification, and thanks all for the feedback. To
 the
  valid point however, regarding the result of bloated web browsers storing
  each language's dictionary, I feel more thought could be put in swaying
  IME's off OSs, as it is limiting in availability for all. That said,
  couldn't we have have  'dictionary look-ups' be served as a service? It
  could follow the search services model available today, where users
 choose
  their provider to be used by the browser itself. This would allow room
 for
  providers to even emerge given possible incentives or others including
  noting trends circulating via users speaking x,y, or z languages. Worst
  case, one could look into a peer-to-peer solution, where users donate
 their
  bandwidth/cpu for others. Your thoughts on this are appreciated.



Re: [whatwg] Proposal in supporting the writing of Arabizi

2011-12-04 Thread Ryosuke Niwa
On Sun, Dec 4, 2011 at 8:05 PM, Sami Eljabali seljab...@gmail.com wrote:

 By not moving IME's off OSes, you're asking every OS connecting to the
 internet to support this feature. Netbooks for example, may just have a
 native web browser on it. Would its OS then need to implement its own IME
 for a few languages for their entry? Instead its web browser could just
 support the input field, given they can render them.


Why would implementing IME for such an OS be harder than implementing one
for the web browser?

- Ryosuke


Re: [whatwg] Proposal in supporting the writing of Arabizi

2011-12-04 Thread Glenn Maynard
On Sun, Dec 4, 2011 at 11:05 PM, Sami Eljabali seljab...@gmail.com wrote:

 By not moving IME's off OSes, you're asking every OS connecting to the
 internet to support this feature. Netbooks for example, may just have a
 native web browser on it. Would its OS then need to implement its own IME
 for a few languages for their entry? Instead its web browser could just
 support the input field, given they can render them.


Input methods are the job of the operating system, just like file access
and networking; it's a component of user input.  If a system wants to run
only a browser, it's still the *system's* responsibility to provide input
methods; they should no more be moved to browsers than should ext4 or
TCP/IP.

I can also guarantee that actual users don't want browsers to use a
different input method for complex scripts like Japanese, any more than
they want browsers to have their own built-in filesystems or networking
protocols.  They (which includes myself) want input methods to act the same
way in Firefox as they do in Office and Photoshop and terminal windows and
everything else.

-- 
Glenn Maynard


Re: [whatwg] Default encoding to UTF-8?

2011-12-04 Thread Henri Sivonen
On Fri, Dec 2, 2011 at 6:29 PM, Glenn Maynard gl...@zewt.org wrote:
 On Fri, Dec 2, 2011 at 10:46 AM, Henri Sivonen hsivo...@iki.fi wrote:

 Regarding your (and 16) remark, considering my personal happiness at
 work, I'd prioritize the eradication of UTF-16 as an interchange
 encoding much higher than eradicating ASCII-based non-UTF-8 encodings
 that all major browsers support. I think suggesting a solution to the
 encoding problem while implying that UTF-16 is not a problem isn't
 particularly appropriate. :-)
...
 I don't think I'd call it a bigger problem, though, since it's comparatively
 (even vanishingly) rare, where untagged legacy encodings are a widespread
 problem that gets worse every day we can't think of a way to curtail it.

From implementation perspective, UTF-16 has its own class of bugs than
are unlike other encoding-related bugs and fixing those bugs is
particularly annoying because you know that UTF-16 is so rare that you
know the fix has little actual utility.

-- 
Henri Sivonen
hsivo...@iki.fi
http://hsivonen.iki.fi/


Re: [whatwg] Default encoding to UTF-8?

2011-12-04 Thread Glenn Maynard
On Mon, Dec 5, 2011 at 1:30 AM, Henri Sivonen hsivo...@iki.fi wrote:

 From implementation perspective, UTF-16 has its own class of bugs than
 are unlike other encoding-related bugs and fixing those bugs is
 particularly annoying because you know that UTF-16 is so rare that you
 know the fix has little actual utility.


There are lots of things like that on the platform, though, and this one
doesn't really get worse over time.  More and more content with untagged
legacy encodings accumulates every day, regularly causing user-visible
problems, which is why I'd call it a much bigger issue.

-- 
Glenn Maynard