Re: [whatwg] Proposal in supporting the writing of Arabizi
Thanks Mark for the clarification, and thanks all for the feedback. To the valid point however, regarding the result of bloated web browsers storing each language's dictionary, I feel more thought could be put in swaying IME's off OSs, as it is limiting in availability for all. That said, couldn't we have have 'dictionary look-ups' be served as a service? It could follow the search services model available today, where users choose their provider to be used by the browser itself. This would allow room for providers to even emerge given possible incentives or others including noting trends circulating via users speaking x,y, or z languages. Worst case, one could look into a peer-to-peer solution, where users donate their bandwidth/cpu for others. Your thoughts on this are appreciated. Thanks for your time, -Sami On Thu, Dec 1, 2011 at 10:28 PM, Mark Callow callow_m...@hicorp.co.jpwrote: I think what is being requested by the OP is very very different from the things being requested in the W3C bugs linked from the below referenced wiki page (which seem like good ideas, but please ensure that '+' can be entered in phone numbers). As Sergiusz Wolicki pointed the OP is requesting an IME and IMEs already exist for several languages. The corollary of this is that hooks for IMEs exist in all major operating systems. As Sergiusz also pointed out, users will want this functionality available in any text field. I think it would be better to develop an Arabic IME for the OS rather than embedding it in browsers. Maybe such a thing already exists. Have you any idea of the size of the dictionary and supporting data needed for the Japanese IME? It is quite large. I do not think browser vendors will want to bloat their products with large IME dictionaries for even one language so any browser-based IMEs will inevitably become separate downloads. In which case there is no benefit compared to a separately downloaded OS-based IME and the disadvantage that it can't be used with any text field on the system. Regards -Mark On 02/12/2011 03:36, Tab Atkins Jr. wrote: On Thu, Dec 1, 2011 at 1:07 AM, Sami Eljabali seljab...@gmail.com wrote: [snip] *Proposal:* Have the interpreter described above be embedded within browsers and enabled when users click and focus on text fields defined as: input type=text lang=arabizi to interpret Arabizihttp://en.wikipedia.org/wiki/Arabic_chat_alphabetas Arabic. Should a browser not support it, then the input type=text would be the fallback attribute leaving users writing in a plain text field. We are looking into something like this for many languages. I've attempted to record this as a use-case on http://wiki.whatwg.org/wiki/Text_input_keyboard_mode_control, but I can't figure out how to upload images yet. Once I do, I'll add screenshots, an explanation, and a link to this thread.
Re: [whatwg] Proposal in supporting the writing of Arabizi
Why do you feel it is necessary to sway IME's off OSes? As far as I know the OS ones are all freely downloadable or included in OS distributions. The downloadable ones are not even as hard to find as they used to be. They're needed for all text input fields across the system. They're complicated enough that I wouldn't want to have to learn different ones in different applications. I quite agree about the dictionaries and not just for IMEs. I have a ridiculous number of English dictionaries installed on my system, e.g., one in Thunderbird, one in Firefox, one in MS Office, one in XMLMind, one in Foxit Reader plus a host of others. I also have separate copies of the _same_ Japanese dictionaries in Thunderbird and Firefox for use by the Rikaichan plug-in. However having dictionary look-up only available as a network service is a very dangerous way to go from the perspective of civil rights and liberties. It needs to be a service available locally perhaps with an option to go to the network. Regards -Mark On 05/12/2011 07:42, Sami Eljabali wrote: Thanks Mark for the clarification, and thanks all for the feedback. To the valid point however, regarding the result of bloated web browsers storing each language's dictionary, I feel more thought could be put in swaying IME's off OSs, as it is limiting in availability for all. That said, couldn't we have have 'dictionary look-ups' be served as a service? It could follow the search services model available today, where users choose their provider to be used by the browser itself. This would allow room for providers to even emerge given possible incentives or others including noting trends circulating via users speaking x,y, or z languages. Worst case, one could look into a peer-to-peer solution, where users donate their bandwidth/cpu for others. Your thoughts on this are appreciated.
Re: [whatwg] Proposal in supporting the writing of Arabizi
By not moving IME's off OSes, you're asking every OS connecting to the internet to support this feature. Netbooks for example, may just have a native web browser on it. Would its OS then need to implement its own IME for a few languages for their entry? Instead its web browser could just support the input field, given they can render them. On Sun, Dec 4, 2011 at 5:17 PM, Mark Callow callow_m...@hicorp.co.jpwrote: Why do you feel it is necessary to sway IME's off OSes? As far as I know the OS ones are all freely downloadable or included in OS distributions. The downloadable ones are not even as hard to find as they used to be. They're needed for all text input fields across the system. They're complicated enough that I wouldn't want to have to learn different ones in different applications. I quite agree about the dictionaries and not just for IMEs. I have a ridiculous number of English dictionaries installed on my system, e.g., one in Thunderbird, one in Firefox, one in MS Office, one in XMLMind, one in Foxit Reader plus a host of others. I also have separate copies of the _same_ Japanese dictionaries in Thunderbird and Firefox for use by the Rikaichan plug-in. However having dictionary look-up only available as a network service is a very dangerous way to go from the perspective of civil rights and liberties. It needs to be a service available locally perhaps with an option to go to the network. Regards -Mark On 05/12/2011 07:42, Sami Eljabali wrote: Thanks Mark for the clarification, and thanks all for the feedback. To the valid point however, regarding the result of bloated web browsers storing each language's dictionary, I feel more thought could be put in swaying IME's off OSs, as it is limiting in availability for all. That said, couldn't we have have 'dictionary look-ups' be served as a service? It could follow the search services model available today, where users choose their provider to be used by the browser itself. This would allow room for providers to even emerge given possible incentives or others including noting trends circulating via users speaking x,y, or z languages. Worst case, one could look into a peer-to-peer solution, where users donate their bandwidth/cpu for others. Your thoughts on this are appreciated.
Re: [whatwg] Proposal in supporting the writing of Arabizi
On Sun, Dec 4, 2011 at 8:05 PM, Sami Eljabali seljab...@gmail.com wrote: By not moving IME's off OSes, you're asking every OS connecting to the internet to support this feature. Netbooks for example, may just have a native web browser on it. Would its OS then need to implement its own IME for a few languages for their entry? Instead its web browser could just support the input field, given they can render them. Why would implementing IME for such an OS be harder than implementing one for the web browser? - Ryosuke
Re: [whatwg] Proposal in supporting the writing of Arabizi
On Sun, Dec 4, 2011 at 11:05 PM, Sami Eljabali seljab...@gmail.com wrote: By not moving IME's off OSes, you're asking every OS connecting to the internet to support this feature. Netbooks for example, may just have a native web browser on it. Would its OS then need to implement its own IME for a few languages for their entry? Instead its web browser could just support the input field, given they can render them. Input methods are the job of the operating system, just like file access and networking; it's a component of user input. If a system wants to run only a browser, it's still the *system's* responsibility to provide input methods; they should no more be moved to browsers than should ext4 or TCP/IP. I can also guarantee that actual users don't want browsers to use a different input method for complex scripts like Japanese, any more than they want browsers to have their own built-in filesystems or networking protocols. They (which includes myself) want input methods to act the same way in Firefox as they do in Office and Photoshop and terminal windows and everything else. -- Glenn Maynard
Re: [whatwg] Default encoding to UTF-8?
On Fri, Dec 2, 2011 at 6:29 PM, Glenn Maynard gl...@zewt.org wrote: On Fri, Dec 2, 2011 at 10:46 AM, Henri Sivonen hsivo...@iki.fi wrote: Regarding your (and 16) remark, considering my personal happiness at work, I'd prioritize the eradication of UTF-16 as an interchange encoding much higher than eradicating ASCII-based non-UTF-8 encodings that all major browsers support. I think suggesting a solution to the encoding problem while implying that UTF-16 is not a problem isn't particularly appropriate. :-) ... I don't think I'd call it a bigger problem, though, since it's comparatively (even vanishingly) rare, where untagged legacy encodings are a widespread problem that gets worse every day we can't think of a way to curtail it. From implementation perspective, UTF-16 has its own class of bugs than are unlike other encoding-related bugs and fixing those bugs is particularly annoying because you know that UTF-16 is so rare that you know the fix has little actual utility. -- Henri Sivonen hsivo...@iki.fi http://hsivonen.iki.fi/
Re: [whatwg] Default encoding to UTF-8?
On Mon, Dec 5, 2011 at 1:30 AM, Henri Sivonen hsivo...@iki.fi wrote: From implementation perspective, UTF-16 has its own class of bugs than are unlike other encoding-related bugs and fixing those bugs is particularly annoying because you know that UTF-16 is so rare that you know the fix has little actual utility. There are lots of things like that on the platform, though, and this one doesn't really get worse over time. More and more content with untagged legacy encodings accumulates every day, regularly causing user-visible problems, which is why I'd call it a much bigger issue. -- Glenn Maynard