Re: Experiments with classical Greek keyboard input

2006-01-30 Thread Thomas Wolff
I've only followed this discussion partially because I'm not familiar 
with ancient Greek, but I noticed a few things.

Jan Willem Stumpel wrote:

 Proposal (I tested this, with the small alpha only, and it seems to
 work):

 -- Greek (modern and ancient) should use the common (international)
Compose file.
 -- The international Compose file should have different definitions for
letters with simple tonos and letters with simple oxia. At present,
the Compose file has

 dead_acute Greek_alpha  : ά U03AC # GREEK SMALL LETTER  ALPHA
 WITH TONOS

(and grep GREEK SMALL LETTER ALPHA Compose|grep -v AND|grep OXIA
gives nothing!)

It should actually list the following two entries from Unicode data:
1F71;GREEK SMALL LETTER ALPHA WITH OXIA;Ll;0;L;03ACN;;;1FBB;;1FBB
1FBB;GREEK CAPITAL LETTER ALPHA WITH OXIA;Lu;0;L;0386N1F71;

I guess that's due to the following comments quoted from 
en_US.UTF-8/Compose (SUSE Linux 10.0):
# Part 2
# Compose map for Korean Hangul(Choseongul) Conjoining Jamos  automatically
# generated  from UnicodeData-2.0.14.txt at
#ftp://ftp.unicode.org/Public/2.0-Update/UnicodeData-2.0.14.txt
#   by Jungshik Shin [EMAIL PROTECTED]  2002-10-17

This means the Compose data are quite outdated (Unicode 2.0!) and should 
be updated.

Jungshik Shin, would you provide us with the script or program that you 
used to generate these entries automatically? That would be much 
appreciated.
Actually, I would also like to equip my editor mined http://towo.net/mined 
with compose data automatically generated from Unicode data. I could 
do that myself but Jungshik Shin's contribution would help.

Also, the following information would help:
* What are the preferred keys that users would like to use to enter 
  oxia, tonos, etc as accent prefix or combination keys?
* Are any common keys (like quote mark, grave, acute) typically 
  associated with Greek accents or is that rather random and subject 
  to individual preference?
* Are any common keyboard mappings in use that set some de facto standard 
  here? What are their mappings?

If someone would answer these questions in a generic way (i.e. not 
referring to X key names or mappings or even the more mysterious X 
keyboard configuration properties), I would be grateful.
(I admit the questions are a little bit redundant, trying to achieve 
the same result under different aspects.)

Thomas Wolff

--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/



Re: Experiments with classical Greek keyboard input

2006-01-30 Thread Simos Xenitellis

O/H Thomas Wolff έγραψε:
I've only followed this discussion partially because I'm not familiar 
with ancient Greek, but I noticed a few things.


Jan Willem Stumpel wrote:

  

Proposal (I tested this, with the small alpha only, and it seems to
work):



  

-- Greek (modern and ancient) should use the common (international)
   Compose file.
-- The international Compose file should have different definitions for
   letters with simple tonos and letters with simple oxia. At present,
   the Compose file has



  

dead_acute Greek_alpha  : ά U03AC # GREEK SMALL LETTER  ALPHA
WITH TONOS



  

   (and grep GREEK SMALL LETTER ALPHA Compose|grep -v AND|grep OXIA
   gives nothing!)



It should actually list the following two entries from Unicode data:
1F71;GREEK SMALL LETTER ALPHA WITH OXIA;Ll;0;L;03ACN;;;1FBB;;1FBB
1FBB;GREEK CAPITAL LETTER ALPHA WITH OXIA;Lu;0;L;0386N1F71;

I guess that's due to the following comments quoted from 
en_US.UTF-8/Compose (SUSE Linux 10.0):

# Part 2
# Compose map for Korean Hangul(Choseongul) Conjoining Jamos  automatically
# generated  from UnicodeData-2.0.14.txt at
#ftp://ftp.unicode.org/Public/2.0-Update/UnicodeData-2.0.14.txt
#   by Jungshik Shin [EMAIL PROTECTED]  2002-10-17

This means the Compose data are quite outdated (Unicode 2.0!) and should 
be updated.


Jungshik Shin, would you provide us with the script or program that you 
used to generate these entries automatically? That would be much 
appreciated.
Actually, I would also like to equip my editor mined http://towo.net/mined 
with compose data automatically generated from Unicode data. I could 
do that myself but Jungshik Shin's contribution would help.


Also, the following information would help:
* What are the preferred keys that users would like to use to enter 
  oxia, tonos, etc as accent prefix or combination keys?
* Are any common keys (like quote mark, grave, acute) typically 
  associated with Greek accents or is that rather random and subject 
  to individual preference?
* Are any common keyboard mappings in use that set some de facto standard 
  here? What are their mappings?


If someone would answer these questions in a generic way (i.e. not 
referring to X key names or mappings or even the more mysterious X 
keyboard configuration properties), I would be grateful.
(I admit the questions are a little bit redundant, trying to achieve 
the same result under different aspects.)
  

You can have a look at this document,
http://planet.hellug.gr/misc/polytonic/
Although it is in Greek, it should be feasible to discern the 
combinations proposed. For example, Νεκρό πλήκτρο is Dead key in the 
list.

If there are queries, feel free to refer to me.

The Compose file should be broken in smaller files per script rather 
than having a big monolithic file.
There is increasing interest in updating this area of Xorg 
(http://community.livejournal.com/xkbconfig/) and I home it gets done soon.


Simos

--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/



Re: Experiments with classical Greek keyboard input

2006-01-30 Thread Jan Willem Stumpel
Simos Xenitellis wrote:

 You can have a look at this document, 
 http://planet.hellug.gr/misc/polytonic/ Although it is in Greek, it
 should be feasible to discern the combinations proposed. For example,
 Νεκρό πλήκτρο is Dead key in the list. If there are queries, feel
 free to refer to me.

Very interesting. Is this a proposal, or has it been implemented?
According to Babelfish, you say Your distribution of Linux that
has been published after October 2005 should include the renewed system
that we describe here. Mine does not, but I don't trust the Babelfish
translation..

As far as I can see, it would not be difficult to implement it. Nothing
would have to be changed in the binaries, only in the xkb and Compose
files.

I noticed you only want to use 'two level' keys (normal and shift), not
using AltGr. Is this some kind of standard? (e.g. Greek national
standard, or some other kind of standard)? The present pc/gr file in xkb
uses 'three level' keys.

BTW I suppose when you say that tonos/oxia is on the ; key, you mean the
key which is ; on US keyboards, not the key which is ; on Greek keyboards?

 The Compose file should be broken in smaller files per script
 rather than having a big monolithic file.

What advantage would this bring? If we have many small pieces of the
Compose file, how is the user (or the system) supposed to decide when to
use which piece? Wouldn't this create another configuration problem?

UTF-8 allows using one system for all languages and scripts, without
changing locales. There is only one, IMHO unavoidable, but small,
disadvantage: some files (like fonts, and the Compose file) tend to
become rather big. But memory and disk space are not as expensive as
they used to be. And the user does not notice anything of this. She just
thinks: wow! I can input any language anywhere, at any time!

 There is increasing interest in updating this area of Xorg 
 (http://community.livejournal.com/xkbconfig/) and I hope it gets done
 soon.

Hmm.. xkb and Compose are two completely different mechanisms. One
is input to the other. People often complain about xkb being
'mysterious' or 'arcane'. Since xfree86 4.3 and x.org came around, it
isn't anymore. It just lacks user-level documentation. Recently, thanks
to this list, I have come close enough to enlightenment to attempt a
user-level description on my utf-8 page, sections 6.1 and 6.2
(http://www.jw-stumpel.nl/stestu).

Regards, Jan



--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/



Re: Experiments with classical Greek keyboard input

2006-01-30 Thread Simos Xenitellis

O/H Jan Willem Stumpel έγραψε:

Simos Xenitellis wrote:

  
You can have a look at this document, 
http://planet.hellug.gr/misc/polytonic/ Although it is in Greek, it

should be feasible to discern the combinations proposed. For example,
Νεκρό πλήκτρο is Dead key in the list. If there are queries, feel
free to refer to me.



Very interesting. Is this a proposal, or has it been implemented?
According to Babelfish, you say Your distribution of Linux that
has been published after October 2005 should include the renewed system
that we describe here. Mine does not, but I don't trust the Babelfish
translation..
  

The referenced document is indeed a proposal.
You are correct about October 2005. Several distributions were released 
in October (Ubuntu, OpenSUSE) so the plan was to have the changes 
upstream by the end of the summer so that they move to the new 
distributions as they appear.
However, this plan did not work out and we still did not submit these 
changes.

Konstantinos Pistiolis is working on this subject.

As far as I can see, it would not be difficult to implement it. Nothing
would have to be changed in the binaries, only in the xkb and Compose
files.

I noticed you only want to use 'two level' keys (normal and shift), not
using AltGr. Is this some kind of standard? (e.g. Greek national
standard, or some other kind of standard)? The present pc/gr file in xkb
uses 'three level' keys.
  
As far as I know there is no national standard for Greek polytonic. 
Windows XP support Greek polytonic,
however, there is an inherent disadvantage that you cannot stuck more 
than one dead key; due to this
quite a lot of keys have to be used as dead keys. In addition, if a 
character accepts more than one diacritic,
then you need three dead keys to cover all the cases (diacritic A, 
diacritic B, diacritic A+B).


Regarding the usage of AltGr. There have been quite a few discussions on 
whether to use or not. I do not have the full details at my disposal.

Kostas, would you like to chip in for this?

BTW I suppose when you say that tonos/oxia is on the ; key, you mean the
key which is ; on US keyboards, not the key which is ; on Greek keyboards?
  

Indeed, ; it is the physical key according to the US keyboard.
The proposal document does not include a specific dead key to produce 
oxia. In the Windows XP layout there is such a dead key,
in an uncomfortable location however, for those end-users who would like 
to use it.
  

The Compose file should be broken in smaller files per script
rather than having a big monolithic file.



What advantage would this bring? If we have many small pieces of the
Compose file, how is the user (or the system) supposed to decide when to
use which piece? Wouldn't this create another configuration problem?
  
The configuration mechanism of Xorg would shield the end-user from this 
complexity. I am referring to the needs of the developers.
For example, suppose a lesser known language wants to make an 
installable package that adds writing support. The way this could be 
done is by dropping (adding) the appropriate files in the appropriate 
directory. Otherwise, there would be need to patch the monolithic file.
In addition, the Polytonic section in the Compose file is suitable to be 
auto-generated from a script as the multiple diacritics on vowels bring up

combinations.

UTF-8 allows using one system for all languages and scripts, without
changing locales. There is only one, IMHO unavoidable, but small,
disadvantage: some files (like fonts, and the Compose file) tend to
become rather big. But memory and disk space are not as expensive as
they used to be. And the user does not notice anything of this. She just
thinks: wow! I can input any language anywhere, at any time!
  
As I mention above, the splitting of the files would be an advantage for 
the developers.
The end-user would only see a GUI configuration tool. No setxkbmap or 
editing of xorg.conf.
There is increasing interest in updating this area of Xorg 
(http://community.livejournal.com/xkbconfig/) and I hope it gets done

soon.



Hmm.. xkb and Compose are two completely different mechanisms. One
is input to the other. People often complain about xkb being
'mysterious' or 'arcane'. Since xfree86 4.3 and x.org came around, it
isn't anymore. It just lacks user-level documentation. Recently, thanks
to this list, I have come close enough to enlightenment to attempt a
user-level description on my utf-8 page, sections 6.1 and 6.2
(http://www.jw-stumpel.nl/stestu).
  

Thanks for this.
We need to put effort so that gswitchit (Keyboard Indicator applet in 
GNOME) gets more and more advanced and ubiquitous.

The plan is for gswitchit to be used for KDE as well.
This is the proper direction so end-users are happy that their settings 
just work.


Simos


--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/



Re: question on Linux UTF8 support

2006-01-30 Thread Sheshadrivasan B





No.  Different users might be running different locales, and those
mentioned old applications might assume filenames to be in users'
locale encodings.  Of course, if some user switches locales often,
then all kinds of mess-ups might occur, unless she's consistently
using UTF-8 (or other language-agnostic encoding) for naming files.




So essentially, what this amounts to is that: you cannot prevent
junk being displayed when a user does an ls at the prompt.
Essentially users are shooting each other in the foot in as far
as display of file names is concerned. right?
Shesh.



--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/