Attached is a humble attempt at Lao syllabication rules in the hopes for Lao
integration with TeX.
I am sending this to the tex-hyphen list, and CCing the xetex list as a
lengthy discussion regarding this subject occurred there during the last
couple of weeks.
I will be happy to work with the group in tweaking this and running tests.
Thank you,
--
Brian Wilson, Director
Asia-Pacific International University Translation Center
_____________
I have a new blog!! http://tc4asia.org/wpblog
"He hath shewed thee, O man, what is good; and what doth the LORD require of
thee , but to do justly, and to love mercy, and to walk humbly with thy
God." Micah 6:8
The following is a brief sketch of the syllabification rules in Lao. My
apologies for not using standard conventions. Feel free to edit.
On the most basic level of word-wrapping, syllables should never be split.
Lao syllables consist of
1. Beginning Consonant (bC) [required]
2. Secondary Beginning Consonant (sbC) [for consonant clusters]
3. Vowel (V) [required]
4. Tone Mark (T) [The order of 3 and 4 can be reversed]
5. Final Consonant (fC)
6. Extra Final Consonant (efC)
7. galan (g)
##########
##########
Consonants and consonant clusters that can begin a syllable.
1. ກ 0E81
(1) ກຣ 0E81 + 0EA3 [uncommon]
(2) ກລ 0E81 + 0EA5 [uncommon]
(3) ກວ 0E81 + 0EA7
(4) ກຼ 0E81 + 0EBC [uncommon]
2. ຂ 0E82
(1) ຂຣ 0E82 + 0EA3 [uncommon]
(2) ຂລ 0E82 + 0EA5 [uncommon]
(3) ຂວ 0E82 + 0EA7
(4) ຂຼ 0E82 + 0EBC [uncommon]
3. ຄ 0E84
(1) ຄຣ 0E84 + 0EA3 [uncommon]
(2) ຄລ 0E84 + 0EA5 [uncommon]
(3) ຄວ 0E84 + 0EA7
(4) ຄຼ 0E84 + 0EBC [uncommon]
4. ງ 0E87
5. ຈ 0E88
6. ຊ 0E89
7. ຍ 0E80
8. ດ 0E94
(1) ດຣ 0E94 + 0EA3 [uncommon]
9. ຕ 0E95
(1) ຕຣ 0E95 + 0EA3 [uncommon]
10. ຖ 0E96
11. ທ 0E97
12. ນ 0E99
13. ບ 0E9A
(1) ບຣ 0E9A + 0EA3 [uncommon]
(2) ບລ 0E9A + 0EA5 [uncommon]
(3) ບຼ 0E9A + 0EBC [uncommon]
14. ປ 0E9B
(1) ປຣ 0E9B + 0EA3 [uncommon]
(2) ປລ 0E9B + 0EA5 [uncommon]
(3) ປຼ 0E9B + 0EBC [uncommon]
15. ຜ 0E9C
16. ຝ 0E9D
(1) ຝຣ 0E9D + 0EA3
(2) ຝຼ 0E9D + 0EBC
17. ພ 0E9E
18. ຟ 0E9F
19. ມ 0EA1
20. ຢ 0EA2
21. ຣ 0EA3
22. ລ 0EA5
23. ວ 0EA7
24. ສ 0EAA
(1) ສຣ 0E81 + 0EA3 [uncommon]
(2) ສລ 0E81 + 0EA5 [uncommon]
(3) ສວ 0E81 + 0EA7
(4) ສຼ 0E81 + 0EBC [uncommon]
25. ຫ 0EAB
(1) ຫງ 0EAB + 0E87
(2) ຫນ 0EAB + 0E99 [This is uncommon as it has its own
character, see below]
(3) ຫຍ 0EAB + 0E8D
(4) ຫມ 0EAB + 0EA1 [This is uncommon as it has its own
character, see below]
(5) ຫຣ 0EAB + 0EA3 [uncommon]
(6) ຫລ 0EAB + 0EA5
(7) ຫວ 0EAB + 0EA7
(8) ຫຼ 0EAB + 0EBC
26. ອ 0EAD
27. ຮ 0EAE [my mac is rendering this the same as 0EA3, shame on it]
28. ໜ 0EDC
29.ໝ 0EDD
############
############
Consonants that commonly end a syllable
1. ກ 0E81
2. ງ 0E87
3. ຍ 0E8D [This is a /y/ and acts as a semivowel in certain
constructions that will be explained later]
4. ດ 0E94
5. ມ 0EA1
6. ນ 0E99
7. ບ 0E9A
8. ວ 0EA7 [This is a /w/ and acts as a semivowel in certain
constructions that will be explained later]
############
############
Consonants that could conceivably end a syllable in rare occasions when
transcribing certain foreign words.
1. ຂ 0E82
2. ຄ 0E84
3. ຈ 0E88
4. ຊ 0E89
5. ດ 0E94
6. ຕ 0E95
7. ຖ 0E96
8. ທ 0E97
9. ປ 0E9B
10. ຜ 0E9C
11. ຝ 0E9D
12. ພ 0E9E
13. ຟ 0E9F
14. ມ 0EA1
15. ຣ 0EA3
16. ລ 0EA5
17. ສ 0EAA
############
############
Consonants that can never end a syllable [unless followed immediately by the
silencer 0ECC]
1. ຫ 0EAB
2. ຢ 0EA2
3. ອ 0EAD
4. ຮ 0EAE
5. ຼ 0EBC
6. ໜ 0EDC
7. ໝ 0EDD
############
############
Extra final consonant
In order to type foreign words, Lao adds 0ECC to extra final consonants.
Every consonant but
1. ຼ 0EBC
2. ໜ 0EDC
3. ໝ 0EDD]
are theoretically possible with some more common than others.
############
############
Vowels that are written before the beginning consonant [syllable breaks ALWAYS
occur before these characters and NEVER occur after these characters]
1. ເ 0EC0
2. ແ 0EC1
3. ໄ 0EC2
4. ໃ 0EC3
5. ໂ 0EC4
############
############
Vowels that are written after the beginning consonant [syllable breaks NEVER
occur before these characters. Some vowels in this section and the proceeding
section can be stacked. I can specify if necessary.]
1. ະ 0EB0
2. າ 0EB2
3. ຳ 0EB3 [can also be written as 0ECD followed by 0EB2]
4. ິ 0EB4
5. ີ 0EB5
6. ຶ 0EB6
7. ື 0EB7
8. ຸ 0EB8
9. ູ 0EB9
10. ໍ 0ECD
############
############
Vowels that are written between two consonants [syllable breaks NEVER occur
before or after these characters]
1. ັ 0EB1 [The following character must be a consonant or 0EBD
semi-vowel]
2. ົ 0EBB [The following character must be (an optional T marker) 1.
consonant or 2. າ 0EB2 vowel when used in the /ow/ diphthong ( <0EC0> <bC>
<(sbC)> <0EBB> <(T)> <0EBD>) or 3. ວ 0EA7 semi-vowel when used in the /ua/
diphthong (Note that the ວ may be followed by ະ 0EB0 for the shortened version
of this diphthong. <bC> <sbC> <0EBB> <(T)> <0EA7> <(0EB0)>)]
############
############
Vowels that can't take a final consonants
1. ະ 0EB0 [syllable break ALWAYS occurs after this character]
2. ໍ 0ECD [syllable break ALWAYS occurs after this character or the
optional tone mark immediately following it.]
############
############
/ia/ Vowel and in old orthography /y/ which can replace the final ຍ 0E8D - see
above
1. ຽ [can NEVER break before. If it is a final /y/, then can break after]
############
############
Tones. There are four tone marks that can sit on top of the initial consonant
or on ິ ີ ຶ ື ໍ 0EB4 - 0EB5 - 0EB6 - 0EB7 - 0ECD (Note that 0EB5 and
0EB7 also part of diphthongs—see below) Breaks can NEVER occur before these.
1. ່ 0EC8
2. ້ 0EC9
3. ໊ 0ECA
4. ໋ 0ECB
############
############
The silencer—a mark placed on a consonant rendering it silent. Only used to
write foreign words. Usually placed on the last letter of a syllable, although
it can occur in the middle of a syllable when placed on a ຣ 0EA3 or ລ 0EA5. A
break can NEVER occur before the consonant upon which this character sits as a
consonant containing this character (galan) can not begin a syllable.
1. ໌ 0ECC
############
############
The following punctuation marks can never begin a new line. Also not that
English and French punctuation symbols and rules apply. ( Lao tends to add a
space around punctuation as in French, but not always. ) Quotes can be with "
" or << >>
1. ໆ 0EC6
2. 0EAF [Sorry, I can't find this on my unmarked mac keyboard]
############
############
Vowel Diphthongs. Here is where it gets hairy as three consonant semi-vowels
are involved. [See my explanation at the beginning of this document.
Parentheses refer to optional characters)]
1. <0EC0> <bC> <(sbC)> <0EB6 or 0EB7> <(T)> <0EAD> <(fC)> [eua
vowel. Note that the beginning consonant is in the middle]
[Well, that wasn't so bad. I think that the other diphthongs are taken care of
in previous rules and notes.]
############
############
Consonants used as vowels between consonants.
1. ວ 0EA7
2. ອ 0EAD
[If ວ|ອ is preceded by a consonant (note optional tone mark) and followed
immediately by a consonant that is not followed by a vowel or tone mark then
consider C(T)ວ|ອC to be a syllable.]
############
############
Yeah. The end.