For OpenOffice.org integration, we have developed a new
feature to specify the collation (sorting order) for all character-based columns
in the database.
The collations we are planning to use are listed below. We
need word list files for different languages that demonstrate how different
words should be listed. If you are a speaker of a language with special
collation rules, please submit a word list.
Fred Toussi
Maintainer, HSQLDB Project
The word lists should be in CSV format, encoded as UTF-8. A
test file should contain words that demonstrate the collation rules
of one particular language, including use of accents. Words should be
sorted and numbered correctly as in the following example:
1, "alpha"
2, "beta"
3, "gamma"
...
List of collations are as follows:
| Collation name | Locale |
| Afrikaans | af-ZA |
| Amharic | am-ET |
| Arabic | ar |
| Assamese | as-IN |
| Azerbaijani_Latin | az-AZ |
| Azerbaijani_Cyrillic | az-cyrillic |
| Belarusian | be-BY |
| Bulgarian | bg-BG |
| Bengali | bn-IN |
| Tibetan | bo-CN |
| Bosnian | bs-BA |
| Catalan | ca-ES |
| Czech | cs-CZ |
| Welsh | cy-GB |
| Danish | da-DK |
| German | de-DE |
| Greek | el-GR |
| Latin1_General | en-US |
| Spanish | es-ES |
| Estonian | et-EE |
| Basque | eu |
| Finnish | fi-FI |
| French | fr-FR |
| Guarani | gn-PY |
| Gujarati | gu-IN |
| Hausa | ha-NG |
| Hebrew | he-IL |
| Hindi | hi-IN |
| Croatian | hr-HR |
| Hungarian | hu-HU |
| Armenian | hy-AM |
| Indonesian | id-ID |
| Igbo | ig-NG |
| Icelandic | is-IS |
| Italian | it-IT |
| Inuktitut | iu-CA |
| Japanese | ja-JP |
| Georgian | ka-GE |
| Kazakh | kk-KZ |
| Khmer | km-KH |
| Kannada | kn-IN |
| Korean | ko-KR |
| Konkani | kok-IN |
| Kashmiri | ks |
| Kirghiz | ky-KG |
| Lao | lo-LA |
| Lithuanian | lt-LT |
| Latvian | lv-LV |
| Maori | mi-NZ |
| Macedonian | mk-MK |
| Malayalam | ml-IN |
| Mongolian | mn-MN |
| Manipuri | mni-IN |
| Marathi | mr-IN |
| Malay | ms-MY |
| Maltese | mt-MT |
| Burmese | my-MM |
| Danish_Norwegian | nb-NO |
| Nepali | ne-NP |
| Dutch | nl-NL |
| Norwegian | nn-NO |
| Oriya | or-IN |
| Punjabi | pa-IN |
| Polish | pl-PL |
| Pashto | ps-AF |
| Portuguese | pt-PT |
| Romanian | ro-RO |
| Russian | ru-RU |
| Sanskrit | sa-IN |
| Sindhi | sd-IN |
| Slovak | sk-SK |
| Slovenian | sl-SI |
| Somali | so-SO |
| Albanian | sq-AL |
| Serbian_Cyrillic | sr-YU |
| Swedish | sv-SE |
| Swahili | sw-KE |
| Tamil | ta-IN |
| Telugu | te-IN |
| Tajik | tg-TJ |
| Thai | th-TH |
| Turkmen | tk-TM |
| Tswana | tn-BW |
| Turkish | tr-TR |
| Tatar | tt-RU |
| Ukrainian | uk-UA |
| Urdu | ur-PK |
| Uzbek_Latin | uz-UZ |
| Venda | ven-ZA |
| Vietnamese | vi-VN |
| Yoruba | yo-NG |
| Chinese | zh-CN |
| Zulu | zu-ZA |
