Re: [Firebird-devel] (Double) Metaphone?

m. Th. Mon, 01 Apr 2013 09:52:32 -0700

On 30/3/2013 12:32 μμ, Dimitry Sibiryakov wrote:

30.03.2013 11:27, m. Th. wrote:

Do you actually read the source?

    No, I actually learn languages. So, I have no idea how to transliterate Russian 
"щ" or
"ы" or Czech "ř".


There is the "UNITED NATIONS GROUP OF EXPERTS ON GEOGRAPHICAL NAMES (UNGEGN)"
(sorry for the all caps, it was a copy / paste)

which gives Romanization tables which are quite clear for enough alphabets and 
easy to implement in code.

See

http://www.eki.ee/wgrs/status.htm

Concrete, for Russian you have http://www.eki.ee/wgrs/rom1_ru.htm

Also, let us not forget that we-re talking here about a /phonetic/ engine which outputs an _aggregate_ code, hence inalmost all cases we can ignore (at least for the beginning) the subtile differences between sounds (for ex. see the useof aphostrophes in the Georgian language at http://www.eki.ee/wgrs/rom2_ka.htm).

(One of) the main problem(s) in the design phase are the latin alphabets which aren't already covered by the existentDouble Metaphone codebase.

Such alphabets, especially for our case, are generally easy to process in order to achieve our goal, even if thescientific approach vs. special letters etc. is sometimes somewhat daunting for an non-informed. For ex. the Romanianalphabet, while in the general, scientific approach is somewhat complicated (seehttp://en.wikipedia.org/wiki/Romanian_alphabet ), for our purpose is quite easy to build the cases / phonetic conversiontable.

Hence, the most recommended approach is to implement the UN standard where it exists and where it is worth it (for ex. Idon't know if is of high importance to implement the romanization for Tigrinya, Lao or Urdu - no offence intended), andfor other languages, if someone with knowledge of the languages can provide the conversion tables for the mostwidespread Latin-derived alphabets from...


http://en.wikipedia.org/wiki/Category:Latin_alphabets

...it would be a plus, even if we could use Wikipedia for this. From what I see at the above list, and from the DoubleMetaphone code it seems that only the Eastern-European alphabets are not covered by Metaphone, need and, hence, are ina rather good position. For the Romanian language, I (or Mariuz) can provide a translation table with ease. I don't knowif for other EE languages are other takers...


Thoughts? Comments?

Ioan Th.

------------------------------------------------------------------------------
Own the Future-Intel&reg; Level Up Game Demo Contest 2013
Rise to greatness in Intel's independent game demo contest.
Compete for recognition, cash, and the chance to get your game 
on Steam. $5K grand prize plus 10 genre and skill prizes. 
Submit your demo by 6/6/13. http://p.sf.net/sfu/intel_levelupd2d

Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Re: [Firebird-devel] (Double) Metaphone?

Reply via email to