Den 2013-04-03 09:29 skrev Alex Peshkoff såhär:
> On 04/03/13 11:23, Kjell Rilbe wrote:
>> Den 2013-04-03 09:09 skrev Alex Peshkoff såhär:
>>> On 04/03/13 01:44, Thomas Beckmann wrote:
>>>> Second: I'd like to have this like a plugin architecture,
>>> The most simple requirement - just write UDF (or in FB3 external
>>> function). No other plugins are needed.
>> I think that Thomas meant was a phonetic matching function that can be
>> configured in a way similar to collations, i.e. for this language, use
>> these phonetic rules. No programming, just config.
> Sorry - plugin means adding _code_, not just doing configuration.

:-) I know what he wrote. I just thought he meant config, because that 
made more sense to me. If phonetic rules were to be programmed in code 
for each language, then I see little point in building a separate plugin 
architecture for it, because of the already existing support for 
udf/external function, just as you mentioned. It's in the phonetic rules 
the complexity lies, not in the actual comparison of the already 
phonetically encoded data.

So, as I see it these are the sensible options at hand:

1. Do nothing. No phonetic support at all.

2. Include a phonetic matching function that supports English only. 
Anyone needing a different language has to write a completely separate 
udf or external function. This would be a rather simple task, just code 
double Metaphone and include it in the standard FB releases, but adds 
little value for non-English data.

3. Include a phonetic matching function that uses phonetic rules from 
some kind of config. Possibly include a few languages by default or 
provide them all as separate downloads. This is more complex and would 
probably require quite a bit of research to define how the config is to 
be structured/defined, but if successful would provide better value for 
data in various languages.

4. Include a phonetic matching function that works well with data in any 
language or mixed languages. This would be even more difficult than 
above, but if successful would provide maximum value for all languages. 
However, I think this would be subject to several academic research 
projects, one for each language, so I consider this to be undoable.

Note: I'm not quite sure if double Metaphone claims to actually support 
mixed-language data, or if it is intended simply to support English with 
some foreign/imported words and/or names subject to English 
pronunciation. Does anyone know?

Regards,
Kjell

-- 
--------------------------------------
Kjell Rilbe
DataDIA AB
E-post: kj...@datadia.se
Telefon: 08-761 06 55
Mobil: 0733-44 24 64



------------------------------------------------------------------------------
Minimize network downtime and maximize team effectiveness.
Reduce network management and security costs.Learn how to hire 
the most talented Cisco Certified professionals. Visit the 
Employer Resources Portal
http://www.cisco.com/web/learning/employer_resources/index.html
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Reply via email to