Hi Peter and Carlos,

Thank you for getting back in touch with me!

I appreciate your interest in the Metaphone family of algorithms. However, I 
think there has been some confusion on the objectives of your project due to 
the way that you have titled it -  "Metaphone-Standards, formal specification 
for the Metaphone algorithms in different languages".

When I look up the meaning of the phrase "Formal Specification" I find this 
definition:

In computer science, a formal specification is a mathematical description of 
software or hardware that may be used to develop an implementation. It 
describes what the system should do, not (necessarily) how the system should do 
it. Given such a specification, it is possible to use formal verification 
techniques to demonstrate that a candidate system design is correct with 
respect to the specification

In other words, a 'Formal Specification' is about the same as a 'Reference 
Implementation' - it purports to be the authoritative description of a design, 
algorithm, or other description of method. Additionally, the word 'Standards' 
implies the same thing.

Reading your project site, it seems to me that asserting a Formal Specification 
is perhaps not the main objective of the project - at least, I hope not, since 
it would be necessary to have the proper standing in relation to the technology 
in question to make such an assertion, and, as the designer of original 
Metaphone, I would be the only person with such standing.

It seems to me, however, that what you principally intend here is to implement 
a framework that would allow developers to specify phonetic encoding algorithms 
using an implementation independant formalism, which you call an MLAV. This is 
interesting and useful effort, but please allow me to discuss this issue a bit:

I've considered carefully the question of whether the Metaphone mappings 
specific to any particular language should be considered 'data' that should be 
separated from code. The principle that 'data' should be separated from 'code' 
is a rule of thumb (if you are not familiar with this English expression, a 
'rule of thumb' is an heuristic rule useful in many situations but perhaps not 
applicable in many others) that needs, like any engineering decision, to be 
evaluated from a practical point of view to determine whether or not applying 
it results in a net gain. My conclusion after thinking about it is this: the 
result of segregating the mappings from the body of the code, with the aim of 
being easily able to replace an encoding for one language with the encoding 
rules for another, creates more problems than it solves. The  reasons are this: 
1) the mechanism for programmatically reading the expression of the mapping in 
a formalism and executing it
 necessarily requires its own code, and would be subject to its own bugs and 
other vulnerabilities, therefore, the benefit of adding this body of code needs 
to be make up for the added burden; 2) any formalism would need to be able to 
express any possible logic that could be implemented in normal program code, 
and it is not clear that a formalism exists that could express all the logic 
embodied in the code required to implement Metaphone 3 (however, I encourage 
you to purchase a copy of Metaphone 3 and try this for yourselves); 3) when 
maintaining and enhancing Metaphone code, I have found that the current code 
solutions as found in my implementations of Double Metaphone and Metaphone 3 
have a high proportion of clarity and maintainability, and I suspect that 
needing to revise a non-code formalism when fixing bugs or adding functionality 
would be considerably more confusing - thus, significantly degrading 
maintainability; and 4) finally, I have
 considered the question of what code would be shareable between 
implementations of Metaphone 3 for English, and Metaphone 3 for other 
languages, which is an effort that I am working on. My conclusion is that a 
small amount of code would end up in common, but not very much, and that the 
abstraction of the ruleset for implementations of Metaphone 3 for different 
languages would require no more than a typical code refactoring, not the 
creation of a separate formalism to express mappings as data.

To conclude, the fact that there are "magic numbers" and the appearance of 
'data' being mixed with code, in my implementations of Double Metaphone and 
Metaphone 3 may look sub-optimal to an engineer casually examining the code, 
but please be aware that I have considered these questions extensively and I am 
satisfied that these implementations are structured optimally for maintenance 
of a body of code that, for Metaphone 3 at least, is of substantial complexity. 
So, while I encourage you to pursue your project of creating a formalism to 
represent Metaphone rulesets for different languages, please consider carefully 
whether or not this approach will cause more problems than benefits. However, 
please do try mapping the logic in Metaphone 3 to your formalism. (If you do 
this you should of course carefully compare the output of your implementation 
to mine using a large body of sample words and names. The database I used in 
developing Metaphone 3 runs to over
 150 thousand words and names.)

Also, please be aware that Double Metaphone and Metaphone 3 are both intended 
to be improvements over original Metaphone, so at the very least you should 
revise your implementation according to my reference implementation of Double 
Metaphone as available from the C/C++ User's Journal website. In other words, I 
consider original Metaphone to be deprecated.

Finally, back to the original issue: Please do not describe your project and 
website as a "Formal Specification" or "Metaphone-Standards". I've tried to 
think of a title for your project that will describe it as an effort to develop 
and implement a non-code formalism that could be used for different Metaphone 
implementations, but I'm not sure what that would be. But, please retitle your 
project so that it does not appear to claim to be a Reference Implementation, 
and find a name for it that better reflects your efforts to develop a formalism 
for phonetic encoding mappings.

Also - and please excuse me for my rudeness in bringing it up - please have a 
native english speaker review your comments on your site - clarity is important 
in engineering!

So, I thank both of you again for your interest in Metaphone algorithms, and 
the very interesting work you have already accomplished! I have been working on 
Metaphone 3 implementations for French, Spanish, and German, and I look forward 
to comparing our respective efforts.

Please let me know what you think and how you intend to proceed!

thanks,

Lawrence Philips



________________________________
 From: Peter Padua Krauss <[email protected]>
To: [email protected]; Carlos Jordão 
<[email protected]> 
Sent: Tuesday, November 22, 2011 6:55 AM
Subject: Re: Comment on MetaphoneEn_v1_1 in metaphone-standards
 

Dear Philips,

I think you are realy Lawrence Philips, the father of the 1990's Metaphone 
algorithm...
You made this project possible!

This e-mail is being sent with both addresses, mine, Peter Krauss, and 
the Brazilian Portuguese Metaphone developer and second project's owner, Carlos 
Jordão.
Reply to both owners.

PS: the main objective of our project is to warrant a "interoperable Metaphone" 
for open-source ones, in a context of linguistic tools.



2011/11/21 <[email protected]>

Comment by [email protected]:
>
>I'm sorry, I meant to type: [email protected]
>
>You can also contact me at [email protected]
>
>
>For more information:
>http://code.google.com/p/metaphone-standards/wiki/MetaphoneEn_v1_1
>-- 
>You received this message because you starred the wiki page.
>You may adjust your notification preferences at:
>https://code.google.com/hosting/settings
>
>Reply to this email to add a comment.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Project Hosting on Google Code" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-code-hosting?hl=en.

Reply via email to