Hi Peter and Carlos, Thank you for getting back in touch with me!
I appreciate your interest in the Metaphone family of algorithms. However, I think there has been some confusion on the objectives of your project due to the way that you have titled it - "Metaphone-Standards, formal specification for the Metaphone algorithms in different languages". When I look up the meaning of the phrase "Formal Specification" I find this definition: In computer science, a formal specification is a mathematical description of software or hardware that may be used to develop an implementation. It describes what the system should do, not (necessarily) how the system should do it. Given such a specification, it is possible to use formal verification techniques to demonstrate that a candidate system design is correct with respect to the specification In other words, a 'Formal Specification' is about the same as a 'Reference Implementation' - it purports to be the authoritative description of a design, algorithm, or other description of method. Additionally, the word 'Standards' implies the same thing. Reading your project site, it seems to me that asserting a Formal Specification is perhaps not the main objective of the project - at least, I hope not, since it would be necessary to have the proper standing in relation to the technology in question to make such an assertion, and, as the designer of original Metaphone, I would be the only person with such standing. It seems to me, however, that what you principally intend here is to implement a framework that would allow developers to specify phonetic encoding algorithms using an implementation independant formalism, which you call an MLAV. This is interesting and useful effort, but please allow me to discuss this issue a bit: I've considered carefully the question of whether the Metaphone mappings specific to any particular language should be considered 'data' that should be separated from code. The principle that 'data' should be separated from 'code' is a rule of thumb (if you are not familiar with this English expression, a 'rule of thumb' is an heuristic rule useful in many situations but perhaps not applicable in many others) that needs, like any engineering decision, to be evaluated from a practical point of view to determine whether or not applying it results in a net gain. My conclusion after thinking about it is this: the result of segregating the mappings from the body of the code, with the aim of being easily able to replace an encoding for one language with the encoding rules for another, creates more problems than it solves. The reasons are this: 1) the mechanism for programmatically reading the expression of the mapping in a formalism and executing it necessarily requires its own code, and would be subject to its own bugs and other vulnerabilities, therefore, the benefit of adding this body of code needs to be make up for the added burden; 2) any formalism would need to be able to express any possible logic that could be implemented in normal program code, and it is not clear that a formalism exists that could express all the logic embodied in the code required to implement Metaphone 3 (however, I encourage you to purchase a copy of Metaphone 3 and try this for yourselves); 3) when maintaining and enhancing Metaphone code, I have found that the current code solutions as found in my implementations of Double Metaphone and Metaphone 3 have a high proportion of clarity and maintainability, and I suspect that needing to revise a non-code formalism when fixing bugs or adding functionality would be considerably more confusing - thus, significantly degrading maintainability; and 4) finally, I have considered the question of what code would be shareable between implementations of Metaphone 3 for English, and Metaphone 3 for other languages, which is an effort that I am working on. My conclusion is that a small amount of code would end up in common, but not very much, and that the abstraction of the ruleset for implementations of Metaphone 3 for different languages would require no more than a typical code refactoring, not the creation of a separate formalism to express mappings as data. To conclude, the fact that there are "magic numbers" and the appearance of 'data' being mixed with code, in my implementations of Double Metaphone and Metaphone 3 may look sub-optimal to an engineer casually examining the code, but please be aware that I have considered these questions extensively and I am satisfied that these implementations are structured optimally for maintenance of a body of code that, for Metaphone 3 at least, is of substantial complexity. So, while I encourage you to pursue your project of creating a formalism to represent Metaphone rulesets for different languages, please consider carefully whether or not this approach will cause more problems than benefits. However, please do try mapping the logic in Metaphone 3 to your formalism. (If you do this you should of course carefully compare the output of your implementation to mine using a large body of sample words and names. The database I used in developing Metaphone 3 runs to over 150 thousand words and names.) Also, please be aware that Double Metaphone and Metaphone 3 are both intended to be improvements over original Metaphone, so at the very least you should revise your implementation according to my reference implementation of Double Metaphone as available from the C/C++ User's Journal website. In other words, I consider original Metaphone to be deprecated. Finally, back to the original issue: Please do not describe your project and website as a "Formal Specification" or "Metaphone-Standards". I've tried to think of a title for your project that will describe it as an effort to develop and implement a non-code formalism that could be used for different Metaphone implementations, but I'm not sure what that would be. But, please retitle your project so that it does not appear to claim to be a Reference Implementation, and find a name for it that better reflects your efforts to develop a formalism for phonetic encoding mappings. Also - and please excuse me for my rudeness in bringing it up - please have a native english speaker review your comments on your site - clarity is important in engineering! So, I thank both of you again for your interest in Metaphone algorithms, and the very interesting work you have already accomplished! I have been working on Metaphone 3 implementations for French, Spanish, and German, and I look forward to comparing our respective efforts. Please let me know what you think and how you intend to proceed! thanks, Lawrence Philips ________________________________ From: Peter Padua Krauss <[email protected]> To: [email protected]; Carlos Jordão <[email protected]> Sent: Tuesday, November 22, 2011 6:55 AM Subject: Re: Comment on MetaphoneEn_v1_1 in metaphone-standards Dear Philips, I think you are realy Lawrence Philips, the father of the 1990's Metaphone algorithm... You made this project possible! This e-mail is being sent with both addresses, mine, Peter Krauss, and the Brazilian Portuguese Metaphone developer and second project's owner, Carlos Jordão. Reply to both owners. PS: the main objective of our project is to warrant a "interoperable Metaphone" for open-source ones, in a context of linguistic tools. 2011/11/21 <[email protected]> Comment by [email protected]: > >I'm sorry, I meant to type: [email protected] > >You can also contact me at [email protected] > > >For more information: >http://code.google.com/p/metaphone-standards/wiki/MetaphoneEn_v1_1 >-- >You received this message because you starred the wiki page. >You may adjust your notification preferences at: >https://code.google.com/hosting/settings > >Reply to this email to add a comment. > -- You received this message because you are subscribed to the Google Groups "Project Hosting on Google Code" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-code-hosting?hl=en.

