Hi Catherine, Welcome!
I assume you are the same person as "jellyman" who posted earlier in this mailing list about exactly this same topic? If so, FYI, someone posting under multiple identities can be a predictor for various forms of mailing list nastiness, so you might want to pick one identity and stick with it here. FYI, you're posting to the wrong mailing list. general@lucene.apache.org is not monitored by many people, and is intended for use in discussing top-level Lucene or cross-sub-project issues. You should instead be on d...@lucene.apache.org, which is devoted to *development* of Lucene and Solr. I'm sending this to both lists, but you should respond only on d...@lucene.apache.org. For future reference, discussions of *usage* of Lucene goes to java-u...@lucene.apache.org. More details here: <http://lucene.apache.org/core/discussion.html>. Some of your tests should go into lucene/analysis/phonetic/src/test/org/apache/lucene/analysis/phonetic/TestPhoneticFilter.java - you should add your algorithm to each of the three test*() methods in that class. As soon as possible, you should create a JIRA issue and upload your patch - it's much easier to provide guidance against a concrete implementation. In case you haven't seen it, some of the mechanics of this process are described here: <http://wiki.apache.org/lucene-java/HowToContribute>. Happy coding! Steve -----Original Message----- From: Catherine Tate [mailto:catherine_tat...@hotmail.com] Sent: Tuesday, August 21, 2012 10:09 AM To: general@lucene.apache.org Subject: Making a contribution to Lucene - please help Hi, I'm trying to add a phonetic algorithm to the Apache Lucene project. I have the algorithm ready-to-go plus tests. All that remains is "hard-wiring" it into the project itself (which I'm finding difficult). I need some help here (a little hand holding) I initially thought that I had to put it into the commons-codec sub-project but I was corrected and told to put the algorithm into the lucene/analysis/phoenetic/src/java area and implement org.apache.commons.codec.Encoder I'm a little worried by this as all I see are filters here and I'm trying to follow existing pattern before I upload the patch. Anyway, what I have at the moment is something like this sketch: public class NewPhoenetic implements org.apache.commons.codec.Encoder{public static String GetMRA(String name){ //blah blah - gets the encoding } public static boolean ComparePhoeneticEncoding(String name1, string name2){ //compare 2 encoded names to see if they match //etc... } @Overridepublic Object encode(Object arg0) throws EncoderException {//return null;if(arg0 == null)return null;return MatchRatingApproach.GetPhoeneticEncoding(arg0.toString());} } My question are: 1. Am I going in the right direction, any suggestions, improvements, things to watch out for? The second question is how do I integrate the unit tests that I have. I see that may are located in: lucene/analysis/phoenetic/src/test but I don't understand the pattern. I would appreciate if you could code some small code-samples/patterns along with any explanations - I'm still pretty new to Java so what may be simple to you may not be so obvious to me. Thanks for any help in advance,Catherine