On 11 August 2011 20:56, Gary Gregory <garydgreg...@gmail.com> wrote: > Hello All! > > Topic 1: Housekeeping: package name and POM. > > The next codec release out of trunk will be major release labeled 2.0, > the current release is 1.5. > > In trunk, I've removed deprecated methods and the project now requires > Java 5. This means 2.0 will not be a drop-in binary compatible release > for 1.5. > > I'd like to confirm or deny that this means the package name will > change to o.a.c.codec2 and that the POM groupId will have to change > from commons-codec to org.apache.commons. 2.0 and 1.5 would be able to > live side by side.
Yes, the name changes are necessary to avoid problems with incompatible jars. > I'd like to get this out of the way first hence topic 1. > > > Topic 2: Beider-Morse (BM) Encoder API > https://issues.apache.org/jira/browse/CODEC-125 > > BM is a new codec for 2.0. > > The encode API returns a set of encodings. > > In trunk, this is currently a String in the format "s1|s2|s3". > > I think this is not the best design, a set should be a Set, in this > case, an ordered set. Or, a List. Generally, it should be a Collection > of Strings. > > There was concern with call sites that generically use a [codec] > Encoder with the signature "Object encoder(Object)" and call > toString() on the result. > > If we set the API to "CharSequence encode(Set<CharSequence>)" or > "String encode(Set<String>)", doing a toString() on a HashSet will > yield a usable String similar as to what trunk does now. For example, > for a HashSet of Strings "a", "b" and "c", HashSet.toString() returns > "[a, b, c]" which no worse than "a|b|c" IMO. At least it is a > documented and stable format. +1 > Topic 3: Generics > > This will be in a separate thread but I'd like to get this in 2.0 > because this will likely break the API and I only want to break things > once and not have to do a codec3 for generics. +1. > Thank you all, > Gary > > On Thu, Aug 11, 2011 at 2:38 PM, Matthew Pocock > <turingatemyhams...@gmail.com> wrote: >> Hi, >> >> As those of you who've been following the CODEC-125 ticket will know, with >> Greg's help I've got a port of the beider morse phonetic >> matching (bmpm) algorithm in as a string encoder. As far as I can tell, it's >> ready for people to use and abuse. It ideally needs more test-case words, >> but to the best of my knowledge it doesn't have any horrendous bugs or >> performance issues. >> >> The discussion on the ticket started to stray off bmpm and on to policy for >> releases and changing APIs, and Sebb said we should discuss it on the list. >> So, here we are. >> >> Ideally, I'd like there to be a release of commons-codec some time soon so >> that users can start to try out bmpm right away, and so that we can start >> the process of adding it to the list of supported indexing methods in solr. >> What do people think? >> >> Matthew >> >> -- >> Dr Matthew Pocock >> Visitor, School of Computing Science, Newcastle University >> mailto: turingatemyhams...@gmail.com >> gchat: turingatemyhams...@gmail.com >> msn: matthew_poc...@yahoo.co.uk >> irc.freenode.net: drdozer >> tel: (0191) 2566550 >> mob: +447535664143 >> > > > > -- > Thank you, > Gary > > http://garygregory.wordpress.com/ > http://garygregory.com/ > http://people.apache.org/~ggregory/ > http://twitter.com/GaryGregory > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org