Gary, I'm expanding coverage right now.   Testing Metaphone is neither easy
nor fun.  We'll also need to increase coverage in DoubleMetaphone.



I uncovered a potential bug in Metaphone.  The code in question deals with
the encoding of 'B':

// START CODE from Metaphone

case 'B' :
    if ((n > 0) && !(n + 1 == wdsz) && 
        (local.charAt(n - 1) == 'M')) { // not MB at end of word 
        code.append(symb);
    } else {
        code.append(symb);
    }
    mtsz++;
    break;

// END CODE

My understanding is that we should not encode a 'B' if a word ends in "MB".
(Following: http://aspell.sourceforge.net/metaphone/metaphone-kuhn.txt)So
the Metaphone of "COMB" is "KM" not "TMB", and the Metaphone of "TOMB" is
"TM" not "TMB".  I "refactored" this code a bit and came up with the
following:

case 'B' :
    if ( isPreviousChar(local, n, 'M') && 
         isLastChar(wdsz, n) ) { 
        // B is silent if word ends in MB
          break;
    } else {
        code.append(symb);
    }
    break;

Also, this code was (outright) copied from a C++ program, there was no need
to keep track of the length of our StringBuffer in a variable named "mtsz".
That's gone, and the only reason this was possible was great code coverage.

Tim

> -----Original Message-----
> From: Gary Gregory [mailto:[EMAIL PROTECTED]
> Sent: Sunday, April 18, 2004 1:24 PM
> To: Jakarta Commons Developers List
> Subject: RE: [codec] 1.3 Release Candidate Status
> 
> Ah, yes, thanks for catching it. I've fixed this in CVS HEAD. See
> Bugzilla Bug 28455: Hex converts illegal characters to 255.
> 
> Thank you,
> Gary
> 
> > -----Original Message-----
> > From: Oleg Kalnichevski [mailto:[EMAIL PROTECTED]
> > Sent: Sunday, April 18, 2004 10:40
> > To: Jakarta Commons Developers List
> > Subject: Re: [codec] 1.3 Release Candidate Status
> >
> > Gary
> > Has the problem reported by Tom van den Berge been looked into?
> >
> >
> http://marc.theaimsgroup.com/?l=jakarta-commons-dev&m=108201900324974&w=
> 2
> >
> > If not, I believe it should be
> >
> > Oleg
> >
> >
> >
> > On Sun, 2004-04-18 at 19:33, Gary Gregory wrote:
> > > Hello all,
> > >
> > > It seems to me that we should start the release process for Codec
> 1.3.
> > >
> > > Does the following need polish?
> > >
> > > - Better unit test code coverage (Clover). Some classes are 100%,
> others
> > > not (volunteers?). New classes are 100% I believe.
> > > - Check that the release notes file is up to date WRT fixes and new
> > > features.
> > >
> > > Depending on feedback from the list, we could build an RC1.
> > >
> > > Thank you,
> > > Gary
> > >
> > >
> > >
> ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > > For additional commands, e-mail: [EMAIL PROTECTED]
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to