Hi,

I've written a simple reproduction of LUCENENET-54 (ArgumentOutOfRangeException in SnowballProgram). I'm not sure about the correct workflow to reopen this issue (it was closed as invalid in 2007 due to missing information), so I'm throwing what I got into the developer mailing list and hope that someone else knows the correct approach. Problem originates in SnowballProgram.slice_to, where the second argument to StringBuilder.ToString(start, length) where the last parameter is passed an index instead of the length.

Reproduction:
using System.IO;
using Lucene.Net.Analysis.Snowball;
using Lucene.Net.Analysis.Tokenattributes;
using NUnit.Framework;

namespace ConsoleApplication {
    [TestFixture]
    public class LuceneRepo {
        [Test(Description = "LUCENENET-54")]
        public void Repro() {
            var analyzer = new SnowballAnalyzer("Finnish");

            var input = new StringReader("terve");
            var tokenStream = analyzer.TokenStream("fieldName", input);
var termAttr = (TermAttribute)tokenStream.AddAttribute(typeof (TermAttribute));

            Assert.That(tokenStream.IncrementToken(), Is.True);
            Assert.That(termAttr.Term(), Is.EqualTo("terv"));
        }
    }
}

Unexpected exception:
System.ArgumentOutOfRangeException: Index and length must refer to a location within the string.
Parameter name: length
   at System.Text.StringBuilder.ToString(Int32 startIndex, Int32 length)
at SF.Snowball.SnowballProgram.slice_to(StringBuilder s) in C:\Dev\Third Party\Lucene.NET\src\contrib\Snowball\SF\Snowball\SnowballProgram.cs:line 467 at SF.Snowball.Ext.FinnishStemmer.r_tidy() in C:\Dev\Third Party\Lucene.NET\src\contrib\Snowball\SF\Snowball\Ext\FinnishStemmer.cs:line 974 at SF.Snowball.Ext.FinnishStemmer.Stem() in C:\Dev\Third Party\Lucene.NET\src\contrib\Snowball\SF\Snowball\Ext\FinnishStemmer.cs:line 1137

Expected result:
The unit test should pass.

// Simon

Reply via email to