DIGY
-----Original Message-----
From: Robert Stewart [mailto:[email protected]]
Sent: Tuesday, September 13, 2011 5:55 PM
To:<[email protected]>
Subject: Re: [Lucene.Net] Test case for: possible infinite loop bug in
portuguese snowball stemmer?
Here is a test case:
string text = @"Califórnia";
Lucene.Net.Analysis.KeywordTokenizer tokenizer = new KeywordTokenizer(new
StringReader(text));
Lucene.Net.Analysis.Snowball.SnowballFilter stemmer=
new Lucene.Net.Analysis.Snowball.SnowballFilter(tokenizer,
"Portuguese");
Lucene.Net.Analysis.Token token;
while ((token = stemmer.Next()) != null)
{
System.Console.WriteLine(tokenText);
}
Seems to go into infinite loop. Call to stemmer.Next() never returns. Not
sure if this is the only stemmer I am having trouble with. And it does
happen to us on a near daily basis.
Thanks,
Bob
On Sep 13, 2011, at 9:37 AM, Robert Stewart wrote:
Are there any known issues with snowball stemmers (portuguese in
particular) going into some infinite loop? I have a problem that happens on
a recurring basis where IndexWriter locks up on AddDocument and never
returns (it has taken up to 3 days before we realize it), requiring manual
killing of the process. It seems to happen only on portuguese documents
from what I can tell so far, and the stack trace when thread is aborted is
always as follows:
System.Threading.ThreadAbortException: Thread was being aborted.
at System.RuntimeMethodHandle._InvokeMethodFast(IRuntimeMethodInfo
method, Object target, Object[] arguments, SignatureStruct& sig,
MethodAttributes methodAttributes, RuntimeType typeOwner)
at System.RuntimeMethodHandle.InvokeMethodFast(IRuntimeMethodInfo
method, Object target, Object[] arguments, Signature sig, MethodAttributes
methodAttributes, RuntimeType typeOwner)
at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags
invokeAttr, Binder binder, Object[] parameters, CultureInfo culture, Boolean
skipVisibilityChecks)
at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags
invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
at Lucene.Net.Analysis.Snowball.SnowballFilter.Next()
System.SystemException: System.Threading.ThreadAbortException: Thread was
being aborted.
at System.RuntimeMethodHandle._InvokeMethodFast(IRuntimeMethodInfo
method, Object target, Object[] arguments, SignatureStruct& sig,
MethodAttributes methodAttributes, RuntimeType typeOwner)
at System.RuntimeMethodHandle.InvokeMethodFast(IRuntimeMethodInfo
method, Object target, Object[] arguments, Signature sig, MethodAttributes
methodAttributes, RuntimeType typeOwner)
at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags
invokeAttr, Binder binder, Object[] parameters, CultureInfo culture, Boolean
skipVisibilityChecks)
at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags
invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
at Lucene.Net.Analysis.Snowball.SnowballFilter.Next()
at Lucene.Net.Analysis.Snowball.SnowballFilter.Next()
at Lucene.Net.Analysis.TokenStream.IncrementToken()
at Lucene.Net.Index.DocInverterPerField.ProcessFields(Fieldable[]
fields, Int32 count)
at Lucene.Net.Index.DocFieldProcessorPerThread.ProcessDocument()
at Lucene.Net.Index.DocumentsWriter.UpdateDocument(Document doc,
Analyzer analyzer, Term delTerm)
at Lucene.Net.Index.IndexWriter.AddDocument(Document doc, Analyzer
analyzer)
Is there another list of contrib/snowball issues? I have not been able to
reproduce a small test case yet however. Have there been any such issues
with stemmers in the past?
Thanks,
Bob
-----
Checked by AVG - www.avg.com
Version: 2012.0.1796 / Virus Database: 2082/4494 - Release Date: 09/13/11