Shoot I wish I had noticed this earlier: http://clucene.git.sourceforge.net/git/gitweb.cgi?p=clucene/clucene;a=commit;h=de5695332badddc264c3e187350463d9d6ee4a8a
Looks like someone else had already found and fixed the bug. I wish I had found that, would have saved me a lot of time, oh well. On Dec 4, 2010, at 1:40 PM, Matt Ronge wrote: > I've been working off the head of CLucene (which has great!) and I ran into a > memory smasher. > > I had strange issues where after some queries the search results would start > to get more and more incorrect until the app would crash. After much > debugging I was able to confirm that this would only occur for queries which > had lengths that were multiples of 8. > > After even more debugging I found that the KeywordTokenizer (which I was > using for my queries) was allocating term buffers that where multiples of 8 > (suspicious!). It turns out that if the token length is a multiple of 8 and > the KeywordTokenizer attempts to null terminate the string, it writes off the > end of the array, causing memory corruption. Normally you don't see this > because it silently corrupts and the token must be the multiple of 8. > > To fix this I make sure to add room for the null terminator if the buffer is > already full. Here is my patch: > > diff --git a/src/core/CLucene/analysis/Analyzers.cpp > b/src/core/CLucene/analysis/Analyzers.cpp > index 0c34a60..39fec43 100644 > --- a/src/core/CLucene/analysis/Analyzers.cpp > +++ b/src/core/CLucene/analysis/Analyzers.cpp > @@ -556,6 +556,9 @@ Token* KeywordTokenizer::next(Token* token){ > if ( termBuffer == NULL ){ > termBuffer=token->resizeTermBuffer(token->bufferLength() + 8); > } > + if (upto == token->bufferLength()) { > + termBuffer = token->resizeTermBuffer(token->bufferLength() + 1); > + } > termBuffer[upto]=0; > token->setTermLength(upto); > return token; > > > Let me know if I should open a bug for this. > > Thanks again for clucene! > -- > Matt Ronge > Central Atomics > Makers of Rocketbox > > http://www.getrocketbox.com > mro...@centralatomics.com > > > > > > > > ------------------------------------------------------------------------------ > What happens now with your Lotus Notes apps - do you make another costly > upgrade, or settle for being marooned without product support? Time to move > off Lotus Notes and onto the cloud with Force.com, apps are easier to build, > use, and manage than apps on traditional platforms. Sign up for the Lotus > Notes Migration Kit to learn more. http://p.sf.net/sfu/salesforce-d2d > _______________________________________________ > CLucene-developers mailing list > CLucene-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/clucene-developers -- Matt Ronge Central Atomics Makers of Rocketbox http://www.getrocketbox.com mro...@centralatomics.com ------------------------------------------------------------------------------ What happens now with your Lotus Notes apps - do you make another costly upgrade, or settle for being marooned without product support? Time to move off Lotus Notes and onto the cloud with Force.com, apps are easier to build, use, and manage than apps on traditional platforms. Sign up for the Lotus Notes Migration Kit to learn more. http://p.sf.net/sfu/salesforce-d2d _______________________________________________ CLucene-developers mailing list CLucene-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/clucene-developers