Keeping this thread alive. I would appreciate a response from the community about this issue.
Thanks in advance, Adriano Crestani On Tue, Jul 13, 2010 at 3:59 AM, Adriano Crestani <[email protected]> wrote: > Hi, > > Why TermAttributeImpl.clone() method uses buff.clone() instead of > System.arrayCopy to clone its internal buffer? Performance reasons? > > I have the following scenario: > > ... > public boolean incrementToken() { > ... > String twoHundredKCharsString = "abc...."; > String smallString = "test"; > > termAttribute.setTermBuffer(twoHundredKCharsString); > State largeStringState = captureState(); > > termAttribute.setTermBuffer(smallString); > State smallStringState = captureState(); > > ... > } > ... > > And guess what?! smallStringState has a TermAttribute object that > holds an internal buffer of 200k chars in size!!! > > I was googling and found out that using cloning and arrayCopy has the > same performance for small arrays, and cloning just performs better > for large arrays. > > So, if large string inputs are not a real scenario, why not use > arrayCopy instead of clone? But in case it's a real scenario, Lucene > should definitely not be copying the entire buffer for small strings. > > Maybe TermAttribute interface could expose a method like > shrinkBuffer(), so the user could invoke when it needs to. > > Thoughts? > > Best Regards, > Adriano Crestani > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
