Hello.
I am developing own analyzer based on StandardAnalyzer.
I realized that tokenizer.setMaxTokenLength is called many times.
*protected TokenStreamComponents createComponents(final String fieldName,
final Reader reader) {*
* final StandardTokenizer src = new StandardTokenizer(getVersion(),
reader);*
* src.setMaxTokenLength(maxTokenLength);*
* TokenStream tok = new StandardFilter(getVersion(), src);*
* tok = new LowerCaseFilter(getVersion(), tok);*
* tok = new StopFilter(getVersion(), tok, stopwords);*
* return new TokenStreamComponents(src, tok) {*
* @Override*
* protected void setReader(final Reader reader) throws IOException {*
* src.setMaxTokenLength(StandardAnalyzer.this.maxTokenLength);*
* super.setReader(reader);*
* }*
* };*
* }*
Does it make sense if length stays the same? I see it finally calls this
one( in StandardTokenizerImpl ):
*public final void setBufferSize(int numChars) {*
* ZZ_BUFFERSIZE = numChars;*
* char[] newZzBuffer = new char[ZZ_BUFFERSIZE];*
* System.arraycopy(zzBuffer, 0, newZzBuffer, 0,
Math.min(zzBuffer.length, ZZ_BUFFERSIZE));*
* zzBuffer = newZzBuffer;*
* }*
So it just copies old array content into the new one.
Regards
Piotr Idzikowski