Hello. I am developing own analyzer based on StandardAnalyzer. I realized that tokenizer.setMaxTokenLength is called many times.
*protected TokenStreamComponents createComponents(final String fieldName, final Reader reader) {* * final StandardTokenizer src = new StandardTokenizer(getVersion(), reader);* * src.setMaxTokenLength(maxTokenLength);* * TokenStream tok = new StandardFilter(getVersion(), src);* * tok = new LowerCaseFilter(getVersion(), tok);* * tok = new StopFilter(getVersion(), tok, stopwords);* * return new TokenStreamComponents(src, tok) {* * @Override* * protected void setReader(final Reader reader) throws IOException {* * src.setMaxTokenLength(StandardAnalyzer.this.maxTokenLength);* * super.setReader(reader);* * }* * };* * }* Does it make sense if length stays the same? I see it finally calls this one( in StandardTokenizerImpl ): *public final void setBufferSize(int numChars) {* * ZZ_BUFFERSIZE = numChars;* * char[] newZzBuffer = new char[ZZ_BUFFERSIZE];* * System.arraycopy(zzBuffer, 0, newZzBuffer, 0, Math.min(zzBuffer.length, ZZ_BUFFERSIZE));* * zzBuffer = newZzBuffer;* * }* So it just copies old array content into the new one. Regards Piotr Idzikowski