Hi,
It's been a while since I've written a custom TokenFilter, and I'm not having
luck getting tokens out of the TokenStream using 2.3-dev.
I'm hitting that default term buffer of the size 10 using the following:
public final Token next(Token result) throws IOException {
result = input.next(result);
if (result != null) {
final int len = result.termLength(); // gives me the actual
term length, not the buffer of length 10
result.setTermLength(len);
System.out.println("LEN 1: " + len); // prints the actual length
final char[] buffer = result.termBuffer(); // this still gives me
the buffer of length 10
System.out.println("LEN 2: " + buffer.length); // and this prints 10
Is the idea to:
1) get the char[] buffer from Token
2) get its real length via termLength()
3) manually fill a new char[] with the content of the buffer, minus the extra
buffering?
I'm looking at Token to see how to get the *actual* term, but don't see
anything, so it looks like a Filter writer has to do one of these for each term
buffer:
public final Token next(Token result) throws IOException {
result = input.next(result);
if (result != null) {
final int len = result.termLength();
final char[] buffer = result.termBuffer();
final char[] token = new char[len];
System.arraycopy(buffer, 0, token, 0, len);
Am I missing a Token method I could use instead, or is this the new way to go?
Thanks,
Otis
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]