I am trying to create a custom analyzer that will check for pagebreak
and linebreak and add the payload data for each term. In the custom
filter I have this code:
public boolean incrementToken() throws IOException {
if(input.incrementToken())
{
if(termAtt.term().equals(pageBreak)){
System.out.println("pageBreak");
pageCount++;
}
else if(termAtt.term().equals(lineBreak))
{
System.out.println("lineBreak");
lineCount++;
}
else
addPayload(lineCount, pageCount);
return true;
}
else
return false;
}
where pageBreak and lineBreak is defined as :
int pageBreakAscii = 12;
String pageBreak = new Character ((char) pageBreakAscii).toString();
String lineBreak = System.getProperty("line.separator");
And am using the WhitespaceAnalyzer tokenstream, which ignores the
pageBreak and lineBreak. Is there a way to create a analyzer that will
ignore the pagebreak and linebreak characters during search, but give
access to them in incrementToken() in the filter ?
--
Where there is a will, there is a way !
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]