RE: Custom Filter Indexing Slow

Uwe Schindler Fri, 14 Sep 2012 03:10:51 -0700

The problem ist hat your transformation method needs Strings, but your 
incrementToken method also has a serious bug: It does not respect the length of 
the buffer, so it may hit additional garbage!



The easiest way to do this in lots less code and not having those bugs:

     public boolean incrementToken() throws IOException { 
        if (!input.incrementToken()) {
            return false;
        }
        final String normalizedLCcallnum = 
getLCShelfkey(charTermAttr.toString());
        charTermAttr.setEmpty().append(normalizedLCcallnum);
        return true;
     }

This fixes part of your performance problem: It does not 2 times convert the 
result of your transformation between char arrays, Strings,..

To further improve speed, make the method getLCShelfKey directly operatate on 
char[] and length.

Uwe
-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -----Original Message-----
> From: Osullivan L. [mailto:l.osulli...@swansea.ac.uk]
> Sent: Friday, September 14, 2012 11:58 AM
> To: general@lucene.apache.org
> Subject: Custom Filter Indexing Slow
> 
> Hi Folks,
> 
> I have a custom filter which does everything I need it to but it has reduced 
> my
> indexing speed to a crawl. Are there any methods I need to call to clear / 
> clean
> things up once my script (details below) has done it's work?
> 
> Thanks,
> 
> Luke
> 
>   public LCCNormalizeFilter(TokenStream input)
>     {
>         super(input);
>         this.charTermAttr = addAttribute(CharTermAttribute.class);
>     }
> 
>     public boolean incrementToken() throws IOException {
> 
>       if (!input.incrementToken()) {
>           return false;
>       }
> 
>       char[] buffer = charTermAttr.buffer();
>       String rawLCcallnum = new String(buffer);
>       String normalizedLCcallnum = getLCShelfkey(rawLCcallnum);
>       char[] newBuffer = normalizedLCcallnum.toCharArray();
>         charTermAttr.setEmpty();
>         charTermAttr.copyBuffer(newBuffer, 0, newBuffer.length);
>         return true;
>     }=

RE: Custom Filter Indexing Slow

Reply via email to