On Thu, Apr 9, 2009 at 02:01, Uwe Schindler <u...@thetaphi.de> wrote: >> >> Also, on the other topic - how hard is it to boost >> >> TermEnum.skipTo(term) speed to IndexReader.terms(term) level? Would be >> >> nice for TrieRangeFilter and probably some other filters. >> > I think all that's needed is to implement SegmentTermEnum.skipTo, >> > calling something like tis.terms(Term) but instead of returning a >> > cloned SegmentTermEnum, overwrite the one passed in? >> I bet at least MultiSegmentReader.MultiTermEnum should be affected >> too? (I'm looking at 2.3.2 sources) >> >> > Does TrieRangeFilter use TermEnum.skipTo? If so, we should certainly >> fix this. >> It doesn't, but only because skipTo is so obviously slow + I have >> another filter in my project that could use skipTo. >> >> Refer to: https://issues.apache.org/jira/browse/LUCENE- >> 1470?focusedCommentId=12651318&page=com.atlassian.jira.plugin.system.issue >> tabpanels%3Acomment-tabpanel#action_12651318 >> Uwe> I am fine with calling IndexReader.terms(Term) to use the cache >> and faster seeking. The cost of creating new instances of TermEnums is >> less than doing disk reads. > > I am fascinated; you remember my question... :-) I don't, I retired from that issue comments earlier :) But today I was borrowing parts of your code for my version of rangefilter (which we discussed at the very beginning) and stumbled upon obviously missed skipTo opportunity. Then I checked the mailing-list and found there your supporting voice.
> Yes, if seekTo would work more performant, I could easily use it in > TrieRange and would be happy as noted before. Currently, a new TermEnum is > created on each sub-range. When TrieRange was committed and therefore > updated, for me it was (and still is) not clear, why skipTo may not be as > fast as a new TermEnum. Check Michael's link below, this method (and its ugly implementation) is a random offspring of some ancient bugfix. Nobody loved it, and it grew in neglect. >> But other people (like me) might use mmapped indexes, so cost(new >> TermEnum)/cost(index read) relation looks different for us. >> >> > See also this, for historical context: >> > >> http://markmail.org/message/2e7kpvyi3bqtgjwt#query:lucene%20termenum%20sk >> ipto+page:1+mid:lb46mbbgpgbnnuxk+state:results >> Darn! And api-wise it looks like a legitimate method :) > > Uwe > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > > -- Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 ICQ: 104465785 --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org