[ https://issues.apache.org/jira/browse/LUCENE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051170#comment-13051170 ]
Michael McCandless commented on LUCENE-3206: -------------------------------------------- bq. Thanks Mike. I agree it'd be nice to have a flexible label type as well, but I have no idea how to make it efficient (and code-clean) yet. You could do a similar thing as with the outputs (using either a boxed type if you don't care about performance that much or a mutable wrapper if you do care about GC), but how this would affect the API I have no idea right now. There is also the lexicographic order that one would need to consider (a comparator would need to be passed as part of the construction process and then for traversals). It'll get complicated. Yeah this was my fear :) bq. I was also thinking of just dropping support for BYTE1/2 and leaving fixed int labels... This would bloat byte-labeled automata a little bit (if they're ASCII they'd v-code into a single byte anyway), but would strip down the ugliness of BYTE1/2/4... All methods accepting BytesRef and CharSequence would still be there, translated on the fly, but the representation of labels would always be an int. Hmm, that makes me nervous -- this could be a non-negligible increase in FST size for the non-ascii case I think? bq. One more question: can you give me traversal use cases you're using FSTs for now? I'll try to implement them and see how the new API works out in practice. I looked at the FSTEnum and it has next(), seekCeil() and seekFloor(). I think SimpleText codec is a good example? Also VariableGapTermsIndexReader, and MemoryCodec? Each of these use the BytesRefFSTEnum, I believe. bq. I'm also a bit terrified by the about of changes this would introduce if we decided to switch the APIs (tests, scattered use cases...). Don't know if I'll have the time to update this all. I think it's still fairly contained at this point? (Ie the number of tests that directly use the FST APIs). > FST package API refactoring > --------------------------- > > Key: LUCENE-3206 > URL: https://issues.apache.org/jira/browse/LUCENE-3206 > Project: Lucene - Java > Issue Type: Improvement > Components: core/FSTs > Affects Versions: 3.2 > Reporter: Dawid Weiss > Assignee: Dawid Weiss > Priority: Minor > Fix For: 3.3, 4.0 > > Attachments: LUCENE-3206.patch > > > The current API is still marked @experimental, so I think there's still time > to fiddle with it. I've been using the current API for some time and I do > have some ideas for improvement. This is a placeholder for these -- I'll post > a patch once I have a working proof of concept. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org