[ 
https://issues.apache.org/jira/browse/LUCENE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051170#comment-13051170
 ] 

Michael McCandless commented on LUCENE-3206:
--------------------------------------------

bq. Thanks Mike. I agree it'd be nice to have a flexible label type as well, 
but I have no idea how to make it efficient (and code-clean) yet. You could do 
a similar thing as with the outputs (using either a boxed type if you don't 
care about performance that much or a mutable wrapper if you do care about GC), 
but how this would affect the API I have no idea right now. There is also the 
lexicographic order that one would need to consider (a comparator would need to 
be passed as part of the construction process and then for traversals). It'll 
get complicated.

Yeah this was my fear :)

bq. I was also thinking of just dropping support for BYTE1/2 and leaving fixed 
int labels... This would bloat byte-labeled automata a little bit (if they're 
ASCII they'd v-code into a single byte anyway), but would strip down the 
ugliness of BYTE1/2/4... All methods accepting BytesRef and CharSequence would 
still be there, translated on the fly, but the representation of labels would 
always be an int.

Hmm, that makes me nervous -- this could be a non-negligible increase
in FST size for the non-ascii case I think?

bq. One more question: can you give me traversal use cases you're using FSTs 
for now? I'll try to implement them and see how the new API works out in 
practice. I looked at the FSTEnum and it has next(), seekCeil() and seekFloor().

I think SimpleText codec is a good example?  Also
VariableGapTermsIndexReader, and MemoryCodec?  Each of these use the
BytesRefFSTEnum, I believe.

bq. I'm also a bit terrified by the about of changes this would introduce if we 
decided to switch the APIs (tests, scattered use cases...). Don't know if I'll 
have the time to update this all.

I think it's still fairly contained at this point?  (Ie the number of
tests that directly use the FST APIs).


> FST package API refactoring
> ---------------------------
>
>                 Key: LUCENE-3206
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3206
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/FSTs
>    Affects Versions: 3.2
>            Reporter: Dawid Weiss
>            Assignee: Dawid Weiss
>            Priority: Minor
>             Fix For: 3.3, 4.0
>
>         Attachments: LUCENE-3206.patch
>
>
> The current API is still marked @experimental, so I think there's still time 
> to fiddle with it. I've been using the current API for some time and I do 
> have some ideas for improvement. This is a placeholder for these -- I'll post 
> a patch once I have a working proof of concept.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to