[jira] [Commented] (LUCENE-4285) Improve FST API usability for mere mortals
[ https://issues.apache.org/jira/browse/LUCENE-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13471514#comment-13471514 ] Dawid Weiss commented on LUCENE-4285: - For the record, I don't have anything against null returned for an empty automaton, I just mentioned how this representation is accomplished elsewhere (for instance here http://www.eti.pg.gda.pl/katedry/kiw/pracownicy/Jan.Daciuk/personal/fsa.html). Improve FST API usability for mere mortals -- Key: LUCENE-4285 URL: https://issues.apache.org/jira/browse/LUCENE-4285 Project: Lucene - Core Issue Type: Improvement Components: core/FSTs Reporter: David Smiley Attachments: LUCENE-4285.patch FST technology is something that has brought amazing advances to Lucene, yet the API is hard to use for the vast majority of users like me. I know that performance of FSTs is really important, but surely a lot can be done without sacrificing that. (comments will hold specific ideas and problems) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4285) Improve FST API usability for mere mortals
[ https://issues.apache.org/jira/browse/LUCENE-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13471081#comment-13471081 ] David Smiley commented on LUCENE-4285: -- I admit checked exceptions would have alerted me to my bug, but that doesn't make the API any nicer -- I still need null checks littered through my FST user code now. I don't know the FST internals but I'd be surprised to hear that adding support for an empty FST adds appreciable overhead. If this overhead we're discussing is a simple conditional check, then this is net-zero since as it is I need these null checks on my end of the API due to my FST being potentially null. Improve FST API usability for mere mortals -- Key: LUCENE-4285 URL: https://issues.apache.org/jira/browse/LUCENE-4285 Project: Lucene - Core Issue Type: Improvement Components: core/FSTs Reporter: David Smiley FST technology is something that has brought amazing advances to Lucene, yet the API is hard to use for the vast majority of users like me. I know that performance of FSTs is really important, but surely a lot can be done without sacrificing that. (comments will hold specific ideas and problems) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4285) Improve FST API usability for mere mortals
[ https://issues.apache.org/jira/browse/LUCENE-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13471083#comment-13471083 ] Dawid Weiss commented on LUCENE-4285: - Things are more difficult than they seem at the surface. An elegant solution would encode an empty automaton without any extra flags or checks. In an arc based representation there is simply no notion of an empty set of arcs though -- there needs to be at least one and if it's present on the root state then, well, it's no longer an empty automaton. Like I said -- this can be modeled with an initial state transition (the symbol doesn't matter); if this transition is final then this the automaton is empty (there is no actual root state). But this also changes how traversals are implemented and would affect all of the existing code. Improve FST API usability for mere mortals -- Key: LUCENE-4285 URL: https://issues.apache.org/jira/browse/LUCENE-4285 Project: Lucene - Core Issue Type: Improvement Components: core/FSTs Reporter: David Smiley FST technology is something that has brought amazing advances to Lucene, yet the API is hard to use for the vast majority of users like me. I know that performance of FSTs is really important, but surely a lot can be done without sacrificing that. (comments will hold specific ideas and problems) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4285) Improve FST API usability for mere mortals
[ https://issues.apache.org/jira/browse/LUCENE-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470555#comment-13470555 ] David Smiley commented on LUCENE-4285: -- I just found out that an FST builder.finish() returns null if there's no input basically. That is bad API design; it should return an FST with nothing init. For now I have to have littered null checks. Improve FST API usability for mere mortals -- Key: LUCENE-4285 URL: https://issues.apache.org/jira/browse/LUCENE-4285 Project: Lucene - Core Issue Type: Improvement Components: core/FSTs Reporter: David Smiley FST technology is something that has brought amazing advances to Lucene, yet the API is hard to use for the vast majority of users like me. I know that performance of FSTs is really important, but surely a lot can be done without sacrificing that. (comments will hold specific ideas and problems) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4285) Improve FST API usability for mere mortals
[ https://issues.apache.org/jira/browse/LUCENE-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470560#comment-13470560 ] Robert Muir commented on LUCENE-4285: - I rather like this: I don't think its common to have an empty FST, its usually indicative of a bug or misconfiguration. There is also some code in lucene that uses this return value, e.g. SynonymFilterFactory, if you give it a file with no actual synonyms entries it just returns the underlying stream rather than decorating it with a useless synonyms filter. Improve FST API usability for mere mortals -- Key: LUCENE-4285 URL: https://issues.apache.org/jira/browse/LUCENE-4285 Project: Lucene - Core Issue Type: Improvement Components: core/FSTs Reporter: David Smiley FST technology is something that has brought amazing advances to Lucene, yet the API is hard to use for the vast majority of users like me. I know that performance of FSTs is really important, but surely a lot can be done without sacrificing that. (comments will hold specific ideas and problems) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4285) Improve FST API usability for mere mortals
[ https://issues.apache.org/jira/browse/LUCENE-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470562#comment-13470562 ] Michael McCandless commented on LUCENE-4285: bq. That is bad API design; it should return an FST with nothing init. +1, this has also bit me ... null return is bad when it's surprising, as clearly it is here. Somehow we just need an FST that accepts nothing. Improve FST API usability for mere mortals -- Key: LUCENE-4285 URL: https://issues.apache.org/jira/browse/LUCENE-4285 Project: Lucene - Core Issue Type: Improvement Components: core/FSTs Reporter: David Smiley FST technology is something that has brought amazing advances to Lucene, yet the API is hard to use for the vast majority of users like me. I know that performance of FSTs is really important, but surely a lot can be done without sacrificing that. (comments will hold specific ideas and problems) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4285) Improve FST API usability for mere mortals
[ https://issues.apache.org/jira/browse/LUCENE-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470584#comment-13470584 ] Dawid Weiss commented on LUCENE-4285: - This is typically (?) done by making the root state follow an epsilon transition. If it points to a final state it means the automaton is empty and accepts epsilon (in other words, nothing). But then it also adds overhead for every iteration which needs to skip over this epsilon transition... Improve FST API usability for mere mortals -- Key: LUCENE-4285 URL: https://issues.apache.org/jira/browse/LUCENE-4285 Project: Lucene - Core Issue Type: Improvement Components: core/FSTs Reporter: David Smiley FST technology is something that has brought amazing advances to Lucene, yet the API is hard to use for the vast majority of users like me. I know that performance of FSTs is really important, but surely a lot can be done without sacrificing that. (comments will hold specific ideas and problems) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4285) Improve FST API usability for mere mortals
[ https://issues.apache.org/jira/browse/LUCENE-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470588#comment-13470588 ] Robert Muir commented on LUCENE-4285: - If the problem with null is that the documentation isn't loud enough, we can make it bold? If the problem is people dont read build()'s javadocs and we think thats bad, another idea is to just throw a checked exception: then there is no sneaky bugs caused by the change (unlike returning EMPTY, which could easily introduce these). Improve FST API usability for mere mortals -- Key: LUCENE-4285 URL: https://issues.apache.org/jira/browse/LUCENE-4285 Project: Lucene - Core Issue Type: Improvement Components: core/FSTs Reporter: David Smiley FST technology is something that has brought amazing advances to Lucene, yet the API is hard to use for the vast majority of users like me. I know that performance of FSTs is really important, but surely a lot can be done without sacrificing that. (comments will hold specific ideas and problems) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4285) Improve FST API usability for mere mortals
[ https://issues.apache.org/jira/browse/LUCENE-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470595#comment-13470595 ] Robert Muir commented on LUCENE-4285: - (I hate checked exceptions and think this would only be annoying for most users who will just let their IDE fill in a crappy try/catch with a System.out.println, but I'm just looking out for our own code here :) Improve FST API usability for mere mortals -- Key: LUCENE-4285 URL: https://issues.apache.org/jira/browse/LUCENE-4285 Project: Lucene - Core Issue Type: Improvement Components: core/FSTs Reporter: David Smiley FST technology is something that has brought amazing advances to Lucene, yet the API is hard to use for the vast majority of users like me. I know that performance of FSTs is really important, but surely a lot can be done without sacrificing that. (comments will hold specific ideas and problems) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4285) Improve FST API usability for mere mortals
[ https://issues.apache.org/jira/browse/LUCENE-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427435#comment-13427435 ] David Smiley commented on LUCENE-4285: -- Keep in mind, from an FST outsider like me, *FSTs are basically a fancy SortedMap*. Yet Lucene's FST API is so complicated that there is a dedicated package of classes, and I need to understand a fair amount of it. I'm not saying the package should go away or just one class is realistic, just that conceptually for outsiders it can and should be simpler than it is. The Util.get* methods should have instance methods on the FST. I shouldn't need to look at Util, I think. The BytesReader concept is confusing and should be hidden. Outputs... this aspect of the API is over-exposed; maybe it can be hidden more? I know I need to choose an implementation at construction. FSTEnum is pretty cool, and improving it or creating variants of it could help to simply using the overall API. The FST should have a getter for it. It would be nice if FSTEnum could advance to the next arc by a label (I need this). It would be something like next(int). Can it be improved to the point where, for example, SynonymFilter can use it? It would be nice to reduce the use-cases where users/client-code don't have to even see an Arc. Improve FST API usability for mere mortals -- Key: LUCENE-4285 URL: https://issues.apache.org/jira/browse/LUCENE-4285 Project: Lucene - Core Issue Type: Improvement Components: core/FSTs Reporter: David Smiley FST technology is something that has brought amazing advances to Lucene, yet the API is hard to use for the vast majority of users like me. I know that performance of FSTs is really important, but surely a lot can be done without sacrificing that. (comments will hold specific ideas and problems) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org