[jira] [Commented] (LUCENE-4285) Improve FST API usability for mere mortals

2012-10-08 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13471514#comment-13471514
 ] 

Dawid Weiss commented on LUCENE-4285:
-

For the record, I don't have anything against null returned for an empty 
automaton, I just mentioned how this representation is accomplished elsewhere 
(for instance here 
http://www.eti.pg.gda.pl/katedry/kiw/pracownicy/Jan.Daciuk/personal/fsa.html).

 Improve FST API usability for mere mortals
 --

 Key: LUCENE-4285
 URL: https://issues.apache.org/jira/browse/LUCENE-4285
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/FSTs
Reporter: David Smiley
 Attachments: LUCENE-4285.patch


 FST technology is something that has brought amazing advances to Lucene, yet 
 the API is hard to use for the vast majority of users like me.  I know that 
 performance of FSTs is really important, but surely a lot can be done without 
 sacrificing that.
 (comments will hold specific ideas and problems)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4285) Improve FST API usability for mere mortals

2012-10-06 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13471081#comment-13471081
 ] 

David Smiley commented on LUCENE-4285:
--

I admit checked exceptions would have alerted me to my bug, but that doesn't 
make the API any nicer -- I still need null checks littered through my FST user 
code now.  I don't know the FST internals but I'd be surprised to hear that 
adding support for an empty FST adds appreciable overhead.  If this overhead 
we're discussing is a simple conditional check, then this is net-zero since as 
it is I need these null checks on my end of the API due to my FST being 
potentially null.

 Improve FST API usability for mere mortals
 --

 Key: LUCENE-4285
 URL: https://issues.apache.org/jira/browse/LUCENE-4285
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/FSTs
Reporter: David Smiley

 FST technology is something that has brought amazing advances to Lucene, yet 
 the API is hard to use for the vast majority of users like me.  I know that 
 performance of FSTs is really important, but surely a lot can be done without 
 sacrificing that.
 (comments will hold specific ideas and problems)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4285) Improve FST API usability for mere mortals

2012-10-06 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13471083#comment-13471083
 ] 

Dawid Weiss commented on LUCENE-4285:
-

Things are more difficult than they seem at the surface. An elegant solution 
would encode an empty automaton without any extra flags or checks. In an arc 
based representation there is simply no notion of an empty set of arcs though 
-- there needs to be at least one and if it's present on the root state then, 
well, it's no longer an empty automaton. Like I said -- this can be modeled 
with an initial state transition (the symbol doesn't matter); if this 
transition is final then this the automaton is empty (there is no actual root 
state). But this also changes how traversals are implemented and would affect 
all of the existing code.

 Improve FST API usability for mere mortals
 --

 Key: LUCENE-4285
 URL: https://issues.apache.org/jira/browse/LUCENE-4285
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/FSTs
Reporter: David Smiley

 FST technology is something that has brought amazing advances to Lucene, yet 
 the API is hard to use for the vast majority of users like me.  I know that 
 performance of FSTs is really important, but surely a lot can be done without 
 sacrificing that.
 (comments will hold specific ideas and problems)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4285) Improve FST API usability for mere mortals

2012-10-05 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470555#comment-13470555
 ] 

David Smiley commented on LUCENE-4285:
--

I just found out that an FST builder.finish() returns null if there's no input 
basically.  That is bad API design; it should return an FST with nothing init.  
For now I have to have littered null checks.

 Improve FST API usability for mere mortals
 --

 Key: LUCENE-4285
 URL: https://issues.apache.org/jira/browse/LUCENE-4285
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/FSTs
Reporter: David Smiley

 FST technology is something that has brought amazing advances to Lucene, yet 
 the API is hard to use for the vast majority of users like me.  I know that 
 performance of FSTs is really important, but surely a lot can be done without 
 sacrificing that.
 (comments will hold specific ideas and problems)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4285) Improve FST API usability for mere mortals

2012-10-05 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470560#comment-13470560
 ] 

Robert Muir commented on LUCENE-4285:
-

I rather like this: I don't think its common to have an empty FST, its usually 
indicative of a bug or misconfiguration.

There is also some code in lucene that uses this return value, e.g. 
SynonymFilterFactory, if you give it a file with no actual 
synonyms entries it just returns the underlying stream rather than decorating 
it with a useless synonyms filter.

 Improve FST API usability for mere mortals
 --

 Key: LUCENE-4285
 URL: https://issues.apache.org/jira/browse/LUCENE-4285
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/FSTs
Reporter: David Smiley

 FST technology is something that has brought amazing advances to Lucene, yet 
 the API is hard to use for the vast majority of users like me.  I know that 
 performance of FSTs is really important, but surely a lot can be done without 
 sacrificing that.
 (comments will hold specific ideas and problems)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4285) Improve FST API usability for mere mortals

2012-10-05 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470562#comment-13470562
 ] 

Michael McCandless commented on LUCENE-4285:


bq. That is bad API design; it should return an FST with nothing init.

+1, this has also bit me ... null return is bad when it's surprising, as 
clearly it is here.

Somehow we just need an FST that accepts nothing.

 Improve FST API usability for mere mortals
 --

 Key: LUCENE-4285
 URL: https://issues.apache.org/jira/browse/LUCENE-4285
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/FSTs
Reporter: David Smiley

 FST technology is something that has brought amazing advances to Lucene, yet 
 the API is hard to use for the vast majority of users like me.  I know that 
 performance of FSTs is really important, but surely a lot can be done without 
 sacrificing that.
 (comments will hold specific ideas and problems)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4285) Improve FST API usability for mere mortals

2012-10-05 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470584#comment-13470584
 ] 

Dawid Weiss commented on LUCENE-4285:
-

This is typically (?) done by making the root state follow an epsilon 
transition. If it points to a final state it means the automaton is empty and 
accepts epsilon (in other words, nothing).

But then it also adds overhead for every iteration which needs to skip over 
this epsilon transition...

 Improve FST API usability for mere mortals
 --

 Key: LUCENE-4285
 URL: https://issues.apache.org/jira/browse/LUCENE-4285
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/FSTs
Reporter: David Smiley

 FST technology is something that has brought amazing advances to Lucene, yet 
 the API is hard to use for the vast majority of users like me.  I know that 
 performance of FSTs is really important, but surely a lot can be done without 
 sacrificing that.
 (comments will hold specific ideas and problems)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4285) Improve FST API usability for mere mortals

2012-10-05 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470588#comment-13470588
 ] 

Robert Muir commented on LUCENE-4285:
-

If the problem with null is that the documentation isn't loud enough, we can 
make it bold?

If the problem is people dont read build()'s javadocs and we think thats bad, 
another idea is to just
throw a checked exception: then there is no sneaky bugs caused by the change 
(unlike returning EMPTY, which
could easily introduce these).


 Improve FST API usability for mere mortals
 --

 Key: LUCENE-4285
 URL: https://issues.apache.org/jira/browse/LUCENE-4285
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/FSTs
Reporter: David Smiley

 FST technology is something that has brought amazing advances to Lucene, yet 
 the API is hard to use for the vast majority of users like me.  I know that 
 performance of FSTs is really important, but surely a lot can be done without 
 sacrificing that.
 (comments will hold specific ideas and problems)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4285) Improve FST API usability for mere mortals

2012-10-05 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13470595#comment-13470595
 ] 

Robert Muir commented on LUCENE-4285:
-

(I hate checked exceptions and think this would only be annoying for most users
 who will just let their IDE fill in a crappy try/catch with a 
System.out.println, but
 I'm just looking out for our own code here :)


 Improve FST API usability for mere mortals
 --

 Key: LUCENE-4285
 URL: https://issues.apache.org/jira/browse/LUCENE-4285
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/FSTs
Reporter: David Smiley

 FST technology is something that has brought amazing advances to Lucene, yet 
 the API is hard to use for the vast majority of users like me.  I know that 
 performance of FSTs is really important, but surely a lot can be done without 
 sacrificing that.
 (comments will hold specific ideas and problems)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4285) Improve FST API usability for mere mortals

2012-08-02 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427435#comment-13427435
 ] 

David Smiley commented on LUCENE-4285:
--

Keep in mind, from an FST outsider like me, *FSTs are basically a fancy 
SortedMap*.  Yet Lucene's FST API is so complicated that there is a dedicated 
package of classes, and I need to understand a fair amount of it.  I'm not 
saying the package should go away or just one class is realistic, just that 
conceptually for outsiders it can and should be simpler than it is.

The Util.get* methods should have instance methods on the FST.  I shouldn't 
need to look at Util, I think.

The BytesReader concept is confusing and should be hidden.

Outputs... this aspect of the API is over-exposed; maybe it can be hidden more? 
 I know I need to choose an implementation at construction.

FSTEnum is pretty cool, and improving it or creating variants of it could help 
to simply using the overall API.  The FST should have a getter for it.  It 
would be nice if FSTEnum could advance to the next arc by a label (I need 
this).  It would be something like next(int).  Can it be improved to the point 
where, for example, SynonymFilter can use it?  It would be nice to reduce the 
use-cases where users/client-code don't have to even see an Arc.

 Improve FST API usability for mere mortals
 --

 Key: LUCENE-4285
 URL: https://issues.apache.org/jira/browse/LUCENE-4285
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/FSTs
Reporter: David Smiley

 FST technology is something that has brought amazing advances to Lucene, yet 
 the API is hard to use for the vast majority of users like me.  I know that 
 performance of FSTs is really important, but surely a lot can be done without 
 sacrificing that.
 (comments will hold specific ideas and problems)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org