A user reported observing some cases of overflow in Fisher's exact test in the Ngram Statistics Package (both the left and right variations). My own conclusion is that there is a bit of rounding error at work here, since we are summing together a potentially large number of hyper-geometric probabilities to arrive at the values. So, it's not an alarming situation, but certainly one that needs to be fixed. Below you can see some specific cases of overflow:
Right Fisher output: 934064 cat:cc<>h_position_direction:-<>2 1.0490 1728 20006 169317 h_role:object<>relative_position:3<>3 1.0050 144 68362 15501 h_cat:jj<>h_role:locative<>4 1.0032 511 48419 35842 h_group_type:na<>cat:pos<>5 1.0000 14 59756 8709 h_group_type:na<>cat:sym<>5 1.0000 38 59756 6910 Left Fisher output: 934064 cat:nn<>h_relative_position:3<>1 1.0916 801 301347 1890 h_group_type:np<>role:predeterminer<>2 1.0390 1133 445387 1135 group_type:na<>leafp:na<>3 1.0000 1 1 1 group_type:na<>h_leafp:na<>3 1.0000 1 1 59756 h_group_type:na<>leafp:na<>3 1.0000 1 59756 1 The good news is that the Ngram Statistics Package is in line for a long overdue facelift that will commence in August - there are a number of long pending issues that will be resolved at that time, and some new enhancements and features. As we get closer to starting that work, I'll be posting our list of reported problems, etc. in order to make sure we have caught everything. And of course, please feel free to let us know of any other questions or concerns. Ted -- Ted Pedersen http://www.d.umn.edu/~tpederse Yahoo! Groups Links <*> To visit your group on the web, go to: http://groups.yahoo.com/group/ngram/ <*> To unsubscribe from this group, send an email to: [EMAIL PROTECTED] <*> Your use of Yahoo! Groups is subject to: http://docs.yahoo.com/info/terms/