Can you characterize those 21,000 leaves that deviate significantly? For example, do some tiles predominate disproportionately (I would guess C, G, and/or O, given your CANISTER bonus).
On Fri, Jan 23, 2009 at 5:06 PM, Eugene Deon <[email protected]> wrote: > I've been trying to tune my leave estimation strategy by solving for the > exact values of my commonly considered estimation variables as to minimize > the sum of squared errors between my strategy estimates and all the leaves > in Quackle's "superleaves" file. > > > > So far my variables include only: > > -single letter values > > -double/triple/quad-letter penalties > > -vowel/consonant imbalance penalties > > -bonuses for number of tiles in CANISTER > > -a couple of letter pair values (QU, YY, IY, FF, ING) > > > > The results aren't acceptable yet. 21,000 of 148,000 five or fewer letter > racks have errors over 5. But I would like to keep the number of terms > managable if possible. > > > > Has anyone else done a similar analysis? Are complete details of Maven's > letter pair values available somewhere? > > > > Thanks, > > Eugene d'Eon >
