I think we should try to put together a large corpus of actual games played
by players of a variety of skill levels (i.e. you play a game with quackle,
save it as a gcg file and email it to a repository somewhere).  Then we can
build some probabilistic models to estimate the likelihood that a player's
vocabulary contains a particular word, given the body of words played
historically by that player and by other players of relatively equal
strength.  If games in the corpus included ratings for the players involved,
we could incorporate those.  Otherwise, we could assess the strength of each
player by the avg. pts/turn or by comparing the moves played in the game to
what Quackle would have played in the same situation.  I think that even
crude models constructed in this way could yield a big improvement over just
using published lists of common words (although those would certainly be
valuable as well).  Since people grow their Scrabble vocabulary in such a
different way than they grow their normal reading/writing/speaking
vocabulary, I think there is no substitute for real game data.

If there is interest in this type of thing, I think I can set up such a
repository for the accumulation of data.

Obviously, if any of the online Scrabble servers were to log played games
and contribute them, that would be nice too.

Mark

On 9/5/07, Matt Liberty <[EMAIL PROTECTED]> wrote:
>
>   My feeling is to start out with a basic vocabulary for the dumber
> player. I'm quite open to how that initial set should be chosen.
>
> There should be some options for adding various lists automatically -
> like all the legal two letter words. Likewise it would be nice to add
> any words I am currently quizzing in Zyzzva. I think a general word
> list import mechanism would cover all that fairly flexibly. It should
> also be able to add any word I play (that is truly legal) to its
> lexicon on the fly.
>
> Of course the current gaddag code assumes a static lexicon, it would
> need work to make it more dynamic. Also having separate lexicons for
> generation and validation is some work. All this might get me
> motivated to put some more work into Quackle though.
>
> Matt
>
>  
>

Reply via email to