After discussions on this list last year, I decided to try 
categorizing queries on the list with a view to training a Bayes 
classifier for an autoresponder. I've got about 5 weeks worth now of 
/queries/ (not replies) sent to the list which I've been classifying 
somewhat crudely, and thought people might be interested in the 
stats.

The categories are necessarily more than a bit subjective; a few 
queries were ambiguous; the categories are what seemed to me to be 
relevant to newcomers.

158 installation (incl download, dictionaries, removal)
95 M$ compatibility (not vista issues)
24 foreign language submissions
18 selling OOo
9 vista compatibilty
7 envelope printing
424 uncategorized (may need to split)

giving a total of around 740 queries.

Oh, and the classifier doesn't work at all well. Probably needs much 
more data.

-- 
http://www.scottsonline.org.uk lists incoming sites blocked because 
of spam
[EMAIL PROTECTED]    Mike Scott, Harlow, Essex, England




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to