> If you use the --stat option to select a test of association for
> identifying bigram or co-occurrence features, you really should use the
> --stat_rank or --stat_score to select a subset of the bigrams or
> co-occurrences that are identified. If you do not, the effect will be the
> same as if you didn't use the --stat option at all, since all features

This is true ONLY FOR order1, for order2 --stat uses stat scores in
word vectors, even if you dont remove the insignificant features.
SO, using --context o2 and --stat without --stat_score or --stat_rank
will create word vectors for all bigrams/cocs in your training data.
For order1, yes we consider frequency counts of features in test
contexts, so scores dont matter. But for word vectors they show
scores/frequencies depending on whether or not you use --stat.

Also, one more relevant point. --window, --stat do not apply to
unigram features. So even if you do specify these params for uni
features, they wont take any effect.

Amruta



-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
senseclusters-users mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/senseclusters-users

Reply via email to