Andres, (and also Linas and anyone else interested...)
I have refreshed my memory on clustering for unsupervised POS learning... this was the approach I had fiddled with long ago, http://www.cs.rhul.ac.uk/home/alexc/papers/eacl2003.pdf https://github.com/ninjin/clark_pos_induction I note that Spitkovsky (at Google) uses a similar method in his more recent work on unsupervised part of speech learning, https://web.stanford.edu/~jurafsky/goldtags.pdf These guys are doing clustering on sparse vectors derived via co-occurrence of various sorts -- they're not using dimension-reduction; though Spitkovsky is doing some dependency parsing... This is at bottom just EM clustering, but it's used in a way that's nicely customized for part of speech induction... This paper finds that Fuzzy C-Means outperforms EM on classifying word2vec output vectors, in a somewhat different context: http://dm.snu.ac.kr/static/docs/TR/SNUDM-TR-2016-11.pdf -- Ben -- Ben Goertzel, PhD http://goertzel.org "I am God! I am nothing, I'm play, I am freedom, I am life. I am the boundary, I am the peak." -- Alexander Scriabin -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/opencog. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/CACYTDBe4Zn40R2dNCp%3DYS0XfOrZOzCGiHSqhSO5s9drrcc8huQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
