Hi Chris, thank you very much for the interesting literature pointers. Surprisingly, very recent studies also.
I once replaced mkcls with supervised part-of-speech based classes and had some improvement for German-Polish AER, but never worked on that any further. Cannot say whether that actually improved BLEU. On 08.06.2012 18:08, Chris Dyer wrote: > Hi Marcin, > mkcls is best understood as implementing the Brown et al (1992) > clustering model (i.e., a bigram HMM with some extra hard > constraints), although it uses a different algorithm for parameter > learning than the algorithm proposed by Brown. > > Its performance has been analyzed and compared to a few other > techniques in this paper: > Phil Blunsom; Trevor Cohn. (2011) A Hierarchical Pitman-Yor Process > HMM for Unsupervised Part of Speech Induction. > http://aclweb.org/anthology-new/P/P11/P11-1087.pdf > > Evaluated as an unsupervised POS tagger, mkcls works surprisingly > well, especially considering its "age". > > There has been other work that has looked at using unsupervised word > classes for various NLP tasks and found that Brown clusters are quite > good for a variety of things, so I suspect mkcls is going to be hard > to beat, although tuning the number of classes is likely to be a very > good idea: > Joseph Turian; Lev Ratinov; Yoshua Bengio. (2010) Word > representations: A simple and general method for semi-supervised > learning. http://www.aclweb.org/anthology-new/P/P10/P10-1040.pdf > > On Wed, Jun 6, 2012 at 5:33 AM, Marcin Junczys-Dowmunt > <[email protected]> wrote: >> Hi all, >> I am training another model and started wondering about the mkcls tool >> (again). Does anyone know if there have been any attempts to use >> something different and to what result? It's a strange little tool, >> everyone uses it, but probably hardly anyone knows what it exactly does >> and why it does what it does :) apart from knowying that Models 3-5 need >> its output. >> >> Best, >> Marcin >> _______________________________________________ >> Moses-support mailing list >> [email protected] >> http://mailman.mit.edu/mailman/listinfo/moses-support > -- dr inż. Marcin Junczys-Dowmunt Uniwersytet im. Adama Mickiewicza Wydział Matematyki i Informatyki ul. Umultowska 87 61-614 Poznań _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
