Hi Chris,
thank you very much for the interesting literature pointers. 
Surprisingly, very recent studies also.

I once replaced mkcls with supervised part-of-speech based classes and 
had some improvement for German-Polish AER, but never worked on that any 
further. Cannot say whether that actually improved BLEU.


On 08.06.2012 18:08, Chris Dyer wrote:
> Hi Marcin,
> mkcls is best understood as implementing the Brown et al (1992)
> clustering model (i.e., a bigram HMM with some extra hard
> constraints), although it uses a different algorithm for parameter
> learning than the algorithm proposed by Brown.
>
> Its performance has been analyzed and compared to a few other
> techniques in this paper:
> Phil Blunsom; Trevor Cohn. (2011) A Hierarchical Pitman-Yor Process
> HMM for Unsupervised Part of Speech Induction.
> http://aclweb.org/anthology-new/P/P11/P11-1087.pdf
>
> Evaluated as an unsupervised POS tagger, mkcls works surprisingly
> well, especially considering its "age".
>
> There has been other work that has looked at using unsupervised word
> classes for various NLP tasks and found that Brown clusters are quite
> good for a variety of things, so I suspect mkcls is going to be hard
> to beat, although tuning the number of classes is likely to be a very
> good idea:
> Joseph Turian; Lev Ratinov; Yoshua Bengio. (2010) Word
> representations: A simple and general method for semi-supervised
> learning. http://www.aclweb.org/anthology-new/P/P10/P10-1040.pdf
>
> On Wed, Jun 6, 2012 at 5:33 AM, Marcin Junczys-Dowmunt
> <[email protected]>  wrote:
>> Hi all,
>> I am training another model and started wondering about the mkcls tool
>> (again). Does anyone know if there have been any attempts to use
>> something different and to what result? It's a strange little tool,
>> everyone uses it, but probably hardly anyone knows what it exactly does
>> and why it does what it does :) apart from knowying that Models 3-5 need
>> its output.
>>
>> Best,
>> Marcin
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>


-- 
dr inż. Marcin Junczys-Dowmunt
Uniwersytet im. Adama Mickiewicza
Wydział Matematyki i Informatyki
ul. Umultowska 87
61-614 Poznań
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to