Thanks, for clarifying things.
> It is clear. I am wondering about the supervised training: is it
> > possible to train the tagger (in a supervised manner) without creating
> > all the lexical resources used by the MT system? What is
> > not obvious for me, that why are these parameters needed:
> > "apertium-tagger[-d] -s=n DIC CRP TSX TAGGER_DATA HTAG UNTAG"
>
> And FILES are:
> DIC: full expanded dictionary file
> CRP: training text corpus file
> TSX: tagger specification file, in XML format
> TAGGER_DATA: tagger data file, built in the training and used while
> tagging
> HTAG: hand-tagged text corpus
> UNTAG: untagged text corpus, morphological analysis of HTAG
> corpus to use both jointly with -s option
>
>
> For Hungarian, "DIC" is not going to be possible as it relies on
> dictionary expansion,[1] the rest is possible (you just need to convert
> the resources you already have).
>
> Felipe: What is the dictionary expansion file used for when training the
> tagger, and could it be approximated in some way?
>
> Fran
>
> 1. Well, you could just analyse the corpus with your morphological
> analyser, and then convert the set of analyses from the corpus to an
> Apertium .dix file, then expand it. This would be useless for most
> purposes but would allow you to train the tagger.
>
>
Can you please confirm me whether it is the process of training or not?
For tagging we need a untagged corpus (UNTAG), a disambiguated one (HTAG),
and one which has all the possible analysises for each word(CRP). We also
need a dictionary which has (a huge amount) wordform analysis pairs (DIC).
(Is it a simulated morphological analyzer?) TAGGER_DATA is created during
the training, and TSX contains mapping between tags of the MA and the
tagger. (One more question: is it possible to use identical relation as
mapping, since the tagset we use is the one that the MA generates?)
Thanks,
Gyorgy
------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff