On 21 September 2013 18:45, Francis Tyers <[email protected]> wrote:
> El ds 21 de 09 de 2013 a les 19:29 +0200, en/na Mikel Forcada va
> escriure:
>> Al 09/21/2013 02:11 PM, En/na Francis Tyers ha escrit:
>> > No, basically I'm asking if it can work without specifying the set of
>> > coarse tags.
>> What would happen if one did not specify the set of coarse tags in the
>> HMM tagger?
>
> The main thing is that it would make it easier to train taggers for use
> after CG. For many language pairs, defining the .tsx file is quite
> cumbersome if the tagset is large. And with an automatically-defined
> tagset, the training does not finish because it is too large.

If you had some tagged text, you could treat tag clustering as word
clustering -- delexicalise the text, generate word classes from it
(mkcls will do this), and use those as the coarse tags. Here are some
scripts to do that: https://github.com/jimregan/tag-clusterer

-- 
<Sefam> Are any of the mentors around?
<jimregan> yes, they're the ones trolling you

------------------------------------------------------------------------------
LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99!
1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint
2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes
Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/22/13. 
http://pubads.g.doubleclick.net/gampad/clk?id=64545871&iu=/4140/ostg.clktrk
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to