Hi Gang [hi Apertiumers!]
thanks for merging your code with the trunk. It will be part of the next
release, but, in the meantime, people can still use it. It would also be
very, very nice if you could publish the document you prepared in the
Apertium wiki (you can add a page about the LSW tagger implementation,
etc. containing a summary and then linking to the document). I cannot
find any of your stuff there, and it would be nice to have it.
By the way, I have already passed you.
It has been a pleasure to work with you!
Mikel
Al 09/23/2013 05:48 AM, En/na Gang Chen ha escrit:
Hi, all,
Thanks for the discussions!
I've merged the code of branches/apertium-swpost/apertium into
trunk/apertium.
Now the Apertium PoS tagger supports both models: HMM and LSW. HMM is
the default choice, and its usage is the same as before. LSW is the
new part, and its usage is also simple: just add an "-w" option while
using apertium-tagger for training, retraining, or tagging.
I've tested the functionalities of HMM in the modifed tagger:
unsupervised training, unsupervised retraining, supervised training,
and tagger with different options. They produced the same result as
the original one. The functionalities of LSW also work as expected.
Any further information, welcome to the wiki page:
http://wiki.apertium.org/w/index.php?title=User:Gang_Chen/GSoC_2013_Progress
If there are any questions/problems about the tagger, please email me :-)
Many thanks to the nice Apertium community!!
Best wishes,
Gang
2013/9/22 Jimmy O'Regan <[email protected] <mailto:[email protected]>>
On 21 September 2013 18:45, Francis Tyers <[email protected]
<mailto:[email protected]>> wrote:
> El ds 21 de 09 de 2013 a les 19:29 +0200, en/na Mikel Forcada va
> escriure:
>> Al 09/21/2013 02:11 PM, En/na Francis Tyers ha escrit:
>> > No, basically I'm asking if it can work without specifying
the set of
>> > coarse tags.
>> What would happen if one did not specify the set of coarse tags
in the
>> HMM tagger?
>
> The main thing is that it would make it easier to train taggers
for use
> after CG. For many language pairs, defining the .tsx file is quite
> cumbersome if the tagset is large. And with an automatically-defined
> tagset, the training does not finish because it is too large.
If you had some tagged text, you could treat tag clustering as word
clustering -- delexicalise the text, generate word classes from it
(mkcls will do this), and use those as the coarse tags. Here are some
scripts to do that: https://github.com/jimregan/tag-clusterer
--
<Sefam> Are any of the mentors around?
<jimregan> yes, they're the ones trolling you
------------------------------------------------------------------------------
LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99!
1,500+ hours of tutorials including VisualStudio 2012, Windows 8,
SharePoint
2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power
Pack includes
Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/22/13.
http://pubads.g.doubleclick.net/gampad/clk?id=64545871&iu=/4140/ostg.clktrk
_______________________________________________
Apertium-stuff mailing list
[email protected]
<mailto:[email protected]>
https://lists.sourceforge.net/lists/listinfo/apertium-stuff
------------------------------------------------------------------------------
LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99!
1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint
2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes
Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13.
http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff
--
Mikel L. Forcada (http://www.dlsi.ua.es/~mlf/)
Departament de Llenguatges i Sistemes InformĂ tics
Universitat d'Alacant
E-03071 Alacant, Spain
Phone: +34 96 590 9776
Fax: +34 96 590 9326
------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff