Gang,
could you give a Makefile for it? I cannot just simply compile it with a
g++ command --- I'd need to know the libraries, and I don't want to work
that hard ;-)
It would be really really nice if you could provide also code for a
roundtrip converter reading the output of this one and regenerating the
input, as this is something you would have to deal with anyway when you
write the tagger.
All the best
Mikel
Al 04/21/2013 11:41 AM, En/na Gang Chen ha escrit:
Hi, Mikel, Jimmy,
Thanks for your responses!
I have done a C++ version, with some care for option errors. Here is
the code:
https://github.com/elephantgcc/gsoc-2013/blob/master/ApertiumFilter.cpp
Your explanations are very helpful for me to update my understanding.
As to the restrictions, as far as I know from the documentation, in
the current HMM tagger, it is done by modifying the transition matrix
during training, so that the FORBID transitions have a probability of
0. However, In the case of sliding-window PoS tagger, I think I need
some time to go into the details of the tagger.
Best wishes,
Gang Chen
2013/4/21 Mikel Forcada <[email protected] <mailto:[email protected]>>
Hi Gang,
your code seems to work correctly, at least in a few tests I have
performed. There is only one thing that I didn't like: the
program silently exits unless there is one of the options -r/-f.
It should give an error.
As Jimmy O'Regan says in his message, and even if it is not a
requirement, we do need to see if you could prepare a C++ version
of the coding challenge, as this is the language that is going to
be used for the sliding-window part-of-speech tagger. Can you do
that Gang?
As to the intuitive idea, well, the sliding-window PoS tagger is
different from an HMM PoS tagger, but I would not say that it can
use "more information". In fact, if both the previous and the
following word are ambiguous, an HMM PoS tagger is actually using
a wider context, as it will not make a decision until a
nonambiguous word appears.
It is just a different way of tagging. We suspect it uses more
parameters, but it can easily be turned into a finite-state
transducer after training.
We are also interested in the way you can introduce restrictions
(FORBID) in the tagger.
I look forward to hearing from you.
Best
Mikel
--
Mikel L. Forcada (http://www.dlsi.ua.es/~mlf/ <http://www.dlsi.ua.es/%7Emlf/>)
Departament de Llenguatges i Sistemes InformĂ tics
Universitat d'Alacant
E-03071 Alacant, Spain
Phone:+34 96 590 9776 <tel:%2B34%2096%20590%209776>
Fax:+34 96 590 9326 <tel:%2B34%2096%20590%209326>
--
Mikel L. Forcada (http://www.dlsi.ua.es/~mlf/)
Departament de Llenguatges i Sistemes InformĂ tics
Universitat d'Alacant
E-03071 Alacant, Spain
Phone: +34 96 590 9776
Fax: +34 96 590 9326
------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff