Gang,
could you give a Makefile for it? I cannot just simply compile it with a g++ command --- I'd need to know the libraries, and I don't want to work that hard ;-)

It would be really really nice if you could provide also code for a roundtrip converter reading the output of this one and regenerating the input, as this is something you would have to deal with anyway when you write the tagger.

All the best

Mikel

Al 04/21/2013 11:41 AM, En/na Gang Chen ha escrit:
Hi, Mikel, Jimmy,

Thanks for your responses!

I have done a C++ version, with some care for option errors. Here is the code:
https://github.com/elephantgcc/gsoc-2013/blob/master/ApertiumFilter.cpp

Your explanations are very helpful for me to update my understanding.
As to the restrictions, as far as I know from the documentation, in the current HMM tagger, it is done by modifying the transition matrix during training, so that the FORBID transitions have a probability of 0. However, In the case of sliding-window PoS tagger, I think I need some time to go into the details of the tagger.


Best wishes,

Gang Chen



2013/4/21 Mikel Forcada <[email protected] <mailto:[email protected]>>

    Hi Gang,
    your code seems to work correctly, at least in a few tests I have
    performed.  There is only one thing that I didn't like: the
    program silently exits unless there is one of the options -r/-f.
    It should give an error.

    As Jimmy O'Regan says in his message, and even if it is not a
    requirement, we do need to see if you could prepare a C++ version
    of the coding challenge, as this is the language that is going to
    be used for the sliding-window part-of-speech tagger. Can you do
    that Gang?

    As to the intuitive idea, well, the sliding-window PoS tagger is
    different from an HMM PoS tagger, but I would not say that it can
    use "more information". In fact, if both the previous and the
    following word are ambiguous, an HMM PoS tagger is actually using
    a wider context, as it will not make a decision until a
    nonambiguous word appears.

    It is just a different way of tagging. We suspect it uses more
    parameters, but it can easily be turned into a finite-state
    transducer after training.

    We are also interested in the way you can introduce restrictions
    (FORBID) in the tagger.

    I look forward to hearing from you.

    Best

    Mikel

-- Mikel L. Forcada (http://www.dlsi.ua.es/~mlf/ <http://www.dlsi.ua.es/%7Emlf/>)
    Departament de Llenguatges i Sistemes InformĂ tics
    Universitat d'Alacant
    E-03071 Alacant, Spain
    Phone:+34 96 590 9776  <tel:%2B34%2096%20590%209776>
    Fax:+34 96 590 9326  <tel:%2B34%2096%20590%209326>




--
Mikel L. Forcada (http://www.dlsi.ua.es/~mlf/)
Departament de Llenguatges i Sistemes InformĂ tics
Universitat d'Alacant
E-03071 Alacant, Spain
Phone: +34 96 590 9776
Fax: +34 96 590 9326

------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to