Also shown here for Spanish: http://www.lrec-conf.org/proceedings/lrec2012/summaries/1075.html
Fran El dc 20 de 03 de 2013 a les 23:00 +0000, en/na Trosterud Trond va escriure: > > > !y experience: already 100 handwritten .cg rules give pos marking with > accuracy around 93-95. What takes a bit more is disamb of the full tag > string. > > > So .cg as part of the process should be considered. > > > Trend > Lähetetty Samsungin tablettitietokoneesta > > Francis Tyers <[email protected]> kirjoitti: > > I also like the idea! Especially if we can have an optional > integration > of CG to allow people to write rules to tag the corpus -- if they so > wish. In the end we win both ways: Those who are looking for a tagged > corpus for training the tagger get it, and those who would also like > constraint rules get them too. > > I'll try writing it up now. :) > > Fran > > El dt 19 de 03 de 2013 a les 20:19 +0100, en/na Mikel Forcada va > escriure: > > +1 > > > > Write it up, Gema! ;-) > > > > You'll mentor it with a co-mentor (!) I can easily think of a > couple > > names.... > > > > Mikel > > > > Al 03/19/2013 01:49 PM, En/na Gema Ramírez-Sánchez ha escrit: > > > Hi there, > > > > > > as I see it, there is a need in Apertium for most released pairs > and > > > the ones to come: better PoS taggers. In my experience, training > > > supervised taggers has never been a waste of time but all the > > > opposite: at the same time we have quality improvement and we are > > > creating unvaluable linguistic resources such as disambiguated > tagged > > > corpora. > > > > > > So, how to turn this inot a GSoC idea? > > > > > > Following the wikipages on how to train a tagger (see below) and > > > taking into account that supervised training still to be > written... > > > this project would at least involve > > > > > > 0) (must-have) making an interface where you can upload a raw text > of, > > > say, 25.000 words or (optional) create a corpus or X size for a > given > > > language from wikipedia > > > > > > and, by choosing a language for which there is at least a > > > morphological dictionary in Apertium, you have: > > > > > > 1) (must-have) a non-disambiguated tagged corpus > > > 3) (must-have) a .dic file > > > 2) (must-have) a simple fully functional precalculated .tsx file > in > > > which coarse tags defined taking into account the information from > the > > > dic file > > > > > > then it will also include: > > > > > > 4) (must-have) a user-friendly interface to take your > > > non-disambiguated tagged corpus and be able to disambiguate it > > > manually > > > 5) (must-have) a user-friendly documentation on how to improve the > tsx > > > (refine coarse tags, write rules) > > > 6) (must-have) a user-friendly interface to train a supervised > tagger > > > 7) (must-have) some way to evaluate performance of a .prob > > > > > > I'm surely forgetting some must-have and I have to think about it > a > > > little bit more, but, what do you think about the general idea of > > > having tools to train supervised taggers? > > > > > > Another important question: I'll not able to technically mentor > this > > > project, so, if no one else is interested... > > > > > > Best, > > > > > > Gema. > > > > > > -------------------- > > > How to train a tagger in Apertium: > > > http://wiki.apertium.org/wiki/Tagger_training > > > http://wiki.apertium.org/wiki/Target_language_tagger_training > > > http://wiki.apertium.org/wiki/Unsupervised_tagger_training > > > > > > > > > > > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_d2d_mar > _______________________________________________ > Apertium-stuff mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/apertium-stuff > ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_mar _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
