El 2018-05-29 11:12, Grzegorz Kulik escribió:
Hi,

I've been developing the Polish - Silesian Apertium pair for some time
and the translations have become reasonable so I reckon it's time to
publish them. From the 10 000 most frequent Polish words it covers
nearly 9 000 (the rest is on its way) and it handles more than 21
thousand words altogether.

https://github.com/gkkulik/apertium-pol

https://github.com/gkkulik/apertium-pol-szl

https://github.com/gkkulik/apertium-szl

It still needs some fine tuning as sometimes it gives slightly amusing
output. I improve it regularly because I use it daily to translate
news so I get rid of any spotted mistakes. I hope you people can also
give me some tips since you obviously know much more about the
technical aspects of Apertium.

Wow great, where is the news published ?

Questions:

I want to improve the translation by developing handtagged coprora for
both languages. What size do I need to make it reasonable?

Well, starting from 10,000 tokens or so. You might be able to convert
some sentences from an existing corpus (e.g. UD_Polish), but it might
be better to tag from scratch.  I would start with trying the unigram
tagger (it's much easier to train) and if not try the perceptron tagger.

There was a great PDF Apertium developer manual but I cannot find it
anywhere. Can anybody point me in the right direction?

Is this the one you are referring to?

http://xixona.dlsi.ua.es/~fran/apertium2-documentation.pdf

When the pair is published in trunk and on the website, I want to make
it a media event here in Upper Silesia which means increased traffic.
Is that okay? Sorry if this question is silly. :)


Yes that would be great!

Would you be interested in moving the code to the Apertium project to
be able to take advantage of including it in the website and APy?

Tino: Do you know what the process would be for that?

Fran

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to