El dl 28 de 03 de 2011 a les 16:22 +0200, en/na Kevin Brubeck Unhammer va escriure: > mirlan <[email protected]> writes: > > >> ** If you use trmorph, how will you trim the lemmas to the contents of > >> > >> the bilingual dictionary ? > >> > > I am working on it. > > Please explain how ;-) > > The regular method[1] is to take an lttoolbox analyser, and find the > full set of possible input-output pairs using the program lt-expand, and > run that through the translator to check for errors. Unfortunately, when > your analyser is in SFST/HFST-format -- which opens for lots of "loops" > in the analyser -- things get a bit more complicated. Brian Croom's > hfst-fst2strings[2] attempts to do something similar to lt-expand, while > providing some ways to filter the possibilities.
This is the testvoc. I was talking more about the trimming of dictionaries before testvoc. For example, by expanding the bilingual dictionary, and then making a new analyser/generator which is a subset of the old one. Ryan has a script for this in sme-fin (that I think we also use in other language pairs with Sámi) and I have a script for it in af-nl and ca-sc. But because trmorph is differently laid out from both of those, it will probably be necessary to write one from scratch. > > * How will you make the bilingual lexicon ? I presume there are few > > > > freely-available (e.g. open-source/free software) dictionaries, so you > > > > will probably have to build your own. Someone with experience of > > > > Apertium can do ~400 words in a day, so we would like to see a start on > > > > the lexicon to make sure you understand the problems involved. > > > > > > Right now i have StarDict tr-ky dicitionary, i hope it could help me. > > > > Is there a link? Does it have part-of-speech (word class) information? > (That would make it a lot easier to use.) > > > * It would be a good idea to start looking at any transfer > > > > (syntactic/morphological) issues between the two languages. > > > > > > > > tr-ky have some similarities […] > > We are more interested in the differences ;) E.g. differences in case > system, inflection, word order, etc. > > The best way to document such differences (or similarities) is to make a > page like http://wiki.apertium.org/wiki/English_and_French/Pending_tests > which you can then test your language pair on. > > > Do come on IRC more so we can discuss the issues and any possible > problems you have; we don't want anyone to waste lots of time on > something that could be solved by discussing it on IRC :) Agree, being on IRC is probably a good idea to avoid wasting time on problems which are easily solved. Regards, Fran ------------------------------------------------------------------------------ Enable your software for Intel(R) Active Management Technology to meet the growing manageability and security demands of your customers. Businesses are taking advantage of Intel(R) vPro (TM) technology - will your software be a part of the solution? Download the Intel(R) Manageability Checker today! http://p.sf.net/sfu/intel-dev2devmar _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
