El dl 28 de 03 de 2011 a les 16:22 +0200, en/na Kevin Brubeck Unhammer
va escriure:
> mirlan <[email protected]> writes:
> 
> >> ** If you use trmorph, how will you trim the lemmas to the contents of     
> >>              
> >> the bilingual dictionary ?                                                 
> >>              
> > I am working on it.
> 
> Please explain how ;-) 
> 
> The regular method[1] is to take an lttoolbox analyser, and find the
> full set of possible input-output pairs using the program lt-expand, and
> run that through the translator to check for errors. Unfortunately, when
> your analyser is in SFST/HFST-format -- which opens for lots of "loops"
> in the analyser -- things get a bit more complicated. Brian Croom's
> hfst-fst2strings[2] attempts to do something similar to lt-expand, while
> providing some ways to filter the possibilities.

This is the testvoc. I was talking more about the trimming of
dictionaries before testvoc. For example, by expanding the bilingual
dictionary, and then making a new analyser/generator which is a subset
of the old one.

Ryan has a script for this in sme-fin (that I think we also use in other
language pairs with Sámi) and I have a script for it in af-nl and ca-sc.

But because trmorph is differently laid out from both of those, it will
probably be necessary to write one from scratch.

> > * How will you make the bilingual lexicon ? I presume there are few         
> >             
> > freely-available (e.g. open-source/free software) dictionaries, so you      
> >             
> > will probably have to build your own. Someone with experience of            
> >             
> > Apertium can do ~400 words in a day, so we would like to see a start on     
> >             
> > the lexicon to make sure you understand the problems involved.
> >                                                                             
> >             
> > Right now i have StarDict tr-ky dicitionary, i hope it could help me.       
> >             
> 
> Is there a link? Does it have part-of-speech (word class) information?
> (That would make it a lot easier to use.)
> 
> > * It would be a good idea to start looking at any transfer                  
> >             
> > (syntactic/morphological) issues between the two languages.                 
> >             
> >                                                                             
> >             
> > tr-ky have some similarities […]
> 
> We are more interested in the differences ;) E.g. differences in case
> system, inflection, word order, etc.
> 
> The best way to document such differences (or similarities) is to make a
> page like http://wiki.apertium.org/wiki/English_and_French/Pending_tests
> which you can then test your language pair on.
> 
> 
> Do come on IRC more so we can discuss the issues and any possible
> problems you have; we don't want anyone to waste lots of time on
> something that could be solved by discussing it on IRC :)

Agree, being on IRC is probably a good idea to avoid wasting time on
problems which are easily solved.

Regards,

Fran


------------------------------------------------------------------------------
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software 
be a part of the solution? Download the Intel(R) Manageability Checker 
today! http://p.sf.net/sfu/intel-dev2devmar
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to