El dv 28 de 09 de 2012 a les 21:07 +0000, en/na Francis Tyers va
escriure:
> El dv 28 de 09 de 2012 a les 16:53 -0400, en/na Steve Rawlinson va
> escriure:
> > Hello,
> > 
> > 
> > I'm looking to hire/sponsor a developer that is familiar with Apertium
> > dix/metadix formats.  I hope it's ok to post this request here.  I'd
> > be happy to have the resulting work put under the GPL (or similar open
> > source license) and contributed back to Apertium.  Perhaps as part of
> > dixtools or as a new feature?
> > 
> > 
> > Here is what I need, a program that can read a bilingual dictionary
> > (dix or metadix?) and output all the word pair translations, including
> > all the conjugations, inflected forms, plurals, etc. (any possible
> > variations on the words) that are available in the dictionary for the
> > left side.  I've looked around in the dixtools and lttoolbox and I
> > don't see anything that does this, but maybe I've missed it?
> > 
> > 
> > This command is pretty close to what I need:
> >  
> >   "apertium-dixtools list pairs"
> > 
> > 
> > But it doesn't seem to do the inflected forms.  If I understand things
> > correctly, this should be possible by making use of the paradigms in
> > the mono dictionaries.
> > 
> > 
> > Here's a quick example of what I need for the Spanish verb "tener":
> > 
> > 
> > tengo /to have
> > tienes/to have
> > tiene/to have
> > tenemos/to have
> > teneis/to have
> > tienen/to have
> > 
> > 
> > Plus all the other conjugations that are in the the mono dictionary
> > paradigm (future, imperfect, etc.)  If someone knows how to do this
> > with the current tools please let me know!  
> > 
> > 
> > If nothing currently exists, then adding to dixtools in Java might
> > make the most sense.  I personally prefer to work in Python, but I'd
> > be open to any language.
> > 
> > 
> > If you're interested in working on this feature please post a reply or
> > email me directly with your interest and a proposal (just a quick
> > outline of how you'd do this, costs, etc.)
> 
> Hi there!
> 
> Here is a sequence of commands that will more or less get you what you
> want, without having to use python or java or anything.
> 
> Step 1: Expand the source language morphological dictionary.
> 
> $ lt-expand apertium-en-es.es.dix  > /tmp/es.exp
> 
> $ head /tmp/es.exp 
> abyectas:abyecto<adj><f><pl>
> abyecta:abyecto<adj><f><sg>
> abyectos:abyecto<adj><m><pl>
> 
> Step 2: Pass the lexical form side through the bilingual dictionary.
> 
> cat /tmp/es.exp | sed 's/:>:/:/g' | sed 's/:<:/:/g' | cut -f2 -d':' |
> sed 's/^/^/g' | sed 's/$/$/g' | lt-proc -b es-en.autobil.bin
> > /tmp/es-en.exp
> 
> ^abyecto<adj><f><pl>/abject<adj><f><pl>$
> ^abyecto<adj><f><sg>/abject<adj><f><sg>$
> ^abyecto<adj><m><pl>/abject<adj><m><pl>$
> ^abyecto<adj><m><sg>/abject<adj><m><sg>$
> 
> Step 3: Paste the output together.
> 
> $ paste /tmp/es.exp /tmp/es-en.exp | sed 's/:>:/:/g' | sed 's/:<:/:/g' |
> sed 's/:/\t/g'| sed 's/\//\t/1' | cut -f1,4 | head | cut -f1 -d'<' | sed
> 's/\t/\/ /g' | head
> 
> abyectas/ abject
> abyecta/ abject
> abyectos/ abject
> abyecto/ abject
> abyectĂ­simas/ abject
> abyectĂ­sima/ abject
> 
> If you want something more involved, you could try getting into contact
> with Prompsit Language Engineering, who offer services based around
> Apertium. Their email: [email protected] 
> 
> Regards,

PS. What were you intending to use the output for -- if you don't mind
me asking :)

F.


------------------------------------------------------------------------------
Got visibility?
Most devs has no idea what their production app looks like.
Find out how fast your code is with AppDynamics Lite.
http://ad.doubleclick.net/clk;262219671;13503038;y?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to