El dv 28 de 09 de 2012 a les 16:53 -0400, en/na Steve Rawlinson va escriure: > Hello, > > > I'm looking to hire/sponsor a developer that is familiar with Apertium > dix/metadix formats. I hope it's ok to post this request here. I'd > be happy to have the resulting work put under the GPL (or similar open > source license) and contributed back to Apertium. Perhaps as part of > dixtools or as a new feature? > > > Here is what I need, a program that can read a bilingual dictionary > (dix or metadix?) and output all the word pair translations, including > all the conjugations, inflected forms, plurals, etc. (any possible > variations on the words) that are available in the dictionary for the > left side. I've looked around in the dixtools and lttoolbox and I > don't see anything that does this, but maybe I've missed it? > > > This command is pretty close to what I need: > > "apertium-dixtools list pairs" > > > But it doesn't seem to do the inflected forms. If I understand things > correctly, this should be possible by making use of the paradigms in > the mono dictionaries. > > > Here's a quick example of what I need for the Spanish verb "tener": > > > tengo /to have > tienes/to have > tiene/to have > tenemos/to have > teneis/to have > tienen/to have > > > Plus all the other conjugations that are in the the mono dictionary > paradigm (future, imperfect, etc.) If someone knows how to do this > with the current tools please let me know! > > > If nothing currently exists, then adding to dixtools in Java might > make the most sense. I personally prefer to work in Python, but I'd > be open to any language. > > > If you're interested in working on this feature please post a reply or > email me directly with your interest and a proposal (just a quick > outline of how you'd do this, costs, etc.)
Hi there! Here is a sequence of commands that will more or less get you what you want, without having to use python or java or anything. Step 1: Expand the source language morphological dictionary. $ lt-expand apertium-en-es.es.dix > /tmp/es.exp $ head /tmp/es.exp abyectas:abyecto<adj><f><pl> abyecta:abyecto<adj><f><sg> abyectos:abyecto<adj><m><pl> Step 2: Pass the lexical form side through the bilingual dictionary. cat /tmp/es.exp | sed 's/:>:/:/g' | sed 's/:<:/:/g' | cut -f2 -d':' | sed 's/^/^/g' | sed 's/$/$/g' | lt-proc -b es-en.autobil.bin > /tmp/es-en.exp ^abyecto<adj><f><pl>/abject<adj><f><pl>$ ^abyecto<adj><f><sg>/abject<adj><f><sg>$ ^abyecto<adj><m><pl>/abject<adj><m><pl>$ ^abyecto<adj><m><sg>/abject<adj><m><sg>$ Step 3: Paste the output together. $ paste /tmp/es.exp /tmp/es-en.exp | sed 's/:>:/:/g' | sed 's/:<:/:/g' | sed 's/:/\t/g'| sed 's/\//\t/1' | cut -f1,4 | head | cut -f1 -d'<' | sed 's/\t/\/ /g' | head abyectas/ abject abyecta/ abject abyectos/ abject abyecto/ abject abyectĂsimas/ abject abyectĂsima/ abject If you want something more involved, you could try getting into contact with Prompsit Language Engineering, who offer services based around Apertium. Their email: [email protected] Regards, Fran ------------------------------------------------------------------------------ Got visibility? Most devs has no idea what their production app looks like. Find out how fast your code is with AppDynamics Lite. http://ad.doubleclick.net/clk;262219671;13503038;y? http://info.appdynamics.com/FreeJavaPerformanceDownload.html _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
