El dv 28 de 09 de 2012 a les 21:07 +0000, en/na Francis Tyers va escriure: > El dv 28 de 09 de 2012 a les 16:53 -0400, en/na Steve Rawlinson va > escriure: > > Hello, > > > > > > I'm looking to hire/sponsor a developer that is familiar with Apertium > > dix/metadix formats. I hope it's ok to post this request here. I'd > > be happy to have the resulting work put under the GPL (or similar open > > source license) and contributed back to Apertium. Perhaps as part of > > dixtools or as a new feature? > > > > > > Here is what I need, a program that can read a bilingual dictionary > > (dix or metadix?) and output all the word pair translations, including > > all the conjugations, inflected forms, plurals, etc. (any possible > > variations on the words) that are available in the dictionary for the > > left side. I've looked around in the dixtools and lttoolbox and I > > don't see anything that does this, but maybe I've missed it? > > > > > > This command is pretty close to what I need: > > > > "apertium-dixtools list pairs" > > > > > > But it doesn't seem to do the inflected forms. If I understand things > > correctly, this should be possible by making use of the paradigms in > > the mono dictionaries. > > > > > > Here's a quick example of what I need for the Spanish verb "tener": > > > > > > tengo /to have > > tienes/to have > > tiene/to have > > tenemos/to have > > teneis/to have > > tienen/to have > > > > > > Plus all the other conjugations that are in the the mono dictionary > > paradigm (future, imperfect, etc.) If someone knows how to do this > > with the current tools please let me know! > > > > > > If nothing currently exists, then adding to dixtools in Java might > > make the most sense. I personally prefer to work in Python, but I'd > > be open to any language. > > > > > > If you're interested in working on this feature please post a reply or > > email me directly with your interest and a proposal (just a quick > > outline of how you'd do this, costs, etc.) > > Hi there! > > Here is a sequence of commands that will more or less get you what you > want, without having to use python or java or anything. > > Step 1: Expand the source language morphological dictionary. > > $ lt-expand apertium-en-es.es.dix > /tmp/es.exp > > $ head /tmp/es.exp > abyectas:abyecto<adj><f><pl> > abyecta:abyecto<adj><f><sg> > abyectos:abyecto<adj><m><pl> > > Step 2: Pass the lexical form side through the bilingual dictionary. > > cat /tmp/es.exp | sed 's/:>:/:/g' | sed 's/:<:/:/g' | cut -f2 -d':' | > sed 's/^/^/g' | sed 's/$/$/g' | lt-proc -b es-en.autobil.bin > > /tmp/es-en.exp > > ^abyecto<adj><f><pl>/abject<adj><f><pl>$ > ^abyecto<adj><f><sg>/abject<adj><f><sg>$ > ^abyecto<adj><m><pl>/abject<adj><m><pl>$ > ^abyecto<adj><m><sg>/abject<adj><m><sg>$ > > Step 3: Paste the output together. > > $ paste /tmp/es.exp /tmp/es-en.exp | sed 's/:>:/:/g' | sed 's/:<:/:/g' | > sed 's/:/\t/g'| sed 's/\//\t/1' | cut -f1,4 | head | cut -f1 -d'<' | sed > 's/\t/\/ /g' | head > > abyectas/ abject > abyecta/ abject > abyectos/ abject > abyecto/ abject > abyectĂsimas/ abject > abyectĂsima/ abject > > If you want something more involved, you could try getting into contact > with Prompsit Language Engineering, who offer services based around > Apertium. Their email: [email protected] > > Regards,
PS. What were you intending to use the output for -- if you don't mind me asking :) F. ------------------------------------------------------------------------------ Got visibility? Most devs has no idea what their production app looks like. Find out how fast your code is with AppDynamics Lite. http://ad.doubleclick.net/clk;262219671;13503038;y? http://info.appdynamics.com/FreeJavaPerformanceDownload.html _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
