El dv 28 de 09 de 2012 a les 16:53 -0400, en/na Steve Rawlinson va
escriure:
> Hello,
> 
> 
> I'm looking to hire/sponsor a developer that is familiar with Apertium
> dix/metadix formats.  I hope it's ok to post this request here.  I'd
> be happy to have the resulting work put under the GPL (or similar open
> source license) and contributed back to Apertium.  Perhaps as part of
> dixtools or as a new feature?
> 
> 
> Here is what I need, a program that can read a bilingual dictionary
> (dix or metadix?) and output all the word pair translations, including
> all the conjugations, inflected forms, plurals, etc. (any possible
> variations on the words) that are available in the dictionary for the
> left side.  I've looked around in the dixtools and lttoolbox and I
> don't see anything that does this, but maybe I've missed it?
> 
> 
> This command is pretty close to what I need:
>  
>   "apertium-dixtools list pairs"
> 
> 
> But it doesn't seem to do the inflected forms.  If I understand things
> correctly, this should be possible by making use of the paradigms in
> the mono dictionaries.
> 
> 
> Here's a quick example of what I need for the Spanish verb "tener":
> 
> 
> tengo /to have
> tienes/to have
> tiene/to have
> tenemos/to have
> teneis/to have
> tienen/to have
> 
> 
> Plus all the other conjugations that are in the the mono dictionary
> paradigm (future, imperfect, etc.)  If someone knows how to do this
> with the current tools please let me know!  
> 
> 
> If nothing currently exists, then adding to dixtools in Java might
> make the most sense.  I personally prefer to work in Python, but I'd
> be open to any language.
> 
> 
> If you're interested in working on this feature please post a reply or
> email me directly with your interest and a proposal (just a quick
> outline of how you'd do this, costs, etc.)

Hi there!

Here is a sequence of commands that will more or less get you what you
want, without having to use python or java or anything.

Step 1: Expand the source language morphological dictionary.

$ lt-expand apertium-en-es.es.dix  > /tmp/es.exp

$ head /tmp/es.exp 
abyectas:abyecto<adj><f><pl>
abyecta:abyecto<adj><f><sg>
abyectos:abyecto<adj><m><pl>

Step 2: Pass the lexical form side through the bilingual dictionary.

cat /tmp/es.exp | sed 's/:>:/:/g' | sed 's/:<:/:/g' | cut -f2 -d':' |
sed 's/^/^/g' | sed 's/$/$/g' | lt-proc -b es-en.autobil.bin
> /tmp/es-en.exp

^abyecto<adj><f><pl>/abject<adj><f><pl>$
^abyecto<adj><f><sg>/abject<adj><f><sg>$
^abyecto<adj><m><pl>/abject<adj><m><pl>$
^abyecto<adj><m><sg>/abject<adj><m><sg>$

Step 3: Paste the output together.

$ paste /tmp/es.exp /tmp/es-en.exp | sed 's/:>:/:/g' | sed 's/:<:/:/g' |
sed 's/:/\t/g'| sed 's/\//\t/1' | cut -f1,4 | head | cut -f1 -d'<' | sed
's/\t/\/ /g' | head

abyectas/ abject
abyecta/ abject
abyectos/ abject
abyecto/ abject
abyectĂ­simas/ abject
abyectĂ­sima/ abject

If you want something more involved, you could try getting into contact
with Prompsit Language Engineering, who offer services based around
Apertium. Their email: [email protected] 

Regards,

Fran


------------------------------------------------------------------------------
Got visibility?
Most devs has no idea what their production app looks like.
Find out how fast your code is with AppDynamics Lite.
http://ad.doubleclick.net/clk;262219671;13503038;y?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to