Hi Fran,
Wow, thanks for the helpful instructions and quick reply! It looks like it
will do exactly what I need. I'll give it a try.
As for what I am intending to use the output for, I'm looking into creating
a translation dictionary for a certain platform. It would be
sold commercially, but I would follow the GPL license rules of course. If
it's profitable, then I'd be happy to give back to the Apertium community
by sponsoring work or contributing in some other way.
Thanks!
Steve
On Fri, Sep 28, 2012 at 5:10 PM, Francis Tyers <[email protected]> wrote:
> El dv 28 de 09 de 2012 a les 21:07 +0000, en/na Francis Tyers va
> escriure:
> > El dv 28 de 09 de 2012 a les 16:53 -0400, en/na Steve Rawlinson va
> > escriure:
> > > Hello,
> > >
> > >
> > > I'm looking to hire/sponsor a developer that is familiar with Apertium
> > > dix/metadix formats. I hope it's ok to post this request here. I'd
> > > be happy to have the resulting work put under the GPL (or similar open
> > > source license) and contributed back to Apertium. Perhaps as part of
> > > dixtools or as a new feature?
> > >
> > >
> > > Here is what I need, a program that can read a bilingual dictionary
> > > (dix or metadix?) and output all the word pair translations, including
> > > all the conjugations, inflected forms, plurals, etc. (any possible
> > > variations on the words) that are available in the dictionary for the
> > > left side. I've looked around in the dixtools and lttoolbox and I
> > > don't see anything that does this, but maybe I've missed it?
> > >
> > >
> > > This command is pretty close to what I need:
> > >
> > > "apertium-dixtools list pairs"
> > >
> > >
> > > But it doesn't seem to do the inflected forms. If I understand things
> > > correctly, this should be possible by making use of the paradigms in
> > > the mono dictionaries.
> > >
> > >
> > > Here's a quick example of what I need for the Spanish verb "tener":
> > >
> > >
> > > tengo /to have
> > > tienes/to have
> > > tiene/to have
> > > tenemos/to have
> > > teneis/to have
> > > tienen/to have
> > >
> > >
> > > Plus all the other conjugations that are in the the mono dictionary
> > > paradigm (future, imperfect, etc.) If someone knows how to do this
> > > with the current tools please let me know!
> > >
> > >
> > > If nothing currently exists, then adding to dixtools in Java might
> > > make the most sense. I personally prefer to work in Python, but I'd
> > > be open to any language.
> > >
> > >
> > > If you're interested in working on this feature please post a reply or
> > > email me directly with your interest and a proposal (just a quick
> > > outline of how you'd do this, costs, etc.)
> >
> > Hi there!
> >
> > Here is a sequence of commands that will more or less get you what you
> > want, without having to use python or java or anything.
> >
> > Step 1: Expand the source language morphological dictionary.
> >
> > $ lt-expand apertium-en-es.es.dix > /tmp/es.exp
> >
> > $ head /tmp/es.exp
> > abyectas:abyecto<adj><f><pl>
> > abyecta:abyecto<adj><f><sg>
> > abyectos:abyecto<adj><m><pl>
> >
> > Step 2: Pass the lexical form side through the bilingual dictionary.
> >
> > cat /tmp/es.exp | sed 's/:>:/:/g' | sed 's/:<:/:/g' | cut -f2 -d':' |
> > sed 's/^/^/g' | sed 's/$/$/g' | lt-proc -b es-en.autobil.bin
> > > /tmp/es-en.exp
> >
> > ^abyecto<adj><f><pl>/abject<adj><f><pl>$
> > ^abyecto<adj><f><sg>/abject<adj><f><sg>$
> > ^abyecto<adj><m><pl>/abject<adj><m><pl>$
> > ^abyecto<adj><m><sg>/abject<adj><m><sg>$
> >
> > Step 3: Paste the output together.
> >
> > $ paste /tmp/es.exp /tmp/es-en.exp | sed 's/:>:/:/g' | sed 's/:<:/:/g' |
> > sed 's/:/\t/g'| sed 's/\//\t/1' | cut -f1,4 | head | cut -f1 -d'<' | sed
> > 's/\t/\/ /g' | head
> >
> > abyectas/ abject
> > abyecta/ abject
> > abyectos/ abject
> > abyecto/ abject
> > abyectĂsimas/ abject
> > abyectĂsima/ abject
> >
> > If you want something more involved, you could try getting into contact
> > with Prompsit Language Engineering, who offer services based around
> > Apertium. Their email: [email protected]
> >
> > Regards,
>
> PS. What were you intending to use the output for -- if you don't mind
> me asking :)
>
> F.
>
>
>
> ------------------------------------------------------------------------------
> Got visibility?
> Most devs has no idea what their production app looks like.
> Find out how fast your code is with AppDynamics Lite.
> http://ad.doubleclick.net/clk;262219671;13503038;y?
> http://info.appdynamics.com/FreeJavaPerformanceDownload.html
> _______________________________________________
> Apertium-stuff mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
------------------------------------------------------------------------------
Got visibility?
Most devs has no idea what their production app looks like.
Find out how fast your code is with AppDynamics Lite.
http://ad.doubleclick.net/clk;262219671;13503038;y?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff