Well, of course you can try to replace manually the variables by paths (as
I told you, you have to try to replace variables starting and ending with
__). I don't think I can help you much more because I never did this, but
I'm sure that with a bit of patiente you will do it ;) Good luck!
Cheers,
Miquel.
2014-02-20 14:11 GMT+01:00 Per Tunedal <[email protected]>:
> Hi Miquel,
> yes, that what was I had in my mind. But it doesn't help much dough.
>
> Next dependency is some Python library for levenstien distance ...
>
> There must be an easier way to test the script and see if it gives me
> something useful. I'm not interested in testing the other functions
> right now.
>
> Just compile the script somehow? Or just hard code paths into the
> script?
>
> Yours,
> Per Tunedal
>
>
> On Thu, Feb 20, 2014, at 10:46, Miquel Esplà wrote:
> > Hi Per,
> >
> > I didn't try to compile with the version of Python you are using, but you
> > can try to change this condition in configure.ac to do so.
> >
> > Cheers,
> >
> > Miquel.
> >
> >
> > 2014-02-20 10:19 GMT+01:00 Per Tunedal <[email protected]>:
> >
> > > Hi Miquel,
> > > Thanks for your thorough answer.
> > >
> > > I've tried ./autogen.sh
> > > I had to install httrack, but then got:
> > > checking for a Python interpreter with version >= 2.7... none
> > > configure: error: You don't have Python 2.7 or later installed.
> > >
> > > Is it really necessary to update Python?
> > >
> > > It appears that the configure script demands Python >= 2.7 In Debian
> > > Squeeze Pyhton 2.6.6 is the default.
> > > I'm afraid of messing things up if I install Python manually, and not
> with
> > > Synaptic. Lots of things depend on Python.
> > >
> > > And upgrading to Debian Wheezy might fuzz things up as well ...
> > >
> > > Yours,
> > > Per Tunedal
> > >
> > >
> > > On Wed, Feb 19, 2014, at 9:58, Miquel Esplà wrote:
> > >
> > > Hi Per,
> > >
> > > 2014-02-18 21:37 GMT+01:00 Per Tunedal <[email protected]>:
> > >
> > > Hi Miquel,
> > > thank you. Looks like a good approach.
> > >
> > > Looking at the script:
> > > It runs GIZA++ in both directions to begin with? I just have to supply
> the
> > > bitext files?
> > >
> > >
> > > Yes, you only need to provide the bitext files compressed with gzip.
> > >
> > >
> > >
> > > But the script have some trouble finding the GIZA++ files:
> > > per@Pers-debian:~/script$ sh bitextor-builddics.in sv fr
> > > "/home/per/corpora/OpenOffice3.fr-sv.sv" "/home/per/corpora/
> > > OpenOffice3.fr-sv.fr"
> > >
> "/home/per/block_world_corpus/GIZA++_wordlists/bitextor/OpenOffice3.gizadict.sv-fr"
> > > TOKENISING THE CORPUS...
> > > Can't open perl script
> "__PREFIX__/share/bitextor/utils/tokenizer.perl":
> > > Filen eller katalogen finns inte
> > > gzip: /home/per/corpora/OpenOffice3.fr-sv.sv: not in gzip format
> > > Can't open perl script
> "__PREFIX__/share/bitextor/utils/tokenizer.perl":
> > > Filen eller katalogen finns inte
> > > gzip: /home/per/corpora/OpenOffice3.fr-sv.fr: not in gzip format
> > > LOWERCASING THE CORPUS...
> > > FILTERING OUT TOO LONG SENTENCES...
> > > FORMATTING THE CORPUS FOR PROCESSING...
> > > mv: kan inte ta status på
> > > "/tmp/tempcorpuspreproc.QP7LM/corpus.clean.sv_corpus.clean.fr.snt":
> Filen
> > > eller katalogen finns inte
> > > mv: kan inte ta status på
> > > "/tmp/tempcorpuspreproc.QP7LM/corpus.clean.fr_corpus.clean.sv.snt":
> Filen
> > > eller katalogen finns inte
> > > mv: kan inte ta status på
> > > "/tmp/tempcorpuspreproc.QP7LM/corpus.clean.sv.vcb": Filen eller
> katalogen
> > > finns inte
> > > mv: kan inte ta status på
> > > "/tmp/tempcorpuspreproc.QP7LM/corpus.clean.fr.vcb": Filen eller
> katalogen
> > > finns inte
> > > BUILDING WORD CLASSES FOR IMPROVING ALIGNMENT...
> > > CHECKING COOCURRENCE OF WORDS IN THE CORPUS...
> > > BUILDING PROBABILISTIC DICTIONARIES...
> > > FILTERING DICTIONARY...
> > > egrep: /tmp/tempgizamodel.RlVVs/fr.vcbegrep:
> > > /tmp/tempgizamodel.RlVVs/sv.vcb: Filen eller katalogen finns inte
> > > : Filen eller katalogen finns inte
> > > bitextor-builddics.in: 173: __PYTHON__: not found
> > > DONE!
> > >
> > >
> > > I'm sorry, I didn't explain it well: as I said, bitextor-builddics.inis
> > > only the template of the script. What I didn't say is that you need to
> > > compile the project to get the true script. If you have a look into the
> > > code of the template, you will see that there are many variables
> starting
> > > and ending with "__" (such as __PREFFIX__). These variables are
> > > replaced by the corresponding paths at compilation time. So, to use
> the
> > > script, you have to download the whole trunk directory, and then to
> run:
> > > ./autogen.sh
> > > ./configure
> > > make
> > > make install
> > >
> > > As you know, you can use the option --prefix=LOCALDIR when running
> > > ./configure to install bitextor in a specific path (for example
> LOCALDIR could
> > > be /home/per/local/).
> > >
> > > Best,
> > >
> > > Miquel.
> > >
> > >
> > >
> > > Yours,
> > > Per Tunedal
> > >
> > > On Tue, Feb 18, 2014, at 12:38, Miquel Esplà wrote:
> > >
> > > Hi Per,
> > >
> > > I think that the explanation in this website:
> > > http://rali.iro.umontreal.ca/rali/?q=en/node/1325 is quite useful. It
> > > helps a lot to understand the structure and the content of each file
> > > generated by OmegaT.
> > >
> > > About the script, in the last release of bitextor we included a script
> > > called "bitextor-builddics" (you can find the template of this script
> here:
> > > https://svn.code.sf.net/p/bitextor/code/trunk/bitextor-builddics.in)
> > > which uses GIZA++ to obtain a plain text bilingual dictionary, but only
> > > including pairs of words fulfilling: a) both words occur at least 10
> times
> > > in the corpus, and b) the harmonic mean of their probabilities in both
> > > probabilistic dictionaries (S -> T and T -> S) is higher than 0.2. If
> you
> > > want to use this, I recommend you to use the version in the trunk,
> which
> > > fixes some minor bugs still present in the release.
> > >
> > > Best,
> > >
> > > Miquel.
> > >
> > > 2014-02-17 14:21 GMT+01:00 Per Tunedal <[email protected]>:
> > >
> > > Hi Miquel,
> > > thank you for your informative answer. In deed I needed to create a
> > > coocurrence file.
> > > I did successfully create such a file with snt2cooc.out
> > >
> > > And GIZA++ has run successfully and made a lot of files in my home
> > > directory (!).
> > >
> > > How do I redirect the output to a more suitable folder? -outputpath ?
> > >
> > > Where can I find an explanation of the content of the files?
> > >
> > > I suppose the dictionary is in the translation table *.t3.final
> > > Any convenient way to extract plain text dictionaries (without going
> one
> > > step further and use Moses)?
> > > Some script available to decode the translation table by the using the
> > > vocabulary files *.vcb ?
> > >
> > > Yours,
> > > Per Tunedal
> > >
> > >
> > >
> > > On Mon, Feb 17, 2014, at 11:08, Miquel Esplà wrote:
> > >
> > > Hi Per,
> > >
> > > if I am not wrong, depending on how you compile GIZA++, it can generate
> > > the coocurrence files on-the-fly during alignment, or you may need to
> do so
> > > before running the alignment. Actually, I think that, with the standard
> > > compilation, you are in the second case. Have a look here:
> > > https://code.google.com/p/giza-pp/issues/detail?id=9 I hope the link
> will
> > > be helpful!
> > >
> > > Cheers,
> > >
> > > Miquel.
> > >
> > > 2014-02-17 10:30 GMT+01:00 Per Tunedal <[email protected]>:
> > >
> > >
> > > Hi,
> > > I tried the procedure described at
> > > http://wiki.apertium.org/wiki/Using_GIZA%2B%2B to get a rough
> > > dictionary, but encountered the following error in the last step:
> > >
> > > ERROR: NO COOCURRENCE FILE GIVEN!
> > >
> > > Is one step missing in the procedure?
> > >
> > > Yours,
> > > Per Tunedal
> > >
> > >
> > >
> > >
> ------------------------------------------------------------------------------
> > > Android apps run on BlackBerry 10
> > > Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
> > > Now with support for Jelly Bean, Bluetooth, Mapview and more.
> > > Get your Android app in front of a whole new audience. Start now.
> > >
> > >
> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk
> > > _______________________________________________
> > > Apertium-stuff mailing list
> > > [email protected]
> > > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> > >
> > >
> > >
> ------------------------------------------------------------------------------
> > > Android apps run on BlackBerry 10
> > > Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
> > > Now with support for Jelly Bean, Bluetooth, Mapview and more.
> > > Get your Android app in front of a whole new audience. Start now.
> > >
> > >
> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk
> > > *_______________________________________________*
> > > Apertium-stuff mailing list
> > > [email protected]
> > > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> > >
> > >
> > >
> > >
> ------------------------------------------------------------------------------
> > > Android apps run on BlackBerry 10
> > > Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
> > > Now with support for Jelly Bean, Bluetooth, Mapview and more.
> > > Get your Android app in front of a whole new audience. Start now.
> > >
> > >
> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk
> > > _______________________________________________
> > > Apertium-stuff mailing list
> > > [email protected]
> > > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> > >
> > >
> > >
> > >
> ------------------------------------------------------------------------------
> > > Managing the Performance of Cloud-Based Applications
> > > Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
> > > Read the Whitepaper.
> > >
> > >
> http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk
> > > *_______________________________________________*
> > > Apertium-stuff mailing list
> > > [email protected]
> > > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> > >
> > >
> > >
> > >
> ------------------------------------------------------------------------------
> > > Managing the Performance of Cloud-Based Applications
> > > Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
> > > Read the Whitepaper.
> > >
> > >
> http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk
> > > _______________________________________________
> > > Apertium-stuff mailing list
> > > [email protected]
> > > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> > >
> > >
> > >
> > >
> ------------------------------------------------------------------------------
> > > Managing the Performance of Cloud-Based Applications
> > > Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
> > > Read the Whitepaper.
> > >
> > >
> http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk
> > > *_______________________________________________*
> > > Apertium-stuff mailing list
> > > [email protected]
> > > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> > >
> > >
> > >
> > >
> ------------------------------------------------------------------------------
> > > Managing the Performance of Cloud-Based Applications
> > > Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
> > > Read the Whitepaper.
> > >
> > >
> http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk
> > > _______________________________________________
> > > Apertium-stuff mailing list
> > > [email protected]
> > > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> > >
> > >
> >
> ------------------------------------------------------------------------------
> > Managing the Performance of Cloud-Based Applications
> > Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
> > Read the Whitepaper.
> >
> http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk
> > _______________________________________________
> > Apertium-stuff mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
>
> ------------------------------------------------------------------------------
> Managing the Performance of Cloud-Based Applications
> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
> Read the Whitepaper.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk
> _______________________________________________
> Apertium-stuff mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff