Hi Per,
I didn't try to compile with the version of Python you are using, but you
can try to change this condition in configure.ac to do so.
Cheers,
Miquel.
2014-02-20 10:19 GMT+01:00 Per Tunedal <[email protected]>:
> Hi Miquel,
> Thanks for your thorough answer.
>
> I've tried ./autogen.sh
> I had to install httrack, but then got:
> checking for a Python interpreter with version >= 2.7... none
> configure: error: You don't have Python 2.7 or later installed.
>
> Is it really necessary to update Python?
>
> It appears that the configure script demands Python >= 2.7 In Debian
> Squeeze Pyhton 2.6.6 is the default.
> I'm afraid of messing things up if I install Python manually, and not with
> Synaptic. Lots of things depend on Python.
>
> And upgrading to Debian Wheezy might fuzz things up as well ...
>
> Yours,
> Per Tunedal
>
>
> On Wed, Feb 19, 2014, at 9:58, Miquel Esplà wrote:
>
> Hi Per,
>
> 2014-02-18 21:37 GMT+01:00 Per Tunedal <[email protected]>:
>
> Hi Miquel,
> thank you. Looks like a good approach.
>
> Looking at the script:
> It runs GIZA++ in both directions to begin with? I just have to supply the
> bitext files?
>
>
> Yes, you only need to provide the bitext files compressed with gzip.
>
>
>
> But the script have some trouble finding the GIZA++ files:
> per@Pers-debian:~/script$ sh bitextor-builddics.in sv fr
> "/home/per/corpora/OpenOffice3.fr-sv.sv" "/home/per/corpora/
> OpenOffice3.fr-sv.fr"
> "/home/per/block_world_corpus/GIZA++_wordlists/bitextor/OpenOffice3.gizadict.sv-fr"
> TOKENISING THE CORPUS...
> Can't open perl script "__PREFIX__/share/bitextor/utils/tokenizer.perl":
> Filen eller katalogen finns inte
> gzip: /home/per/corpora/OpenOffice3.fr-sv.sv: not in gzip format
> Can't open perl script "__PREFIX__/share/bitextor/utils/tokenizer.perl":
> Filen eller katalogen finns inte
> gzip: /home/per/corpora/OpenOffice3.fr-sv.fr: not in gzip format
> LOWERCASING THE CORPUS...
> FILTERING OUT TOO LONG SENTENCES...
> FORMATTING THE CORPUS FOR PROCESSING...
> mv: kan inte ta status på
> "/tmp/tempcorpuspreproc.QP7LM/corpus.clean.sv_corpus.clean.fr.snt": Filen
> eller katalogen finns inte
> mv: kan inte ta status på
> "/tmp/tempcorpuspreproc.QP7LM/corpus.clean.fr_corpus.clean.sv.snt": Filen
> eller katalogen finns inte
> mv: kan inte ta status på
> "/tmp/tempcorpuspreproc.QP7LM/corpus.clean.sv.vcb": Filen eller katalogen
> finns inte
> mv: kan inte ta status på
> "/tmp/tempcorpuspreproc.QP7LM/corpus.clean.fr.vcb": Filen eller katalogen
> finns inte
> BUILDING WORD CLASSES FOR IMPROVING ALIGNMENT...
> CHECKING COOCURRENCE OF WORDS IN THE CORPUS...
> BUILDING PROBABILISTIC DICTIONARIES...
> FILTERING DICTIONARY...
> egrep: /tmp/tempgizamodel.RlVVs/fr.vcbegrep:
> /tmp/tempgizamodel.RlVVs/sv.vcb: Filen eller katalogen finns inte
> : Filen eller katalogen finns inte
> bitextor-builddics.in: 173: __PYTHON__: not found
> DONE!
>
>
> I'm sorry, I didn't explain it well: as I said, bitextor-builddics.in is
> only the template of the script. What I didn't say is that you need to
> compile the project to get the true script. If you have a look into the
> code of the template, you will see that there are many variables starting
> and ending with "__" (such as __PREFFIX__). These variables are
> replaced by the corresponding paths at compilation time. So, to use the
> script, you have to download the whole trunk directory, and then to run:
> ./autogen.sh
> ./configure
> make
> make install
>
> As you know, you can use the option --prefix=LOCALDIR when running
> ./configure to install bitextor in a specific path (for example LOCALDIR could
> be /home/per/local/).
>
> Best,
>
> Miquel.
>
>
>
> Yours,
> Per Tunedal
>
> On Tue, Feb 18, 2014, at 12:38, Miquel Esplà wrote:
>
> Hi Per,
>
> I think that the explanation in this website:
> http://rali.iro.umontreal.ca/rali/?q=en/node/1325 is quite useful. It
> helps a lot to understand the structure and the content of each file
> generated by OmegaT.
>
> About the script, in the last release of bitextor we included a script
> called "bitextor-builddics" (you can find the template of this script here:
> https://svn.code.sf.net/p/bitextor/code/trunk/bitextor-builddics.in)
> which uses GIZA++ to obtain a plain text bilingual dictionary, but only
> including pairs of words fulfilling: a) both words occur at least 10 times
> in the corpus, and b) the harmonic mean of their probabilities in both
> probabilistic dictionaries (S -> T and T -> S) is higher than 0.2. If you
> want to use this, I recommend you to use the version in the trunk, which
> fixes some minor bugs still present in the release.
>
> Best,
>
> Miquel.
>
> 2014-02-17 14:21 GMT+01:00 Per Tunedal <[email protected]>:
>
> Hi Miquel,
> thank you for your informative answer. In deed I needed to create a
> coocurrence file.
> I did successfully create such a file with snt2cooc.out
>
> And GIZA++ has run successfully and made a lot of files in my home
> directory (!).
>
> How do I redirect the output to a more suitable folder? -outputpath ?
>
> Where can I find an explanation of the content of the files?
>
> I suppose the dictionary is in the translation table *.t3.final
> Any convenient way to extract plain text dictionaries (without going one
> step further and use Moses)?
> Some script available to decode the translation table by the using the
> vocabulary files *.vcb ?
>
> Yours,
> Per Tunedal
>
>
>
> On Mon, Feb 17, 2014, at 11:08, Miquel Esplà wrote:
>
> Hi Per,
>
> if I am not wrong, depending on how you compile GIZA++, it can generate
> the coocurrence files on-the-fly during alignment, or you may need to do so
> before running the alignment. Actually, I think that, with the standard
> compilation, you are in the second case. Have a look here:
> https://code.google.com/p/giza-pp/issues/detail?id=9 I hope the link will
> be helpful!
>
> Cheers,
>
> Miquel.
>
> 2014-02-17 10:30 GMT+01:00 Per Tunedal <[email protected]>:
>
>
> Hi,
> I tried the procedure described at
> http://wiki.apertium.org/wiki/Using_GIZA%2B%2B to get a rough
> dictionary, but encountered the following error in the last step:
>
> ERROR: NO COOCURRENCE FILE GIVEN!
>
> Is one step missing in the procedure?
>
> Yours,
> Per Tunedal
>
>
>
> ------------------------------------------------------------------------------
> Android apps run on BlackBerry 10
> Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
> Now with support for Jelly Bean, Bluetooth, Mapview and more.
> Get your Android app in front of a whole new audience. Start now.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk
> _______________________________________________
> Apertium-stuff mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
>
> ------------------------------------------------------------------------------
> Android apps run on BlackBerry 10
> Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
> Now with support for Jelly Bean, Bluetooth, Mapview and more.
> Get your Android app in front of a whole new audience. Start now.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk
> *_______________________________________________*
> Apertium-stuff mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
>
>
> ------------------------------------------------------------------------------
> Android apps run on BlackBerry 10
> Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
> Now with support for Jelly Bean, Bluetooth, Mapview and more.
> Get your Android app in front of a whole new audience. Start now.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk
> _______________________________________________
> Apertium-stuff mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
>
>
> ------------------------------------------------------------------------------
> Managing the Performance of Cloud-Based Applications
> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
> Read the Whitepaper.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk
> *_______________________________________________*
> Apertium-stuff mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
>
>
> ------------------------------------------------------------------------------
> Managing the Performance of Cloud-Based Applications
> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
> Read the Whitepaper.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk
> _______________________________________________
> Apertium-stuff mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
>
>
> ------------------------------------------------------------------------------
> Managing the Performance of Cloud-Based Applications
> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
> Read the Whitepaper.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk
> *_______________________________________________*
> Apertium-stuff mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
>
>
> ------------------------------------------------------------------------------
> Managing the Performance of Cloud-Based Applications
> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
> Read the Whitepaper.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk
> _______________________________________________
> Apertium-stuff mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
>
------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff