Hi both,

Thanks for your help. I'm not convinced its hardware configuration although I am looking into it - the 'reordering-table.binlexr.srctree' file is now 12GB and growing (4 days later). In the pre-built models I have seen this file doesn't appear to get that big. I think the source file must be corrupt or something is set wrong. The other big one 'phrase-table.binphr.tgtdata.wa' reached 7.5gb and was complete. The last thing it output to the console was:

..................................................[phrase:70500000]
..................................................[phrase:71000000]
distinct source phrases: 71001135 distinct first words of source phrases: 322879 number of phrase pairs (line count): 108170769
Count of lines with missing alignments: 0/108170769
WARNING: there are src voc entries with no phrase translation: count 27888
There exists phrase translations for 294991 entries

Does that sound about right?

I am wondering if something is wrong in the original files that I built so I am going to wipe the working directory and restart it from just the original prepared corpus files.

Also my mgizapp compile problem is:

[ 55%] Built target mgiza_lib
[ 56%] Built target d4norm
[ 58%] Built target hmmnorm
[ 60%] Built target mgiza
[ 62%] Built target plain2snt
[ 63%] Built target snt2cooc
[ 65%] Built target snt2coocrmp
[ 67%] Built target snt2plain
[ 70%] Built target symal
[ 72%] Building CXX object src/mkcls/CMakeFiles/mkcls.dir/KategProblemTest.cpp.o In file included from /home/dave/mgizapp/trunk/mgizapp/src/mkcls/StatVar.h:33:0, from /home/dave/mgizapp/trunk/mgizapp/src/mkcls/Problem.h:34, from /home/dave/mgizapp/trunk/mgizapp/src/mkcls/KategProblem.h:34, from /home/dave/mgizapp/trunk/mgizapp/src/mkcls/KategProblemTest.h:30, from /home/dave/mgizapp/trunk/mgizapp/src/mkcls/KategProblemTest.cpp:27: /home/dave/mgizapp/trunk/mgizapp/src/mkcls/myleda.h: In instantiation of 'B& leda_h_array<A, B>::operator[](const A&) [with A = std::basic_string<char>; B = int]': /home/dave/mgizapp/trunk/mgizapp/src/mkcls/KategProblemTest.cpp:96:18: required from here /home/dave/mgizapp/trunk/mgizapp/src/mkcls/myleda.h:221:5: error: 'insert' was not declared in this scope, and no declarations were found by argument-dependent lookup at the point of instantiation [-fpermissive] /home/dave/mgizapp/trunk/mgizapp/src/mkcls/myleda.h:221:5: note: declarations in dependent base '__gnu_cxx::hash_map<std::basic_string<char>, int, my_hash<std::basic_string<char>, std::less<std::basic_string<char> > >, std::equal_to<std::basic_string<char> >, std::allocator<int> >' are not found by unqualified lookup /home/dave/mgizapp/trunk/mgizapp/src/mkcls/myleda.h:221:5: note: use 'this->insert' instead /home/dave/mgizapp/trunk/mgizapp/src/mkcls/myleda.h: In instantiation of 'B& leda_h_array<A, B>::operator[](const A&) [with A = std::pair<std::basic_string<char>, std::basic_string<char> >; B = int]': /home/dave/mgizapp/trunk/mgizapp/src/mkcls/KategProblemTest.cpp:265:38: required from here /home/dave/mgizapp/trunk/mgizapp/src/mkcls/myleda.h:221:5: error: 'insert' was not declared in this scope, and no declarations were found by argument-dependent lookup at the point of instantiation [-fpermissive] /home/dave/mgizapp/trunk/mgizapp/src/mkcls/myleda.h:221:5: note: declarations in dependent base '__gnu_cxx::hash_map<std::pair<std::basic_string<char>, std::basic_string<char> >, int, my_hash<std::pair<std::basic_string<char>, std::basic_string<char> >, std::less<std::pair<std::basic_string<char>, std::basic_string<char> > > >, std::equal_to<std::pair<std::basic_string<char>, std::basic_string<char> > >, std::allocator<int> >' are not found by unqualified lookup /home/dave/mgizapp/trunk/mgizapp/src/mkcls/myleda.h:221:5: note: use 'this->insert' instead
make[2]: *** [src/mkcls/CMakeFiles/mkcls.dir/KategProblemTest.cpp.o] Error 1
make[1]: *** [src/mkcls/CMakeFiles/mkcls.dir/all] Error 2
make: *** [all] Error 2

Thank-you all for the help.

Dave


On 29/12/2012 04:08, [email protected] wrote:
Send Moses-support mailing list submissions to
        [email protected]

To subscribe or unsubscribe via the World Wide Web, visit
        http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
        [email protected]

You can reach the person managing the list at
        [email protected]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

    1. Re: compile error with boost 1.48 (Kenneth Heafield)
    2. Re: compile error with boost 1.48 (Hieu Hoang)
    3. Re: train-model.perl with mgizapp fails when extended UTF-8
       characters in output path. (Tom Hoar)
    4. Re: Time taken by processPhraseTable and processLexicalTable
       (Tom Hoar)


----------------------------------------------------------------------

Message: 1
Date: Fri, 28 Dec 2012 17:21:14 +0000
From: Kenneth Heafield <[email protected]>
Subject: Re: [Moses-support] compile error with boost 1.48
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset=windows-1252; format=flowed

Hi,

        This looks like a mismatch between library and header versions.  Does
/opt/boost_1_51_0/lib64/libboost_program_options-mt.so exist?  Did you
use exactly this command for Boost, as recommended in
BUILD-INSTRUCTIONS.txt ?

./b2 --prefix=/opt/boost_1_51_0 --libdir=/opt/boost_1_51_0/lib64
--layout=tagged link=static,shared threading=multi,single install

Kenneth

On 12/28/12 14:37, Holger Schwenk wrote:
Hello Hieu,

thanks for you fast answer !

I installed boost 1.51.0 and moses did compile (but I don't understand
why Fedora doesn't upgrade the buggy boost version in FC 17 ...).

However, there are a couple of errors when compiling mira and pro. I
attached some of the error messages. It seems to me that this is related
to linking.

Hieu, do you manage to compile the current git version with boost_1_51_0
? Which compiler do you use ?

Holger

On 12/28/2012 01:00 PM, Hieu Hoang wrote:
hi holger

i think there's a problem with boost 1.48.
http://article.gmane.org/gmane.comp.nlp.moses.user/7795/
i can't find any way around it.

would you be able to compile with another boost lib and see if it works?

in
BUILD-INSTRUCTIONS.txt
there's instructions on how to compile boost. Then link into moses using
./bjam --with-boost=/Users/hieuhoang/workspace/boost/boost_1_51_0

On 28/12/2012 11:35, Holger Schwenk wrote:
Hello,

I just installed Fedora Core 17(kernel 3.6.10, gcc 4.7.2, boost
1.48.0-13) and I get an error while compiling the latest version of
Moses:

./bjam -j4 --with-srilm=$SRILM --with-giza=/usr/local/bin
--prefix=/opt/mt/moses-smt/moses-2012-12-28

fails with

moses/FeatureVector.cpp:284:17: *warning: reference to local variable
?backoff? returned* [enabled by default]

and

ambiguous calls to boost in moses/FeatureVector.cpp

The log file is attached.

have you encountered this problem before or this seems to be rather
an issue withmy installation ?

thanks,

Holger



_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

------------------------------

Message: 2
Date: Fri, 28 Dec 2012 17:44:00 +0000
From: Hieu Hoang <[email protected]>
Subject: Re: [Moses-support] compile error with boost 1.48
To: Kenneth Heafield <[email protected]>,       moses-support
        <[email protected]>
Message-ID:
        <CAEKMkbg1e5hkGXfLZeO1z=sNvbU3-15GCoPNP3Zzs0=4bsi...@mail.gmail.com>
Content-Type: text/plain; charset="windows-1252"

i did exactly what it said in  BUILD-INSTRUCTIONS.txt
   ./b2 --prefix=/home/s0565741/workspace/boost/boost_1_51_0
--libdir=/home/s0565741/workspace/boost/boost_1_51_0/lib64 --layout=tagged
link=static,shared threading=multi,single install -j15

   ./bjam --with-srilm=/home/s0565741/workspace/srilm
--with-irstlm=/home/s0565741/workspace/irstlm/trunk/
--with-cmph=/home/s0565741/workspace/cmph-2.0 -j15
--with-boost=/home/s0565741/workspace/boost/boost_1_51_0

it would only work when i also do this
    ln -s libboost_thread-mt.a libboost_thread.a

On 28 December 2012 17:21, Kenneth Heafield <[email protected]> wrote:

Hi,

         This looks like a mismatch between library and header versions.
  Does
/opt/boost_1_51_0/lib64/libboost_program_options-mt.so exist?  Did you
use exactly this command for Boost, as recommended in
BUILD-INSTRUCTIONS.txt ?

./b2 --prefix=/opt/boost_1_51_0 --libdir=/opt/boost_1_51_0/lib64
--layout=tagged link=static,shared threading=multi,single install

Kenneth

On 12/28/12 14:37, Holger Schwenk wrote:
Hello Hieu,

thanks for you fast answer !

I installed boost 1.51.0 and moses did compile (but I don't understand
why Fedora doesn't upgrade the buggy boost version in FC 17 ...).

However, there are a couple of errors when compiling mira and pro. I
attached some of the error messages. It seems to me that this is related
to linking.

Hieu, do you manage to compile the current git version with boost_1_51_0
? Which compiler do you use ?

Holger

On 12/28/2012 01:00 PM, Hieu Hoang wrote:
hi holger

i think there's a problem with boost 1.48.
http://article.gmane.org/gmane.comp.nlp.moses.user/7795/
i can't find any way around it.

would you be able to compile with another boost lib and see if it works?

in
BUILD-INSTRUCTIONS.txt
there's instructions on how to compile boost. Then link into moses using
./bjam --with-boost=/Users/hieuhoang/workspace/boost/boost_1_51_0

On 28/12/2012 11:35, Holger Schwenk wrote:
Hello,

I just installed Fedora Core 17(kernel 3.6.10, gcc 4.7.2, boost
1.48.0-13) and I get an error while compiling the latest version of
Moses:

./bjam -j4 --with-srilm=$SRILM --with-giza=/usr/local/bin
--prefix=/opt/mt/moses-smt/moses-2012-12-28

fails with

moses/FeatureVector.cpp:284:17: *warning: reference to local variable
?backoff? returned* [enabled by default]

and

ambiguous calls to boost in moses/FeatureVector.cpp

The log file is attached.

have you encountered this problem before or this seems to be rather
an issue withmy installation ?

thanks,

Holger



_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://mailman.mit.edu/mailman/private/moses-support/attachments/20121228/5be86ad9/attachment-0001.htm

------------------------------

Message: 3
Date: Sat, 29 Dec 2012 10:30:23 +0700
From: Tom Hoar <[email protected]>
Subject: Re: [Moses-support] train-model.perl with mgizapp fails when
        extended UTF-8 characters in output path.
To: <[email protected]>
Message-ID:
        <[email protected]>
Content-Type: text/plain; charset="utf-8"

Ken,

I traced this. The problem is not specific to MGIZA++. It
manifested itself the first time with merge_alignment.py in
train-model.perl's step 2. In train-model.perl step 5, it happens again
when "extract" redirects output to a path.

This redirection problem
does not happen with 2- or 3-byte UTF-8 Latin, Japanese or Chinese
characters that I tried. All Thai UTF-8 characters are 3 bytes and there
are many documented cases where UTF handlers don't handle Thai properly.
It appears Perl's system() call is one of these cases.

I'm working
through the problem on stackoverflow.com:
http://stackoverflow.com/questions/14020240/why-does-perl-system-corrupt-the-redirected-path
[1] but no time for more tests during the holiday. I'll share a fix
if/when there's a resolution.

For now, Hieu's recommendation to
document the problem and block such requests in the front end is a good
work-around.

Tom

On 2012-12-28 06:43, Kenneth Heafield wrote:

+Qin Gao
Is this a train-model.perl problem or an mgiza problem?


Links:
------
[1]
http://stackoverflow.com/questions/14020240/why-does-perl-system-corrupt-the-redirected-path
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://mailman.mit.edu/mailman/private/moses-support/attachments/20121229/7357ee01/attachment-0001.htm

------------------------------

Message: 4
Date: Sat, 29 Dec 2012 11:07:53 +0700
From: Tom Hoar <[email protected]>
Subject: Re: [Moses-support] Time taken by processPhraseTable and
        processLexicalTable
To: <[email protected]>, David Wilson-Parr <[email protected]>
Message-ID:
        <[email protected]>
Content-Type: text/plain; charset="utf-8"

Dave,

You have an interesting hardware setup. We have several
users who use DoMY inside VMware guests on Windows host machines with
much less capable hardware. One frequently trains/tunes/binarizes and
evaluates models with 2+ million segments on as little as 4 GB real RAM
& 4 cores and the total time is about 2 days. This configuration uses
all cores for MGIZA++ and all other multi-threaded components.

Re
MGIZA++, it compiles on Ubuntu 10.04 through 12.10. I can't imagine why
Mint would be any different. You might try installing DoMY CE to see any
differences. http://www.precisiontranslationtools.com/quick-start/ [4].
Alterately, I posted our MGIZA++ setup script on this list last week.
you might try it.

The other thing that catches my attention is how you
"map the working (train/model) directory to the host computer harddisk."
I don't think any of my users have used this configuration, but rather
set the virtual disk to grow automatically. Hieu, is it possible this
hardware mapping configuration is incompatible with the virtualization
techniques used in the binarized tables?

Finally, the train-model.perl
script uses different temp folders for in different steps. Sometimes, it
honors the standard /tmp folder for Linux. In step 6, honors the
user-defined --temp-dir option. In step 5, the extract binary only uses
a subfolder it creates under its output folder. This unpredictable temp
file storage is also true with the language model building tools.
Therefore, the easiest way to ensure you have enough hard drive space
for all the various temp outputs is to have one root partition that is
big enough for all of your temp files.

On 2012-12-28 18:49, Hieu Hoang
wrote:

Hi Dave

NB. Please subscribe to the mailing list before
posting to it. You can
subscribe here:

http://mailman.mit.edu/mailman/listinfo/moses-support [1]
I've
noticed recently that virtual machine disks tends to be REALLY
slow.
Since binarization is all IO bound, that may severely affect the
speed. Try binarizing on a real pc and see how it goes.
Also, check
that you haven't been beaten by this VMWare bug:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=51306
[2]
For comparison, the de-en in the example models I created

http://www.statmt.org/moses/RELEASE-0.91/de-en/model/phrase-table.1.bin/
[3]took 10 hours, and that while running about 20 binarizations
simultaneously.
On 28/12/2012 02:02, David Wilson-Parr wrote:

Hi all,
I am a new member going through the 'build baseline
system' section of the website using the Europarl Swedish-English V6
set. Training took not so long maybe 2 days although I swapped laptops
halfway through so its hard to tell. I am running moses on Mint (Ubuntu
type) linux on VMWare under Windows 7. I map the working (train/model)
directory to the host computer harddisk because I want to keep the VM
image smaller.
Anyway cutting to the chase. I was training the
Swedish/English Pair of Europarl. Training took a while, I would
estimate 2 days but I wasn't using mgiza++ just giza++ , incidentally I
can't get mgiza++ to compile. I then tried to run the decoder and it
took a while to start up but when it started, it would immediately say

**Killed

event though I didn't kill it. So I decided to
binarise the phrase table an re-ordering models but it has taken far
longer than I expected. The 'build a baseline system tutorial' generally
indicates when something is a time-consuming process but this was taking
longer than the initial training.
processPhraseTable - took 2
days+
processLexicalTable - 3 days and still running

Machine
has 32gb of Ram, Intel I7 3630-QM 2.40 Ghz cpu (4/8 cores) . SSD drive
Sata III. VMware Image is set to use 4 cores and 29Gb memory.
I
really appreciate some help,
Dave

_______________________________________________
Moses-support mailing
list
[email protected]

http://mailman.mit.edu/mailman/listinfo/moses-support [1]



Links:
------
[1]
http://mailman.mit.edu/mailman/listinfo/moses-support
[2]
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&amp;cmd=displayKC&amp;externalId=51306
[3]
http://www.statmt.org/moses/RELEASE-0.91/de-en/model/phrase-table.1.bin/
[4]
http://www.precisiontranslationtools.com/quick-start/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://mailman.mit.edu/mailman/private/moses-support/attachments/20121229/bd211ed1/attachment.htm

------------------------------

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 74, Issue 36
*********************************************

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to