Hi Dave, 

12 GB isn't unusually large, but 4 days seems excessive.
It's possible there's something wrong with your corpus files, but I see
no evidence. 

When your cmake for MGIZA++ fails, have you tried tunning
cmake again without changing anything? That's one work-around we've used
on some recent Ubuntu versions. 

On 2012-12-29 19:30, David Wilson-Parr
wrote: 

> Hi both,
> 
> Thanks for your help. I'm not convinced its
hardware configuration although I am looking into it - the
'reordering-table.binlexr.srctree' file is now 12GB and growing (4 days
later). In the pre-built models I have seen this file doesn't appear to
get that big. I think the source file must be corrupt or something is
set wrong. The other big one 'phrase-table.binphr.tgtdata.wa' reached
7.5gb and was complete. The last thing it output to the console was:
>

> ..................................................[phrase:70500000]
>
..................................................[phrase:71000000]
>
distinct source phrases: 71001135 distinct first words of source
phrases: 322879 number of phrase pairs (line count): 108170769
> Count
of lines with missing alignments: 0/108170769
> WARNING: there are src
voc entries with no phrase translation: count 27888
> There exists
phrase translations for 294991 entries
> 
> Does that sound about
right?
> 
> I am wondering if something is wrong in the original files
that I built so I am going to wipe the working directory and restart it
from just the original prepared corpus files. 
> 
> Also my mgizapp
compile problem is:
> 
> [ 55%] Built target mgiza_lib
> [ 56%] Built
target d4norm
> [ 58%] Built target hmmnorm
> [ 60%] Built target
mgiza
> [ 62%] Built target plain2snt
> [ 63%] Built target snt2cooc
> [
65%] Built target snt2coocrmp
> [ 67%] Built target snt2plain
> [ 70%]
Built target symal
> [ 72%] Building CXX object
src/mkcls/CMakeFiles/mkcls.dir/KategProblemTest.cpp.o
> In file included
from /home/dave/mgizapp/trunk/mgizapp/src/mkcls/StatVar.h:33:0,
> from
/home/dave/mgizapp/trunk/mgizapp/src/mkcls/Problem.h:34,
> from
/home/dave/mgizapp/trunk/mgizapp/src/mkcls/KategProblem.h:34,
> from
/home/dave/mgizapp/trunk/mgizapp/src/mkcls/KategProblemTest.h:30,
> from
/home/dave/mgizapp/trunk/mgizapp/src/mkcls/KategProblemTest.cpp:27:
>
/home/dave/mgizapp/trunk/mgizapp/src/mkcls/myleda.h: In instantiation of
'B& leda_h_array<A, B>::operator[](const A&) [with A =
std::basic_string<char>; B = int]':
>
/home/dave/mgizapp/trunk/mgizapp/src/mkcls/KategProblemTest.cpp:96:18:
required from here
>
/home/dave/mgizapp/trunk/mgizapp/src/mkcls/myleda.h:221:5: error:
'insert' was not declared in this scope, and no declarations were found
by argument-dependent lookup at the point of instantiation
[-fpermissive]
>
/home/dave/mgizapp/trunk/mgizapp/src/mkcls/myleda.h:221:5: note:
declarations in dependent base
'__gnu_cxx::hash_map<std::basic_string<char>, int,
my_hash<std::basic_string<char>, std::less<std::basic_string<char> > >,
std::equal_to<std::basic_string<char> >, std::allocator<int> >' are not
found by unqualified lookup
>
/home/dave/mgizapp/trunk/mgizapp/src/mkcls/myleda.h:221:5: note: use
'this->insert' instead
>
/home/dave/mgizapp/trunk/mgizapp/src/mkcls/myleda.h: In instantiation of
'B& leda_h_array<A, B>::operator[](const A&) [with A =
std::pair<std::basic_string<char>, std::basic_string<char> >; B =
int]':
>
/home/dave/mgizapp/trunk/mgizapp/src/mkcls/KategProblemTest.cpp:265:38:
required from here
>
/home/dave/mgizapp/trunk/mgizapp/src/mkcls/myleda.h:221:5: error:
'insert' was not declared in this scope, and no declarations were found
by argument-dependent lookup at the point of instantiation
[-fpermissive]
>
/home/dave/mgizapp/trunk/mgizapp/src/mkcls/myleda.h:221:5: note:
declarations in dependent base
'__gnu_cxx::hash_map<std::pair<std::basic_string<char>,
std::basic_string<char> >, int,
my_hash<std::pair<std::basic_string<char>, std::basic_string<char> >,
std::less<std::pair<std::basic_string<char>, std::basic_string<char> > >
>, std::equal_to<std::pair<std::basic_string<char>,
std::basic_string<char> > >, std::allocator<int> >' are not found by
unqualified lookup
>
/home/dave/mgizapp/trunk/mgizapp/src/mkcls/myleda.h:221:5: note: use
'this->insert' instead
> make[2]: ***
[src/mkcls/CMakeFiles/mkcls.dir/KategProblemTest.cpp.o] Error 1
>
make[1]: *** [src/mkcls/CMakeFiles/mkcls.dir/all] Error 2
> make: ***
[all] Error 2
> 
> Thank-you all for the help.
> 
> Dave
> 
> On
29/12/2012 04:08, [email protected] wrote: 
> 
>> Send
Moses-support mailing list submissions to
>> [email protected]
>>

>> To subscribe or unsubscribe via the World Wide Web, visit
>>
http://mailman.mit.edu/mailman/listinfo/moses-support [1]
>> or, via
email, send a message with subject or body 'help' to
>>
[email protected]
>> 
>> You can reach the person managing
the list at
>> [email protected]
>> 
>> When replying, please
edit your Subject line so it is more specific
>> than "Re: Contents of
Moses-support digest..."
>> 
>> Today's Topics:
>> 
>> 1. Re: compile
error with boost 1.48 (Kenneth Heafield)
>> 2. Re: compile error with
boost 1.48 (Hieu Hoang)
>> 3. Re: train-model.perl with mgizapp fails
when extended UTF-8
>> characters in output path. (Tom Hoar)
>> 4. Re:
Time taken by processPhraseTable and processLexicalTable
>> (Tom
Hoar)
>> 
>>
----------------------------------------------------------------------
>>

>> Message: 1
>> Date: Fri, 28 Dec 2012 17:21:14 +0000
>> From:
Kenneth Heafield <[email protected]>
>> Subject: Re: [Moses-support]
compile error with boost 1.48
>> To: [email protected]
>>
Message-ID: <[email protected]>
>> Content-Type:
text/plain; charset=windows-1252; format=flowed
>> 
>> Hi,
>> 
>> This
looks like a mismatch between library and header versions. Does 
>>
/opt/boost_1_51_0/lib64/libboost_program_options-mt.so exist? Did you

>> use exactly this command for Boost, as recommended in 
>>
BUILD-INSTRUCTIONS.txt ?
>> 
>> ./b2 --prefix=/opt/boost_1_51_0
--libdir=/opt/boost_1_51_0/lib64 
>> --layout=tagged link=static,shared
threading=multi,single install
>> 
>> Kenneth
>> 
>> On 12/28/12 14:37,
Holger Schwenk wrote:
>> 
>>> Hello Hieu,
>>> 
>>> thanks for you fast
answer !
>>> 
>>> I installed boost 1.51.0 and moses did compile (but I
don't understand
>>> why Fedora doesn't upgrade the buggy boost version
in FC 17 ...).
>>> 
>>> However, there are a couple of errors when
compiling mira and pro. I
>>> attached some of the error messages. It
seems to me that this is related
>>> to linking.
>>> 
>>> Hieu, do you
manage to compile the current git version with boost_1_51_0
>>> ? Which
compiler do you use ?
>>> 
>>> Holger
>>> 
>>> On 12/28/2012 01:00 PM,
Hieu Hoang wrote:
>>> 
>>>> hi holger
>>>> 
>>>> i think there's a
problem with boost 1.48.
>>>>
http://article.gmane.org/gmane.comp.nlp.moses.user/7795/ [2]
>>>> i
can't find any way around it.
>>>> 
>>>> would you be able to compile
with another boost lib and see if it works?
>>>> 
>>>> in
>>>>
BUILD-INSTRUCTIONS.txt
>>>> there's instructions on how to compile
boost. Then link into moses using
>>>> ./bjam
--with-boost=/Users/hieuhoang/workspace/boost/boost_1_51_0
>>>> 
>>>> On
28/12/2012 11:35, Holger Schwenk wrote:
>>>> 
>>>>> Hello,
>>>>> 
>>>>>
I just installed Fedora Core 17(kernel 3.6.10, gcc 4.7.2, boost
>>>>>
1.48.0-13) and I get an error while compiling the latest version
of
>>>>> Moses:
>>>>> 
>>>>> ./bjam -j4 --with-srilm=$SRILM
--with-giza=/usr/local/bin
>>>>>
--prefix=/opt/mt/moses-smt/moses-2012-12-28
>>>>> 
>>>>> fails
with
>>>>> 
>>>>> moses/FeatureVector.cpp:284:17: *warning: reference to
local variable
>>>>> ?backoff? returned* [enabled by default]
>>>>>

>>>>> and
>>>>> 
>>>>> ambiguous calls to boost in
moses/FeatureVector.cpp
>>>>> 
>>>>> The log file is attached.
>>>>>

>>>>> have you encountered this problem before or this seems to be
rather
>>>>> an issue withmy installation ?
>>>>> 
>>>>> thanks,
>>>>>

>>>>> Holger
>>>>> 
>>>>>
_______________________________________________
>>>>> Moses-support
mailing list
>>>>> [email protected]
>>>>>
http://mailman.mit.edu/mailman/listinfo/moses-support [1]
>>>> 
>>>>
_______________________________________________
>>>> Moses-support
mailing list
>>>> [email protected]
>>>>
http://mailman.mit.edu/mailman/listinfo/moses-support [1]
>>> 
>>>
_______________________________________________
>>> Moses-support
mailing list
>>> [email protected]
>>>
http://mailman.mit.edu/mailman/listinfo/moses-support [1]
>> 
>>
------------------------------
>> 
>> Message: 2
>> Date: Fri, 28 Dec
2012 17:44:00 +0000
>> From: Hieu Hoang <[email protected]>
>>
Subject: Re: [Moses-support] compile error with boost 1.48
>> To:
Kenneth Heafield <[email protected]>, moses-support
>>
<[email protected]>
>> Message-ID:
>>
<CAEKMkbg1e5hkGXfLZeO1z=sNvbU3-15GCoPNP3Zzs0=4bsi...@mail.gmail.com>
>>
Content-Type: text/plain; charset="windows-1252"
>> 
>> i did exactly
what it said in BUILD-INSTRUCTIONS.txt
>> ./b2
--prefix=/home/s0565741/workspace/boost/boost_1_51_0
>>
--libdir=/home/s0565741/workspace/boost/boost_1_51_0/lib64
--layout=tagged
>> link=static,shared threading=multi,single install
-j15
>> 
>> ./bjam --with-srilm=/home/s0565741/workspace/srilm
>>
--with-irstlm=/home/s0565741/workspace/irstlm/trunk/
>>
--with-cmph=/home/s0565741/workspace/cmph-2.0 -j15
>>
--with-boost=/home/s0565741/workspace/boost/boost_1_51_0
>> 
>> it would
only work when i also do this
>> ln -s libboost_thread-mt.a
libboost_thread.a
>> 
>> On 28 December 2012 17:21, Kenneth Heafield
<[email protected]> wrote:
>> 
>>> Hi,
>>> 
>>> This looks like a
mismatch between library and header versions.
>>> Does
>>>
/opt/boost_1_51_0/lib64/libboost_program_options-mt.so exist? Did
you
>>> use exactly this command for Boost, as recommended in
>>>
BUILD-INSTRUCTIONS.txt ?
>>> 
>>> ./b2 --prefix=/opt/boost_1_51_0
--libdir=/opt/boost_1_51_0/lib64
>>> --layout=tagged link=static,shared
threading=multi,single install
>>> 
>>> Kenneth
>>> 
>>> On 12/28/12
14:37, Holger Schwenk wrote:
>>> 
>>>> Hello Hieu,
>>>> 
>>>> thanks for
you fast answer !
>>>> 
>>>> I installed boost 1.51.0 and moses did
compile (but I don't understand
>>>> why Fedora doesn't upgrade the
buggy boost version in FC 17 ...).
>>>> 
>>>> However, there are a
couple of errors when compiling mira and pro. I
>>>> attached some of
the error messages. It seems to me that this is related
>>>> to
linking.
>>>> 
>>>> Hieu, do you manage to compile the current git
version with boost_1_51_0
>>>> ? Which compiler do you use ?
>>>> 
>>>>
Holger
>>>> 
>>>> On 12/28/2012 01:00 PM, Hieu Hoang wrote:
>>>> 
>>>>>
hi holger
>>>>> 
>>>>> i think there's a problem with boost 1.48.
>>>>>
http://article.gmane.org/gmane.comp.nlp.moses.user/7795/ [2]
>>>>> i
can't find any way around it.
>>>>> 
>>>>> would you be able to compile
with another boost lib and see if it works?
>>>>> 
>>>>> in
>>>>>
BUILD-INSTRUCTIONS.txt
>>>>> there's instructions on how to compile
boost. Then link into moses using
>>>>> ./bjam
--with-boost=/Users/hieuhoang/workspace/boost/boost_1_51_0
>>>>> 
>>>>>
On 28/12/2012 11:35, Holger Schwenk wrote:
>>>>> 
>>>>>> Hello,
>>>>>>

>>>>>> I just installed Fedora Core 17(kernel 3.6.10, gcc 4.7.2,
boost
>>>>>> 1.48.0-13) and I get an error while compiling the latest
version of
>>>>>> Moses:
>>>>>> 
>>>>>> ./bjam -j4 --with-srilm=$SRILM
--with-giza=/usr/local/bin
>>>>>>
--prefix=/opt/mt/moses-smt/moses-2012-12-28
>>>>>> 
>>>>>> fails
with
>>>>>> 
>>>>>> moses/FeatureVector.cpp:284:17: *warning: reference
to local variable
>>>>>> ?backoff? returned* [enabled by default]
>>>>>>

>>>>>> and
>>>>>> 
>>>>>> ambiguous calls to boost in
moses/FeatureVector.cpp
>>>>>> 
>>>>>> The log file is attached.
>>>>>>

>>>>>> have you encountered this problem before or this seems to be
rather
>>>>>> an issue withmy installation ?
>>>>>> 
>>>>>>
thanks,
>>>>>> 
>>>>>> Holger
>>>>>> 
>>>>>>
_______________________________________________
>>>>>> Moses-support
mailing list
>>>>>> [email protected]
>>>>>>
http://mailman.mit.edu/mailman/listinfo/moses-support [1]
>>>>> 
>>>>>
_______________________________________________
>>>>> Moses-support
mailing list
>>>>> [email protected]
>>>>>
http://mailman.mit.edu/mailman/listinfo/moses-support [1]
>>>> 
>>>>
_______________________________________________
>>>> Moses-support
mailing list
>>>> [email protected]
>>>>
http://mailman.mit.edu/mailman/listinfo/moses-support [1]
>>> 
>>>
_______________________________________________
>>> Moses-support
mailing list
>>> [email protected]
>>>
http://mailman.mit.edu/mailman/listinfo/moses-support [1]
>> 
>>
-------------- next part --------------
>> An HTML attachment was
scrubbed...
>> URL:
http://mailman.mit.edu/mailman/private/moses-support/attachments/20121228/5be86ad9/attachment-0001.htm
[4]
>> 
>> ------------------------------
>> 
>> Message: 3
>> Date:
Sat, 29 Dec 2012 10:30:23 +0700
>> From: Tom Hoar
<[email protected]>
>> Subject: Re: [Moses-support]
train-model.perl with mgizapp fails when
>> extended UTF-8 characters in
output path.
>> To: <[email protected]>
>> Message-ID:
>>
<[email protected]>
>>
Content-Type: text/plain; charset="utf-8"
>> 
>> Ken, 
>> 
>> I traced
this. The problem is not specific to MGIZA++. It
>> manifested itself
the first time with merge_alignment.py in
>> train-model.perl's step 2.
In train-model.perl step 5, it happens again
>> when "extract" redirects
output to a path. 
>> 
>> This redirection problem
>> does not happen
with 2- or 3-byte UTF-8 Latin, Japanese or Chinese
>> characters that I
tried. All Thai UTF-8 characters are 3 bytes and there
>> are many
documented cases where UTF handlers don't handle Thai properly.
>> It
appears Perl's system() call is one of these cases. 
>> 
>> I'm
working
>> through the problem on stackoverflow.com:
>>
http://stackoverflow.com/questions/14020240/why-does-perl-system-corrupt-the-redirected-path
[5]
>> [1] but no time for more tests during the holiday. I'll share a
fix
>> if/when there's a resolution. 
>> 
>> For now, Hieu's
recommendation to
>> document the problem and block such requests in the
front end is a good
>> work-around. 
>> 
>> Tom 
>> 
>> On 2012-12-28
06:43, Kenneth Heafield wrote: 
>> 
>> +Qin Gao
>> 
>> Is this a
train-model.perl problem or an mgiza problem?
>> 
>> Links:
>> ------
>>
[1]
>> http://s> part -------------- An HTML attachment was scrubbed...
URL:
>> -freetext"
href="http://mailman.mit.edu/mailman/private/moses-support/attachments/20121229/7357ee01/attachment-0001.htm";>http://mailman.mit.edu/mailman/private/moses-support/attachments/20121229/7357ee01/attachment-0001.htm
------------------------------ Message: 4 Date: Sat, 29 Dec 2012
11:07:53 +0700 From: Tom Hoar <[email protected]>
Subject: Re: [Moses-support] Time taken by processPhraseTable and
processLexicalTable To: <[email protected]>, David Wilson-Parr
<[email protected]> Message-ID:
<[email protected]>
Content-Type: text/plain; charset="utf-8" Dave, You have an interesting
hardware setup. We have several users who use DoMY inside VMware guests
on Windows host machines with much less capable hardware. One frequently
trains/tunes/binarizes and evaluates models with 2+ million segments on
as little as 4 GB real RAM & 4 cores and the total time is about 2 days.
This configuration uses all cores for MGIZA++ and all other
multi-threaded components. Re MGIZA++, it compiles on Ubuntu 10.04
through 12.10. I can't imagine why Mint would be any different. You
might try installing DoMY CE to see any differences.
http://www.precisiontranslationtools.com/quick-start/ [3] [4].
Alterately, I posted our MGIZA++ setup script on this list last week.
you might try it. The other thing that catches my attention is how you
"map the working (train/model) directory to the host computer harddisk."
I don't think any of my users have used this configuration, but rather
set the virtual disk to grow automatically. Hieu, is it possible this
hardware mapping configuration is incompatible with the virtualization
techniques used in the binarized tables? Finally, the train-model.perl
script uses different temp folders for in different steps. Sometimes, it
honors the standard /tmp folder for Linux. In step 6, honors the
user-defined --temp-dir option. In step 5, the extract binary only uses
a subfolder it creates under its output folder. This unpredictable temp
file storage is also true with the language model building tools.
Therefore, the easiest way to ensure you have enough hard drive space
for all the various temp outputs is to have one root partition that is
big enough for all of your temp files. On 2012-12-28 18:49, Hieu Hoang
wrote: 
>> 
>> Hi Dave
>> 
>> NB. Please subscribe to the mailing list
before
>> 
>> posting to it. You can 
>> 
>> subscribe here:
>> 
>>
http://mailman.mit.edu/mailman/listinfo/moses-support [1] [1]
>> 
>>
slow.
>> 
>> Since binarization is all IO bound, that may severely
affect the 
>> 
>>> 5px; border-left:#1010ff 2px solid
>> :5px;
width:100%"> 
>> 
>> Also, check
>> 
>> that you haven't been beaten by
this VMWare bug:
>> 
>>
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&am>
r-left:#1010ff 2px so
>> eft:5px; width:100%"> 
>> 
>> For comparison,
the de-en in the example models I created
>> 
>> I am a new member going
through the 'build baseline
>> 
>> system' section of the website using
the Europarl Swedish-English V6
>> set. Training took not so long maybe
2 days although I swapped laptops
>> halfway through so its hard to
tell. I am running moses on Mint (Ubuntu
>> type) linux on VMWare under
Windows 7. I map t> r.
>> id; margin-left:5px; width:100%"> 
>> 
>>
Anyway cutting to the chase. I was training the
>> 
>> Swedish/English
Pair of Europarl. Training took a while, I would
>> estimate 2 days but
I wasn't using mgiza++ just giza++ , incidentally I
>> can't get m> ould
immediately say
>> px; border-left:#1010ff 2px solid; margin-left:5px;
width:100%"> 
>> 
>> **Killed
>> 
>> event thoug> re-ordering models but
it has taken far longer than I expected. The 'build a baseline system
tutorial' generally indicates when something is a time-consuming process
but this was taking longe
>> itial training. 
>> 
>> processPhraseTable
- took 2
>> 
>> days+ 
>> 
>> I
>> 
>> really appreciate some help, 
>>

>> Moses-support mailing
>> 
>> list
>> 
>> [email protected]
>>

>> http://mail> t"
href="http://mailman.mit.edu/mailman/listinfo/moses-support";>http://mailman.mit.edu/mailman/listinfo/moses-support
[2]
>>
.com/selfservice/microsites/search.do?language=en_US&amp;cmd=displayKC&amp;externalId=51306">http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&amp;cmd=displayKC&amp;externalId=51306
[3] http://mailman.mit.edu/mailman/listinfo/moses-support End of
Moses-support Digest, Vol 74, Issue 36
********************************************* 
>> 
>>> 
>> 
>>> 
>> 
>>>

> 
> _______________________________________________
> Moses-support
mailing list
> [email protected]
>
http://mailman.mit.edu/mailman/listinfo/moses-support [1]



Links:
------
[1]
http://mailman.mit.edu/mailman/listinfo/moses-support
[2]
http://article.gmane.org/gmane.comp.nlp.moses.user/7795/
[3]
http://www.precisiontranslationtools.com/quick-start/
[4]
http://mailman.mit.edu/mailman/private/moses-support/attachments/20121228/5be86ad9/attachment-0001.htm
[5]
http://stackoverflow.com/questions/14020240/why-does-perl-system-corrupt-the-redirected-path
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to