Prasanth - what is the exact lmplz command that was ran by the EMS?
This works
.../lmplz --order 5 --text lm/europarl.lowercased.1 --arpa
lm/europarl.lmplz -T /tmp -S 1G
This doesn't
.../lmplz --order 5 --text lm/europarl.lowercased.1 --arpa
lm/europarl.lmplz -T /tmp -S 80%
it give the error
util/usage.cc:220 in uint64_t util::<anonymous
namespace>::ParseNum(const std::string &) [Num = double] threw
SizeParseError because `!mem'.
Failed to parse 80% into a memory size because % was specified but the
physical memory size could not be determined.
However, it worked even with the source code from 4 days ago.
On 25/11/2013 19:07, Kenneth Heafield wrote:
> Hi,
>
> I've taken a shot in the dark based on physmem.c to support physical
> memory estimation on BSD and OS X. Please clone
>
> github.com/kpu/kenlm
>
> and compile with
>
> ./bjam
>
> If that fails, please let Hieu and I know (maybe Hieu can help since he
> has OS X). If it doesn't fail, run
>
> bin/lmplz
>
> with no argument. The help message will include a line e.g.
>
> "This machine has 135224176640 bytes of memory."
>
> or
>
> "Unable to determine the amount of memory on this machine."
>
> If it works, then I'll push to Moses. Trying to not break Moses master
> for OS X.
>
> Kenneth
>
> On 11/24/13 22:40, Prasanth K wrote:
>> Hi Kenneth,
>>
>> Thanks for the clarification w.r.t. calculating the memory size. But I
>> am running these on a Mac (10.9 Mavericks). Do you think I should still
>> port the lmplz code to Mac for the estimation of probabilities?
>>
>> One thing though, I did change the default clang compiler that comes
>> with this new Mac to a gcc-4.8 (not sure that changes anything in this
>> context).
>>
>> - Prasanth
>>
>>
>>
>>
>> On Fri, Nov 22, 2013 at 6:50 PM, Kenneth Heafield <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>> Hi,
>>
>> What OS are you on? Cygwin? Apparently every OS reports
>> memory size
>> in a different way:
>>
>>
>> http://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=blob;f=lib/physmem.c;h=2629936146e3042f927523322f18aca76996cd7f;hb=HEAD
>>
>> The good news is that the above code is LGPLv2:
>>
>>
>> http://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=blob;f=modules/physmem;h=9644522e0493a85a9fb4ae7c4449741c2c1500ea;hb=HEAD
>>
>> But currently I'm just using this short function that will fail on some
>> platforms:
>>
>> uint64_t GuessPhysicalMemory() {
>> #if defined(_WIN32) || defined(_WIN64)
>> return 0;
>> #elif defined(_SC_PHYS_PAGES) && defined(_SC_PAGESIZE)
>> long pages = sysconf(_SC_PHYS_PAGES);
>> if (pages == -1) return 0;
>> long page_size = sysconf(_SC_PAGESIZE);
>> if (page_size == -1) return 0;
>> return static_cast<uint64_t>(pages) *
>> static_cast<uint64_t>(page_size);
>> #else
>> return 0;
>> #endif
>> }
>>
>> If it fails, I just don't let users specify memory as a percentage. So
>> one thing thing to fix is putting physmem.{h,c} in util then changing
>> calls to GuessPhysicalMemory. But I'm also not a fan of the way the GNU
>> code gives up and makes up a number at the end.
>>
>> The second porting issue is that lmplz makes parallel use of pread,
>> pwrite, and write. Windows is unsafe in this regard (POSIX requires
>> that pread/pwrite not change the file pointer; Windows has no way to
>> implement that atomically). To fix this, we'll always specify the file
>> offset in cases that happen concurrently. Extend util/stream/io.* with
>> a PWrite class based on PWriteOrThrow then change FileBuffer to use
>> PWrite. Then I guess one should rename PReadOrThrow/PWriteOrThrow to
>> something that indicates they're not-quite-POSIX on windows. Also, the
>> macros in these functions should detect cygwin, bypassing cygwin's
>> "Function not implemented" and calling Windows APIs directly (they're
>> already there for _WIN32).
>>
>> I don't have a windows box so I can say what should be changed at a high
>> level, but need an actual user to ensure it compiles and runs correctly.
>>
>> Kenneth
>>
>> On 11/22/13 06:49, Prasanth K wrote:
>> > Hi,
>> >
>> > I am trying to use KenLM for building a language model on the Europarl
>> > corpus. Following the instructions in
>> >
>>
>> (http://www.statmt.org/moses/?n=FactoredTraining.BuildingLanguageModel#ntoc19),
>> > I added the few lines for getting KenLM to estimate the LM
>> probabilities
>> > (order/n=5) to my config file to the EMS. The language model dies down
>> > during training saying that the "Function not implemented" at counting
>> > and sorting n-grams stage (the first stage itself). Does this mean
>> there
>> > is something wrong with my installation? Or is just insufficient
>> memory?
>> >
>> > Incidentally, when I started giving the amount of memory in terms of %
>> > (80%) there was an error "Failed to parse .. into memory size because
>> > physical memory size could not be determined". I am also curious why
>> > this happens?
>> >
>> > Kenneth, can you shed some light on this? Thanks.
>> >
>> > - Regards,
>> > Prasanth
>> >
>> >
>> >
>> > --
>> > "Theories have four stages of acceptance. i) this is worthless
>> nonsense;
>> > ii) this is an interesting, but perverse, point of view, iii) this is
>> > true, but quite unimportant; iv) I always said so."
>> >
>> > --- J.B.S. Haldane
>> >
>> >
>> > _______________________________________________
>> > Moses-support mailing list
>> > [email protected] <mailto:[email protected]>
>> > http://mailman.mit.edu/mailman/listinfo/moses-support
>> >
>> _______________________________________________
>> Moses-support mailing list
>> [email protected] <mailto:[email protected]>
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>>
>>
>> --
>> "Theories have four stages of acceptance. i) this is worthless nonsense;
>> ii) this is an interesting, but perverse, point of view, iii) this is
>> true, but quite unimportant; iv) I always said so."
>>
>> --- J.B.S. Haldane
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support