Re: [Moses-support] In-memory loading of compact phrases

Marcin Junczys-Dowmunt Wed, 11 Mar 2015 12:25:17 -0700

Maybe someone will correct me, but if I am not wrong, the gziped versionalready calculates the future score while loading (i.e. the phrase isbeing scored by the language model). The compact phrase table cannot dothis during loading and doing this on-line. This will be the reason forthe slow speed. I suppose your phrase table has not been pruned? So, forinstance function words like "the" can have hundreds of thousands ofcounterparts that need to be scored by the LM during collection.

You can limit your phrase table using Barry's prunePhraseTable tool.With this you can limit it to, say, the 20 best phrases (corresponds tothe ttable limit) and only score this 20 phrases during collection. Thatshould be orders of magnitude faster.


Best,
Marcin

W dniu 11.03.2015 o 20:12, Jesús González Rubio pisze:

Thanks for the quick response, I will try as you suggest.

Nevertheless, my main concern is the time spent collecting options. Isit normal the difference observed respect to the gzip'ed tables? beingthe tables cached, shouldn't they be closer?

2015-03-11 18:52 GMT+00:00 Marcin Junczys-Dowmunt <[email protected]<mailto:[email protected]>>:


    Hi,
    Try measuring the differences again after a full system reboot
    (fresh reboot before each mesurement) or after purging OS
    read/write caches. Your phrase tables are most likely cached,
    which means they are in fact in memory.
    Best,
    Marcin

    W dniu 11.03.2015 o 19:31, Jesús González Rubio pisze:

    Hi,

    I'm obtaining some unintuitive timing results when using compact
    phrase tables. The average translation time per sentence is much
    higher for them in comparison to using gzip'ed phrase tables.
    Particularly important is the difference in time required to
    collect the options. This table summarizes the timings (in seconds):

             Compact        Gzip'ed
          on-disk in-memory
    Init:           5.9       6.3    1882.8
    Per-sentence:
     - Collect:     5.9       5.8       0.2
     - Search:      1.6       1.6       3.3

    Results in the table were computed using Moses v2.1 with one
    single thread (-th 1) but I've seen similar results using the
    pre-compiled binary for moses v3.0. The model comprises two
    phrase-tables (~2G and ~3M), two lexicalized reordering tables
    (~700M and ~1M) and two language models (~31G and ~38M). You can
    see the exact configuration in the attached moses.ini file.

    Interestingly, there is virtually no difference for the compact
    table between the the on-disk and in-memory options.
    Additionally, timings were higher for the initial sentences in
    both cases which I think should not be the case for the in-memory
    option.

    May be the case that the in-memory option of compact tables
    (-minpht-memory -minlexr-memory) is not working properly?

    Cheers.

--Jesús



    _______________________________________________
    Moses-support mailing list
    [email protected]  <mailto:[email protected]>
    http://mailman.mit.edu/mailman/listinfo/moses-support



    _______________________________________________
    Moses-support mailing list
    [email protected] <mailto:[email protected]>
    http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] In-memory loading of compact phrases

Reply via email to