Maybe someone will correct me, but if I am not wrong, the gziped version already calculates the future score while loading (i.e. the phrase is being scored by the language model). The compact phrase table cannot do this during loading and doing this on-line. This will be the reason for the slow speed. I suppose your phrase table has not been pruned? So, for instance function words like "the" can have hundreds of thousands of counterparts that need to be scored by the LM during collection.

You can limit your phrase table using Barry's prunePhraseTable tool. With this you can limit it to, say, the 20 best phrases (corresponds to the ttable limit) and only score this 20 phrases during collection. That should be orders of magnitude faster.

Best,
Marcin

W dniu 11.03.2015 o 20:12, Jesús González Rubio pisze:
Thanks for the quick response, I will try as you suggest.

Nevertheless, my main concern is the time spent collecting options. Is it normal the difference observed respect to the gzip'ed tables? being the tables cached, shouldn't they be closer?

2015-03-11 18:52 GMT+00:00 Marcin Junczys-Dowmunt <[email protected] <mailto:[email protected]>>:

    Hi,
    Try measuring the differences again after a full system reboot
    (fresh reboot before each mesurement) or after purging OS
    read/write caches. Your phrase tables are most likely cached,
    which means they are in fact in memory.
    Best,
    Marcin

    W dniu 11.03.2015 o 19:31, Jesús González Rubio pisze:
    Hi,

    I'm obtaining some unintuitive timing results when using compact
    phrase tables. The average translation time per sentence is much
    higher for them in comparison to using gzip'ed phrase tables.
    Particularly important is the difference in time required to
    collect the options. This table summarizes the timings (in seconds):

             Compact        Gzip'ed
          on-disk in-memory
    Init:           5.9       6.3    1882.8
    Per-sentence:
     - Collect:     5.9       5.8       0.2
     - Search:      1.6       1.6       3.3

    Results in the table were computed using Moses v2.1 with one
    single thread (-th 1) but I've seen similar results using the
    pre-compiled binary for moses v3.0. The model comprises two
    phrase-tables (~2G and ~3M), two lexicalized reordering tables
    (~700M and ~1M) and two language models (~31G and ~38M). You can
    see the exact configuration in the attached moses.ini file.

    Interestingly, there is virtually no difference for the compact
    table between the the on-disk and in-memory options.
    Additionally, timings were higher for the initial sentences in
    both cases which I think should not be the case for the in-memory
    option.

    May be the case that the in-memory option of compact tables
    (-minpht-memory -minlexr-memory) is not working properly?

    Cheers.
-- Jesús


    _______________________________________________
    Moses-support mailing list
    [email protected]  <mailto:[email protected]>
    http://mailman.mit.edu/mailman/listinfo/moses-support


    _______________________________________________
    Moses-support mailing list
    [email protected] <mailto:[email protected]>
    http://mailman.mit.edu/mailman/listinfo/moses-support



_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to