Maybe someone will correct me, but if I am not wrong, the gziped version
already calculates the future score while loading (i.e. the phrase is
being scored by the language model). The compact phrase table cannot do
this during loading and doing this on-line. This will be the reason for
the slow speed. I suppose your phrase table has not been pruned? So, for
instance function words like "the" can have hundreds of thousands of
counterparts that need to be scored by the LM during collection.
You can limit your phrase table using Barry's prunePhraseTable tool.
With this you can limit it to, say, the 20 best phrases (corresponds to
the ttable limit) and only score this 20 phrases during collection. That
should be orders of magnitude faster.
Best,
Marcin
W dniu 11.03.2015 o 20:12, Jesús González Rubio pisze:
Thanks for the quick response, I will try as you suggest.
Nevertheless, my main concern is the time spent collecting options. Is
it normal the difference observed respect to the gzip'ed tables? being
the tables cached, shouldn't they be closer?
2015-03-11 18:52 GMT+00:00 Marcin Junczys-Dowmunt <[email protected]
<mailto:[email protected]>>:
Hi,
Try measuring the differences again after a full system reboot
(fresh reboot before each mesurement) or after purging OS
read/write caches. Your phrase tables are most likely cached,
which means they are in fact in memory.
Best,
Marcin
W dniu 11.03.2015 o 19:31, Jesús González Rubio pisze:
Hi,
I'm obtaining some unintuitive timing results when using compact
phrase tables. The average translation time per sentence is much
higher for them in comparison to using gzip'ed phrase tables.
Particularly important is the difference in time required to
collect the options. This table summarizes the timings (in seconds):
Compact Gzip'ed
on-disk in-memory
Init: 5.9 6.3 1882.8
Per-sentence:
- Collect: 5.9 5.8 0.2
- Search: 1.6 1.6 3.3
Results in the table were computed using Moses v2.1 with one
single thread (-th 1) but I've seen similar results using the
pre-compiled binary for moses v3.0. The model comprises two
phrase-tables (~2G and ~3M), two lexicalized reordering tables
(~700M and ~1M) and two language models (~31G and ~38M). You can
see the exact configuration in the attached moses.ini file.
Interestingly, there is virtually no difference for the compact
table between the the on-disk and in-memory options.
Additionally, timings were higher for the initial sentences in
both cases which I think should not be the case for the in-memory
option.
May be the case that the in-memory option of compact tables
(-minpht-memory -minlexr-memory) is not working properly?
Cheers.
--
Jesús
_______________________________________________
Moses-support mailing list
[email protected] <mailto:[email protected]>
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected] <mailto:[email protected]>
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support