Re: [Moses-support] Moses vs Moses2 in its multi-threading

Ivan Zapreev Thu, 16 Mar 2017 08:54:12 -0700

Awesome! Thanks, we will keep in touch.

On Mar 16, 2017 16:49, "Hieu Hoang" <[email protected]> wrote:


>
>
> * Looking for MT/NLP opportunities *
> Hieu Hoang
> http://moses-smt.org/
>
>
> On 16 March 2017 at 15:23, Ivan Zapreev <[email protected]> wrote:
>
>> Dear Hieu,
>>
>> Thank you for you time and the clear answers.
>>
>> >>> ok, if you are using plain text models, then this is the 1st
>> difference to my results. I don't optimize the plain text models, I don't
>> wanna wait for 13 minutes and this is not how most users use the decoder.
>>
>> Well, perhaps, but for me as a user it should not matter, loading the
>> models happens just once, when the system is started. So in production this
>> is not a big issue, only if one wants to start things up and shut them down
>> periodically and quite often.
>>
> good point. It come down to how you're using the decoder and your
> priorities.
>
> For completeness, please evaluate with binary models. I'll test my models
> with plain text models too
>
>
>>
>>
>> >>> But if you see bad scaling in Moses2 with binary models, please let
>> me know
>>
>> Is not that so that when the text models are loaded into memory they get
>> binarized and then the binary model is just the memory snapshot of the
>> loaded text model? I would expect so because otherwise one would need to
>> use different data structures for storing binary and text models. Moreover,
>> this would also mean different execution times on these two model types. So
>> I suspect that having a text model should only influence the loading time
>> of the model but not the decoding times...
>>
>> >>> If you do load the plain text files, you should check that it doesn't
>> use up all memory and has to disk swap
>>
>> I have 5 times more memory than all the models take all together, so
>> swapping is not an issue.
>>
>> >>> I know it's faster on 1 thread, it should be much faster on lots of
>> threads.
>>
>> Yes, I was expecting the same results, but somehow I get a different
>> trend .... so this is why I am asking.
>>
>> >>>  no, just pre-load the binary files
>>
>> Ok, thanks!
>>
>> Kind regards,
>>
>> Dr. Ivan S. Zapreev
>>
>>
>>
>>> On Thu, Mar 16, 2017 at 3:11 PM, Hieu Hoang <[email protected]> wrote:
>>>
>>>>
>>>>
>>>> * Looking for MT/NLP opportunities *
>>>> Hieu Hoang
>>>> http://moses-smt.org/
>>>>
>>>>
>>>> On 16 March 2017 at 13:16, Ivan Zapreev <[email protected]> wrote:
>>>>
>>>>> Dear Hieu,
>>>>>
>>>>> Thank you for a prompt and detailed reply!
>>>>>
>>>>> >>> So your server has 20 cores (40 hyperthreads) and 16GB RAM? If
>>>>> that's correct, then the RAM size would be a problem - you need as much 
>>>>> RAM
>>>>> as the total size of your models, plus more for working memory and the OS.
>>>>>
>>>>> The amount of memory is 256 Gb and not 16. There are a number of 16 Gb
>>>>> plates installed.
>>>>> To my knowledge the machine is not hyperthreaded but just has 40
>>>>> cores, although I am now getting a bit doubtful about that.
>>>>>
>>>> 256GB is good. 20/40 core/hyperthreads is not important for the moment,
>>>> but you should find out exactly what it is
>>>>
>>>>
>>>>>
>>>>> >> Do you run Moses command line, or the server? My timings are based
>>>>> on the command line, the server is a little slower.
>>>>>
>>>>> Both Moses and Moses2 are run in the console mode (not server). The
>>>>> model loading time is excluded from the measurements. I could not manage 
>>>>> to
>>>>> get the asynchronous XML-RPC to work, so for my experiments that would be
>>>>> as if I used Moses/Moses2 in a single-thread mode. Therefore I used the
>>>>> command-line version.
>>>>>
>>>>> >>> Do you run Moses directly, or is another evaluation process
>>>>> running it? Are you sure that evaluation process is working as it should?
>>>>>
>>>>> Moses is run from command time under the "time" command of Linux, and
>>>>> so are other systems we used in comarison. We look at the runtime and not
>>>>> the CPU times, but we perform a number of experiments to measure the
>>>>> average times and control the standard deviations.
>>>>>
>>>>> >>> Do you minimise the effect of disk read by pre-loading the models
>>>>> into filesystem cache? This is usually done by running this before running
>>>>> the decoder cat [binary model files] > /dev/null
>>>>>
>>>>> Nop, we did not do pre-loading, for none of the tools but perhaps this
>>>>> is not an issue as we just measure the average model loading times and
>>>>> subtract them from the average run-time with decoding. So the model 
>>>>> loading
>>>>> times are excluded from the results. Our goal was to measure and compare
>>>>> the decoding times and how they scale in the number of threads.
>>>>>
>>>> You're using the Probing phrase-table with integrated reordering model,
>>>> and a binary KenLM, right?
>>>>
>>>> If so, the loading time will be minimal (1-2 secs) since the binary
>>>> format just memory map the whole data but doesn't actually load them into
>>>> memory. However, the overwhelming amount of the time taken for decoding
>>>> will be page faults while doing LM and pt random lookups.
>>>>
>>>> It would be no surprise that decoding speed for Moses and Moses2 would
>>>> be similar without pre-loading - they are looking up the same data.
>>>>
>>>>
>>>>>
>>>>> >>> it may take a while, but I can't replicate your results without
>>>>> it. Alternatively, I can provide you with my models so you can try &
>>>>> replicate my results.
>>>>>
>>>>> The experiments are run on an internal server which is not visible
>>>>> from outside. I shall explore the possibilities of sharing the models, but
>>>>> I am doubtful it is possible. The university network is very restricted.
>>>>> Yet, I am definitely open to re-running your experiments. If possible.
>>>>>
>>>> i can make it available. But the results will be the same unless you
>>>> sort out your pre-loading
>>>>
>>>>>
>>>>> Kind regards,
>>>>>
>>>>> Ivan
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Best regards,
>>>
>>> Ivan
>>>
>>
>>
>>
>>
>> --
>> Best regards,
>>
>> Ivan
>> <http://www.tainichok.ru/>
>>
>
>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Moses vs Moses2 in its multi-threading

Reply via email to