Hi Tom Yes, that sounds right. If Moses takes a long time to process one sentence, then it will not output anything until this sentence has finished. And if all the other sentences finish first, then there will be one thread left processing the long sentence.
I was under the impression that there was an option to override this behaviour, i.e. to output sentences as they are completed, adding an id so that the user can reorder them. However I couldn't find it so it perhaps didn't make it into trunk. The option was needed by some of the MateCat team when integrating Moses into a CAT tool. cheers - Barry On 14/06/13 16:44, Tom Hoar wrote: > I build a custom recaser model that tokenizes the sentence into > characters. It works well but it is slower than other models. In one > batch, there is an unexpectedly long source sentence. Line 1017 (of 2400 > lines) has 3,854 characters, which becomes 3,854 tokens as the Moses input. > > When it gets to this long line, all stdout pauses. Top shows Moses, > configured to run on 6 cores, is running @ 598%. That's great. CPU > processing continues at <600% CPU load for about 5-6 minutes. Then, it > throttles back to "only" 100% load. > > Is is possible that when Moses drops to 100% load, it has finished > processing the balance of the 2400 lines and the final thread continues > processing that extremely long line? > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
