Hi Barry, very high lock contention. Deadlock is the wrong word. With 48 threads 'top' shows me roughly 120% of processor load instead of 4800%. Actual translation speed however is far below single thread.
Yes, we are running an online system, filtering is not an option. Bye, Marcin 20/9/2011, "Barry Haddow" <[email protected]> napisał/a: >Hi Marcin > >That makes sense. I looked at the locking in FactorCollection recently and >realised that it wasn't implemented correctly, although I didn't know that it >had the potential for deadlock. > >Do you know if it's an actual deadlock that you're observing, or very high >lock contention? > >btw - why aren't you filtering the phrase table? Are you running an online >system where the source sentences are not given in advance? > >cheers - Barry > >On Tuesday 20 September 2011 11:22:49 Marcin Junczys-Dowmunt wrote: >> Hall all, >> by the way, I have found the place, where the heavy locking is occurring. >> It's the lock in >> >> FactorCollection::AddFactor >> >> When I simply and naively remove that one, everything works on full >> throttle with 48 threads and nothing bad seems to be happening. With >> this locks in place the deadlock occurs starting with around 20 threads >> regardless whether the binary phrase table is used or the in-memory >> version. >> >> The size of the phrase table is also a factor. With a small phrase table >> filtered according to given test set there are no deadlocks. Does that >> make any sense? >> >> Bye, >> Marcin >> >> 19/9/2011, "Barry Haddow" <[email protected]> napisaĹ�/a: >> >Hi Marcin >> > >> >On Monday 19 September 2011 07:58:48 Marcin Junczys-Dowmunt wrote: >> >> The binary implementation seems to become unusable with more than 10-12 >> >> threads. Speed drops as more threads are used until it nearly deadlocks >> >> at around 30 threads. I am using a 48-core server with 512 GB ram. Even >> >> copying the binary phrase tables to a ramdisk does not solve the >> >> problem. The behavior stays the same. The in-memory version works fine >> >> with 48 threads, but uses nearly all our ram. >> > >> >There's a shared cache for the on-disk phrase table, which is probably >> > where the contention is coming from. I don't think disabling the cache >> > would help as in a large phrase table you'll have 10s of 1000s of >> > translations of common words and punctuation, which you don't want to >> > reload for every sentence. A per-thread cache may improve things. >> > >> >> Pruning is also not enough, our filtered phrase table still takes around >> >> 300 GB when loaded into memory, I did not even dare to try and load the >> >> unfiltered phrase-table into memory :). But I will take a look at the >> >> implementation from the marathon, thanks. >> > >> >I think Hieu was referring to this >> >http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc16 >> >rather than filtering, which may be of some use. It's hard to imagine that >> > a 500G phrase table doesn't contain a lot of noise. I'm surprised that >> > filtering doesn't remove more though - are you decoding large batches of >> > sentences? >> > >> >> At the moment I am thinking about using a perfect hash function as an >> >> index and keeping target phrases as packed strings in memory. That >> >> should use about as much memory as a gzipped phrase table on disk, it >> >> will be slower though, but probably still faster than the binary >> >> version. >> > >> >Will look forward to seeing how you get on, >> > >> >cheers - Barry >> > >> >-- >> >The University of Edinburgh is a charitable body, registered in >> >Scotland, with registration number SC005336. >> >> _______________________________________________ >> Moses-support mailing list >> [email protected] >> http://mailman.mit.edu/mailman/listinfo/moses-support >> > >-- >Barry Haddow >University of Edinburgh >+44 (0) 131 651 3173 > >-- >The University of Edinburgh is a charitable body, registered in >Scotland, with registration number SC005336. _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
