I took at look at the existing FactorCollection code and it made me cry, so I rewrote it for revision 4242 including a better locking strategy.
On 09/20/11 12:10, Marcin Junczys-Dowmunt wrote: > Hi Barry, > very high lock contention. Deadlock is the wrong word. With 48 threads > 'top' shows me roughly 120% of processor load instead of 4800%. Actual > translation speed however is far below single thread. > > Yes, we are running an online system, filtering is not an option. > Bye, > Marcin > > 20/9/2011, "Barry Haddow" <[email protected]> napisał/a: > >> Hi Marcin >> >> That makes sense. I looked at the locking in FactorCollection recently and >> realised that it wasn't implemented correctly, although I didn't know that >> it >> had the potential for deadlock. >> >> Do you know if it's an actual deadlock that you're observing, or very high >> lock contention? >> >> btw - why aren't you filtering the phrase table? Are you running an online >> system where the source sentences are not given in advance? >> >> cheers - Barry >> >> On Tuesday 20 September 2011 11:22:49 Marcin Junczys-Dowmunt wrote: >>> Hall all, >>> by the way, I have found the place, where the heavy locking is occurring. >>> It's the lock in >>> >>> FactorCollection::AddFactor >>> >>> When I simply and naively remove that one, everything works on full >>> throttle with 48 threads and nothing bad seems to be happening. With >>> this locks in place the deadlock occurs starting with around 20 threads >>> regardless whether the binary phrase table is used or the in-memory >>> version. >>> >>> The size of the phrase table is also a factor. With a small phrase table >>> filtered according to given test set there are no deadlocks. Does that >>> make any sense? >>> >>> Bye, >>> Marcin >>> >>> 19/9/2011, "Barry Haddow" <[email protected]> napisaĹ�/a: >>>> Hi Marcin >>>> >>>> On Monday 19 September 2011 07:58:48 Marcin Junczys-Dowmunt wrote: >>>>> The binary implementation seems to become unusable with more than 10-12 >>>>> threads. Speed drops as more threads are used until it nearly deadlocks >>>>> at around 30 threads. I am using a 48-core server with 512 GB ram. Even >>>>> copying the binary phrase tables to a ramdisk does not solve the >>>>> problem. The behavior stays the same. The in-memory version works fine >>>>> with 48 threads, but uses nearly all our ram. >>>> There's a shared cache for the on-disk phrase table, which is probably >>>> where the contention is coming from. I don't think disabling the cache >>>> would help as in a large phrase table you'll have 10s of 1000s of >>>> translations of common words and punctuation, which you don't want to >>>> reload for every sentence. A per-thread cache may improve things. >>>> >>>>> Pruning is also not enough, our filtered phrase table still takes around >>>>> 300 GB when loaded into memory, I did not even dare to try and load the >>>>> unfiltered phrase-table into memory :). But I will take a look at the >>>>> implementation from the marathon, thanks. >>>> I think Hieu was referring to this >>>> http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc16 >>>> rather than filtering, which may be of some use. It's hard to imagine that >>>> a 500G phrase table doesn't contain a lot of noise. I'm surprised that >>>> filtering doesn't remove more though - are you decoding large batches of >>>> sentences? >>>> >>>>> At the moment I am thinking about using a perfect hash function as an >>>>> index and keeping target phrases as packed strings in memory. That >>>>> should use about as much memory as a gzipped phrase table on disk, it >>>>> will be slower though, but probably still faster than the binary >>>>> version. >>>> Will look forward to seeing how you get on, >>>> >>>> cheers - Barry >>>> >>>> -- >>>> The University of Edinburgh is a charitable body, registered in >>>> Scotland, with registration number SC005336. >>> _______________________________________________ >>> Moses-support mailing list >>> [email protected] >>> http://mailman.mit.edu/mailman/listinfo/moses-support >>> >> -- >> Barry Haddow >> University of Edinburgh >> +44 (0) 131 651 3173 >> >> -- >> The University of Edinburgh is a charitable body, registered in >> Scotland, with registration number SC005336. > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
