Hi Marcin

That makes sense. I looked at the locking in FactorCollection recently and 
realised that it wasn't implemented correctly, although I didn't know that it 
had the potential for deadlock.

Do you know if it's an actual deadlock that you're observing, or very high 
lock contention?

btw - why aren't you filtering the phrase table? Are you running an online 
system where the source sentences are not given in advance?

cheers - Barry

On Tuesday 20 September 2011 11:22:49 Marcin Junczys-Dowmunt wrote:
> Hall all,
> by the way, I have found the place, where the heavy locking is occurring.
> It's the lock in
> 
> FactorCollection::AddFactor
> 
> When I simply and naively remove that one, everything works on full
> throttle with 48 threads and nothing bad seems to be happening. With
> this locks in place the deadlock occurs starting with around 20 threads
> regardless whether the binary phrase table is used or the in-memory
> version.
> 
> The size of the phrase table is also a factor. With a small phrase table
> filtered according to given test set there are no deadlocks. Does that
> make any sense?
> 
> Bye,
> Marcin
> 
> 19/9/2011, "Barry Haddow" <[email protected]> napisaƂ/a:
> >Hi Marcin
> >
> >On Monday 19 September 2011 07:58:48 Marcin Junczys-Dowmunt wrote:
> >> The binary implementation seems to become unusable with more than 10-12
> >> threads. Speed drops as more threads are used until it nearly deadlocks
> >> at around 30 threads. I am using a 48-core server with 512 GB ram. Even
> >> copying the binary phrase tables to a ramdisk does not solve the
> >> problem. The behavior stays the same. The in-memory version works fine
> >> with 48 threads, but uses nearly all our ram.
> >
> >There's a shared cache for the on-disk phrase table, which is probably
> > where the contention is coming from. I don't think disabling the cache
> > would help as in a large phrase table you'll have 10s of 1000s of
> > translations of common words and punctuation, which you don't want to
> > reload for every sentence. A per-thread cache may improve things.
> >
> >> Pruning is also not enough, our filtered phrase table still takes around
> >> 300 GB when loaded into memory, I did not even dare to try and load the
> >> unfiltered phrase-table into memory :). But I will take a look at the
> >> implementation from the marathon, thanks.
> >
> >I think Hieu was referring to this
> >http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc16
> >rather than filtering, which may be of some use. It's hard to imagine that
> > a 500G phrase table doesn't contain a lot of noise. I'm surprised that
> > filtering doesn't remove more though - are you decoding large batches of
> > sentences?
> >
> >> At the moment I am thinking about using a perfect hash function as an
> >> index and keeping target phrases as packed strings in memory. That
> >> should use about as much memory as a gzipped phrase table on disk, it
> >> will be slower though, but probably still faster than the binary
> >> version.
> >
> >Will look forward to seeing how you get on,
> >
> >cheers - Barry
> >
> >--
> >The University of Edinburgh is a charitable body, registered in
> >Scotland, with registration number SC005336.
> 
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
> 
 
--
Barry Haddow
University of Edinburgh
+44 (0) 131 651 3173

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to