Good point.  I now blame this code from
moses/TranslationModel/CompactPT/TargetPhraseCollectionCache.h

Looks like a case for a concurrent fixed-size hash table.  Failing that,
banded locks instead of a single lock?  Namely an array of hash tables,
each of which is independently locked.

  /** retrieve translations for source phrase from persistent cache **/
  void Cache(const Phrase &sourcePhrase, TargetPhraseVectorPtr tpv,
             size_t bitsLeft = 0, size_t maxRank = 0) {
#ifdef WITH_THREADS
    boost::mutex::scoped_lock lock(m_mutex);
#endif

    // check if source phrase is already in cache
    iterator it = m_phraseCache.find(sourcePhrase);
    if(it != m_phraseCache.end())
      // if found, just update clock
      it->second.m_clock = clock();
    else {
      // else, add to cache
      if(maxRank && tpv->size() > maxRank) {
        TargetPhraseVectorPtr tpv_temp(new TargetPhraseVector());
        tpv_temp->resize(maxRank);
        std::copy(tpv->begin(), tpv->begin() + maxRank, tpv_temp->begin());
        m_phraseCache[sourcePhrase] = LastUsed(clock(), tpv_temp, bitsLeft);
      } else
        m_phraseCache[sourcePhrase] = LastUsed(clock(), tpv, bitsLeft);
    }
  }

  std::pair<TargetPhraseVectorPtr, size_t> Retrieve(const Phrase
&sourcePhrase) {
#ifdef WITH_THREADS
    boost::mutex::scoped_lock lock(m_mutex);
#endif

    iterator it = m_phraseCache.find(sourcePhrase);
    if(it != m_phraseCache.end()) {
      LastUsed &lu = it->second;
      lu.m_clock = clock();
      return std::make_pair(lu.m_tpv, lu.m_bitsLeft);
    } else
      return std::make_pair(TargetPhraseVectorPtr(), 0);
  }



On 10/08/2015 08:39 PM, Marcin Junczys-Dowmunt wrote:
> How is probing-pt avoiding the same problem then?
> 
> W dniu 08.10.2015 o 21:36, Kenneth Heafield pisze:
>> There's a ton of object/malloc churn in creating Moses::TargetPhrase
>> objects, most of which are thrown away.  If PhraseDictionaryMemory
>> (which creates and keeps the objects) scales better than CompactPT,
>> that's the first thing I'd optimize.
>>
>> On 10/08/2015 08:30 PM, Marcin Junczys-Dowmunt wrote:
>>> We did quite a bit of experimenting with that, usually there is hardly
>>> any measureable quality loss until you get below 1000. Good enough for
>>> deployment systems. It seems however you can get up 0.4 BLEU increase
>>> when going really high (about 5000 and beyond) with larger distortion
>>> limits. But that's rather uninteresting for commercial applications.
>>>
>>> W dniu 08.10.2015 o 21:24, Michael Denkowski pisze:
>>>> Hi Vincent,
>>>>
>>>> That definitely helps.  I reran everything comparing the original
>>>> 2000/2000 to your suggestion of 400/400.  There isn't much difference
>>>> for a single multi-threaded instance, but there's about a 30% speedup
>>>> when using all single-threaded instances:
>>>>
>>>>               pop limit & stack
>>>> procs/threads    2000      400
>>>> 1x16             5.46     5.68
>>>> 2x8              7.58     8.70
>>>> 4x4              9.71    11.24
>>>> 8x2             12.50    15.87
>>>> 16x1            14.08    18.52
>>>>
>>>> There wasn't any degradation to BLEU/TER/Meteor but this is just one
>>>> data point and a fairly simple system.  I would be curious to see how
>>>> things work out in other users' systems.
>>>>
>>>> Best,
>>>> Michael
>>>>
>>>> On Thu, Oct 8, 2015 at 2:34 PM, Vincent Nguyen <[email protected]
>>>> <mailto:[email protected]>> wrote:
>>>>
>>>>      out of curiosity, what gain do you get with 400 for both stack and
>>>>      cube pruning ?
>>>>
>>>>
>>>>      Le 08/10/2015 20:26, Michael Denkowski a écrit :
>>>>
>>>>          Hi Vincent,
>>>>
>>>>          I'm using cube pruning with the following options for all data
>>>>          points:
>>>>
>>>>          [search-algorithm]
>>>>          1
>>>>
>>>>          [cube-pruning-deterministic-search]
>>>>          true
>>>>
>>>>          [cube-pruning-pop-limit]
>>>>          2000
>>>>
>>>>          [stack]
>>>>          2000
>>>>
>>>>          Best,
>>>>          Michael
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Moses-support mailing list
>>>> [email protected]
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected]
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
> 
> 
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
> 
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to