Hi,
Two potential bugs I found trolling the source code.
Bug 1: FName::init in FeatureVector.cpp can renumber features in a race
condition.
void FName::init(const string& name) {
#ifdef WITH_THREADS
//reader lock
boost::shared_lock<boost::shared_mutex> lock(m_idLock);
#endif
Name2Id::iterator i = name2id.find(name);
if (i != name2id.end()) {
m_id = i->second;
} else {
#ifdef WITH_THREADS
//release the reader lock, and upgrade to writer lock
lock.unlock();
boost::unique_lock<boost::shared_mutex> write_lock(m_idLock);
#endif
//Need to check again if the id is in the map, as someone may
have added
//it while we were waiting on the writer lock.
if (i != name2id.end()) {
m_id = i->second;
} else {
m_id = name2id.size();
name2id[name] = m_id;
id2name.push_back(name);
}
}
}
The value of i does not change because some locking is done. It will
still be name2id.end(). Therefore, the code m_id = i->second will never
never be executed.
Bug 2: I'm not sure what alreadyScored in
moses/GlobalLexicalModelUnlimited.cpp does, but it looks like the
intended use isn't being met.
void GlobalLexicalModelUnlimited::AddFeature(ScoreComponentCollection*
accumulator, StringHash alreadyScored, string sourceTrigger, string
sourceWord, string targetTrigger, string targetWord) const {
// snip bunch of code.
alreadyScored[sourceWord] = 1;
}
The line alreadyScored[sourceWord] = 1 does nothing useful because the
parameter alreadyScored was passed by value. Note:
typedef std::map< std::string, short > StringHash;
I think it was meant to be passed by reference.
Finally, a performance rant. A lot of these features are using maps
whose key is std::string. Instead, the features should use
FactorCollection to intern their strings then do pointer comparison
inside Evaluate.
Kenneth
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support