Hi Rico, no I didn't. I chose sorted vectors because of the memory footprint. Before it would use more than ten GB for caching for a 10M corpus, now around one or two GB. I remember reading somewhere that boost::unordered_set has even a worse overhead than std::set concerning memory. The speed-up is just a nice side-effect, although I believe the main speed problem was the intersection function.
Thanks for the feedback and you fix. Best, Marcin W dniu 22.08.2012 16:29, Rico Sennrich pisze: > Hi Marcin, > > there was a segfault with hierarchical models, but it's working now. > Have you compared your speedup with that of using unordered_set instead > of set? > > best, > Rico > >> Date: Wed, 22 Aug 2012 12:05:51 +0200 >> From: Marcin Junczys-Dowmunt <[email protected]> >> Subject: [Moses-support] Phrase table pruning, filter-pt.cpp >> To: [email protected] >> Message-ID: <[email protected]> >> Content-Type: text/plain; charset=UTF-8; format=flowed >> >> Hi, >> I've just pushed a new version of filter-pt.cpp to master. Should be >> about 2-3 times faster and use way less memory (just replaced sets with >> sorted vectors). I did not check it with hierarchical models since I do >> not use them. Could someone take a look if everything is working for those? >> >> Cheers, >> Marcin > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
