I know I'm shooting my own foot here, but fastutil is now ASL licensed and it provides Java Collections compliance, so it should be a drop-in replacement for former Colt collections... It is a larger library though (but then, it provides more stuff ;).
Dawid On Fri, Mar 4, 2011 at 3:38 PM, Benson Margulies <[email protected]>wrote: > I am 100% in favor of replacing the colt-based thing in there now with > something better if there is something better. I'll commit the patch, > or help test, or whatever. > > I volunteered to put lipstick on the colt-based pig because I really > wanted *some* non-GPL alternative to Trove at Apache, and the > colt-based code sitting around Mahout was the shortest path. > > > On Fri, Mar 4, 2011 at 6:11 AM, Dawid Weiss > <[email protected]> wrote: > > A while ago we discussed if it'd make sense to integrate HPPC with > Mahout, I > > remember Ted mentioned it'd be good to have some kind of comparative > > benchmark that would simply compare Mahout Math collections against HPPC. > I > > finally found the time to do it, driven by some strong updates Sebastiano > > Vinga introduced in fastutil. But first, a disclaimer :) > > > > I am not trying to impose HPPC on anybody and I'm not trying to show > which > > library is "better" or "worse" in a general sense. I believe the > benchmarks > > presented below and any conclusions that can be drawn from them apply > only > > in a marginal sense to any algorithm complex enough to be doing something > > ELSE other than stressing a given hash map implementation. I also believe > > that diversity drives progress, so it's good to share ;) > > > > So. First, I've been thinking about how to measure hash map performance > in > > any sensible (read: reasonable) application. Sebastiano suggested some > clues > > and I came up with some other ideas, but any realistic benchmarks (in the > > sense of data or algorithm) would be a welcome extension. I've > implemented > > the following benchmarks (all on int-int maps): > > > > - BenchmarkPut: injecting a lot of integers in a map, basically what one > > would do to get a unique set of keys, for example. The input ints have > > different distributions: linear, random and high-bits linear (increments > on > > higher bits, then on lower bits). > > > > - BenchmarkGetWithRemoved: injecting a lot of integers, then removing a > > fraction of them (not benchmarked), and (benchmarked) querying this map > for > > a partially matching set of keys. A realistic scenario if you have > removals > > from the map. > > > > - BenchmarkBigramCounting: calculating bigram frequencies from a real > corpus > > of texts in Polish. The input is 10MB long, so it's not a big data set, > but > > the texts are realistic and hence so are bigram distributions. > > > > I tested the following hash map implementations (int-int primitives): > > > > - fastutil (6.1.0), > > - HPPC (trunk, but without improvements resulting from these benchmarks > yet, > > so I'm not biased ;) > > - Java HashMap (on boxed Integer keys) > > - GNU Trove (3.0.0rc1), > > - Mahout Math/Collections (1.0) > > > > The bigram counting also includes PCJ and fastutil's linked hash map > > implementation for historic reasons (but these are less relevant). > > > > The benchmarks were done on a Windows7 machine with I7 CPU, detailed log > > attached, using Google Caliper. Times below are in ms. Links > > should lead you to on-line charts. Some conclusions: > > > > 1) BenchmarkPut, > > > http://dawidweiss123.appspot.com/run/[email protected]/com.carrotsearch.hppc.caliper.BenchmarkPut > > > > distribution implementation ms linear runtime > > RANDOM HPPC 77.9 ========== > > LINEAR HPPC 79.1 ========== > > HIGHBITS HPPC 78.4 ========== > > RANDOM FASTUTIL 97.4 ============ > > LINEAR FASTUTIL 101.2 ============= > > HIGHBITS FASTUTIL 101.3 ============= > > RANDOM JAVA 232.6 ============================== > > LINEAR JAVA 72.7 ========= > > HIGHBITS JAVA 184.2 ======================= > > RANDOM MAHOUT 125.9 ================ > > LINEAR MAHOUT 60.3 ======= > > HIGHBITS MAHOUT 98.4 ============ > > RANDOM TROVE 113.8 ============== > > LINEAR TROVE 43.8 ===== > > HIGHBITS TROVE 83.7 ========== > > > > I believe additional good hashing of integer keys does help to even-out > any > > weird input key distributions (and these would occur a lot in practical > > situations, I bet). fastutil uses the finalization step from MurmurHash3, > > HPPC uses a simplified full-pass over int4 bytes from MurmurHash2, but > these > > yield an equal effect (the speed difference is a result of another > thing). > > Mahout Math currently doesn't hash ints, so it'd be probably a good > addition > > (at very little overhead). > > > > 2) BenchmarkGetWithRemoved: > > > http://dawidweiss123.appspot.com/run/[email protected]/com.carrotsearch.hppc.caliper.BenchmarkGetWithRemoved > > > > %rem.keys implementation ms linear runtime > > 0 HPPC 65.73 ============ > > 0 FASTUTIL 52.63 ========== > > 0 JAVA 156.37 ============================== > > 0 TROVE 75.31 ============== > > 0 MAHOUT 85.59 ================ > > 0.25 HPPC 87.19 ================ > > 0.25 FASTUTIL 58.24 =========== > > 0.25 JAVA 149.26 ============================ > > 0.25 TROVE 123.32 ======================= > > 0.25 MAHOUT 113.77 ===================== > > 0.5 HPPC 101.70 =================== > > 0.5 FASTUTIL 55.74 ========== > > 0.5 JAVA 135.01 ========================= > > 0.5 TROVE 93.19 ================= > > 0.5 MAHOUT 125.16 ======================== > > 0.75 HPPC 93.98 ================== > > 0.75 FASTUTIL 42.20 ======== > > 0.75 JAVA 101.29 =================== > > 0.75 TROVE 76.07 ============== > > 0.75 MAHOUT 94.67 ================== > > 1 HPPC 60.52 =========== > > 1 FASTUTIL 10.78 == > > 1 JAVA 44.30 ======== > > 1 TROVE 9.54 = > > 1 MAHOUT 30.98 ===== > > > > This shows two things. The first thing is that open addressing based on > > double hashing (with REMOVED state markers) requires either automatic or > > manual compaction (rehashing) to avoid performance degradation. HPPC > doesn't > > do this because we rely on the user and so the time grows with the number > of > > removed keys: > > > > 0 HPPC 65.73 ============ > > 0.25 HPPC 87.19 ================ > > 0.5 HPPC 101.70 =================== > > 0.75 HPPC 93.98 ================== > > 1 HPPC 60.52 =========== > > > > (the drop at 100% of removed keys is probably HotSpot overlearning and > > optimizing differently). It is curious as to why Mahout's implementation > was > > so much slower than anything else, but I didn't have the time to > investigate > > yet. The only thing that comes to my mind is using modulus % for sizing > > instead of bitmasking (restricted 2^n map sizes). Also note that even > > regular Java does a fairly good job here (the number of keys was > 2,000,000, > > but even for larger inputs the results were quite similar). Also, note > > fastutil here: > > > > 0 FASTUTIL 52.63 ========== > > 0.25 FASTUTIL 58.24 =========== > > 0.5 FASTUTIL 55.74 ========== > > 0.75 FASTUTIL 42.20 ======== > > 1 FASTUTIL 10.78 == > > > > Nice. Sebastiano reimplemented hash maps to use linear addressing instead > of > > double hashing and with linear addressing you can avoid REMOVED state > > markers entirely (so no need to rehash and you'll get empty spots for > reuse > > if adding new elements). The key here is the combination of linear > > addressing AND a good key hash mangling function (theoretically, with a > > perfect hashing function, the input key distribution wouldn't affect > linear > > addressing performance at all -- no clusters or chains effect. > > > > 3) Bigrams, > > > http://dawidweiss123.appspot.com/run/[email protected]/com.carrotsearch.hppc.caliper.BenchmarkBigramCounting > > > > This is a toy example, I know, but it's the only realistic one in the > set, > > so it's probably worth mentioning. It is a little bit puzzling to me why > > HPPC is (by a small margin) better than fastutil because it really > shouldn't > > be. This isn't exactly noise (variance is given in plain text attached), > but > > I still consider this a glitch within the regular variance. > > > > You can try these on your own machine -- I would actually appreciate if > you > > sent your results and feedback back to me or this list so that we can see > > how the above holds consistent betwen architectures, processors and VMs. > > > > git clone [email protected]:carrotsearch/hppc.git > > cd hppc-others > > (cd lib/; bash ./install-deps ) > > mvn clean package > > java -server -jar hppc-others-0.4.0-SNAPSHOT/hppc-others*.jar | tee > > benchmark.log > > > > I didn't check the performance of simple lists or other structures with > > minimal overhead because I don't imagine they being significantly faster > or > > slower in any library. For these other facts play a role (I personally > like > > the fact I can access the internal buffers in HPPC, but it may against be > a > > point against using it ;). > > > > Dawid > > > >
