Hi devs, We are having a nice hackaton today at Cloudera's offices down at Palo Alto. There are 30+ ppl who showed up, including most of the committers. In the morning, there were some discussions related to recent issues. Here are my notes:
JD - hypertable performance comparison - their tuning is wrong - JD tested both same hypertable numbers, hbase tests finished, hbase slow, - first do a lot splits, then slow the splits. - compactions are smarter for hypertable - smaller memstore is faster, as it fills up, it gets slower - client does not wait for flush commits, does that async. JD used async client for getting comparable numbers Matt - hotpads - talked about prefix compression, trie data encoding (HBASE-4676) - went over the chart in the jira ticket - random reads, bigger block sizes - does not work very well for md5 prefixed keys, you should partition using a single byte - write speed is affected (order of magnitude slower for write compared to None encoding), see attached pdf in the ticket - a lot of improvement options for the key-value heap/block cache/encoding internal APIs Todd - performance - demo of oprofile, ycsb test - uses hw counters, shows actual CPU clocks, L1, L2 cache hits/misses, etc. Use a custom jvm agent for profiling java - crc32 from hadoop libzip, URI, KeyValue comparator, etc Jimmy- pb - remanining things: coprocessors, rpc engine, meta table, some minor things - we should not expose too much rpc internals into coprocessors, and make it not too difficult - continue discussion on jira Jesse - mvn modules - cross module dependencies should be eliminated - hbase-server, hbase-client, hbase-shared at lower level, we should think about mini-cluster Lars, durable sync - hflush / hsync - hacky flush blocks on close mode - disk io is bursty as it is, we should smooth it out - maybe do it per column family configurable David - testing - rc testing - aggregate tests results in a wiki or smt for each rc - binary/ source release issues - need to recompile hbase with hadoop 1,2. jenkins build for each. - 0.96, hadoop-1 and hadoop-2 - compatibility tessts, we do not have any, we can add it to checklist Andrew - async hbase - build sync client on top of async Jesse - snaphots go around the room for integrations Huddle groups for topics above Keep hacking, Enis
