[jira] [Commented] (HBASE-12116) Hot contention spots; writing
[ https://issues.apache.org/jira/browse/HBASE-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496031#comment-14496031 ] Nicolas Liochon commented on HBASE-12116: - I had a look at crc a while ago. My understanding back then was that there are specific instruction in x86 processors to calculate crc, unfortunately a little bit different than the standard crc32. When I was looking at Intel/Hadoop roadmap 2 years ago, it looked like Intel was planning to do the changes in hadoop to use the hw one. There are some info here: http://www.strchr.com/crc32_popcnt Hot contention spots; writing - Key: HBASE-12116 URL: https://issues.apache.org/jira/browse/HBASE-12116 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Attachments: 12116.checkForReplicas.txt, 12116.stringify.and.cache.scanner.maxsize.txt, 12116.txt, Screen Shot 2014-09-29 at 5.12.51 PM.png, Screen Shot 2014-09-30 at 10.39.34 PM.png, Screen Shot 2015-04-13 at 2.03.05 PM.png, perf.write3.svg, perf.write4.svg Playing with flight recorder, here are some write-time contentious synchronizations/locks (picture coming) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12116) Hot contention spots; writing
[ https://issues.apache.org/jira/browse/HBASE-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493630#comment-14493630 ] stack commented on HBASE-12116: --- Esteban and I did some messing with a YCSB pure write load this afternoon. We got to a place where the four disks were saturated writing (CPU was not). Here's a few notes from the session: + Esteban found that upping ringbuffer slot count helped; default is 16k.. he was running at 512k (hbase.regionserver.wal.disruptor.event.count). Also said deferred sync helped too. + Would be good to figure how to do bigger writes so we get more on disk per op. + Adding to CSLM doPut, CRC, and compares are the big consumers of CPU (apart from G1GC) according to perf (see below), no surprise. Flight Recorder has CRC and unsafe compareto at 20% each, then CSLM#doPut at 12%. + Minors but interesting (according to FR) are ByteBloomFilter#set... could do with a tune up (the BB#gets are showing as 3.5% -- unsafe it). The murmur hashing shows up as a percent too (look at faster implementaitons -- guava?). We has ACL audit on. Thats 2% of CPU according to perf. Perf top during run. {code} 6.53% perf-18890.map [.] Ljava/util/concurrent/ConcurrentSkipListMap;.doPut 5.78% libjvm.so [.] OtherRegionsTable::add_reference(void*, int) 4.14% perf-18890.map [.] Lorg/apache/hadoop/util/PureJavaCrc32;.update 3.27% libjvm.so [.] G1RemSet::refine_card(signed char*, int, bool) 2.99% libjvm.so [.] G1ParCopyClosurefalse, (G1Barrier)2, false::copy_to_survivor_space(oopDesc*) 2.90% libjvm.so [.] G1BlockOffsetArrayContigSpace::block_start_unsafe(void const*) 2.69% libjvm.so [.] HeapRegion::oops_on_card_seq_iterate_careful(MemRegion, FilterOutOfRegionClosure*, bool, signed char*) 2.66% libc-2.12.so[.] memcpy 2.54% perf-18890.map [.] Lorg/apache/hadoop/hbase/KeyValue$KVComparator;.compare 2.54% libjvm.so [.] void G1ParCopyClosurefalse, (G1Barrier)2, false::do_oop_workunsigned int(unsigned int*) 2.09% perf-18890.map [.] jbyte_disjoint_arraycopy 1.84% libjvm.so [.] instanceKlass::oop_oop_iterate_backwards_nv(oopDesc*, G1ParScanClosure*) 1.71% libjvm.so [.] G1ParScanThreadState::trim_queue() 1.63% libjvm.so [.] SparsePRT::add_card(int, int) 1.43% libjvm.so [.] G1BlockOffsetArray::forward_to_block_containing_addr_const(HeapWord*, HeapWord*, void const*) const 1.39% perf-18890.map [.] Lorg/apache/hadoop/hbase/util/Bytes$LexicographicalComparerHolder$UnsafeComparer;.compareTo 1.35% libc-2.12.so[.] vfprintf 1.23% perf-18890.map [.] Lorg/apache/hadoop/fs/FSDataOutputStream$PositionCache;.write 1.12% libjvm.so [.] G1UpdateRSOrPushRefOopClosure::do_oop(unsigned int*) 1.09% libjvm.so [.] instanceKlass::oop_oop_iterate_nv(oopDesc*, FilterOutOfRegionClosure*) 1.09% libjvm.so [.] G1SATBCardTableModRefBS::mark_card_deferred(unsigned long) 1.04% libjvm.so [.] HeapRegionDCTOC::walk_mem_region_with_cl(MemRegion, HeapWord*, HeapWord*, OopClosure*) {code} Hot contention spots; writing - Key: HBASE-12116 URL: https://issues.apache.org/jira/browse/HBASE-12116 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Attachments: 12116.checkForReplicas.txt, 12116.stringify.and.cache.scanner.maxsize.txt, 12116.txt, Screen Shot 2014-09-29 at 5.12.51 PM.png, Screen Shot 2014-09-30 at 10.39.34 PM.png, Screen Shot 2015-04-13 at 2.03.05 PM.png, perf.write3.svg, perf.write4.svg Playing with flight recorder, here are some write-time contentious synchronizations/locks (picture coming) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12116) Hot contention spots; writing
[ https://issues.apache.org/jira/browse/HBASE-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494454#comment-14494454 ] Andrew Purtell commented on HBASE-12116: bq. Chatting w/ Esteban, its an audit CP that is run in CDH Ah, well 2% for audit is still not bad. bq. On write path ditto (more awkward here even given no BB interface at write time). Yep, but if raising the floor of supported Hadoop version on trunk, maybe we could work up something together with HDFS for a 2.0 release timeframe. bq. A hbase native lib to do native CRC and compression so not tied to a particular HDFS version would be crazy? Sure, if taking on checksumming at our layer instead of pushing off to HDFS is something we still want to do. I think it still makes sense, we trade complexity in our code for a big cut in IOPS needed for reading. Native bits in HBase should be optional and have a Java only fallback, native code is a PITA for deployment, even if compiled for the correct architecture we may fail to dlopen if native symbol versioning gets in the way. Hot contention spots; writing - Key: HBASE-12116 URL: https://issues.apache.org/jira/browse/HBASE-12116 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Attachments: 12116.checkForReplicas.txt, 12116.stringify.and.cache.scanner.maxsize.txt, 12116.txt, Screen Shot 2014-09-29 at 5.12.51 PM.png, Screen Shot 2014-09-30 at 10.39.34 PM.png, Screen Shot 2015-04-13 at 2.03.05 PM.png, perf.write3.svg, perf.write4.svg Playing with flight recorder, here are some write-time contentious synchronizations/locks (picture coming) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12116) Hot contention spots; writing
[ https://issues.apache.org/jira/browse/HBASE-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494162#comment-14494162 ] Andrew Purtell commented on HBASE-12116: bq. We has ACL audit on. Thats 2% of CPU according to perf. AccessController? Let's look, but 2-3% of perf is performance to design expectations, so that's actually good Hot contention spots; writing - Key: HBASE-12116 URL: https://issues.apache.org/jira/browse/HBASE-12116 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Attachments: 12116.checkForReplicas.txt, 12116.stringify.and.cache.scanner.maxsize.txt, 12116.txt, Screen Shot 2014-09-29 at 5.12.51 PM.png, Screen Shot 2014-09-30 at 10.39.34 PM.png, Screen Shot 2015-04-13 at 2.03.05 PM.png, perf.write3.svg, perf.write4.svg Playing with flight recorder, here are some write-time contentious synchronizations/locks (picture coming) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12116) Hot contention spots; writing
[ https://issues.apache.org/jira/browse/HBASE-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494185#comment-14494185 ] Andrew Purtell commented on HBASE-12116: bq. CRC and unsafe compareto at 20% each Assumed this is with HBase doing its own checksumming. If we know we have DFSInputStream#read variants that accept BBs (and zero copy read under the covers), then we could move all reads over to this interface, check if Hadoop's native CRC (HADOOP-10838) is available to us, and if so run it over the contents of the BBs we get from HDFS? We'd need 2.6.0 as the floor of Hadoop support so maybe do this in trunk for 2.0? Hot contention spots; writing - Key: HBASE-12116 URL: https://issues.apache.org/jira/browse/HBASE-12116 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Attachments: 12116.checkForReplicas.txt, 12116.stringify.and.cache.scanner.maxsize.txt, 12116.txt, Screen Shot 2014-09-29 at 5.12.51 PM.png, Screen Shot 2014-09-30 at 10.39.34 PM.png, Screen Shot 2015-04-13 at 2.03.05 PM.png, perf.write3.svg, perf.write4.svg Playing with flight recorder, here are some write-time contentious synchronizations/locks (picture coming) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12116) Hot contention spots; writing
[ https://issues.apache.org/jira/browse/HBASE-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494434#comment-14494434 ] stack commented on HBASE-12116: --- [~apurtell] Chatting w/ Esteban, its an audit CP that is run in CDH. Yeah on CSLM worth digging in on -- big payback especially if can save a f few compares (in // with work to compress CSLM snapshots). On CRC, have work to do in HDFS if we want to continue doing our own CRC on read path so can read blocks that have been natively CRC'd and decompressed all offheap. On write path ditto (more awkward here even given no BB interface at write time). A hbase native lib to do native CRC and compression so not tied to a particular HDFS version would be crazy?, Or have HDFS do the checksum for us again (CPU for an extra seek -- ok if SSD). Hot contention spots; writing - Key: HBASE-12116 URL: https://issues.apache.org/jira/browse/HBASE-12116 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Attachments: 12116.checkForReplicas.txt, 12116.stringify.and.cache.scanner.maxsize.txt, 12116.txt, Screen Shot 2014-09-29 at 5.12.51 PM.png, Screen Shot 2014-09-30 at 10.39.34 PM.png, Screen Shot 2015-04-13 at 2.03.05 PM.png, perf.write3.svg, perf.write4.svg Playing with flight recorder, here are some write-time contentious synchronizations/locks (picture coming) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12116) Hot contention spots; writing
[ https://issues.apache.org/jira/browse/HBASE-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494226#comment-14494226 ] Andrew Purtell commented on HBASE-12116: I know you were just using YCSB as a load generator but if playing with YCSB you might find this interesting: https://github.com/apurtell/YCSB2/tree/new_hbase_client Blog post on coordinated omission corrections in this version: http://psy-lob-saw.blogspot.co.at/2015/03/fixing-ycsb-coordinated-omission.html (TL;DR: Use -target and -p measurementtype=hdrhistogram. Optionally,-p hdrhistogram.fileoutput=true|false -p hdrhistogram.output.path=path and [HistogramLogProcessor|https://github.com/HdrHistogram/HdrHistogram/blob/master/HistogramLogProcessor] and [plotFiles|https://github.com/HdrHistogram/HdrHistogram/blob/master/GoogleChartsExample/plotFiles.html] to plot latency by percentile.) I've been meaning to try it out. Hot contention spots; writing - Key: HBASE-12116 URL: https://issues.apache.org/jira/browse/HBASE-12116 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Attachments: 12116.checkForReplicas.txt, 12116.stringify.and.cache.scanner.maxsize.txt, 12116.txt, Screen Shot 2014-09-29 at 5.12.51 PM.png, Screen Shot 2014-09-30 at 10.39.34 PM.png, Screen Shot 2015-04-13 at 2.03.05 PM.png, perf.write3.svg, perf.write4.svg Playing with flight recorder, here are some write-time contentious synchronizations/locks (picture coming) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12116) Hot contention spots; writing
[ https://issues.apache.org/jira/browse/HBASE-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494360#comment-14494360 ] Esteban Gutierrez commented on HBASE-12116: --- Good stuff [~apurtell]!. I'll try this out today. Hot contention spots; writing - Key: HBASE-12116 URL: https://issues.apache.org/jira/browse/HBASE-12116 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Attachments: 12116.checkForReplicas.txt, 12116.stringify.and.cache.scanner.maxsize.txt, 12116.txt, Screen Shot 2014-09-29 at 5.12.51 PM.png, Screen Shot 2014-09-30 at 10.39.34 PM.png, Screen Shot 2015-04-13 at 2.03.05 PM.png, perf.write3.svg, perf.write4.svg Playing with flight recorder, here are some write-time contentious synchronizations/locks (picture coming) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12116) Hot contention spots; writing
[ https://issues.apache.org/jira/browse/HBASE-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14494176#comment-14494176 ] Andrew Purtell commented on HBASE-12116: bq. CSLM#doPut at 12% Revive the CSLM alternative JIRAs? (HBASE-3484, HBASE-3993 (+subtasks), HBASE-10713) Hot contention spots; writing - Key: HBASE-12116 URL: https://issues.apache.org/jira/browse/HBASE-12116 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Attachments: 12116.checkForReplicas.txt, 12116.stringify.and.cache.scanner.maxsize.txt, 12116.txt, Screen Shot 2014-09-29 at 5.12.51 PM.png, Screen Shot 2014-09-30 at 10.39.34 PM.png, Screen Shot 2015-04-13 at 2.03.05 PM.png, perf.write3.svg, perf.write4.svg Playing with flight recorder, here are some write-time contentious synchronizations/locks (picture coming) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12116) Hot contention spots; writing
[ https://issues.apache.org/jira/browse/HBASE-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14154418#comment-14154418 ] Anoop Sam John commented on HBASE-12116: On the TRT patch For the 1st time call on includeTimestamp () we need set the minimumTimestamp and maximumTimestamp to the give ts no? If the includeTimestamp () call happens in a decreasing ts way you can see the min value will get changed all the time and the max will never change and be at -1 always ! Hot contention spots; writing - Key: HBASE-12116 URL: https://issues.apache.org/jira/browse/HBASE-12116 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Attachments: 12116.checkForReplicas.txt, 12116.stringify.and.cache.scanner.maxsize.txt, 12116.txt, Screen Shot 2014-09-29 at 5.12.51 PM.png, Screen Shot 2014-09-30 at 10.39.34 PM.png Playing with flight recorder, here are some write-time contentious synchronizations/locks (picture coming) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12116) Hot contention spots; writing
[ https://issues.apache.org/jira/browse/HBASE-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14155109#comment-14155109 ] stack commented on HBASE-12116: --- Thanks for review [~anoop.hbase]. It seems highly unlikely timestamps would go in reverse but let me fix. On TRT, [~mbertozzi] suggests doing below to cut down on need for synchronization: {code} public void includeTimestamp(final long timestamp) { if (maximumTimestamp == -1) { synchronized (this) { if (maximumTimestamp == -1) { minimumTimestamp = timestamp; maximumTimestamp = timestamp; } } } else if (minimumTimestamp timestamp) { synchronized (this) { if (minimumTimestamp timestamp) { minimumTimestamp = timestamp; } } } else if (maximumTimestamp timestamp) { synchronized (this) { if (maximumTimestamp timestamp) { maximumTimestamp = timestamp; } } } return; } {code} Let me measure it. Let me make a subissue and measure. Hot contention spots; writing - Key: HBASE-12116 URL: https://issues.apache.org/jira/browse/HBASE-12116 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Attachments: 12116.checkForReplicas.txt, 12116.stringify.and.cache.scanner.maxsize.txt, 12116.txt, Screen Shot 2014-09-29 at 5.12.51 PM.png, Screen Shot 2014-09-30 at 10.39.34 PM.png Playing with flight recorder, here are some write-time contentious synchronizations/locks (picture coming) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12116) Hot contention spots; writing
[ https://issues.apache.org/jira/browse/HBASE-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14152857#comment-14152857 ] stack commented on HBASE-12116: --- [~eclark] Sample was for 5 minutes. There is a small bit of getting values from configuration which does getProperty but main culprit here is not very interesting... its stringifying exceptions (Was throwing loads of NSRE around sample time). Let me get a better sampling. This tool seems to do better job reporting CPU use, or at least, agrees that CRC is the main consumer of CPU as per our flame graph experiments. Hopefully its good reporting contention. Has a dtrace plugin. Let me see if can get that to work Hot contention spots; writing - Key: HBASE-12116 URL: https://issues.apache.org/jira/browse/HBASE-12116 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Attachments: Screen Shot 2014-09-29 at 5.12.51 PM.png Playing with flight recorder, here are some write-time contentious synchronizations/locks (picture coming) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12116) Hot contention spots; writing
[ https://issues.apache.org/jira/browse/HBASE-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14154407#comment-14154407 ] Anoop Sam John commented on HBASE-12116: As we work with TRT here Stack {code} /** * Update the current TimestampRange to include the timestamp from Key. * If the Key is of type DeleteColumn or DeleteFamily, it includes the * entire time range from 0 to timestamp of the key. * @param key */ public void includeTimestamp(final byte[] key) { includeTimestamp(Bytes.toLong(key,key.length-KeyValue.TIMESTAMP_TYPE_SIZE)); int type = key[key.length - 1]; if (type == Type.DeleteColumn.getCode() || type == Type.DeleteFamily.getCode()) { includeTimestamp(0); } } {code} This above can be removed. This is not used by any one and it expects the KV style key format. Hot contention spots; writing - Key: HBASE-12116 URL: https://issues.apache.org/jira/browse/HBASE-12116 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Attachments: 12116.checkForReplicas.txt, 12116.stringify.and.cache.scanner.maxsize.txt, 12116.txt, Screen Shot 2014-09-29 at 5.12.51 PM.png, Screen Shot 2014-09-30 at 10.39.34 PM.png Playing with flight recorder, here are some write-time contentious synchronizations/locks (picture coming) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12116) Hot contention spots; writing
[ https://issues.apache.org/jira/browse/HBASE-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14152581#comment-14152581 ] Elliott Clark commented on HBASE-12116: --- I assume that properties is coming from hadoop's config ? Hot contention spots; writing - Key: HBASE-12116 URL: https://issues.apache.org/jira/browse/HBASE-12116 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Attachments: Screen Shot 2014-09-29 at 5.12.51 PM.png Playing with flight recorder, here are some write-time contentious synchronizations/locks (picture coming) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12116) Hot contention spots; writing
[ https://issues.apache.org/jira/browse/HBASE-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14152726#comment-14152726 ] Andrew Purtell commented on HBASE-12116: bq. I assume that properties is coming from hadoop's config ? If so see also HBASE-12117 Hot contention spots; writing - Key: HBASE-12116 URL: https://issues.apache.org/jira/browse/HBASE-12116 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Attachments: Screen Shot 2014-09-29 at 5.12.51 PM.png Playing with flight recorder, here are some write-time contentious synchronizations/locks (picture coming) -- This message was sent by Atlassian JIRA (v6.3.4#6332)