[jira] [Commented] (HBASE-13250) chown of ExportSnapshot does not cover all path and files
[ https://issues.apache.org/jira/browse/HBASE-13250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791615#comment-14791615 ] Hudson commented on HBASE-13250: FAILURE: Integrated in HBase-0.98 #1125 (See [https://builds.apache.org/job/HBase-0.98/1125/]) HBASE-13250 Revert due to compilation error against hadoop-1 profile (tedyu: rev 38995fbd51ac4735b673dd1527cb2631b69b7474) * hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/ExportSnapshot.java HBASE-13250 chown of ExportSnapshot does not cover all path and files (He Liangliang) (tedyu: rev 88a620892883ac878bde3ea3c64c7275600b7085) * hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/ExportSnapshot.java > chown of ExportSnapshot does not cover all path and files > - > > Key: HBASE-13250 > URL: https://issues.apache.org/jira/browse/HBASE-13250 > Project: HBase > Issue Type: Bug >Reporter: He Liangliang >Assignee: He Liangliang >Priority: Critical > Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.15, 1.0.3, 1.1.3 > > Attachments: 13250-0.98-v2.txt, HBASE-13250-V0.patch > > > The chuser/chgroup function only covers the leaf hfile. The ownership of > hfile parent paths and snapshot reference files are not changed as expected. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14278) Fix NPE that is showing up since HBASE-14274 went in
[ https://issues.apache.org/jira/browse/HBASE-14278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791628#comment-14791628 ] Hudson commented on HBASE-14278: FAILURE: Integrated in HBase-TRUNK #6817 (See [https://builds.apache.org/job/HBase-TRUNK/6817/]) HBASE-14278 Fix NPE that is showing up since HBASE-14274 went in (eclark: rev c1ac4bb8601f88eb3fe246eb62c3f40e95faf93d) * hbase-hadoop2-compat/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegionSourceImpl.java * hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java * hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java * hbase-hadoop2-compat/src/main/java/org/apache/hadoop/metrics2/impl/JmxCacheBuster.java * hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSourceImpl.java > Fix NPE that is showing up since HBASE-14274 went in > > > Key: HBASE-14278 > URL: https://issues.apache.org/jira/browse/HBASE-14278 > Project: HBase > Issue Type: Sub-task > Components: test >Affects Versions: 2.0.0, 1.2.0, 1.3.0 >Reporter: stack >Assignee: Elliott Clark > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-14278-v1.patch, HBASE-14278-v2.patch, > HBASE-14278-v3.patch, HBASE-14278-v4.patch, HBASE-14278-v5.patch, > HBASE-14278.patch > > > Saw this in TestDistributedLogSplitting after HBASE-14274 was applied. > {code} > 119113 2015-08-20 15:31:10,704 WARN [HBase-Metrics2-1] > impl.MetricsConfig(124): Cannot locate configuration: tried > hadoop-metrics2-hbase.properties,hadoop-metrics2.properties > 119114 2015-08-20 15:31:10,710 ERROR [HBase-Metrics2-1] > lib.MethodMetric$2(118): Error invoking method getBlocksTotal > 119115 java.lang.reflect.InvocationTargetException > 119116 › at sun.reflect.GeneratedMethodAccessor72.invoke(Unknown Source) > 119117 › at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 119118 › at java.lang.reflect.Method.invoke(Method.java:606) > 119119 › at > org.apache.hadoop.metrics2.lib.MethodMetric$2.snapshot(MethodMetric.java:111) > 119120 › at > org.apache.hadoop.metrics2.lib.MethodMetric.snapshot(MethodMetric.java:144) > 119121 › at > org.apache.hadoop.metrics2.lib.MetricsRegistry.snapshot(MetricsRegistry.java:387) > 119122 › at > org.apache.hadoop.metrics2.lib.MetricsSourceBuilder$1.getMetrics(MetricsSourceBuilder.java:79) > 119123 › at > org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195) > 119124 › at > org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172) > 119125 › at > org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:151) > 119126 › at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333) > 119127 › at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:319) > 119128 › at > com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522) > 119129 › at org.apache.hadoop.metrics2.util.MBeans.register(MBeans.java:57) > 119130 › at > org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.startMBeans(MetricsSourceAdapter.java:221) > 119131 › at > org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.start(MetricsSourceAdapter.java:96) > 119132 › at > org.apache.hadoop.metrics2.impl.MetricsSystemImpl.registerSource(MetricsSystemImpl.java:245) > 119133 › at > org.apache.hadoop.metrics2.impl.MetricsSystemImpl$1.postStart(MetricsSystemImpl.java:229) > 119134 › at sun.reflect.GeneratedMethodAccessor50.invoke(Unknown Source) > 119135 › at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 119136 › at java.lang.reflect.Method.invoke(Method.java:606) > 119137 › at > org.apache.hadoop.metrics2.impl.MetricsSystemImpl$3.invoke(MetricsSystemImpl.java:290) > 119138 › at com.sun.proxy.$Proxy13.postStart(Unknown Source) > 119139 › at > org.apache.hadoop.metrics2.impl.MetricsSystemImpl.start(MetricsSystemImpl.java:185) > 119140 › at > org.apache.hadoop.metrics2.impl.JmxCacheBuster$JmxCacheBusterRunnable.run(JmxCacheBuster.java:81) > 119141 › at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > 119142 › at java.util.concurrent.FutureTask.run(FutureTask.java:262) > 119143 › at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) > 119144 › at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) >
[jira] [Commented] (HBASE-14082) Add replica id to JMX metrics names
[ https://issues.apache.org/jira/browse/HBASE-14082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791629#comment-14791629 ] Hudson commented on HBASE-14082: FAILURE: Integrated in HBase-TRUNK #6817 (See [https://builds.apache.org/job/HBase-TRUNK/6817/]) HBASE-14082 Add replica id to JMX metrics names (Lei Chen) (enis: rev 17bdf9fa8cbe920578c09c38960dd0450746fe5c) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapperImpl.java * hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapper.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegion.java * hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java * hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSource.java * hbase-hadoop2-compat/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegionSourceImpl.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapperStub.java > Add replica id to JMX metrics names > --- > > Key: HBASE-14082 > URL: https://issues.apache.org/jira/browse/HBASE-14082 > Project: HBase > Issue Type: Improvement > Components: metrics >Reporter: Lei Chen >Assignee: Lei Chen > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: 14082-v6.patch, HBASE-14082-v1.patch, > HBASE-14082-v2.patch, HBASE-14082-v3.patch, HBASE-14082-v4.patch, > HBASE-14082-v5.patch > > > Today, via JMX, one cannot distinguish a primary region from a replica. A > possible solution is to add replica id to JMX metrics names. The benefits may > include, for example: > # Knowing the latency of a read request on a replica region means the first > attempt to the primary region has timeout. > # Write requests on replicas are due to the replication process, while the > ones on primary are from clients. > # In case of looking for hot spots of read operations, replicas should be > excluded since TIMELINE reads are sent to all replicas. > To implement, we can change the format of metrics names found at > {code}Hadoop->HBase->RegionServer->Regions->Attributes{code} > from > {code}namespace__table__region__metric_{code} > to > {code}namespace__table__region__replicaid__metric_{code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14274) Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs MetricsRegionAggregateSourceImpl
[ https://issues.apache.org/jira/browse/HBASE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791630#comment-14791630 ] Hudson commented on HBASE-14274: FAILURE: Integrated in HBase-TRUNK #6817 (See [https://builds.apache.org/job/HBase-TRUNK/6817/]) HBASE-14278 Fix NPE that is showing up since HBASE-14274 went in (eclark: rev c1ac4bb8601f88eb3fe246eb62c3f40e95faf93d) * hbase-hadoop2-compat/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegionSourceImpl.java * hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java * hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java * hbase-hadoop2-compat/src/main/java/org/apache/hadoop/metrics2/impl/JmxCacheBuster.java * hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSourceImpl.java > Deadlock in region metrics on shutdown: MetricsRegionSourceImpl vs > MetricsRegionAggregateSourceImpl > --- > > Key: HBASE-14274 > URL: https://issues.apache.org/jira/browse/HBASE-14274 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: Elliott Clark > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: 14274-addendum.txt, 23612.stack, HBASE-14274-v1.patch, > HBASE-14274.patch > > > Looking into parent issue, got a hang locally of TestDistributedLogReplay. > We have region closes here: > {code} > "RS_CLOSE_META-localhost:59610-0" prio=5 tid=0x7ff65c03f800 nid=0x54347 > waiting on condition [0x00011f7ac000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00075636d8c0> (a > java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197) > at > java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945) > at > org.apache.hadoop.hbase.regionserver.MetricsRegionAggregateSourceImpl.deregister(MetricsRegionAggregateSourceImpl.java:78) > at > org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.close(MetricsRegionSourceImpl.java:120) > at > org.apache.hadoop.hbase.regionserver.MetricsRegion.close(MetricsRegion.java:41) > at > org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1500) > at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344) > - locked <0x0007ff878190> (a java.lang.Object) > at > org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:102) > at > org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:103) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > {code} > They are trying to MetricsRegionAggregateSourceImpl.deregister. They want to > get a write lock on this classes local ReentrantReadWriteLock while holding > MetricsRegionSourceImpl's readWriteLock write lock. > Then, elsewhere the JmxCacheBuster is running trying to get metrics with > above locks held in reverse: > {code} > "HBase-Metrics2-1" daemon prio=5 tid=0x7ff65e14b000 nid=0x59a03 waiting > on condition [0x000140ea5000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x0007cade1480> (a > java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282) > at > java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731) > at > org.apache.hadoop.hbase.regionserver.MetricsRegionSourceImpl.snapshot(MetricsRegionSourceImpl.java:193) > at >
[jira] [Commented] (HBASE-14411) Fix unit test failures when using multiwal as default WAL provider
[ https://issues.apache.org/jira/browse/HBASE-14411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791650#comment-14791650 ] Yu Li commented on HBASE-14411: --- >From the [testReport | >https://builds.apache.org/job/PreCommit-HBASE-Build/15614//testReport/org.apache.hadoop.hbase.regionserver/TestWALLockup/testLockupWhenSyncInMiddleOfZigZagSetup/], > failure of the case should be caused by intermittent env issue, below is the >exception thrown in TestWALLockup: {noformat} Caused by: java.io.IOException: FAKE! Failed to replace a bad datanode...APPEND at org.apache.hadoop.hbase.regionserver.TestWALLockup$1DodgyFSLog$1.append(TestWALLockup.java:173) at org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.append(FSHLog.java:1880) at org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1748) {noformat} Thanks [~eclark] for the attention, and [~tedyu] for help taking a look. > Fix unit test failures when using multiwal as default WAL provider > -- > > Key: HBASE-14411 > URL: https://issues.apache.org/jira/browse/HBASE-14411 > Project: HBase > Issue Type: Bug >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-14411.branch-1.patch, HBASE-14411.patch, > HBASE-14411_v2.patch > > > If we set hbase.wal.provider to multiwal in > hbase-server/src/test/resources/hbase-site.xml which allows us to use > BoundedRegionGroupingProvider in UT, we will observe below failures in > current code base: > {noformat} > Failed tests: > TestHLogRecordReader>TestWALRecordReader.testPartialRead:164 expected:<1> > but was:<2> > TestHLogRecordReader>TestWALRecordReader.testWALRecordReader:216 > expected:<2> but was:<3> > TestWALRecordReader.testPartialRead:164 expected:<1> but was:<2> > TestWALRecordReader.testWALRecordReader:216 expected:<2> but was:<3> > TestDistributedLogSplitting.testRecoveredEdits:276 edits dir should have > more than a single file in it. instead has 1 > TestAtomicOperation.testMultiRowMutationMultiThreads:499 expected:<0> but > was:<1> > TestHRegionServerBulkLoad.testAtomicBulkLoad:307 > Expected: is > but: was > TestLogRolling.testCompactionRecordDoesntBlockRolling:611 Should have WAL; > one table is not flushed expected:<1> but was:<0> > TestLogRolling.testLogRollOnDatanodeDeath:359 null > TestLogRolling.testLogRollOnPipelineRestart:472 Missing datanode should've > triggered a log roll > TestReplicationSourceManager.testLogRoll:237 expected:<6> but was:<7> > TestReplicationWALReaderManager.test:155 null > TestReplicationWALReaderManager.test:155 null > TestReplicationWALReaderManager.test:155 null > TestReplicationWALReaderManager.test:155 null > TestReplicationWALReaderManager.test:155 null > TestReplicationWALReaderManager.test:155 null > TestReplicationWALReaderManager.test:155 null > TestReplicationWALReaderManager.test:155 null > TestWALSplit.testCorruptedLogFilesSkipErrorsFalseDoesNotTouchLogs:594 if > skip.errors is false all files should remain in place expected:<11> but > was:<12> > TestWALSplit.testLogsGetArchivedAfterSplit:649 wrong number of files in the > archive log expected:<11> but was:<12> > TestWALSplit.testMovedWALDuringRecovery:810->retryOverHdfsProblem:793 > expected:<11> but was:<12> > TestWALSplit.testRetryOpenDuringRecovery:838->retryOverHdfsProblem:793 > expected:<11> but was:<12> > > TestWALSplitCompressed>TestWALSplit.testCorruptedLogFilesSkipErrorsFalseDoesNotTouchLogs:594 > if skip.errors is false all files should remain in place expected:<11> but > was:<12> > TestWALSplitCompressed>TestWALSplit.testLogsGetArchivedAfterSplit:649 wrong > number of files in the archive log expected:<11> but was:<12> > > TestWALSplitCompressed>TestWALSplit.testMovedWALDuringRecovery:810->TestWALSplit.retryOverHdfsProblem:793 > expected:<11> but was:<12> > > TestWALSplitCompressed>TestWALSplit.testRetryOpenDuringRecovery:838->TestWALSplit.retryOverHdfsProblem:793 > expected:<11> but was:<12> > {noformat} > While patch for HBASE-14306 could resolve failures of TestHLogRecordReader, > TestReplicationSourceManager and TestReplicationWALReaderManager, this JIRA > will focus on resolving the others -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14082) Add replica id to JMX metrics names
[ https://issues.apache.org/jira/browse/HBASE-14082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791641#comment-14791641 ] Hudson commented on HBASE-14082: SUCCESS: Integrated in HBase-1.2-IT #153 (See [https://builds.apache.org/job/HBase-1.2-IT/153/]) HBASE-14082 Add replica id to JMX metrics names (Lei Chen) (enis: rev 9f420d0ac6175a7245efe68c27fc32458bca1b86) * hbase-hadoop2-compat/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegionSourceImpl.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapperImpl.java * hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java * hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapper.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapperStub.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegion.java * hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSource.java > Add replica id to JMX metrics names > --- > > Key: HBASE-14082 > URL: https://issues.apache.org/jira/browse/HBASE-14082 > Project: HBase > Issue Type: Improvement > Components: metrics >Reporter: Lei Chen >Assignee: Lei Chen > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: 14082-v6.patch, HBASE-14082-v1.patch, > HBASE-14082-v2.patch, HBASE-14082-v3.patch, HBASE-14082-v4.patch, > HBASE-14082-v5.patch > > > Today, via JMX, one cannot distinguish a primary region from a replica. A > possible solution is to add replica id to JMX metrics names. The benefits may > include, for example: > # Knowing the latency of a read request on a replica region means the first > attempt to the primary region has timeout. > # Write requests on replicas are due to the replication process, while the > ones on primary are from clients. > # In case of looking for hot spots of read operations, replicas should be > excluded since TIMELINE reads are sent to all replicas. > To implement, we can change the format of metrics names found at > {code}Hadoop->HBase->RegionServer->Regions->Attributes{code} > from > {code}namespace__table__region__metric_{code} > to > {code}namespace__table__region__replicaid__metric_{code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14447) Spark tests failing: bind exception when putting up info server
[ https://issues.apache.org/jira/browse/HBASE-14447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791746#comment-14791746 ] Hadoop QA commented on HBASE-14447: --- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12756416/14447.patch against master branch at commit 17bdf9fa8cbe920578c09c38960dd0450746fe5c. ATTACHMENT ID: 12756416 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0 2.7.1) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/15627//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/15627//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/15627//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/15627//console This message is automatically generated. > Spark tests failing: bind exception when putting up info server > --- > > Key: HBASE-14447 > URL: https://issues.apache.org/jira/browse/HBASE-14447 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack >Priority: Minor > Attachments: 14447.patch > > > Go tthis: > {code} > Running org.apache.hadoop.hbase.spark.TestJavaHBaseContext > Tests run: 8, Failures: 0, Errors: 8, Skipped: 0, Time elapsed: 540.875 sec > <<< FAILURE! - in org.apache.hadoop.hbase.spark.TestJavaHBaseContext > testBulkDelete(org.apache.hadoop.hbase.spark.TestJavaHBaseContext) Time > elapsed: 540.647 sec <<< ERROR! > java.lang.RuntimeException: java.io.IOException: Shutting down > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at > org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) > at > org.apache.hadoop.hbase.http.HttpServer.openListeners(HttpServer.java:1012) > at org.apache.hadoop.hbase.http.HttpServer.start(HttpServer.java:953) > at org.apache.hadoop.hbase.http.InfoServer.start(InfoServer.java:91) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.putUpWebUI(HRegionServer.java:1788) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:603) > at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:367) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:139) > at > org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:218) > at > org.apache.hadoop.hbase.LocalHBaseCluster.(LocalHBaseCluster.java:154) > at > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:214) > at > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:94) > at >
[jira] [Created] (HBASE-14448) Refine RegionGroupingProvider Phase-2: remove provider nesting and formalize wal group name
Yu Li created HBASE-14448: - Summary: Refine RegionGroupingProvider Phase-2: remove provider nesting and formalize wal group name Key: HBASE-14448 URL: https://issues.apache.org/jira/browse/HBASE-14448 Project: HBase Issue Type: Improvement Reporter: Yu Li Assignee: Yu Li Now we are nesting DefaultWALProvider inside RegionGroupingProvider, which makes the logic ambiguous since a "provider" itself should provide logs. Suggest to directly instantiate FSHlog in RegionGroupingProvider. W.r.t wal group name, now in RegionGroupingProvider it's using sth like "-null-" which is quite long and unnecessary. Suggest to directly use ".". For more details, please refer to the initial patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14082) Add replica id to JMX metrics names
[ https://issues.apache.org/jira/browse/HBASE-14082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791758#comment-14791758 ] Hudson commented on HBASE-14082: FAILURE: Integrated in HBase-1.2 #181 (See [https://builds.apache.org/job/HBase-1.2/181/]) HBASE-14082 Add replica id to JMX metrics names (Lei Chen) (enis: rev 9f420d0ac6175a7245efe68c27fc32458bca1b86) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapperStub.java * hbase-hadoop2-compat/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegionSourceImpl.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegion.java * hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapper.java * hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSource.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapperImpl.java * hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java > Add replica id to JMX metrics names > --- > > Key: HBASE-14082 > URL: https://issues.apache.org/jira/browse/HBASE-14082 > Project: HBase > Issue Type: Improvement > Components: metrics >Reporter: Lei Chen >Assignee: Lei Chen > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: 14082-v6.patch, HBASE-14082-v1.patch, > HBASE-14082-v2.patch, HBASE-14082-v3.patch, HBASE-14082-v4.patch, > HBASE-14082-v5.patch > > > Today, via JMX, one cannot distinguish a primary region from a replica. A > possible solution is to add replica id to JMX metrics names. The benefits may > include, for example: > # Knowing the latency of a read request on a replica region means the first > attempt to the primary region has timeout. > # Write requests on replicas are due to the replication process, while the > ones on primary are from clients. > # In case of looking for hot spots of read operations, replicas should be > excluded since TIMELINE reads are sent to all replicas. > To implement, we can change the format of metrics names found at > {code}Hadoop->HBase->RegionServer->Regions->Attributes{code} > from > {code}namespace__table__region__metric_{code} > to > {code}namespace__table__region__replicaid__metric_{code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14082) Add replica id to JMX metrics names
[ https://issues.apache.org/jira/browse/HBASE-14082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791802#comment-14791802 ] Hudson commented on HBASE-14082: SUCCESS: Integrated in HBase-1.3-IT #164 (See [https://builds.apache.org/job/HBase-1.3-IT/164/]) HBASE-14082 Add replica id to JMX metrics names (Lei Chen) (enis: rev bb4a690b79a2485d24aa84b9635b7fea0ff6b0d4) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapperStub.java * hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java * hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSource.java * hbase-hadoop2-compat/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegionSourceImpl.java * hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapper.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegion.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapperImpl.java > Add replica id to JMX metrics names > --- > > Key: HBASE-14082 > URL: https://issues.apache.org/jira/browse/HBASE-14082 > Project: HBase > Issue Type: Improvement > Components: metrics >Reporter: Lei Chen >Assignee: Lei Chen > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: 14082-v6.patch, HBASE-14082-v1.patch, > HBASE-14082-v2.patch, HBASE-14082-v3.patch, HBASE-14082-v4.patch, > HBASE-14082-v5.patch > > > Today, via JMX, one cannot distinguish a primary region from a replica. A > possible solution is to add replica id to JMX metrics names. The benefits may > include, for example: > # Knowing the latency of a read request on a replica region means the first > attempt to the primary region has timeout. > # Write requests on replicas are due to the replication process, while the > ones on primary are from clients. > # In case of looking for hot spots of read operations, replicas should be > excluded since TIMELINE reads are sent to all replicas. > To implement, we can change the format of metrics names found at > {code}Hadoop->HBase->RegionServer->Regions->Attributes{code} > from > {code}namespace__table__region__metric_{code} > to > {code}namespace__table__region__replicaid__metric_{code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14425) In Secure Zookeeper cluster superuser will not have sufficient permission if multiple values are configured in "hbase.superuser"
[ https://issues.apache.org/jira/browse/HBASE-14425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791839#comment-14791839 ] Rakesh R commented on HBASE-14425: -- ZooKeeper's authentication framework is pluggable. It provides set of built in schemes. ZooKeeper's built in ACL schemes are set on a per-user basis rather than based on user group. If there are 'n' users in the group, we need to manually add each authenticated user one by one. Please refer [ZooKeeperAccessControl |http://zookeeper.apache.org/doc/r3.4.6/zookeeperProgrammers.html#sc_ZooKeeperAccessControl] section to understand more. For example, ZK auth scheme allows us to have multiple authorized users to access a single znode say "/path" with the different username and password combination. Say we have 3 users: {code} username : password user_123 : pwd_123 user_456 : pwd_456 user_789 : pwd_789 {code} It needs to explicitly set all these users to the "/path" as list of ACL entries. Does this answer your question? > In Secure Zookeeper cluster superuser will not have sufficient permission if > multiple values are configured in "hbase.superuser" > > > Key: HBASE-14425 > URL: https://issues.apache.org/jira/browse/HBASE-14425 > Project: HBase > Issue Type: Bug >Reporter: Pankaj Kumar >Assignee: Pankaj Kumar > Fix For: 2.0.0 > > Attachments: HBASE-14425.patch > > > During master intialization we are setting ACLs for the znodes. > In ZKUtil.createACL(ZooKeeperWatcher zkw, String node, boolean > isSecureZooKeeper), > {code} > String superUser = zkw.getConfiguration().get("hbase.superuser"); > ArrayList acls = new ArrayList(); > // add permission to hbase supper user > if (superUser != null) { > acls.add(new ACL(Perms.ALL, new Id("auth", superUser))); > } > {code} > Here we are directly setting "hbase.superuser" value to Znode which will > cause an issue when multiple values are configured. In "hbase.superuser" > multiple superusers and supergroups can be configured separated by comma. We > need to iterate them and set ACL. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13250) chown of ExportSnapshot does not cover all path and files
[ https://issues.apache.org/jira/browse/HBASE-13250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802689#comment-14802689 ] Hudson commented on HBASE-13250: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #1079 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/1079/]) HBASE-13250 chown of ExportSnapshot does not cover all path and files (He Liangliang) (tedyu: rev 88a620892883ac878bde3ea3c64c7275600b7085) * hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/ExportSnapshot.java > chown of ExportSnapshot does not cover all path and files > - > > Key: HBASE-13250 > URL: https://issues.apache.org/jira/browse/HBASE-13250 > Project: HBase > Issue Type: Bug >Reporter: He Liangliang >Assignee: He Liangliang >Priority: Critical > Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.15, 1.0.3, 1.1.3 > > Attachments: 13250-0.98-v2.txt, HBASE-13250-V0.patch > > > The chuser/chgroup function only covers the leaf hfile. The ownership of > hfile parent paths and snapshot reference files are not changed as expected. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14447) Spark tests failing: bind exception when putting up info server
[ https://issues.apache.org/jira/browse/HBASE-14447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-14447: -- Resolution: Fixed Fix Version/s: 1.3.0 1.2.0 2.0.0 Status: Resolved (was: Patch Available) Committing a test-only 'duh!' small patch to branch-1.2+ The six master build instances all passed on my rig last night. > Spark tests failing: bind exception when putting up info server > --- > > Key: HBASE-14447 > URL: https://issues.apache.org/jira/browse/HBASE-14447 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack >Priority: Minor > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: 14447.patch > > > Go tthis: > {code} > Running org.apache.hadoop.hbase.spark.TestJavaHBaseContext > Tests run: 8, Failures: 0, Errors: 8, Skipped: 0, Time elapsed: 540.875 sec > <<< FAILURE! - in org.apache.hadoop.hbase.spark.TestJavaHBaseContext > testBulkDelete(org.apache.hadoop.hbase.spark.TestJavaHBaseContext) Time > elapsed: 540.647 sec <<< ERROR! > java.lang.RuntimeException: java.io.IOException: Shutting down > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at > org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) > at > org.apache.hadoop.hbase.http.HttpServer.openListeners(HttpServer.java:1012) > at org.apache.hadoop.hbase.http.HttpServer.start(HttpServer.java:953) > at org.apache.hadoop.hbase.http.InfoServer.start(InfoServer.java:91) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.putUpWebUI(HRegionServer.java:1788) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:603) > at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:367) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:139) > at > org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:218) > at > org.apache.hadoop.hbase.LocalHBaseCluster.(LocalHBaseCluster.java:154) > at > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:214) > at > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:94) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1075) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1041) > at > org.apache.hadoop.hbase.spark.TestJavaHBaseContext.setUp(TestJavaHBaseContext.java:82) > ... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14430) TestHttpServerLifecycle#testStartedServerIsAlive times out
[ https://issues.apache.org/jira/browse/HBASE-14430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803160#comment-14803160 ] stack commented on HBASE-14430: --- My master builds all passed last night but I had this in place: {code} diff --git a/hbase-server/src/test/java/org/apache/hadoop/hbase/http/TestHttpServerLifecycle.java b/hbase-server/src/test/java/org/apache/hadoop/hbase/http/TestHttpServerLifecycle.java index de290e3..c28d50a 100644 --- a/hbase-server/src/test/java/org/apache/hadoop/hbase/http/TestHttpServerLifecycle.java +++ b/hbase-server/src/test/java/org/apache/hadoop/hbase/http/TestHttpServerLifecycle.java @@ -21,6 +21,7 @@ import org.apache.hadoop.hbase.testclassification.MiscTests; import org.apache.hadoop.hbase.testclassification.SmallTests; import org.apache.log4j.Logger; import org.junit.Test; +import org.junit.Ignore; import org.junit.experimental.categories.Category; @Category({MiscTests.class, SmallTests.class}) @@ -63,7 +64,7 @@ public class TestHttpServerLifecycle extends HttpServerFunctionalTest { * * @throws Throwable on failure */ - @Test(timeout=6) + @Ignore @Test(timeout=6) public void testStartedServerIsAlive() throws Throwable { HttpServer server = null; server = createTestServer(); {code} > TestHttpServerLifecycle#testStartedServerIsAlive times out > -- > > Key: HBASE-14430 > URL: https://issues.apache.org/jira/browse/HBASE-14430 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack > > Running on my test rig, I see this test timeout from time to time. It just > hangs after jetty setup. Port clash? > {code} > 2015-09-14 09:08:54,474 INFO [main] hbase.ResourceChecker(148): before: > http.TestHttpServerLifecycle#testCreatedServerIsNotAlive Thread=4, > OpenFileDescriptor=192, MaxFileDescriptor=32768, SystemLoadAverage=122, > ProcessCount=507, Availabl > 2015-09-14 09:08:54,592 INFO [Time-limited test] log.Slf4jLog(67): Logging > to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via > org.mortbay.log.Slf4jLog > 2015-09-14 09:08:54,911 INFO [Time-limited test] http.HttpRequestLog(69): > Http request log for http.requests.test is not defined > 2015-09-14 09:08:54,923 INFO [Time-limited test] http.HttpServer(821): Added > global filter 'safety' > (class=org.apache.hadoop.hbase.http.HttpServer$QuotingInputFilter) > 2015-09-14 09:08:54,924 INFO [Time-limited test] http.HttpServer(821): Added > global filter 'clickjackingprevention' > (class=org.apache.hadoop.hbase.http.ClickjackingPreventionFilter) > 2015-09-14 09:08:54,985 INFO [main] hbase.ResourceChecker(172): after: > http.TestHttpServerLifecycle#testCreatedServerIsNotAlive Thread=5 (was 4) > Potentially hanging thread: process reaper > sun.misc.Unsafe.park(Native Method) > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > > java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) > > java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) > java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) > > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > java.lang.Thread.run(Thread.java:745) > - Thread LEAK? -, OpenFileDescriptor=192 (was 192), MaxFileDescriptor=32768 > (was 32768), SystemLoadAverage=122 (was 122), ProcessCount=507 (was 507), > AvailableMemoryMB=28014 (was 28054) > 2015-09-14 09:08:55,013 INFO [main] hbase.ResourceChecker(148): before: > http.TestHttpServerLifecycle#testWepAppContextAfterServerStop Thread=5, > OpenFileDescriptor=192, MaxFileDescriptor=32768, SystemLoadAverage=122, > ProcessCount=507, Ava > 2015-09-14 09:08:55,088 INFO [Time-limited test] http.HttpRequestLog(69): > Http request log for http.requests.test is not defined > 2015-09-14 09:08:55,089 INFO [Time-limited test] http.HttpServer(821): Added > global filter 'safety' > (class=org.apache.hadoop.hbase.http.HttpServer$QuotingInputFilter) > 2015-09-14 09:08:55,090 INFO [Time-limited test] http.HttpServer(821): Added > global filter 'clickjackingprevention' > (class=org.apache.hadoop.hbase.http.ClickjackingPreventionFilter) > 2015-09-14 09:08:55,113 INFO [Time-limited test] http.HttpServer(1013): > Jetty bound to port 60242 > 2015-09-14 09:08:55,113 INFO [Time-limited test] log.Slf4jLog(67): > jetty-6.1.26 > 2015-09-14 09:08:55,263 INFO [Time-limited test] log.Slf4jLog(67): Started > SelectChannelConnector@localhost:60242 > 2015-09-14 09:08:55,270 INFO [Time-limited test]
[jira] [Commented] (HBASE-13408) HBase In-Memory Memstore Compaction
[ https://issues.apache.org/jira/browse/HBASE-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802971#comment-14802971 ] Eric Owhadi commented on HBASE-13408: - Question on re-using the in-memory attribute: let’s imagine the use case of an online cart where people keep adding, deleting, updating quantity before submitting the order. That use case will love this patch. But if we also have some processes doing daily or weekly statistics, or simply users performing “what did I buy over the last 6 months”, but very infrequently, this will trigger population of old data in block cache with in-memory stickiness, even if the use case going back in time are not important enough to consume valuable block cache resources with in-memory stickiness? > HBase In-Memory Memstore Compaction > --- > > Key: HBASE-13408 > URL: https://issues.apache.org/jira/browse/HBASE-13408 > Project: HBase > Issue Type: New Feature >Reporter: Eshcar Hillel > Fix For: 2.0.0 > > Attachments: HBASE-13408-trunk-v01.patch, > HBASE-13408-trunk-v02.patch, HBASE-13408-trunk-v03.patch, > HBaseIn-MemoryMemstoreCompactionDesignDocument-ver02.pdf, > HBaseIn-MemoryMemstoreCompactionDesignDocument.pdf, > InMemoryMemstoreCompactionEvaluationResults.pdf, > InMemoryMemstoreCompactionScansEvaluationResults.pdf > > > A store unit holds a column family in a region, where the memstore is its > in-memory component. The memstore absorbs all updates to the store; from time > to time these updates are flushed to a file on disk, where they are > compacted. Unlike disk components, the memstore is not compacted until it is > written to the filesystem and optionally to block-cache. This may result in > underutilization of the memory due to duplicate entries per row, for example, > when hot data is continuously updated. > Generally, the faster the data is accumulated in memory, more flushes are > triggered, the data sinks to disk more frequently, slowing down retrieval of > data, even if very recent. > In high-churn workloads, compacting the memstore can help maintain the data > in memory, and thereby speed up data retrieval. > We suggest a new compacted memstore with the following principles: > 1.The data is kept in memory for as long as possible > 2.Memstore data is either compacted or in process of being compacted > 3.Allow a panic mode, which may interrupt an in-progress compaction and > force a flush of part of the memstore. > We suggest applying this optimization only to in-memory column families. > A design document is attached. > This feature was previously discussed in HBASE-5311. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13408) HBase In-Memory Memstore Compaction
[ https://issues.apache.org/jira/browse/HBASE-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802981#comment-14802981 ] Edward Bortnikov commented on HBASE-13408: -- JFYI, there is a discussion of a very similar feature in the rocksdb dev group: https://www.facebook.com/groups/rocksdb.dev/permalink/812072708891245/. > HBase In-Memory Memstore Compaction > --- > > Key: HBASE-13408 > URL: https://issues.apache.org/jira/browse/HBASE-13408 > Project: HBase > Issue Type: New Feature >Reporter: Eshcar Hillel > Fix For: 2.0.0 > > Attachments: HBASE-13408-trunk-v01.patch, > HBASE-13408-trunk-v02.patch, HBASE-13408-trunk-v03.patch, > HBaseIn-MemoryMemstoreCompactionDesignDocument-ver02.pdf, > HBaseIn-MemoryMemstoreCompactionDesignDocument.pdf, > InMemoryMemstoreCompactionEvaluationResults.pdf, > InMemoryMemstoreCompactionScansEvaluationResults.pdf > > > A store unit holds a column family in a region, where the memstore is its > in-memory component. The memstore absorbs all updates to the store; from time > to time these updates are flushed to a file on disk, where they are > compacted. Unlike disk components, the memstore is not compacted until it is > written to the filesystem and optionally to block-cache. This may result in > underutilization of the memory due to duplicate entries per row, for example, > when hot data is continuously updated. > Generally, the faster the data is accumulated in memory, more flushes are > triggered, the data sinks to disk more frequently, slowing down retrieval of > data, even if very recent. > In high-churn workloads, compacting the memstore can help maintain the data > in memory, and thereby speed up data retrieval. > We suggest a new compacted memstore with the following principles: > 1.The data is kept in memory for as long as possible > 2.Memstore data is either compacted or in process of being compacted > 3.Allow a panic mode, which may interrupt an in-progress compaction and > force a flush of part of the memstore. > We suggest applying this optimization only to in-memory column families. > A design document is attached. > This feature was previously discussed in HBASE-5311. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14447) Spark tests failing: bind exception when putting up info server
[ https://issues.apache.org/jira/browse/HBASE-14447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803202#comment-14803202 ] Hudson commented on HBASE-14447: SUCCESS: Integrated in HBase-1.2-IT #154 (See [https://builds.apache.org/job/HBase-1.2-IT/154/]) HBASE-14447 Spark tests failing: bind exception when putting up info server (stack: rev 91fd371909751a6919175e0c6e104dd6c85f7112) * hbase-spark/src/test/resources/hbase-site.xml > Spark tests failing: bind exception when putting up info server > --- > > Key: HBASE-14447 > URL: https://issues.apache.org/jira/browse/HBASE-14447 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack >Priority: Minor > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: 14447.patch > > > Go tthis: > {code} > Running org.apache.hadoop.hbase.spark.TestJavaHBaseContext > Tests run: 8, Failures: 0, Errors: 8, Skipped: 0, Time elapsed: 540.875 sec > <<< FAILURE! - in org.apache.hadoop.hbase.spark.TestJavaHBaseContext > testBulkDelete(org.apache.hadoop.hbase.spark.TestJavaHBaseContext) Time > elapsed: 540.647 sec <<< ERROR! > java.lang.RuntimeException: java.io.IOException: Shutting down > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at > org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) > at > org.apache.hadoop.hbase.http.HttpServer.openListeners(HttpServer.java:1012) > at org.apache.hadoop.hbase.http.HttpServer.start(HttpServer.java:953) > at org.apache.hadoop.hbase.http.InfoServer.start(InfoServer.java:91) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.putUpWebUI(HRegionServer.java:1788) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:603) > at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:367) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:139) > at > org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:218) > at > org.apache.hadoop.hbase.LocalHBaseCluster.(LocalHBaseCluster.java:154) > at > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:214) > at > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:94) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1075) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1041) > at > org.apache.hadoop.hbase.spark.TestJavaHBaseContext.setUp(TestJavaHBaseContext.java:82) > ... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14447) Spark tests failing: bind exception when putting up info server
[ https://issues.apache.org/jira/browse/HBASE-14447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803216#comment-14803216 ] Hudson commented on HBASE-14447: SUCCESS: Integrated in HBase-1.3-IT #165 (See [https://builds.apache.org/job/HBase-1.3-IT/165/]) HBASE-14447 Spark tests failing: bind exception when putting up info server (stack: rev cf135c444a3a8598a8da67474a8243eaa0d48347) * hbase-spark/src/test/resources/hbase-site.xml > Spark tests failing: bind exception when putting up info server > --- > > Key: HBASE-14447 > URL: https://issues.apache.org/jira/browse/HBASE-14447 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack >Priority: Minor > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: 14447.patch > > > Go tthis: > {code} > Running org.apache.hadoop.hbase.spark.TestJavaHBaseContext > Tests run: 8, Failures: 0, Errors: 8, Skipped: 0, Time elapsed: 540.875 sec > <<< FAILURE! - in org.apache.hadoop.hbase.spark.TestJavaHBaseContext > testBulkDelete(org.apache.hadoop.hbase.spark.TestJavaHBaseContext) Time > elapsed: 540.647 sec <<< ERROR! > java.lang.RuntimeException: java.io.IOException: Shutting down > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at > org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) > at > org.apache.hadoop.hbase.http.HttpServer.openListeners(HttpServer.java:1012) > at org.apache.hadoop.hbase.http.HttpServer.start(HttpServer.java:953) > at org.apache.hadoop.hbase.http.InfoServer.start(InfoServer.java:91) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.putUpWebUI(HRegionServer.java:1788) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:603) > at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:367) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:139) > at > org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:218) > at > org.apache.hadoop.hbase.LocalHBaseCluster.(LocalHBaseCluster.java:154) > at > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:214) > at > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:94) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1075) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1041) > at > org.apache.hadoop.hbase.spark.TestJavaHBaseContext.setUp(TestJavaHBaseContext.java:82) > ... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12298) Support BB usage in PrefixTree
[ https://issues.apache.org/jira/browse/HBASE-12298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803286#comment-14803286 ] Anoop Sam John commented on HBASE-12298: +1 > Support BB usage in PrefixTree > -- > > Key: HBASE-12298 > URL: https://issues.apache.org/jira/browse/HBASE-12298 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners >Reporter: Anoop Sam John >Assignee: ramkrishna.s.vasudevan > Attachments: HBASE-12298.patch, HBASE-12298_1.patch, > HBASE-12298_2.patch, HBASE-12298_3.patch, HBASE-12298_4 (1).patch, > HBASE-12298_4 (1).patch, HBASE-12298_4 (1).patch, HBASE-12298_4 (1).patch, > HBASE-12298_4 (1).patch, HBASE-12298_4.patch, HBASE-12298_4.patch, > HBASE-12298_4.patch, HBASE-12298_4.patch, HBASE-12298_5.patch, > HBASE-12298_6.patch, HBASE-12298_7.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14447) Spark tests failing: bind exception when putting up info server
[ https://issues.apache.org/jira/browse/HBASE-14447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803355#comment-14803355 ] Hudson commented on HBASE-14447: FAILURE: Integrated in HBase-1.3 #183 (See [https://builds.apache.org/job/HBase-1.3/183/]) HBASE-14447 Spark tests failing: bind exception when putting up info server (stack: rev cf135c444a3a8598a8da67474a8243eaa0d48347) * hbase-spark/src/test/resources/hbase-site.xml > Spark tests failing: bind exception when putting up info server > --- > > Key: HBASE-14447 > URL: https://issues.apache.org/jira/browse/HBASE-14447 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack >Priority: Minor > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: 14447.patch > > > Go tthis: > {code} > Running org.apache.hadoop.hbase.spark.TestJavaHBaseContext > Tests run: 8, Failures: 0, Errors: 8, Skipped: 0, Time elapsed: 540.875 sec > <<< FAILURE! - in org.apache.hadoop.hbase.spark.TestJavaHBaseContext > testBulkDelete(org.apache.hadoop.hbase.spark.TestJavaHBaseContext) Time > elapsed: 540.647 sec <<< ERROR! > java.lang.RuntimeException: java.io.IOException: Shutting down > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at > org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) > at > org.apache.hadoop.hbase.http.HttpServer.openListeners(HttpServer.java:1012) > at org.apache.hadoop.hbase.http.HttpServer.start(HttpServer.java:953) > at org.apache.hadoop.hbase.http.InfoServer.start(InfoServer.java:91) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.putUpWebUI(HRegionServer.java:1788) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:603) > at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:367) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:139) > at > org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:218) > at > org.apache.hadoop.hbase.LocalHBaseCluster.(LocalHBaseCluster.java:154) > at > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:214) > at > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:94) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1075) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1041) > at > org.apache.hadoop.hbase.spark.TestJavaHBaseContext.setUp(TestJavaHBaseContext.java:82) > ... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14374) Backport parent issue to 1.1 and 1.0
[ https://issues.apache.org/jira/browse/HBASE-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804347#comment-14804347 ] Nick Dimiduk commented on HBASE-14374: -- Anything I can help with here? I'd like to start machinations on 1.1.3 next week. > Backport parent issue to 1.1 and 1.0 > > > Key: HBASE-14374 > URL: https://issues.apache.org/jira/browse/HBASE-14374 > Project: HBase > Issue Type: Sub-task >Reporter: stack >Assignee: stack > Fix For: 1.0.3, 1.1.3 > > Attachments: 14317-branch-1.1.txt, 14317.branch-1.1.v2.txt, > 14317.branch-1.1.v2.txt, 14317.branch-1.1.v2.txt, 14374.branch-1.1.v3.txt, > 14374.branch-1.1.v4.txt, 14374.branch-1.1.v4.txt, 14374.branch-1.1.v4.txt, > 14374.branch-1.1.v4.txt, 14374.branch-1.1.v4.txt, 14374.branch-1.1.v4.txt, > 14374.branch-1.1.v5.txt > > > Backport parent issue to branch-1.1. and branch-1.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14128) Fix inability to run Multiple MR over the same Snapshot
[ https://issues.apache.org/jira/browse/HBASE-14128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804470#comment-14804470 ] stack commented on HBASE-14128: --- +1 (after chatting about patch offline). Nits below. Fix the latter minor on commit. Seems fine making this non-pubilc since class is private, evolving... static SnapshotManifest getSnapshotManifest It is opaque doing the below where a restoreId becomes a tablename... just have the passed in arg be tablename? public static List getSplits(Scan scan, SnapshotManifest manifest, 320 List regionManifests, Path restoreDir, String restoreId, Configuration conf) > Fix inability to run Multiple MR over the same Snapshot > --- > > Key: HBASE-14128 > URL: https://issues.apache.org/jira/browse/HBASE-14128 > Project: HBase > Issue Type: Bug > Components: mapreduce, snapshots >Reporter: Matteo Bertozzi >Assignee: santosh kumar >Priority: Minor > Labels: beginner, noob > Attachments: HBASE-14128-v0.patch > > > from the list, running multiple MR over the same snapshot does not work > {code} > public static void copySnapshotForScanner(Configuration conf, FileSystem .. > RestoreSnapshotHelper helper = new RestoreSnapshotHelper(conf, fs, > manifest, manifest.getTableDescriptor(), restoreDir, monitor, status); > {code} > the problem is that manifest.getTableDescriptor() will try to clone the > snapshot with the same target name. ending up in "file already exist" > exceptions. > we just need to clone that descriptor and generate a new target table name -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14447) Spark tests failing: bind exception when putting up info server
[ https://issues.apache.org/jira/browse/HBASE-14447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804306#comment-14804306 ] Hudson commented on HBASE-14447: FAILURE: Integrated in HBase-1.2 #182 (See [https://builds.apache.org/job/HBase-1.2/182/]) HBASE-14447 Spark tests failing: bind exception when putting up info server (stack: rev 91fd371909751a6919175e0c6e104dd6c85f7112) * hbase-spark/src/test/resources/hbase-site.xml > Spark tests failing: bind exception when putting up info server > --- > > Key: HBASE-14447 > URL: https://issues.apache.org/jira/browse/HBASE-14447 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack >Priority: Minor > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: 14447.patch > > > Go tthis: > {code} > Running org.apache.hadoop.hbase.spark.TestJavaHBaseContext > Tests run: 8, Failures: 0, Errors: 8, Skipped: 0, Time elapsed: 540.875 sec > <<< FAILURE! - in org.apache.hadoop.hbase.spark.TestJavaHBaseContext > testBulkDelete(org.apache.hadoop.hbase.spark.TestJavaHBaseContext) Time > elapsed: 540.647 sec <<< ERROR! > java.lang.RuntimeException: java.io.IOException: Shutting down > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at > org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) > at > org.apache.hadoop.hbase.http.HttpServer.openListeners(HttpServer.java:1012) > at org.apache.hadoop.hbase.http.HttpServer.start(HttpServer.java:953) > at org.apache.hadoop.hbase.http.InfoServer.start(InfoServer.java:91) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.putUpWebUI(HRegionServer.java:1788) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:603) > at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:367) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:139) > at > org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:218) > at > org.apache.hadoop.hbase.LocalHBaseCluster.(LocalHBaseCluster.java:154) > at > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:214) > at > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:94) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1075) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1041) > at > org.apache.hadoop.hbase.spark.TestJavaHBaseContext.setUp(TestJavaHBaseContext.java:82) > ... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11590) use a specific ThreadPoolExecutor
[ https://issues.apache.org/jira/browse/HBASE-11590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804373#comment-14804373 ] Nicolas Liochon commented on HBASE-11590: - If we cut down the timeout, it's more or less equivalent of not having a thread pool at all. One of the things I don't like in many solutions (the TPE I wrote myself included) is that we have a race condition: we may create a thread even if it's not needed. I'm off for 3 days, but I will try to find a reasonable solution next week. > use a specific ThreadPoolExecutor > - > > Key: HBASE-11590 > URL: https://issues.apache.org/jira/browse/HBASE-11590 > Project: HBase > Issue Type: Bug > Components: Client, Performance >Affects Versions: 1.0.0, 2.0.0 >Reporter: Nicolas Liochon >Assignee: Nicolas Liochon >Priority: Minor > Fix For: 2.0.0 > > Attachments: tp.patch > > > The JDK TPE creates all the threads in the pool. As a consequence, we create > (by default) 256 threads even if we just need a few. > The attached TPE create threads only if we have something in the queue. > On a PE test with replica on, it improved the 99 latency percentile by 5%. > Warning: there are likely some race conditions, but I'm posting it here > because there is may be an implementation available somewhere we can use, or > a good reason not to do that. So feedback welcome as usual. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14447) Spark tests failing: bind exception when putting up info server
[ https://issues.apache.org/jira/browse/HBASE-14447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804408#comment-14804408 ] Hudson commented on HBASE-14447: FAILURE: Integrated in HBase-TRUNK #6818 (See [https://builds.apache.org/job/HBase-TRUNK/6818/]) HBASE-14447 Spark tests failing: bind exception when putting up info server (stack: rev a47ff1d998fb09870d50d60c0269140b595a59f7) * hbase-spark/src/test/resources/hbase-site.xml > Spark tests failing: bind exception when putting up info server > --- > > Key: HBASE-14447 > URL: https://issues.apache.org/jira/browse/HBASE-14447 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack >Priority: Minor > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: 14447.patch > > > Go tthis: > {code} > Running org.apache.hadoop.hbase.spark.TestJavaHBaseContext > Tests run: 8, Failures: 0, Errors: 8, Skipped: 0, Time elapsed: 540.875 sec > <<< FAILURE! - in org.apache.hadoop.hbase.spark.TestJavaHBaseContext > testBulkDelete(org.apache.hadoop.hbase.spark.TestJavaHBaseContext) Time > elapsed: 540.647 sec <<< ERROR! > java.lang.RuntimeException: java.io.IOException: Shutting down > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at > org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) > at > org.apache.hadoop.hbase.http.HttpServer.openListeners(HttpServer.java:1012) > at org.apache.hadoop.hbase.http.HttpServer.start(HttpServer.java:953) > at org.apache.hadoop.hbase.http.InfoServer.start(InfoServer.java:91) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.putUpWebUI(HRegionServer.java:1788) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:603) > at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:367) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:139) > at > org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:218) > at > org.apache.hadoop.hbase.LocalHBaseCluster.(LocalHBaseCluster.java:154) > at > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:214) > at > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:94) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1075) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1041) > at > org.apache.hadoop.hbase.spark.TestJavaHBaseContext.setUp(TestJavaHBaseContext.java:82) > ... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11590) use a specific ThreadPoolExecutor
[ https://issues.apache.org/jira/browse/HBASE-11590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804478#comment-14804478 ] stack commented on HBASE-11590: --- bq. If we cut down the timeout, it's more or less equivalent of not having a thread pool at all. Well, if a timeout of 1 or 10 seconds, the pool would be in place when we need it... in times of read/write. No hurry [~nkeywal] On the create of one thread too many, I'd not be too worried given we seem to currently create 255 threads too many (smile). > use a specific ThreadPoolExecutor > - > > Key: HBASE-11590 > URL: https://issues.apache.org/jira/browse/HBASE-11590 > Project: HBase > Issue Type: Bug > Components: Client, Performance >Affects Versions: 1.0.0, 2.0.0 >Reporter: Nicolas Liochon >Assignee: Nicolas Liochon >Priority: Minor > Fix For: 2.0.0 > > Attachments: tp.patch > > > The JDK TPE creates all the threads in the pool. As a consequence, we create > (by default) 256 threads even if we just need a few. > The attached TPE create threads only if we have something in the queue. > On a PE test with replica on, it improved the 99 latency percentile by 5%. > Warning: there are likely some race conditions, but I'm posting it here > because there is may be an implementation available somewhere we can use, or > a good reason not to do that. So feedback welcome as usual. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14404) Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98
[ https://issues.apache.org/jira/browse/HBASE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804548#comment-14804548 ] Andrew Purtell commented on HBASE-14404: I'll update the patch shortly to not make a change if the user doesn't add a configuration setting either way. > Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98 > --- > > Key: HBASE-14404 > URL: https://issues.apache.org/jira/browse/HBASE-14404 > Project: HBase > Issue Type: Task >Reporter: Andrew Purtell >Assignee: Andrew Purtell > Fix For: 0.98.15 > > Attachments: HBASE-14404-0.98.patch > > > HBASE-14098 adds a new configuration toggle - > "hbase.hfile.drop.behind.compaction" - which if set to "true" tells > compactions to drop pages from the OS blockcache after write. It's on by > default where committed so far but a backport to 0.98 would default it to > off. (The backport would also retain compat methods to LimitedPrivate > interface StoreFileScanner.) What could make it a controversial change in > 0.98 is it changes the default setting of > 'hbase.regionserver.compaction.private.readers' from "false" to "true". I > think it's fine, we use private readers in production. They're stable and do > not present perf issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-14383) Compaction improvements
[ https://issues.apache.org/jira/browse/HBASE-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804474#comment-14804474 ] Lars Hofhansl edited comment on HBASE-14383 at 9/17/15 8:49 PM: bq. Can we retire hbase.regionserver.maxlogs? I think so. maxlogs is really a function of heap available for the memstores and the HDFS block size used. Something like: {{maxlogs = memstore heap / (HDFS blocksize * 0.95)}} Can we just default it to that? Maybe with 10% padding. was (Author: lhofhansl): bq. Can we retire hbase.regionserver.maxlogs? I think so. maxlogs is really a function of heap available for the memstores and the HDFS block size used. Something like: {{maxlogs = memstore heap / (HDFS blocksize * 0.95)}} > Compaction improvements > --- > > Key: HBASE-14383 > URL: https://issues.apache.org/jira/browse/HBASE-14383 > Project: HBase > Issue Type: Improvement >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov > Fix For: 2.0.0 > > > Still major issue in many production environments. The general recommendation > - disabling region splitting and major compactions to reduce unpredictable > IO/CPU spikes, especially during peak times and running them manually during > off peak times. Still do not resolve the issues completely. > h3. Flush storms > * rolling WAL events across cluster can be highly correlated, hence flushing > memstores, hence triggering minor compactions, that can be promoted to major > ones. These events are highly correlated in time if there is a balanced > write-load on the regions in a table. > * the same is true for memstore flushing due to periodic memstore flusher > operation. > Both above may produce *flush storms* which are as bad as *compaction > storms*. > What can be done here. We can spread these events over time by randomizing > (with jitter) several config options: > # hbase.regionserver.optionalcacheflushinterval > # hbase.regionserver.flush.per.changes > # hbase.regionserver.maxlogs > h3. ExploringCompactionPolicy max compaction size > One more optimization can be added to ExploringCompactionPolicy. To limit > size of a compaction there is a config parameter one could use > hbase.hstore.compaction.max.size. It would be nice to have two separate > limits: for peak and off peak hours. > h3. ExploringCompactionPolicy selection evaluation algorithm > Too simple? Selection with more files always wins, selection of smaller size > wins if number of files is the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-14383) Compaction improvements
[ https://issues.apache.org/jira/browse/HBASE-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804474#comment-14804474 ] Lars Hofhansl edited comment on HBASE-14383 at 9/17/15 8:49 PM: bq. Can we retire hbase.regionserver.maxlogs? I think so. maxlogs is really a function of heap available for the memstores and the HDFS block size used. Something like: {{maxlogs = memstore heap / (HDFS blocksize * 0.95)}} was (Author: lhofhansl): I think so. maxlogs is really a function of heap available for the memstores and the HDFS block size used. Something like: {{maxlogs = memstore heap / (HDFS blocksize * 0.95)}} > Compaction improvements > --- > > Key: HBASE-14383 > URL: https://issues.apache.org/jira/browse/HBASE-14383 > Project: HBase > Issue Type: Improvement >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov > Fix For: 2.0.0 > > > Still major issue in many production environments. The general recommendation > - disabling region splitting and major compactions to reduce unpredictable > IO/CPU spikes, especially during peak times and running them manually during > off peak times. Still do not resolve the issues completely. > h3. Flush storms > * rolling WAL events across cluster can be highly correlated, hence flushing > memstores, hence triggering minor compactions, that can be promoted to major > ones. These events are highly correlated in time if there is a balanced > write-load on the regions in a table. > * the same is true for memstore flushing due to periodic memstore flusher > operation. > Both above may produce *flush storms* which are as bad as *compaction > storms*. > What can be done here. We can spread these events over time by randomizing > (with jitter) several config options: > # hbase.regionserver.optionalcacheflushinterval > # hbase.regionserver.flush.per.changes > # hbase.regionserver.maxlogs > h3. ExploringCompactionPolicy max compaction size > One more optimization can be added to ExploringCompactionPolicy. To limit > size of a compaction there is a config parameter one could use > hbase.hstore.compaction.max.size. It would be nice to have two separate > limits: for peak and off peak hours. > h3. ExploringCompactionPolicy selection evaluation algorithm > Too simple? Selection with more files always wins, selection of smaller size > wins if number of files is the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14383) Compaction improvements
[ https://issues.apache.org/jira/browse/HBASE-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804474#comment-14804474 ] Lars Hofhansl commented on HBASE-14383: --- I think so. maxlogs is really a function of heap available for the memstores and the HDFS block size used. Something like: {{maxlogs = memstore heap / (HDFS blocksize * 0.95)}} > Compaction improvements > --- > > Key: HBASE-14383 > URL: https://issues.apache.org/jira/browse/HBASE-14383 > Project: HBase > Issue Type: Improvement >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov > Fix For: 2.0.0 > > > Still major issue in many production environments. The general recommendation > - disabling region splitting and major compactions to reduce unpredictable > IO/CPU spikes, especially during peak times and running them manually during > off peak times. Still do not resolve the issues completely. > h3. Flush storms > * rolling WAL events across cluster can be highly correlated, hence flushing > memstores, hence triggering minor compactions, that can be promoted to major > ones. These events are highly correlated in time if there is a balanced > write-load on the regions in a table. > * the same is true for memstore flushing due to periodic memstore flusher > operation. > Both above may produce *flush storms* which are as bad as *compaction > storms*. > What can be done here. We can spread these events over time by randomizing > (with jitter) several config options: > # hbase.regionserver.optionalcacheflushinterval > # hbase.regionserver.flush.per.changes > # hbase.regionserver.maxlogs > h3. ExploringCompactionPolicy max compaction size > One more optimization can be added to ExploringCompactionPolicy. To limit > size of a compaction there is a config parameter one could use > hbase.hstore.compaction.max.size. It would be nice to have two separate > limits: for peak and off peak hours. > h3. ExploringCompactionPolicy selection evaluation algorithm > Too simple? Selection with more files always wins, selection of smaller size > wins if number of files is the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14430) TestHttpServerLifecycle#testStartedServerIsAlive times out
[ https://issues.apache.org/jira/browse/HBASE-14430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803299#comment-14803299 ] stack commented on HBASE-14430: --- Just got this too on my test rig: TestHttpServerLifecycle.testStoppedServerIsNotAlive:97->HttpServerFunctionalTest.stop:195 » TestTimedOut > TestHttpServerLifecycle#testStartedServerIsAlive times out > -- > > Key: HBASE-14430 > URL: https://issues.apache.org/jira/browse/HBASE-14430 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack > > Running on my test rig, I see this test timeout from time to time. It just > hangs after jetty setup. Port clash? > {code} > 2015-09-14 09:08:54,474 INFO [main] hbase.ResourceChecker(148): before: > http.TestHttpServerLifecycle#testCreatedServerIsNotAlive Thread=4, > OpenFileDescriptor=192, MaxFileDescriptor=32768, SystemLoadAverage=122, > ProcessCount=507, Availabl > 2015-09-14 09:08:54,592 INFO [Time-limited test] log.Slf4jLog(67): Logging > to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via > org.mortbay.log.Slf4jLog > 2015-09-14 09:08:54,911 INFO [Time-limited test] http.HttpRequestLog(69): > Http request log for http.requests.test is not defined > 2015-09-14 09:08:54,923 INFO [Time-limited test] http.HttpServer(821): Added > global filter 'safety' > (class=org.apache.hadoop.hbase.http.HttpServer$QuotingInputFilter) > 2015-09-14 09:08:54,924 INFO [Time-limited test] http.HttpServer(821): Added > global filter 'clickjackingprevention' > (class=org.apache.hadoop.hbase.http.ClickjackingPreventionFilter) > 2015-09-14 09:08:54,985 INFO [main] hbase.ResourceChecker(172): after: > http.TestHttpServerLifecycle#testCreatedServerIsNotAlive Thread=5 (was 4) > Potentially hanging thread: process reaper > sun.misc.Unsafe.park(Native Method) > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > > java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) > > java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) > java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) > > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > java.lang.Thread.run(Thread.java:745) > - Thread LEAK? -, OpenFileDescriptor=192 (was 192), MaxFileDescriptor=32768 > (was 32768), SystemLoadAverage=122 (was 122), ProcessCount=507 (was 507), > AvailableMemoryMB=28014 (was 28054) > 2015-09-14 09:08:55,013 INFO [main] hbase.ResourceChecker(148): before: > http.TestHttpServerLifecycle#testWepAppContextAfterServerStop Thread=5, > OpenFileDescriptor=192, MaxFileDescriptor=32768, SystemLoadAverage=122, > ProcessCount=507, Ava > 2015-09-14 09:08:55,088 INFO [Time-limited test] http.HttpRequestLog(69): > Http request log for http.requests.test is not defined > 2015-09-14 09:08:55,089 INFO [Time-limited test] http.HttpServer(821): Added > global filter 'safety' > (class=org.apache.hadoop.hbase.http.HttpServer$QuotingInputFilter) > 2015-09-14 09:08:55,090 INFO [Time-limited test] http.HttpServer(821): Added > global filter 'clickjackingprevention' > (class=org.apache.hadoop.hbase.http.ClickjackingPreventionFilter) > 2015-09-14 09:08:55,113 INFO [Time-limited test] http.HttpServer(1013): > Jetty bound to port 60242 > 2015-09-14 09:08:55,113 INFO [Time-limited test] log.Slf4jLog(67): > jetty-6.1.26 > 2015-09-14 09:08:55,263 INFO [Time-limited test] log.Slf4jLog(67): Started > SelectChannelConnector@localhost:60242 > 2015-09-14 09:08:55,270 INFO [Time-limited test] log.Slf4jLog(67): Stopped > SelectChannelConnector@localhost:0 > 2015-09-14 09:08:55,401 INFO [main] hbase.ResourceChecker(172): after: > http.TestHttpServerLifecycle#testWepAppContextAfterServerStop Thread=5 (was > 5), OpenFileDescriptor=197 (was 192) - OpenFileDescriptor LEAK? -, > MaxFileDescriptor=32768 > 2015-09-14 09:08:55,428 INFO [main] hbase.ResourceChecker(148): before: > http.TestHttpServerLifecycle#testStopUnstartedServer Thread=5, > OpenFileDescriptor=197, MaxFileDescriptor=32768, SystemLoadAverage=122, > ProcessCount=507, AvailableMem > 2015-09-14 09:08:55,489 INFO [Time-limited test] http.HttpRequestLog(69): > Http request log for http.requests.test is not defined > 2015-09-14 09:08:55,489 INFO [Time-limited test] http.HttpServer(821): Added > global filter 'safety' > (class=org.apache.hadoop.hbase.http.HttpServer$QuotingInputFilter) > 2015-09-14 09:08:55,490 INFO [Time-limited test] http.HttpServer(821): Added > global filter
[jira] [Commented] (HBASE-14386) Reset MutableHistogram's min/max/sum after snapshot
[ https://issues.apache.org/jira/browse/HBASE-14386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804683#comment-14804683 ] Ted Yu commented on HBASE-14386: [~aoxiang]: Can you answer Elliott's question above ? > Reset MutableHistogram's min/max/sum after snapshot > --- > > Key: HBASE-14386 > URL: https://issues.apache.org/jira/browse/HBASE-14386 > Project: HBase > Issue Type: Bug >Reporter: binlijin >Assignee: Oliver > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-14386.patch > > > Current MutableHistogram do not reset min/max/sum after snapshot, so we > affect by historical data. For example when i monitor the QueueCallTime_mean, > i see one host's QueueCallTime_mean metric is high, but when i trace the > host's regionserver log i see the QueueCallTime_mean has been lower, but the > metric is still high. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14383) Compaction improvements
[ https://issues.apache.org/jira/browse/HBASE-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804687#comment-14804687 ] Vladimir Rodionov commented on HBASE-14383: --- {quote} Where is this code? {quote} In a FlushLargeStoresPolicy, We flush only stores with 16MB in size and greater, otherwise we check if if it is old enough to be flushed. > Compaction improvements > --- > > Key: HBASE-14383 > URL: https://issues.apache.org/jira/browse/HBASE-14383 > Project: HBase > Issue Type: Improvement >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov > Fix For: 2.0.0 > > > Still major issue in many production environments. The general recommendation > - disabling region splitting and major compactions to reduce unpredictable > IO/CPU spikes, especially during peak times and running them manually during > off peak times. Still do not resolve the issues completely. > h3. Flush storms > * rolling WAL events across cluster can be highly correlated, hence flushing > memstores, hence triggering minor compactions, that can be promoted to major > ones. These events are highly correlated in time if there is a balanced > write-load on the regions in a table. > * the same is true for memstore flushing due to periodic memstore flusher > operation. > Both above may produce *flush storms* which are as bad as *compaction > storms*. > What can be done here. We can spread these events over time by randomizing > (with jitter) several config options: > # hbase.regionserver.optionalcacheflushinterval > # hbase.regionserver.flush.per.changes > # hbase.regionserver.maxlogs > h3. ExploringCompactionPolicy max compaction size > One more optimization can be added to ExploringCompactionPolicy. To limit > size of a compaction there is a config parameter one could use > hbase.hstore.compaction.max.size. It would be nice to have two separate > limits: for peak and off peak hours. > h3. ExploringCompactionPolicy selection evaluation algorithm > Too simple? Selection with more files always wins, selection of smaller size > wins if number of files is the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14367) Add normalization support to shell
[ https://issues.apache.org/jira/browse/HBASE-14367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804747#comment-14804747 ] Jean-Marc Spaggiari commented on HBASE-14367: - any progress on that? ;) > Add normalization support to shell > -- > > Key: HBASE-14367 > URL: https://issues.apache.org/jira/browse/HBASE-14367 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 1.1.2 >Reporter: Lars George >Assignee: Mikhail Antonov > Fix For: 2.0.0, 1.2.0, 1.3.0 > > > https://issues.apache.org/jira/browse/HBASE-13103 adds support for setting a > normalization flag per {{HTableDescriptor}}, along with the server side chore > to do the work. > What is lacking is to easily set this from the shell, right now you need to > use the Java API to modify the descriptor. This issue is to add the flag as a > known attribute key and/or other means to toggle this per table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14383) Compaction improvements
[ https://issues.apache.org/jira/browse/HBASE-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804682#comment-14804682 ] Enis Soztutar commented on HBASE-14383: --- bq. flush policy ignores all files less than 15MB. Where is this code? I could not find anything in the periodic or non-periodic flush requests that prevents flush requests. bq. maxlogs is really a function of heap available for the memstores and the HDFS block size used. Something like: maxlogs = memstore heap / (HDFS blocksize * 0.95) This assumes that all memstores are getting updates. In case a memstore stops getting updates, it will not flush for ~0.5 hour (expected) unless it is the biggest memstore left. bq. Can we just default it to that? Maybe with 10% padding. Maybe we can instead do the limit as 2x or 3x. > Compaction improvements > --- > > Key: HBASE-14383 > URL: https://issues.apache.org/jira/browse/HBASE-14383 > Project: HBase > Issue Type: Improvement >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov > Fix For: 2.0.0 > > > Still major issue in many production environments. The general recommendation > - disabling region splitting and major compactions to reduce unpredictable > IO/CPU spikes, especially during peak times and running them manually during > off peak times. Still do not resolve the issues completely. > h3. Flush storms > * rolling WAL events across cluster can be highly correlated, hence flushing > memstores, hence triggering minor compactions, that can be promoted to major > ones. These events are highly correlated in time if there is a balanced > write-load on the regions in a table. > * the same is true for memstore flushing due to periodic memstore flusher > operation. > Both above may produce *flush storms* which are as bad as *compaction > storms*. > What can be done here. We can spread these events over time by randomizing > (with jitter) several config options: > # hbase.regionserver.optionalcacheflushinterval > # hbase.regionserver.flush.per.changes > # hbase.regionserver.maxlogs > h3. ExploringCompactionPolicy max compaction size > One more optimization can be added to ExploringCompactionPolicy. To limit > size of a compaction there is a config parameter one could use > hbase.hstore.compaction.max.size. It would be nice to have two separate > limits: for peak and off peak hours. > h3. ExploringCompactionPolicy selection evaluation algorithm > Too simple? Selection with more files always wins, selection of smaller size > wins if number of files is the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14275) Backport to 0.98 HBASE-10785 Metas own location should be cached
[ https://issues.apache.org/jira/browse/HBASE-14275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804711#comment-14804711 ] Andrew Purtell commented on HBASE-14275: Bisect brought me here again. > Backport to 0.98 HBASE-10785 Metas own location should be cached > > > Key: HBASE-14275 > URL: https://issues.apache.org/jira/browse/HBASE-14275 > Project: HBase > Issue Type: Improvement >Reporter: Jerry He >Assignee: Jerry He > Fix For: 0.98.14 > > Attachments: HBASE-14275-0.98.patch > > > We've seen similar problem reported on 0.98. > It is good improvement to have. > This will cover HBASE-10785 and the a later HBASE-11332. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14383) Compaction improvements
[ https://issues.apache.org/jira/browse/HBASE-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804741#comment-14804741 ] Enis Soztutar commented on HBASE-14383: --- bq. In a FlushLargeStoresPolicy, We flush only stores with 16MB in size and greater, otherwise we check if if it is old enough to be flushed. Hmm, does this policy mean that we may end up not flushing data even with periodic flusher? The periodic flusher should be like a force flush to be affective. > Compaction improvements > --- > > Key: HBASE-14383 > URL: https://issues.apache.org/jira/browse/HBASE-14383 > Project: HBase > Issue Type: Improvement >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov > Fix For: 2.0.0 > > > Still major issue in many production environments. The general recommendation > - disabling region splitting and major compactions to reduce unpredictable > IO/CPU spikes, especially during peak times and running them manually during > off peak times. Still do not resolve the issues completely. > h3. Flush storms > * rolling WAL events across cluster can be highly correlated, hence flushing > memstores, hence triggering minor compactions, that can be promoted to major > ones. These events are highly correlated in time if there is a balanced > write-load on the regions in a table. > * the same is true for memstore flushing due to periodic memstore flusher > operation. > Both above may produce *flush storms* which are as bad as *compaction > storms*. > What can be done here. We can spread these events over time by randomizing > (with jitter) several config options: > # hbase.regionserver.optionalcacheflushinterval > # hbase.regionserver.flush.per.changes > # hbase.regionserver.maxlogs > h3. ExploringCompactionPolicy max compaction size > One more optimization can be added to ExploringCompactionPolicy. To limit > size of a compaction there is a config parameter one could use > hbase.hstore.compaction.max.size. It would be nice to have two separate > limits: for peak and off peak hours. > h3. ExploringCompactionPolicy selection evaluation algorithm > Too simple? Selection with more files always wins, selection of smaller size > wins if number of files is the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14367) Add normalization support to shell
[ https://issues.apache.org/jira/browse/HBASE-14367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804752#comment-14804752 ] Mikhail Antonov commented on HBASE-14367: - Oh, yep, sorry for delay, been busy :) should post first patch today or tomorrow. > Add normalization support to shell > -- > > Key: HBASE-14367 > URL: https://issues.apache.org/jira/browse/HBASE-14367 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 1.1.2 >Reporter: Lars George >Assignee: Mikhail Antonov > Fix For: 2.0.0, 1.2.0, 1.3.0 > > > https://issues.apache.org/jira/browse/HBASE-13103 adds support for setting a > normalization flag per {{HTableDescriptor}}, along with the server side chore > to do the work. > What is lacking is to easily set this from the shell, right now you need to > use the Java API to modify the descriptor. This issue is to add the flag as a > known attribute key and/or other means to toggle this per table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14374) Backport parent issue to 1.1 and 1.0
[ https://issues.apache.org/jira/browse/HBASE-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804936#comment-14804936 ] stack commented on HBASE-14374: --- Let me get going on this [~ndimiduk] Patch needs updating. > Backport parent issue to 1.1 and 1.0 > > > Key: HBASE-14374 > URL: https://issues.apache.org/jira/browse/HBASE-14374 > Project: HBase > Issue Type: Sub-task >Reporter: stack >Assignee: stack > Fix For: 1.0.3, 1.1.3 > > Attachments: 14317-branch-1.1.txt, 14317.branch-1.1.v2.txt, > 14317.branch-1.1.v2.txt, 14317.branch-1.1.v2.txt, 14374.branch-1.1.v3.txt, > 14374.branch-1.1.v4.txt, 14374.branch-1.1.v4.txt, 14374.branch-1.1.v4.txt, > 14374.branch-1.1.v4.txt, 14374.branch-1.1.v4.txt, 14374.branch-1.1.v4.txt, > 14374.branch-1.1.v5.txt > > > Backport parent issue to branch-1.1. and branch-1.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14374) Backport parent 'HBASE-14317 Stuck FSHLog' issue to 1.1 and 1.0
[ https://issues.apache.org/jira/browse/HBASE-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-14374: -- Summary: Backport parent 'HBASE-14317 Stuck FSHLog' issue to 1.1 and 1.0 (was: Backport parent issue to 1.1 and 1.0) > Backport parent 'HBASE-14317 Stuck FSHLog' issue to 1.1 and 1.0 > --- > > Key: HBASE-14374 > URL: https://issues.apache.org/jira/browse/HBASE-14374 > Project: HBase > Issue Type: Sub-task >Reporter: stack >Assignee: stack > Fix For: 1.0.3, 1.1.3 > > Attachments: 14317-branch-1.1.txt, 14317.branch-1.1.v2.txt, > 14317.branch-1.1.v2.txt, 14317.branch-1.1.v2.txt, 14374.branch-1.1.v3.txt, > 14374.branch-1.1.v4.txt, 14374.branch-1.1.v4.txt, 14374.branch-1.1.v4.txt, > 14374.branch-1.1.v4.txt, 14374.branch-1.1.v4.txt, 14374.branch-1.1.v4.txt, > 14374.branch-1.1.v5.txt > > > Backport parent issue to branch-1.1. and branch-1.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14275) Backport to 0.98 HBASE-10785 Metas own location should be cached
[ https://issues.apache.org/jira/browse/HBASE-14275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804912#comment-14804912 ] Jerry He commented on HBASE-14275: -- Any interesting finding? > Backport to 0.98 HBASE-10785 Metas own location should be cached > > > Key: HBASE-14275 > URL: https://issues.apache.org/jira/browse/HBASE-14275 > Project: HBase > Issue Type: Improvement >Reporter: Jerry He >Assignee: Jerry He > Fix For: 0.98.14 > > Attachments: HBASE-14275-0.98.patch > > > We've seen similar problem reported on 0.98. > It is good improvement to have. > This will cover HBASE-10785 and the a later HBASE-11332. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14443) Add request parameter to the TooSlow/TooLarge warn message of RpcServer
[ https://issues.apache.org/jira/browse/HBASE-14443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jianwei Cui updated HBASE-14443: Attachment: HBASE-14443-trunk-v1.patch The patch will add request parameters into warn message. For get/scan/mutate request, the warn message will be formatted as: {code} (operationTooSlow): {"region":"test_table,,1442476299154.3ee9b59f45681b73c79b58b25d0be062.","Condition":{"family":"C","qualifier":""},"Put":{"totalColumns":1,"families":{"C":[{"timestamp":9223372036854775807,"tag":[],"qualifier":"","vlen":6}]},"row":"105"}, ... ,"method":"Mutate" {code} This is a warn log for checkAndPut request, the condition and put info will both be included. For multi request, the quantity of actions will be included as: {code} (operationTooSlow): {..."method":"Multi","MultiAction":{"Increment":1,"Put":1,"Append":1,"Delete":1,"regions":1}} {code} For coprocessor exec, the coprocessor service and method name will be extracted from request and included as: {code} (responseTooSlow): {"call":"hbase.pb.MultiRowMutationService#MutateRows(region: test_table,,1442476299154.3ee9b59f45681b73c79b58b25d0be062., row:106)",...} {code} > Add request parameter to the TooSlow/TooLarge warn message of RpcServer > --- > > Key: HBASE-14443 > URL: https://issues.apache.org/jira/browse/HBASE-14443 > Project: HBase > Issue Type: Improvement > Components: rpc >Reporter: Jianwei Cui >Priority: Minor > Fix For: 1.2.1 > > Attachments: HBASE-14443-trunk-v1.patch > > > The RpcServer will log a warn message for TooSlow or TooLarge request as: > {code} > logResponse(new Object[]{param}, > md.getName(), md.getName() + "(" + param.getClass().getName() + > ")", > (tooLarge ? "TooLarge" : "TooSlow"), > status.getClient(), startTime, processingTime, qTime, > responseSize); > {code} > The RpcServer#logResponse will create the warn message as: > {code} > if (params.length == 2 && server instanceof HRegionServer && > params[0] instanceof byte[] && > params[1] instanceof Operation) { > ... > responseInfo.putAll(((Operation) params[1]).toMap()); > ... > } else if (params.length == 1 && server instanceof HRegionServer && > params[0] instanceof Operation) { > ... > responseInfo.putAll(((Operation) params[0]).toMap()); > ... > } else { > ... > } > {code} > Because the parameter is always a protobuf message, not an instance of > Operation, the request parameter will not be added into the warn message. The > parameter is helpful to find out the problem, for example, knowing the > startRow/endRow is useful for a TooSlow scan. To improve the warn message, we > can transform the protobuf request message to corresponding Operation > subclass object by ProtobufUtil, so that it can be added the warn message. > Suggestion and discussion are welcomed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14443) Add request parameter to the TooSlow/TooLarge warn message of RpcServer
[ https://issues.apache.org/jira/browse/HBASE-14443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804946#comment-14804946 ] Jianwei Cui commented on HBASE-14443: - The coprocessor info is also needed as described in [HBASE-14333|https://issues.apache.org/jira/browse/HBASE-14443]. This patch include the coprocessor service and method name into warn message. > Add request parameter to the TooSlow/TooLarge warn message of RpcServer > --- > > Key: HBASE-14443 > URL: https://issues.apache.org/jira/browse/HBASE-14443 > Project: HBase > Issue Type: Improvement > Components: rpc >Reporter: Jianwei Cui >Priority: Minor > Fix For: 1.2.1 > > Attachments: HBASE-14443-trunk-v1.patch > > > The RpcServer will log a warn message for TooSlow or TooLarge request as: > {code} > logResponse(new Object[]{param}, > md.getName(), md.getName() + "(" + param.getClass().getName() + > ")", > (tooLarge ? "TooLarge" : "TooSlow"), > status.getClient(), startTime, processingTime, qTime, > responseSize); > {code} > The RpcServer#logResponse will create the warn message as: > {code} > if (params.length == 2 && server instanceof HRegionServer && > params[0] instanceof byte[] && > params[1] instanceof Operation) { > ... > responseInfo.putAll(((Operation) params[1]).toMap()); > ... > } else if (params.length == 1 && server instanceof HRegionServer && > params[0] instanceof Operation) { > ... > responseInfo.putAll(((Operation) params[0]).toMap()); > ... > } else { > ... > } > {code} > Because the parameter is always a protobuf message, not an instance of > Operation, the request parameter will not be added into the warn message. The > parameter is helpful to find out the problem, for example, knowing the > startRow/endRow is useful for a TooSlow scan. To improve the warn message, we > can transform the protobuf request message to corresponding Operation > subclass object by ProtobufUtil, so that it can be added the warn message. > Suggestion and discussion are welcomed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12298) Support BB usage in PrefixTree
[ https://issues.apache.org/jira/browse/HBASE-12298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-12298: --- Resolution: Fixed Status: Resolved (was: Patch Available) Pushed to master. Thanks for the reviews Anoop and Ted. > Support BB usage in PrefixTree > -- > > Key: HBASE-12298 > URL: https://issues.apache.org/jira/browse/HBASE-12298 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners >Reporter: Anoop Sam John >Assignee: ramkrishna.s.vasudevan > Attachments: HBASE-12298.patch, HBASE-12298_1.patch, > HBASE-12298_2.patch, HBASE-12298_3.patch, HBASE-12298_4 (1).patch, > HBASE-12298_4 (1).patch, HBASE-12298_4 (1).patch, HBASE-12298_4 (1).patch, > HBASE-12298_4 (1).patch, HBASE-12298_4.patch, HBASE-12298_4.patch, > HBASE-12298_4.patch, HBASE-12298_4.patch, HBASE-12298_5.patch, > HBASE-12298_6.patch, HBASE-12298_7.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)