[jira] [Commented] (HBASE-16499) slow replication for small HBase clusters
[ https://issues.apache.org/jira/browse/HBASE-16499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426508#comment-16426508 ] Ashish Singhi commented on HBASE-16499: --- Thanks for the review. I have pushed the addendum to only master branch. Addendum didn't apply as we have not committed HBASE-20273 in branch-2 and branch-2.0 > slow replication for small HBase clusters > - > > Key: HBASE-16499 > URL: https://issues.apache.org/jira/browse/HBASE-16499 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Vikas Vishwakarma >Assignee: Ashish Singhi >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-16499-addendum.patch, HBASE-16499.patch, > HBASE-16499.patch > > > For small clusters 10-20 nodes we recently observed that replication is > progressing very slowly when we do bulk writes and there is lot of lag > accumulation on AgeOfLastShipped / SizeOfLogQueue. From the logs we observed > that the number of threads used for shipping wal edits in parallel comes from > the following equation in HBaseInterClusterReplicationEndpoint > int n = Math.min(Math.min(this.maxThreads, entries.size()/100+1), > replicationSinkMgr.getSinks().size()); > ... > for (int i=0; ientryLists.add(new ArrayList(entries.size()/n+1)); <-- > batch size > } > ... > for (int i=0; i . > // RuntimeExceptions encountered here bubble up and are handled > in ReplicationSource > pool.submit(createReplicator(entryLists.get(i), i)); <-- > concurrency > futures++; > } > } > maxThreads is fixed & configurable and since we are taking min of the three > values n gets decided based replicationSinkMgr.getSinks().size() when we have > enough edits to replicate > replicationSinkMgr.getSinks().size() is decided based on > int numSinks = (int) Math.ceil(slaveAddresses.size() * ratio); > where ratio is this.ratio = conf.getFloat("replication.source.ratio", > DEFAULT_REPLICATION_SOURCE_RATIO); > Currently DEFAULT_REPLICATION_SOURCE_RATIO is set to 10% so for small > clusters of size 10-20 RegionServers the value we get for numSinks and hence > n is very small like 1 or 2. This substantially reduces the pool concurrency > used for shipping wal edits in parallel effectively slowing down replication > for small clusters and causing lot of lag accumulation in AgeOfLastShipped. > Sometimes it takes tens of hours to clear off the entire replication queue > even after the client has finished writing on the source side. > We are running tests by varying replication.source.ratio and have seen > multi-fold improvement in total replication time (will update the results > here). I wanted to propose here that we should increase the default value for > replication.source.ratio also so that we have sufficient concurrency even for > small clusters. We figured it out after lot of iterations and debugging so > probably slightly higher default will save the trouble. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HBASE-16499) slow replication for small HBase clusters
[ https://issues.apache.org/jira/browse/HBASE-16499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426508#comment-16426508 ] Ashish Singhi edited comment on HBASE-16499 at 4/5/18 5:52 AM: --- Thanks for the review. I have pushed the addendum to only master branch. Addendum didn't apply to branch-2 and branch-2.0, as we have not committed HBASE-20273 in branch-2 and branch-2.0 was (Author: ashish singhi): Thanks for the review. I have pushed the addendum to only master branch. Addendum didn't apply as we have not committed HBASE-20273 in branch-2 and branch-2.0 > slow replication for small HBase clusters > - > > Key: HBASE-16499 > URL: https://issues.apache.org/jira/browse/HBASE-16499 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Vikas Vishwakarma >Assignee: Ashish Singhi >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-16499-addendum.patch, HBASE-16499.patch, > HBASE-16499.patch > > > For small clusters 10-20 nodes we recently observed that replication is > progressing very slowly when we do bulk writes and there is lot of lag > accumulation on AgeOfLastShipped / SizeOfLogQueue. From the logs we observed > that the number of threads used for shipping wal edits in parallel comes from > the following equation in HBaseInterClusterReplicationEndpoint > int n = Math.min(Math.min(this.maxThreads, entries.size()/100+1), > replicationSinkMgr.getSinks().size()); > ... > for (int i=0; ientryLists.add(new ArrayList(entries.size()/n+1)); <-- > batch size > } > ... > for (int i=0; i . > // RuntimeExceptions encountered here bubble up and are handled > in ReplicationSource > pool.submit(createReplicator(entryLists.get(i), i)); <-- > concurrency > futures++; > } > } > maxThreads is fixed & configurable and since we are taking min of the three > values n gets decided based replicationSinkMgr.getSinks().size() when we have > enough edits to replicate > replicationSinkMgr.getSinks().size() is decided based on > int numSinks = (int) Math.ceil(slaveAddresses.size() * ratio); > where ratio is this.ratio = conf.getFloat("replication.source.ratio", > DEFAULT_REPLICATION_SOURCE_RATIO); > Currently DEFAULT_REPLICATION_SOURCE_RATIO is set to 10% so for small > clusters of size 10-20 RegionServers the value we get for numSinks and hence > n is very small like 1 or 2. This substantially reduces the pool concurrency > used for shipping wal edits in parallel effectively slowing down replication > for small clusters and causing lot of lag accumulation in AgeOfLastShipped. > Sometimes it takes tens of hours to clear off the entire replication queue > even after the client has finished writing on the source side. > We are running tests by varying replication.source.ratio and have seen > multi-fold improvement in total replication time (will update the results > here). I wanted to propose here that we should increase the default value for > replication.source.ratio also so that we have sufficient concurrency even for > small clusters. We figured it out after lot of iterations and debugging so > probably slightly higher default will save the trouble. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20273) [DOC] include call out of additional changed config defaults in 2.0 upgrade
[ https://issues.apache.org/jira/browse/HBASE-20273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426507#comment-16426507 ] Ashish Singhi commented on HBASE-20273: --- Should we push this change to branch-2.0 also ? Because that is from where user will be referring the hbase book which will be part of release tar ball. > [DOC] include call out of additional changed config defaults in 2.0 upgrade > --- > > Key: HBASE-20273 > URL: https://issues.apache.org/jira/browse/HBASE-20273 > Project: HBase > Issue Type: Sub-task > Components: documentation >Affects Versions: 2.0.0 >Reporter: Sean Busbey >Assignee: Mike Drob >Priority: Major > Fix For: 3.0.0 > > Attachments: HBASE-20273.patch > > > Copied from feedback on HBASE-19158 from [~mdrob]: > {quote} > Default settings/configuration properties changed/renamed: > HBASE-19919 > HBASE-19148 > HBASE-18307 > HBASE-17314 > HBASE-15784 > HBASE-15027 > HBASE-14906 > HBASE-14521 > {quote} > More detail later from [~mdrob]: > {quote} > would like to see notes that: > hbase.master.cleaner.interval changed from 1 min to 10 min > MasterProcedureConstants.MASTER_PROCEDURE_THREADS defaults to CPU/4 instead > of CPU > hbase.rpc.server.nativetransport renamed to hbase.netty.nativetransport > hbase.netty.rpc.server.worker.count renamed to base.netty.worker.count > hbase.hfile.compactions.discharger.interval renamed to > hbase.hfile.compaction.discharger.interval > hbase.hregion.percolumnfamilyflush.size.lower.bound removed at site level, > but can still be applied at table level > hbase.client.reties.number now counts the total number of retries, not the > total number of tries > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20006) TestRestoreSnapshotFromClientWithRegionReplicas is flakey
[ https://issues.apache.org/jira/browse/HBASE-20006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426505#comment-16426505 ] Toshihiro Suzuki commented on HBASE-20006: -- The last build looks good. Could you please review? [~stack] > TestRestoreSnapshotFromClientWithRegionReplicas is flakey > - > > Key: HBASE-20006 > URL: https://issues.apache.org/jira/browse/HBASE-20006 > Project: HBase > Issue Type: Bug > Components: read replicas >Reporter: stack >Assignee: Toshihiro Suzuki >Priority: Critical > Attachments: HBASE-20006.branch-2.001.patch, > HBASE-20006.master.001.patch, HBASE-20006.master.002.patch, > HBASE-20006.master.003.patch, HBASE-20006.master.003.patch > > > Failing 10% of the time. Interestingly, it is below that causes fail. We go > to split but it is already split. We will then fail the split with an > internal assert which messes up procedures; at a minimum we should just not > split (this is in the prepare stage). > {code} > 2018-02-15 23:21:42,162 INFO [PEWorker-12] > procedure.MasterProcedureScheduler(571): pid=105, > state=RUNNABLE:SPLIT_TABLE_REGION_PREPARE; SplitTableRegionProcedure > table=testOnlineSnapshotAfterSplittingRegions-1518736887838, > parent=3f850cea7d71a7ebd019f2f009efca4d, > daughterA=06b5e6366efbef155d70e56cfdf58dc9, > daughterB=8c175de1b33765a5683ac1e502edb0bd, > table=testOnlineSnapshotAfterSplittingRegions-1518736887838, > testOnlineSnapshotAfterSplittingRegions-1518736887838,,1518736887882.3f850cea7d71a7ebd019f2f009efca4d. > 2018-02-15 23:21:42,162 INFO [PEWorker-12] > assignment.SplitTableRegionProcedure(440): Split of {ENCODED => > 3f850cea7d71a7ebd019f2f009efca4d, NAME => > 'testOnlineSnapshotAfterSplittingRegions-1518736887838,,1518736887882.3f850cea7d71a7ebd019f2f009efca4d.', > STARTKEY => '', ENDKEY => '1'} skipped; state is already SPLIT > 2018-02-15 23:21:42,163 ERROR [PEWorker-12] > procedure2.ProcedureExecutor(1480): CODE-BUG: Uncaught runtime exception: > pid=105, state=RUNNABLE:SPLIT_TABLE_REGION_PREPARE; SplitTableRegionProcedure > table=testOnlineSnapshotAfterSplittingRegions-1518736887838, > parent=3f850cea7d71a7ebd019f2f009efca4d, > daughterA=06b5e6366efbef155d70e56cfdf58dc9, > daughterB=8c175de1b33765a5683ac1e502edb0bd > java.lang.AssertionError: split region should have an exception here > at > org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.executeFromState(SplitTableRegionProcedure.java:228) > at > org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.executeFromState(SplitTableRegionProcedure.java:89) > at > org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:180) > at > org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:845) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1455) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1224) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1734) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20351) Shell dumps netty properties on startup
[ https://issues.apache.org/jira/browse/HBASE-20351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426500#comment-16426500 ] Mike Drob commented on HBASE-20351: --- stack, do you have a custom log4j config on the classpath somewhere? > Shell dumps netty properties on startup > --- > > Key: HBASE-20351 > URL: https://issues.apache.org/jira/browse/HBASE-20351 > Project: HBase > Issue Type: Bug > Components: pain-in-the-ass, shell >Reporter: stack >Priority: Major > Fix For: 2.0.0 > > > {code} > stack@ve0524:~$ ./hbase/bin/hbase --config conf_hbase shell > 2018-04-04 19:58:02,187 DEBUG [main] logging.InternalLoggerFactory: Using > SLF4J as the default logging framework > 2018-04-04 19:58:02,191 DEBUG [main] util.ResourceLeakDetector: > -Dorg.apache.hbase.thirdparty.io.netty.leakDetection.level: simple > 2018-04-04 19:58:02,192 DEBUG [main] util.ResourceLeakDetector: > -Dorg.apache.hbase.thirdparty.io.netty.leakDetection.targetRecords: 4 > 2018-04-04 19:58:02,214 DEBUG [main] internal.PlatformDependent0: > -Dio.netty.noUnsafe: false > 2018-04-04 19:58:02,215 DEBUG [main] internal.PlatformDependent0: Java > version: 8 > 2018-04-04 19:58:02,216 DEBUG [main] internal.PlatformDependent0: > sun.misc.Unsafe.theUnsafe: available > 2018-04-04 19:58:02,216 DEBUG [main] internal.PlatformDependent0: > sun.misc.Unsafe.copyMemory: available > 2018-04-04 19:58:02,217 DEBUG [main] internal.PlatformDependent0: > java.nio.Buffer.address: available > 2018-04-04 19:58:02,217 DEBUG [main] internal.PlatformDependent0: direct > buffer constructor: available > 2018-04-04 19:58:02,218 DEBUG [main] internal.PlatformDependent0: > java.nio.Bits.unaligned: available, true > 2018-04-04 19:58:02,218 DEBUG [main] internal.PlatformDependent0: > jdk.internal.misc.Unsafe.allocateUninitializedArray(int): unavailable prior > to Java9 > 2018-04-04 19:58:02,218 DEBUG [main] internal.PlatformDependent0: > java.nio.DirectByteBuffer.(long, int): available > 2018-04-04 19:58:02,218 DEBUG [main] internal.PlatformDependent: > sun.misc.Unsafe: available > 2018-04-04 19:58:02,218 DEBUG [main] internal.PlatformDependent: > -Dio.netty.tmpdir: /tmp (java.io.tmpdir) > 2018-04-04 19:58:02,218 DEBUG [main] internal.PlatformDependent: > -Dio.netty.bitMode: 64 (sun.arch.data.model) > 2018-04-04 19:58:02,219 DEBUG [main] internal.PlatformDependent: > -Dio.netty.noPreferDirect: false > 2018-04-04 19:58:02,219 DEBUG [main] internal.PlatformDependent: > -Dio.netty.maxDirectMemory: 1073741824 bytes > 2018-04-04 19:58:02,219 DEBUG [main] internal.PlatformDependent: > -Dio.netty.uninitializedArrayAllocationThreshold: -1 > 2018-04-04 19:58:02,220 DEBUG [main] internal.CleanerJava6: > java.nio.ByteBuffer.cleaner(): available > 2018-04-04 19:58:02,220 DEBUG [main] util.ResourceLeakDetectorFactory: Loaded > default ResourceLeakDetector: > org.apache.hbase.thirdparty.io.netty.util.ResourceLeakDetector@7dbae40 > 2018-04-04 19:58:02,229 DEBUG [main] internal.PlatformDependent: > org.jctools-core.MpscChunkedArrayQueue: available > 2018-04-04 19:58:02,260 DEBUG [main] channel.MultithreadEventLoopGroup: > -Dio.netty.eventLoopThreads: 96 > 2018-04-04 19:58:02,282 DEBUG [main] nio.NioEventLoop: > -Dio.netty.noKeySetOptimization: false > 2018-04-04 19:58:02,282 DEBUG [main] nio.NioEventLoop: > -Dio.netty.selectorAutoRebuildThreshold: 512 > HBase Shell > Use "help" to get list of supported commands. > Use "exit" to quit this interactive shell. > Version 2.0.0, r0db342d312784a6663b406fdb0f7b3b3c1fa928d, Mon Apr 2 22:54:56 > PDT 2018 > Took 0.0028 seconds > hbase(main):001:0> > {code} > Does it each time I run a command > {code} > hbase(main):001:0> describe 'ycsb' > 2018-04-04 19:59:00,084 DEBUG [main] buffer.AbstractByteBuf: > -Dorg.apache.hbase.thirdparty.io.netty.buffer.bytebuf.checkAccessible: true > 2018-04-04 19:59:00,084 DEBUG [main] util.ResourceLeakDetectorFactory: Loaded > default ResourceLeakDetector: > org.apache.hbase.thirdparty.io.netty.util.ResourceLeakDetector@66ab924 > 2018-04-04 19:59:00,121 DEBUG [main] channel.DefaultChannelId: > -Dio.netty.processId: 697 (auto-detected) > 2018-04-04 19:59:00,123 DEBUG [main] util.NetUtil: > -Djava.net.preferIPv4Stack: true > 2018-04-04 19:59:00,123 DEBUG [main] util.NetUtil: > -Djava.net.preferIPv6Addresses: false > 2018-04-04 19:59:00,124 DEBUG [main] util.NetUtil: Loopback interface: lo > (lo, 127.0.0.1) > 2018-04-04 19:59:00,125 DEBUG [main] util.NetUtil: > /proc/sys/net/core/somaxconn: 128 > 2018-04-04 19:59:00,125 DEBUG [main] channel.DefaultChannelId: > -Dio.netty.machineId: 00:1e:67:ff:fe:c5:54:b4 (auto-detected) > 2018-04-04 19:59:00,130 DEBUG [main] internal.InternalThreadLocalMap: > -Dio.netty.threadLocalMap.stringBuilder.initialSize: 1024 > 2018-04-04 19:59:00,131 DEBUG [main]
[jira] [Commented] (HBASE-16499) slow replication for small HBase clusters
[ https://issues.apache.org/jira/browse/HBASE-16499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426499#comment-16426499 ] Mike Drob commented on HBASE-16499: --- +1 on addendum > slow replication for small HBase clusters > - > > Key: HBASE-16499 > URL: https://issues.apache.org/jira/browse/HBASE-16499 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Vikas Vishwakarma >Assignee: Ashish Singhi >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-16499-addendum.patch, HBASE-16499.patch, > HBASE-16499.patch > > > For small clusters 10-20 nodes we recently observed that replication is > progressing very slowly when we do bulk writes and there is lot of lag > accumulation on AgeOfLastShipped / SizeOfLogQueue. From the logs we observed > that the number of threads used for shipping wal edits in parallel comes from > the following equation in HBaseInterClusterReplicationEndpoint > int n = Math.min(Math.min(this.maxThreads, entries.size()/100+1), > replicationSinkMgr.getSinks().size()); > ... > for (int i=0; ientryLists.add(new ArrayList(entries.size()/n+1)); <-- > batch size > } > ... > for (int i=0; i . > // RuntimeExceptions encountered here bubble up and are handled > in ReplicationSource > pool.submit(createReplicator(entryLists.get(i), i)); <-- > concurrency > futures++; > } > } > maxThreads is fixed & configurable and since we are taking min of the three > values n gets decided based replicationSinkMgr.getSinks().size() when we have > enough edits to replicate > replicationSinkMgr.getSinks().size() is decided based on > int numSinks = (int) Math.ceil(slaveAddresses.size() * ratio); > where ratio is this.ratio = conf.getFloat("replication.source.ratio", > DEFAULT_REPLICATION_SOURCE_RATIO); > Currently DEFAULT_REPLICATION_SOURCE_RATIO is set to 10% so for small > clusters of size 10-20 RegionServers the value we get for numSinks and hence > n is very small like 1 or 2. This substantially reduces the pool concurrency > used for shipping wal edits in parallel effectively slowing down replication > for small clusters and causing lot of lag accumulation in AgeOfLastShipped. > Sometimes it takes tens of hours to clear off the entire replication queue > even after the client has finished writing on the source side. > We are running tests by varying replication.source.ratio and have seen > multi-fold improvement in total replication time (will update the results > here). I wanted to propose here that we should increase the default value for > replication.source.ratio also so that we have sufficient concurrency even for > small clusters. We figured it out after lot of iterations and debugging so > probably slightly higher default will save the trouble. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20006) TestRestoreSnapshotFromClientWithRegionReplicas is flakey
[ https://issues.apache.org/jira/browse/HBASE-20006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426497#comment-16426497 ] Hadoop QA commented on HBASE-20006: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 35s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 41s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 10s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 6m 3s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 52s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 10s{color} | {color:green} hbase-server: The patch generated 0 new + 36 unchanged - 5 fixed = 36 total (was 41) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 51s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 19m 20s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}102m 3s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}146m 52s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f | | JIRA Issue | HBASE-20006 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12917635/HBASE-20006.master.003.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 2eda5cef612b 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 5fed7fd3d2 | | maven | version: Apache Maven 3.5.3 (3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) | | Default Java | 1.8.0_162 | | findbugs | v3.1.0-RC3 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/12305/testReport/ | | Max. process+thread count | 4274 (vs. ulimit of 1) | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/12305/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org |
[jira] [Commented] (HBASE-20351) Shell dumps netty properties on startup
[ https://issues.apache.org/jira/browse/HBASE-20351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426489#comment-16426489 ] Sahil Aggarwal commented on HBASE-20351: Can you please help me reproduce. I tried on branch-2 ``` Use "help" to get list of supported commands. Use "exit" to quit this interactive shell. Version 2.1.0-SNAPSHOT, r039bc73571d2cc89378749573dfeec74c247b0b9, Thu Apr 5 10:33:31 IST 2018 Took 0.0030 seconds hbase(main):001:0> hbase(main):002:0* list TABLE 0 row(s) Took 0.3629 seconds hbase(main):003:0> ``` couldn't see such logs, even tried turning everything to DEBUG mode in log config. > Shell dumps netty properties on startup > --- > > Key: HBASE-20351 > URL: https://issues.apache.org/jira/browse/HBASE-20351 > Project: HBase > Issue Type: Bug > Components: pain-in-the-ass, shell >Reporter: stack >Priority: Major > Fix For: 2.0.0 > > > {code} > stack@ve0524:~$ ./hbase/bin/hbase --config conf_hbase shell > 2018-04-04 19:58:02,187 DEBUG [main] logging.InternalLoggerFactory: Using > SLF4J as the default logging framework > 2018-04-04 19:58:02,191 DEBUG [main] util.ResourceLeakDetector: > -Dorg.apache.hbase.thirdparty.io.netty.leakDetection.level: simple > 2018-04-04 19:58:02,192 DEBUG [main] util.ResourceLeakDetector: > -Dorg.apache.hbase.thirdparty.io.netty.leakDetection.targetRecords: 4 > 2018-04-04 19:58:02,214 DEBUG [main] internal.PlatformDependent0: > -Dio.netty.noUnsafe: false > 2018-04-04 19:58:02,215 DEBUG [main] internal.PlatformDependent0: Java > version: 8 > 2018-04-04 19:58:02,216 DEBUG [main] internal.PlatformDependent0: > sun.misc.Unsafe.theUnsafe: available > 2018-04-04 19:58:02,216 DEBUG [main] internal.PlatformDependent0: > sun.misc.Unsafe.copyMemory: available > 2018-04-04 19:58:02,217 DEBUG [main] internal.PlatformDependent0: > java.nio.Buffer.address: available > 2018-04-04 19:58:02,217 DEBUG [main] internal.PlatformDependent0: direct > buffer constructor: available > 2018-04-04 19:58:02,218 DEBUG [main] internal.PlatformDependent0: > java.nio.Bits.unaligned: available, true > 2018-04-04 19:58:02,218 DEBUG [main] internal.PlatformDependent0: > jdk.internal.misc.Unsafe.allocateUninitializedArray(int): unavailable prior > to Java9 > 2018-04-04 19:58:02,218 DEBUG [main] internal.PlatformDependent0: > java.nio.DirectByteBuffer.(long, int): available > 2018-04-04 19:58:02,218 DEBUG [main] internal.PlatformDependent: > sun.misc.Unsafe: available > 2018-04-04 19:58:02,218 DEBUG [main] internal.PlatformDependent: > -Dio.netty.tmpdir: /tmp (java.io.tmpdir) > 2018-04-04 19:58:02,218 DEBUG [main] internal.PlatformDependent: > -Dio.netty.bitMode: 64 (sun.arch.data.model) > 2018-04-04 19:58:02,219 DEBUG [main] internal.PlatformDependent: > -Dio.netty.noPreferDirect: false > 2018-04-04 19:58:02,219 DEBUG [main] internal.PlatformDependent: > -Dio.netty.maxDirectMemory: 1073741824 bytes > 2018-04-04 19:58:02,219 DEBUG [main] internal.PlatformDependent: > -Dio.netty.uninitializedArrayAllocationThreshold: -1 > 2018-04-04 19:58:02,220 DEBUG [main] internal.CleanerJava6: > java.nio.ByteBuffer.cleaner(): available > 2018-04-04 19:58:02,220 DEBUG [main] util.ResourceLeakDetectorFactory: Loaded > default ResourceLeakDetector: > org.apache.hbase.thirdparty.io.netty.util.ResourceLeakDetector@7dbae40 > 2018-04-04 19:58:02,229 DEBUG [main] internal.PlatformDependent: > org.jctools-core.MpscChunkedArrayQueue: available > 2018-04-04 19:58:02,260 DEBUG [main] channel.MultithreadEventLoopGroup: > -Dio.netty.eventLoopThreads: 96 > 2018-04-04 19:58:02,282 DEBUG [main] nio.NioEventLoop: > -Dio.netty.noKeySetOptimization: false > 2018-04-04 19:58:02,282 DEBUG [main] nio.NioEventLoop: > -Dio.netty.selectorAutoRebuildThreshold: 512 > HBase Shell > Use "help" to get list of supported commands. > Use "exit" to quit this interactive shell. > Version 2.0.0, r0db342d312784a6663b406fdb0f7b3b3c1fa928d, Mon Apr 2 22:54:56 > PDT 2018 > Took 0.0028 seconds > hbase(main):001:0> > {code} > Does it each time I run a command > {code} > hbase(main):001:0> describe 'ycsb' > 2018-04-04 19:59:00,084 DEBUG [main] buffer.AbstractByteBuf: > -Dorg.apache.hbase.thirdparty.io.netty.buffer.bytebuf.checkAccessible: true > 2018-04-04 19:59:00,084 DEBUG [main] util.ResourceLeakDetectorFactory: Loaded > default ResourceLeakDetector: > org.apache.hbase.thirdparty.io.netty.util.ResourceLeakDetector@66ab924 > 2018-04-04 19:59:00,121 DEBUG [main] channel.DefaultChannelId: > -Dio.netty.processId: 697 (auto-detected) > 2018-04-04 19:59:00,123 DEBUG [main] util.NetUtil: > -Djava.net.preferIPv4Stack: true > 2018-04-04 19:59:00,123 DEBUG [main] util.NetUtil: > -Djava.net.preferIPv6Addresses: false > 2018-04-04 19:59:00,124 DEBUG [main] util.NetUtil: Loopback interface: lo > (lo, 127.0.0.1) >
[jira] [Commented] (HBASE-20188) [TESTING] Performance
[ https://issues.apache.org/jira/browse/HBASE-20188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426488#comment-16426488 ] ramkrishna.s.vasudevan commented on HBASE-20188: The latest results with 8G cache are also with short circuit reads ON? Is there any variation in the stack trace? Scans in 2.0 are slower because scans are also like preads now. bq. 'dfs.client.read.shortcircuit.streams.cache.size' and 'dfs.client.socketcache.capacity' values? These values were increased because the default size was causing some issue with ShortcircuitCache. {code:java} 2017-07-18 22:52:28,969 ERROR [ShortCircuitCache_SlotReleaser] shortcircuit.ShortCircuitCache: ShortCircuitCache(0x122da202): failed to release short-circuit shared memory slot Slot(slotIdx=26, shm=DfsClientShm(f0cce51b1df7a0c887c2b708b1bf702d)) by sending ReleaseShortCircuitAccessRequestProto to /var/lib/hadoop-hdfs/dn_socket. Closing shared memory segment. java.net.SocketException: read(2) error: Connection reset by peer {code} WE have not written any detail doc but just collected the observations that we got. As I said when you have enough RAM and all data is in page cache and you have lot of threads reading from HDFS then Short circuit cache was really needed because TCP connection was a problem. > [TESTING] Performance > - > > Key: HBASE-20188 > URL: https://issues.apache.org/jira/browse/HBASE-20188 > Project: HBase > Issue Type: Umbrella > Components: Performance >Reporter: stack >Assignee: stack >Priority: Blocker > Fix For: 2.0.0 > > Attachments: CAM-CONFIG-V01.patch, HBASE-20188.sh, HBase 2.0 > performance evaluation - Basic vs None_ system settings.pdf, > ITBLL2.5B_1.2.7vs2.0.0_cpu.png, ITBLL2.5B_1.2.7vs2.0.0_gctime.png, > ITBLL2.5B_1.2.7vs2.0.0_iops.png, ITBLL2.5B_1.2.7vs2.0.0_load.png, > ITBLL2.5B_1.2.7vs2.0.0_memheap.png, ITBLL2.5B_1.2.7vs2.0.0_memstore.png, > ITBLL2.5B_1.2.7vs2.0.0_ops.png, > ITBLL2.5B_1.2.7vs2.0.0_ops_NOT_summing_regions.png, YCSB_CPU.png, > YCSB_GC_TIME.png, YCSB_IN_MEMORY_COMPACTION=NONE.ops.png, YCSB_MEMSTORE.png, > YCSB_OPs.png, YCSB_in-memory-compaction=NONE.ops.png, YCSB_load.png, > flamegraph-1072.1.svg, flamegraph-1072.2.svg, hbase-env.sh, hbase-site.xml, > lock.127.workloadc.20180402T200918Z.svg, > lock.2.memsize2.c.20180403T160257Z.svg, run_ycsb.sh, tree.txt > > > How does 2.0.0 compare to old versions? Is it faster, slower? There is rumor > that it is much slower, that the problem is the asyncwal writing. Does > in-memory compaction slow us down or speed us up? What happens when you > enable offheaping? > Keep notes here in this umbrella issue. Need to be able to say something > about perf when 2.0.0 ships. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-17730) [DOC] Migration to 2.0 for coprocessors
[ https://issues.apache.org/jira/browse/HBASE-17730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426481#comment-16426481 ] Hudson commented on HBASE-17730: Results for branch branch-2.0 [build #131 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/131/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/131//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/131//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/131//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > [DOC] Migration to 2.0 for coprocessors > > > Key: HBASE-17730 > URL: https://issues.apache.org/jira/browse/HBASE-17730 > Project: HBase > Issue Type: Sub-task > Components: documentation, migration >Reporter: Appy >Assignee: Appy >Priority: Blocker > Fix For: 2.0.0 > > Attachments: HBASE-17730.master.001.patch > > > Jiras breaking coprocessor compatibility should be marked with component ' > Coprocessor', and label 'incompatible'. > Close to releasing 2.0, we should go through all such jiras and write down > steps for migrating coprocessor easily. > The idea is, it might be very hard to fix coprocessor breakages by reverse > engineering errors, but will be easier we suggest easiest way to fix > breakages resulting from each individual incompatible change. > For eg. HBASE-17312 is incompatible change. It'll result in 100s of errors > because BaseXXXObserver classes are gone and will probably result in a lot of > confusion, but if we explicitly mention the fix which is just one line change > - replace "Foo extends BaseXXXObserver" with "Foo implements XXXObserver" - > it makes it very easy. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19488) Move to using Apache commons CollectionUtils
[ https://issues.apache.org/jira/browse/HBASE-19488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426477#comment-16426477 ] Hudson commented on HBASE-19488: Results for branch branch-2 [build #571 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/571/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/571//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/571//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/571//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Move to using Apache commons CollectionUtils > > > Key: HBASE-19488 > URL: https://issues.apache.org/jira/browse/HBASE-19488 > Project: HBase > Issue Type: Improvement > Components: hbase >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Trivial > Fix For: 2.1.0 > > Attachments: HBASE-19488.1.patch, HBASE-19488.2.patch, > HBASE-19488.3.patch, HBASE-19488.4.patch, HBASE-19488.5.patch > > > A bunch of unused code in CollectionUtils or code that can be found in Apache > Commons libraries. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-17730) [DOC] Migration to 2.0 for coprocessors
[ https://issues.apache.org/jira/browse/HBASE-17730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426476#comment-16426476 ] Hudson commented on HBASE-17730: Results for branch branch-2 [build #571 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/571/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/571//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/571//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/571//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > [DOC] Migration to 2.0 for coprocessors > > > Key: HBASE-17730 > URL: https://issues.apache.org/jira/browse/HBASE-17730 > Project: HBase > Issue Type: Sub-task > Components: documentation, migration >Reporter: Appy >Assignee: Appy >Priority: Blocker > Fix For: 2.0.0 > > Attachments: HBASE-17730.master.001.patch > > > Jiras breaking coprocessor compatibility should be marked with component ' > Coprocessor', and label 'incompatible'. > Close to releasing 2.0, we should go through all such jiras and write down > steps for migrating coprocessor easily. > The idea is, it might be very hard to fix coprocessor breakages by reverse > engineering errors, but will be easier we suggest easiest way to fix > breakages resulting from each individual incompatible change. > For eg. HBASE-17312 is incompatible change. It'll result in 100s of errors > because BaseXXXObserver classes are gone and will probably result in a lot of > confusion, but if we explicitly mention the fix which is just one line change > - replace "Foo extends BaseXXXObserver" with "Foo implements XXXObserver" - > it makes it very easy. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20188) [TESTING] Performance
[ https://issues.apache.org/jira/browse/HBASE-20188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426469#comment-16426469 ] stack commented on HBASE-20188: --- Added a 5th sheet to our doc. I ran compare of 1.2.7 to 2.0.0 and 2.0.0 w/o in-memory compaction w/ 8G of heap; i.e. lots of cache misses. 1.2.7 is > 2x the throughput for read-only loads and 10-20% better on writes. On mixed-load, 2.0.0 with in-memory-compaction OFF is better than 1.2.7 but with it on, its much worse. Something is wrong here when lots of cache misses (With previous runs with heap of 31G, most reads were from cache). > [TESTING] Performance > - > > Key: HBASE-20188 > URL: https://issues.apache.org/jira/browse/HBASE-20188 > Project: HBase > Issue Type: Umbrella > Components: Performance >Reporter: stack >Assignee: stack >Priority: Blocker > Fix For: 2.0.0 > > Attachments: CAM-CONFIG-V01.patch, HBASE-20188.sh, HBase 2.0 > performance evaluation - Basic vs None_ system settings.pdf, > ITBLL2.5B_1.2.7vs2.0.0_cpu.png, ITBLL2.5B_1.2.7vs2.0.0_gctime.png, > ITBLL2.5B_1.2.7vs2.0.0_iops.png, ITBLL2.5B_1.2.7vs2.0.0_load.png, > ITBLL2.5B_1.2.7vs2.0.0_memheap.png, ITBLL2.5B_1.2.7vs2.0.0_memstore.png, > ITBLL2.5B_1.2.7vs2.0.0_ops.png, > ITBLL2.5B_1.2.7vs2.0.0_ops_NOT_summing_regions.png, YCSB_CPU.png, > YCSB_GC_TIME.png, YCSB_IN_MEMORY_COMPACTION=NONE.ops.png, YCSB_MEMSTORE.png, > YCSB_OPs.png, YCSB_in-memory-compaction=NONE.ops.png, YCSB_load.png, > flamegraph-1072.1.svg, flamegraph-1072.2.svg, hbase-env.sh, hbase-site.xml, > lock.127.workloadc.20180402T200918Z.svg, > lock.2.memsize2.c.20180403T160257Z.svg, run_ycsb.sh, tree.txt > > > How does 2.0.0 compare to old versions? Is it faster, slower? There is rumor > that it is much slower, that the problem is the asyncwal writing. Does > in-memory compaction slow us down or speed us up? What happens when you > enable offheaping? > Keep notes here in this umbrella issue. Need to be able to say something > about perf when 2.0.0 ships. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-16499) slow replication for small HBase clusters
[ https://issues.apache.org/jira/browse/HBASE-16499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426452#comment-16426452 ] Ashish Singhi commented on HBASE-16499: --- [~stack] or [~busbey] can you please check the addendum attached. > slow replication for small HBase clusters > - > > Key: HBASE-16499 > URL: https://issues.apache.org/jira/browse/HBASE-16499 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Vikas Vishwakarma >Assignee: Ashish Singhi >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-16499-addendum.patch, HBASE-16499.patch, > HBASE-16499.patch > > > For small clusters 10-20 nodes we recently observed that replication is > progressing very slowly when we do bulk writes and there is lot of lag > accumulation on AgeOfLastShipped / SizeOfLogQueue. From the logs we observed > that the number of threads used for shipping wal edits in parallel comes from > the following equation in HBaseInterClusterReplicationEndpoint > int n = Math.min(Math.min(this.maxThreads, entries.size()/100+1), > replicationSinkMgr.getSinks().size()); > ... > for (int i=0; ientryLists.add(new ArrayList(entries.size()/n+1)); <-- > batch size > } > ... > for (int i=0; i . > // RuntimeExceptions encountered here bubble up and are handled > in ReplicationSource > pool.submit(createReplicator(entryLists.get(i), i)); <-- > concurrency > futures++; > } > } > maxThreads is fixed & configurable and since we are taking min of the three > values n gets decided based replicationSinkMgr.getSinks().size() when we have > enough edits to replicate > replicationSinkMgr.getSinks().size() is decided based on > int numSinks = (int) Math.ceil(slaveAddresses.size() * ratio); > where ratio is this.ratio = conf.getFloat("replication.source.ratio", > DEFAULT_REPLICATION_SOURCE_RATIO); > Currently DEFAULT_REPLICATION_SOURCE_RATIO is set to 10% so for small > clusters of size 10-20 RegionServers the value we get for numSinks and hence > n is very small like 1 or 2. This substantially reduces the pool concurrency > used for shipping wal edits in parallel effectively slowing down replication > for small clusters and causing lot of lag accumulation in AgeOfLastShipped. > Sometimes it takes tens of hours to clear off the entire replication queue > even after the client has finished writing on the source side. > We are running tests by varying replication.source.ratio and have seen > multi-fold improvement in total replication time (will update the results > here). I wanted to propose here that we should increase the default value for > replication.source.ratio also so that we have sufficient concurrency even for > small clusters. We figured it out after lot of iterations and debugging so > probably slightly higher default will save the trouble. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-15227) HBase Backup Phase 3: Fault tolerance (client/server) support
[ https://issues.apache.org/jira/browse/HBASE-15227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426451#comment-16426451 ] stack commented on HBASE-15227: --- What is 'Done'? Where is it described? > HBase Backup Phase 3: Fault tolerance (client/server) support > - > > Key: HBASE-15227 > URL: https://issues.apache.org/jira/browse/HBASE-15227 > Project: HBase > Issue Type: Task >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Labels: backup > Attachments: HBASE-15227-v3.patch, HBASE-15277-v1.patch > > > System must be tolerant to faults: > # Backup operations MUST be atomic (no partial completion state in the backup > system table) > # Process must detect any type of failures which can result in a data loss > (partial backup or partial restore) > # Proper system table state restore and cleanup must be done in case of a > failure > # Additional utility to repair backup system table and corresponding file > system cleanup must be implemented > h3. Backup > h4. General FT framework implementation > Before actual backup operation starts, snapshot of a backup system table is > taken and system table is updated with *ACTIVE_SNAPSHOT* flag. The flag will > be removed upon backup completion. > In case of *any* server-side failures, client catches errors/exceptions and > handles them: > # Cleans up backup destination (removes partial backup data) > # Cleans up any temporary data > # Deletes any active snapshots of a tables being backed up (during full > backup we snapshot tables) > # Restores backup system table from snapshot > # Deletes backup system table snapshot (we read snapshot name from backup > system table before) > In case of *any* client-side failures: > Before any backup or restore operation run we check backup system table on > *ACTIVE_SNAPSHOT*, if flag is present, operation aborts with a message that > backup repair tool (see below) must be run > h4. Backup repair tool > The command line tool *backup repair* which executes the following steps: > # Reads info of a last failed backup session > # Cleans up backup destination (removes partial backup data) > # Cleans up any temporary data > # Deletes any active snapshots of a tables being backed up (during full > backup we snapshot tables) > # Restores backup system table from snapshot > # Deletes backup system table snapshot (we read snapshot name from backup > system table before) > h4. Detection of a partial loss of data > h5. Full backup > Export snapshot operation (?). > We count files and check sizes before and after DistCp run > h5. Incremental backup > Conversion of WAL to HFiles, when WAL file is moved from active to archive > directory. The code is in place to handle this situation > During DistCp run (same as above) > h3. Restore > This operation does not modify backup system table and is idempotent. No > special FT is required. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20057) update .gitignore file to ignores *.patch file
[ https://issues.apache.org/jira/browse/HBASE-20057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426439#comment-16426439 ] maoling commented on HBASE-20057: - Oh,Yes.You are right.Thanks for your ack > update .gitignore file to ignores *.patch file > --- > > Key: HBASE-20057 > URL: https://issues.apache.org/jira/browse/HBASE-20057 > Project: HBase > Issue Type: Improvement > Components: conf >Reporter: maoling >Assignee: maoling >Priority: Trivial > Attachments: HBASE-20057-master-v0.patch > > > When git format-patch a patch, a generated *.patch file will be commited with > real code changes easily. > Update .gitignore file which ignores *.patch will improve the git workflow -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20006) TestRestoreSnapshotFromClientWithRegionReplicas is flakey
[ https://issues.apache.org/jira/browse/HBASE-20006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426431#comment-16426431 ] Toshihiro Suzuki commented on HBASE-20006: -- I just reattached the v3 patch to rerun a build. > TestRestoreSnapshotFromClientWithRegionReplicas is flakey > - > > Key: HBASE-20006 > URL: https://issues.apache.org/jira/browse/HBASE-20006 > Project: HBase > Issue Type: Bug > Components: read replicas >Reporter: stack >Assignee: Toshihiro Suzuki >Priority: Critical > Attachments: HBASE-20006.branch-2.001.patch, > HBASE-20006.master.001.patch, HBASE-20006.master.002.patch, > HBASE-20006.master.003.patch, HBASE-20006.master.003.patch > > > Failing 10% of the time. Interestingly, it is below that causes fail. We go > to split but it is already split. We will then fail the split with an > internal assert which messes up procedures; at a minimum we should just not > split (this is in the prepare stage). > {code} > 2018-02-15 23:21:42,162 INFO [PEWorker-12] > procedure.MasterProcedureScheduler(571): pid=105, > state=RUNNABLE:SPLIT_TABLE_REGION_PREPARE; SplitTableRegionProcedure > table=testOnlineSnapshotAfterSplittingRegions-1518736887838, > parent=3f850cea7d71a7ebd019f2f009efca4d, > daughterA=06b5e6366efbef155d70e56cfdf58dc9, > daughterB=8c175de1b33765a5683ac1e502edb0bd, > table=testOnlineSnapshotAfterSplittingRegions-1518736887838, > testOnlineSnapshotAfterSplittingRegions-1518736887838,,1518736887882.3f850cea7d71a7ebd019f2f009efca4d. > 2018-02-15 23:21:42,162 INFO [PEWorker-12] > assignment.SplitTableRegionProcedure(440): Split of {ENCODED => > 3f850cea7d71a7ebd019f2f009efca4d, NAME => > 'testOnlineSnapshotAfterSplittingRegions-1518736887838,,1518736887882.3f850cea7d71a7ebd019f2f009efca4d.', > STARTKEY => '', ENDKEY => '1'} skipped; state is already SPLIT > 2018-02-15 23:21:42,163 ERROR [PEWorker-12] > procedure2.ProcedureExecutor(1480): CODE-BUG: Uncaught runtime exception: > pid=105, state=RUNNABLE:SPLIT_TABLE_REGION_PREPARE; SplitTableRegionProcedure > table=testOnlineSnapshotAfterSplittingRegions-1518736887838, > parent=3f850cea7d71a7ebd019f2f009efca4d, > daughterA=06b5e6366efbef155d70e56cfdf58dc9, > daughterB=8c175de1b33765a5683ac1e502edb0bd > java.lang.AssertionError: split region should have an exception here > at > org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.executeFromState(SplitTableRegionProcedure.java:228) > at > org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.executeFromState(SplitTableRegionProcedure.java:89) > at > org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:180) > at > org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:845) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1455) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1224) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1734) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20006) TestRestoreSnapshotFromClientWithRegionReplicas is flakey
[ https://issues.apache.org/jira/browse/HBASE-20006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Toshihiro Suzuki updated HBASE-20006: - Attachment: HBASE-20006.master.003.patch > TestRestoreSnapshotFromClientWithRegionReplicas is flakey > - > > Key: HBASE-20006 > URL: https://issues.apache.org/jira/browse/HBASE-20006 > Project: HBase > Issue Type: Bug > Components: read replicas >Reporter: stack >Assignee: Toshihiro Suzuki >Priority: Critical > Attachments: HBASE-20006.branch-2.001.patch, > HBASE-20006.master.001.patch, HBASE-20006.master.002.patch, > HBASE-20006.master.003.patch, HBASE-20006.master.003.patch > > > Failing 10% of the time. Interestingly, it is below that causes fail. We go > to split but it is already split. We will then fail the split with an > internal assert which messes up procedures; at a minimum we should just not > split (this is in the prepare stage). > {code} > 2018-02-15 23:21:42,162 INFO [PEWorker-12] > procedure.MasterProcedureScheduler(571): pid=105, > state=RUNNABLE:SPLIT_TABLE_REGION_PREPARE; SplitTableRegionProcedure > table=testOnlineSnapshotAfterSplittingRegions-1518736887838, > parent=3f850cea7d71a7ebd019f2f009efca4d, > daughterA=06b5e6366efbef155d70e56cfdf58dc9, > daughterB=8c175de1b33765a5683ac1e502edb0bd, > table=testOnlineSnapshotAfterSplittingRegions-1518736887838, > testOnlineSnapshotAfterSplittingRegions-1518736887838,,1518736887882.3f850cea7d71a7ebd019f2f009efca4d. > 2018-02-15 23:21:42,162 INFO [PEWorker-12] > assignment.SplitTableRegionProcedure(440): Split of {ENCODED => > 3f850cea7d71a7ebd019f2f009efca4d, NAME => > 'testOnlineSnapshotAfterSplittingRegions-1518736887838,,1518736887882.3f850cea7d71a7ebd019f2f009efca4d.', > STARTKEY => '', ENDKEY => '1'} skipped; state is already SPLIT > 2018-02-15 23:21:42,163 ERROR [PEWorker-12] > procedure2.ProcedureExecutor(1480): CODE-BUG: Uncaught runtime exception: > pid=105, state=RUNNABLE:SPLIT_TABLE_REGION_PREPARE; SplitTableRegionProcedure > table=testOnlineSnapshotAfterSplittingRegions-1518736887838, > parent=3f850cea7d71a7ebd019f2f009efca4d, > daughterA=06b5e6366efbef155d70e56cfdf58dc9, > daughterB=8c175de1b33765a5683ac1e502edb0bd > java.lang.AssertionError: split region should have an exception here > at > org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.executeFromState(SplitTableRegionProcedure.java:228) > at > org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.executeFromState(SplitTableRegionProcedure.java:89) > at > org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:180) > at > org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:845) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1455) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1224) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1734) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20351) Shell dumps netty properties on startup
stack created HBASE-20351: - Summary: Shell dumps netty properties on startup Key: HBASE-20351 URL: https://issues.apache.org/jira/browse/HBASE-20351 Project: HBase Issue Type: Bug Components: pain-in-the-ass, shell Reporter: stack Fix For: 2.0.0 {code} stack@ve0524:~$ ./hbase/bin/hbase --config conf_hbase shell 2018-04-04 19:58:02,187 DEBUG [main] logging.InternalLoggerFactory: Using SLF4J as the default logging framework 2018-04-04 19:58:02,191 DEBUG [main] util.ResourceLeakDetector: -Dorg.apache.hbase.thirdparty.io.netty.leakDetection.level: simple 2018-04-04 19:58:02,192 DEBUG [main] util.ResourceLeakDetector: -Dorg.apache.hbase.thirdparty.io.netty.leakDetection.targetRecords: 4 2018-04-04 19:58:02,214 DEBUG [main] internal.PlatformDependent0: -Dio.netty.noUnsafe: false 2018-04-04 19:58:02,215 DEBUG [main] internal.PlatformDependent0: Java version: 8 2018-04-04 19:58:02,216 DEBUG [main] internal.PlatformDependent0: sun.misc.Unsafe.theUnsafe: available 2018-04-04 19:58:02,216 DEBUG [main] internal.PlatformDependent0: sun.misc.Unsafe.copyMemory: available 2018-04-04 19:58:02,217 DEBUG [main] internal.PlatformDependent0: java.nio.Buffer.address: available 2018-04-04 19:58:02,217 DEBUG [main] internal.PlatformDependent0: direct buffer constructor: available 2018-04-04 19:58:02,218 DEBUG [main] internal.PlatformDependent0: java.nio.Bits.unaligned: available, true 2018-04-04 19:58:02,218 DEBUG [main] internal.PlatformDependent0: jdk.internal.misc.Unsafe.allocateUninitializedArray(int): unavailable prior to Java9 2018-04-04 19:58:02,218 DEBUG [main] internal.PlatformDependent0: java.nio.DirectByteBuffer.(long, int): available 2018-04-04 19:58:02,218 DEBUG [main] internal.PlatformDependent: sun.misc.Unsafe: available 2018-04-04 19:58:02,218 DEBUG [main] internal.PlatformDependent: -Dio.netty.tmpdir: /tmp (java.io.tmpdir) 2018-04-04 19:58:02,218 DEBUG [main] internal.PlatformDependent: -Dio.netty.bitMode: 64 (sun.arch.data.model) 2018-04-04 19:58:02,219 DEBUG [main] internal.PlatformDependent: -Dio.netty.noPreferDirect: false 2018-04-04 19:58:02,219 DEBUG [main] internal.PlatformDependent: -Dio.netty.maxDirectMemory: 1073741824 bytes 2018-04-04 19:58:02,219 DEBUG [main] internal.PlatformDependent: -Dio.netty.uninitializedArrayAllocationThreshold: -1 2018-04-04 19:58:02,220 DEBUG [main] internal.CleanerJava6: java.nio.ByteBuffer.cleaner(): available 2018-04-04 19:58:02,220 DEBUG [main] util.ResourceLeakDetectorFactory: Loaded default ResourceLeakDetector: org.apache.hbase.thirdparty.io.netty.util.ResourceLeakDetector@7dbae40 2018-04-04 19:58:02,229 DEBUG [main] internal.PlatformDependent: org.jctools-core.MpscChunkedArrayQueue: available 2018-04-04 19:58:02,260 DEBUG [main] channel.MultithreadEventLoopGroup: -Dio.netty.eventLoopThreads: 96 2018-04-04 19:58:02,282 DEBUG [main] nio.NioEventLoop: -Dio.netty.noKeySetOptimization: false 2018-04-04 19:58:02,282 DEBUG [main] nio.NioEventLoop: -Dio.netty.selectorAutoRebuildThreshold: 512 HBase Shell Use "help" to get list of supported commands. Use "exit" to quit this interactive shell. Version 2.0.0, r0db342d312784a6663b406fdb0f7b3b3c1fa928d, Mon Apr 2 22:54:56 PDT 2018 Took 0.0028 seconds hbase(main):001:0> {code} Does it each time I run a command {code} hbase(main):001:0> describe 'ycsb' 2018-04-04 19:59:00,084 DEBUG [main] buffer.AbstractByteBuf: -Dorg.apache.hbase.thirdparty.io.netty.buffer.bytebuf.checkAccessible: true 2018-04-04 19:59:00,084 DEBUG [main] util.ResourceLeakDetectorFactory: Loaded default ResourceLeakDetector: org.apache.hbase.thirdparty.io.netty.util.ResourceLeakDetector@66ab924 2018-04-04 19:59:00,121 DEBUG [main] channel.DefaultChannelId: -Dio.netty.processId: 697 (auto-detected) 2018-04-04 19:59:00,123 DEBUG [main] util.NetUtil: -Djava.net.preferIPv4Stack: true 2018-04-04 19:59:00,123 DEBUG [main] util.NetUtil: -Djava.net.preferIPv6Addresses: false 2018-04-04 19:59:00,124 DEBUG [main] util.NetUtil: Loopback interface: lo (lo, 127.0.0.1) 2018-04-04 19:59:00,125 DEBUG [main] util.NetUtil: /proc/sys/net/core/somaxconn: 128 2018-04-04 19:59:00,125 DEBUG [main] channel.DefaultChannelId: -Dio.netty.machineId: 00:1e:67:ff:fe:c5:54:b4 (auto-detected) 2018-04-04 19:59:00,130 DEBUG [main] internal.InternalThreadLocalMap: -Dio.netty.threadLocalMap.stringBuilder.initialSize: 1024 2018-04-04 19:59:00,131 DEBUG [main] internal.InternalThreadLocalMap: -Dio.netty.threadLocalMap.stringBuilder.maxSize: 4096 2018-04-04 19:59:00,151 DEBUG [main] buffer.PooledByteBufAllocator: -Dio.netty.allocator.numHeapArenas: 82 2018-04-04 19:59:00,151 DEBUG [main] buffer.PooledByteBufAllocator: -Dio.netty.allocator.numDirectArenas: 10 2018-04-04 19:59:00,151 DEBUG [main] buffer.PooledByteBufAllocator: -Dio.netty.allocator.pageSize: 8192 2018-04-04 19:59:00,151 DEBUG [main]
[jira] [Commented] (HBASE-19572) RegionMover should use the configured default port number and not the one from HConstants
[ https://issues.apache.org/jira/browse/HBASE-19572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426423#comment-16426423 ] Toshihiro Suzuki commented on HBASE-19572: -- I'm not familiar with shadedjars but it seems like it isn't related to the patch, because when I ran it in master branch locally, the same error occurred. Thanks. > RegionMover should use the configured default port number and not the one > from HConstants > - > > Key: HBASE-19572 > URL: https://issues.apache.org/jira/browse/HBASE-19572 > Project: HBase > Issue Type: Bug >Reporter: Esteban Gutierrez >Assignee: Toshihiro Suzuki >Priority: Major > Attachments: HBASE-19572.master.001.patch, HBASE-19572.patch > > > The issue I ran into HBASE-19499 was due RegionMover not using the port used > by {{hbase-site.xml}}. The tool should use the value used in the > configuration before falling back to the hardcoded value > {{HConstants.DEFAULT_REGIONSERVER_PORT}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20219) An error occurs when scanning with reversed=true and loadColumnFamiliesOnDemand=true
[ https://issues.apache.org/jira/browse/HBASE-20219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426424#comment-16426424 ] Toshihiro Suzuki commented on HBASE-20219: -- I'm not familiar with shadedjars but it seems like it isn't related to the patch, because when I ran it in master branch locally, the same error occurred. Thanks. > An error occurs when scanning with reversed=true and > loadColumnFamiliesOnDemand=true > > > Key: HBASE-20219 > URL: https://issues.apache.org/jira/browse/HBASE-20219 > Project: HBase > Issue Type: Bug >Reporter: Toshihiro Suzuki >Assignee: Toshihiro Suzuki >Priority: Major > Attachments: HBASE-20219-UT.patch, HBASE-20219.master.001.patch, > HBASE-20219.master.002.patch, HBASE-20219.master.003.patch, > HBASE-20219.master.004.patch > > > I'm facing the following error when scanning with reversed=true and > loadColumnFamiliesOnDemand=true: > {code} > java.lang.IllegalStateException: requestSeek cannot be called on > ReversedKeyValueHeap > at > org.apache.hadoop.hbase.regionserver.ReversedKeyValueHeap.requestSeek(ReversedKeyValueHeap.java:66) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.joinedHeapMayHaveData(HRegion.java:6725) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:6652) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:6364) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3108) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3345) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41548) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) > {code} > I will attach a UT patch to reproduce this issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19893) restore_snapshot is broken in master branch when region splits
[ https://issues.apache.org/jira/browse/HBASE-19893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426422#comment-16426422 ] Toshihiro Suzuki commented on HBASE-19893: -- It seems like TestReplicationKillMasterRS.killOneMasterRS failed in the last build, but I don't think it is related to the patch and the test was successful when I ran it locally. I'm not familiar with shadedjars but it seems like it isn't related to the patch, because when I ran it in master branch locally, the same error occurred. Thanks. > restore_snapshot is broken in master branch when region splits > -- > > Key: HBASE-19893 > URL: https://issues.apache.org/jira/browse/HBASE-19893 > Project: HBase > Issue Type: Bug > Components: snapshots >Reporter: Toshihiro Suzuki >Assignee: Toshihiro Suzuki >Priority: Critical > Attachments: HBASE-19893.master.001.patch, > HBASE-19893.master.002.patch, HBASE-19893.master.003.patch > > > When I was investigating HBASE-19850, I found restore_snapshot didn't work in > master branch. > > Steps to reproduce are as follows: > 1. Create a table > {code:java} > create "test", "cf" > {code} > 2. Load data (2000 rows) to the table > {code:java} > (0...2000).each{|i| put "test", "row#{i}", "cf:col", "val"} > {code} > 3. Split the table > {code:java} > split "test" > {code} > 4. Take a snapshot > {code:java} > snapshot "test", "snap" > {code} > 5. Load more data (2000 rows) to the table and split the table agin > {code:java} > (2000...4000).each{|i| put "test", "row#{i}", "cf:col", "val"} > split "test" > {code} > 6. Restore the table from the snapshot > {code:java} > disable "test" > restore_snapshot "snap" > enable "test" > {code} > 7. Scan the table > {code:java} > scan "test" > {code} > However, this scan returns only 244 rows (it should return 2000 rows) like > the following: > {code:java} > hbase(main):038:0> scan "test" > ROW COLUMN+CELL > row78 column=cf:col, timestamp=1517298307049, value=val > > row999 column=cf:col, timestamp=1517298307608, value=val > 244 row(s) > Took 0.1500 seconds > {code} > > Also, the restored table should have 2 online regions but it has 3 online > regions. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20329) Add note for operators to refguide on AsyncFSWAL
[ https://issues.apache.org/jira/browse/HBASE-20329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426415#comment-16426415 ] Anoop Sam John commented on HBASE-20329: passing the WAL provder -> provider written concurrentl ("fan-out") -> concurrently default default client does. -> 'default ' repeating And you say it "Do not confuse the _ASYNC_WAL_ option on a Mutation or Table with the _AsyncFSWAL_ writer; they are distinctoptions unfortunately closely named" :-) > Add note for operators to refguide on AsyncFSWAL > > > Key: HBASE-20329 > URL: https://issues.apache.org/jira/browse/HBASE-20329 > Project: HBase > Issue Type: Sub-task > Components: documentation, wal >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 3.0.0 > > Attachments: HBASE-20329.master.001.patch > > > Need a few notes in refguide on this new facility. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20329) Add note for operators to refguide on AsyncFSWAL
[ https://issues.apache.org/jira/browse/HBASE-20329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426413#comment-16426413 ] Anoop Sam John commented on HBASE-20329: I see the patch now.. We edit the Durability stuff also in this.. Fine.. Noticed few spell issues etc in the patch, can correct with addendum. > Add note for operators to refguide on AsyncFSWAL > > > Key: HBASE-20329 > URL: https://issues.apache.org/jira/browse/HBASE-20329 > Project: HBase > Issue Type: Sub-task > Components: documentation, wal >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 3.0.0 > > Attachments: HBASE-20329.master.001.patch > > > Need a few notes in refguide on this new facility. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20329) Add note for operators to refguide on AsyncFSWAL
[ https://issues.apache.org/jira/browse/HBASE-20329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426410#comment-16426410 ] Anoop Sam John commented on HBASE-20329: That Q was not really on the new WAL impl 'AsyncFSWAL'.That was more on our Durability semantics ASYNC_WAL. This can be used with whatever WAL impl under the server.When it is ASYNC_WAL durability, we will not call sync to complete the write ops. The sync will happen in an async way later. So ya when crash is there, we may get the data loss.But with our new WAL impl, AsyncFSWAL there can NOT be data loss. :-).. As this is clubbed with this AsyncFSWAL doc jira , am saying boss.. Now these names confuses us a lot. > Add note for operators to refguide on AsyncFSWAL > > > Key: HBASE-20329 > URL: https://issues.apache.org/jira/browse/HBASE-20329 > Project: HBase > Issue Type: Sub-task > Components: documentation, wal >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 3.0.0 > > Attachments: HBASE-20329.master.001.patch > > > Need a few notes in refguide on this new facility. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20152) [AMv2] DisableTableProcedure versus ServerCrashProcedure
[ https://issues.apache.org/jira/browse/HBASE-20152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426388#comment-16426388 ] Mike Drob commented on HBASE-20152: --- [~stack] [~uagashe] [~Apache9] - all subtasks here are resolved. is this issue done or do we have more work to do? > [AMv2] DisableTableProcedure versus ServerCrashProcedure > > > Key: HBASE-20152 > URL: https://issues.apache.org/jira/browse/HBASE-20152 > Project: HBase > Issue Type: Bug > Components: amv2 >Reporter: stack >Assignee: stack >Priority: Major > > Seeing a small spate of issues where disabled tables/regions are being > assigned. Usually they happen when a DisableTableProcedure is running > concurrent with a ServerCrashProcedure. See below. See associated > HBASE-20131. This is umbrella issue for fixing. > h3. Deadlock > From HBASE-20137, 'TestRSGroups is Flakey', > https://issues.apache.org/jira/browse/HBASE-20137?focusedCommentId=16390325=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16390325 > {code} > * SCP is running because a server was aborted in test. > * SCP starts AssignProcedure of region X from crashed server. > * DisableTable Procedure runs because test has finished and we're doing > table delete. Queues > * UnassignProcedure for region X. > * Disable Unassign gets Lock on region X first. > * SCP AssignProcedure tries to get lock, waits on lock. > * DisableTable Procedure UnassignProcedure RPC fails because server is down > (Thats why the SCP). > * Tries to expire the server it failed the RPC against. Fails (currently > being SCP'd). > * DisableTable Procedure Unassign is suspended. It is a suspend with lock on > region X held > * SCP can't run because lock on X is held > * Test timesout. > {code} > Here is the actual log from around the deadlock. pid=308 is the SCP. pid=309 > is the disable table: > {code} > 2018-03-05 11:29:21,224 DEBUG [PEWorker-7] > procedure.ServerCrashProcedure(225): Done splitting WALs pid=308, > state=RUNNABLE:SERVER_CRASH_SPLIT_LOGS; ServerCrashProcedure > server=1cfd208ff882,40584,1520249102524, splitWal=true, meta=false > 2018-03-05 11:29:21,300 INFO > [RpcServer.default.FPBQ.Fifo.handler=2,queue=0,port=38498] > rsgroup.RSGroupAdminServer(371): Move server done: default=>appInfo > 2018-03-05 11:29:21,307 INFO > [RpcServer.default.FPBQ.Fifo.handler=2,queue=0,port=38498] > rsgroup.RSGroupAdminEndpoint$RSGroupAdminServiceImpl(279): > Client=jenkins//172.17.0.2 list rsgroup > 2018-03-05 11:29:21,312 INFO [Time-limited test] client.HBaseAdmin$15(901): > Started disable of Group_ns:testKillRS > 2018-03-05 11:29:21,313 INFO > [RpcServer.default.FPBQ.Fifo.handler=2,queue=0,port=38498] > master.HMaster$7(2278): Client=jenkins//172.17.0.2 disable Group_ns:testKillRS > 2018-03-05 11:29:21,384 INFO [PEWorker-9] > procedure2.ProcedureExecutor(1495): Initialized subprocedures=[{pid=310, > ppid=308, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure > table=Group_ns:testKillRS, region=de7534c208a06502537cd95c248b3043}] > 2018-03-05 11:29:21,534 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=2,queue=0,port=38498] > procedure2.ProcedureExecutor(865): Stored pid=309, > state=RUNNABLE:DISABLE_TABLE_PREPARE; DisableTableProcedure > table=Group_ns:testKillRS > 2018-03-05 11:29:21,542 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=2,queue=0,port=38498] > master.MasterRpcServices(1134): Checking to see if procedure is done pid=309 > 2018-03-05 11:29:21,644 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=2,queue=0,port=38498] > master.MasterRpcServices(1134): Checking to see if procedure is done pid=309 > 2018-03-05 11:29:21,847 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=2,queue=0,port=38498] > master.MasterRpcServices(1134): Checking to see if procedure is done pid=309 > 2018-03-05 11:29:22,118 DEBUG [PEWorker-5] hbase.MetaTableAccessor(1944): Put > {"totalColumns":1,"row":"Group_ns:testKillRS","families":{"table":[{"qualifier":"state","vlen":2,"tag":[],"timestamp":1520249362117}]},"ts":1520249362117} > 2018-03-05 11:29:22,123 INFO [PEWorker-5] hbase.MetaTableAccessor(1646): > Updated table Group_ns:testKillRS state to DISABLING in META > 2018-03-05 11:29:22,148 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=2,queue=0,port=38498] > master.MasterRpcServices(1134): Checking to see if procedure is done pid=309 > 2018-03-05 11:29:22,345 INFO [PEWorker-5] > procedure2.ProcedureExecutor(1495): Initialized subprocedures=[{pid=311, > ppid=309, state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=Group_ns:testKillRS, region=de7534c208a06502537cd95c248b3043, > server=1cfd208ff882,40584,1520249102524}] > 2018-03-05 11:29:22,503 INFO [PEWorker-13] > procedure.MasterProcedureScheduler(571): pid=311,
[jira] [Commented] (HBASE-20346) [DOC] document change to shell tests
[ https://issues.apache.org/jira/browse/HBASE-20346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426376#comment-16426376 ] Mike Drob commented on HBASE-20346: --- docs only change so test4tests and unit shouldn't be related > [DOC] document change to shell tests > > > Key: HBASE-20346 > URL: https://issues.apache.org/jira/browse/HBASE-20346 > Project: HBase > Issue Type: Sub-task > Components: documentation, shell >Affects Versions: 2.0.0-beta-2, 2.0.0 >Reporter: Sean Busbey >Assignee: Mike Drob >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-20346.patch > > > HBASE-19903 changed how the shell tests are organized and executed, but it > missed updating the section on the ref guide that talks about the shell tests. > bring it up to date so that folks don't miss a bunch of the tests or add new > ones in the wrong place. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20330) ProcedureExecutor.start() gets stuck in recover lease on store.
[ https://issues.apache.org/jira/browse/HBASE-20330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426351#comment-16426351 ] Hadoop QA commented on HBASE-20330: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 31s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 2s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 24s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 50s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 19m 54s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 39s{color} | {color:green} hbase-procedure in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 10s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 39m 36s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f | | JIRA Issue | HBASE-20330 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12917624/hbase-20330.master.001.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux c1df15202630 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 5fed7fd3d2 | | maven | version: Apache Maven 3.5.3 (3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) | | Default Java | 1.8.0_162 | | findbugs | v3.1.0-RC3 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/12304/testReport/ | | Max. process+thread count | 295 (vs. ulimit of 1) | | modules | C: hbase-procedure U: hbase-procedure | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/12304/console | |
[jira] [Commented] (HBASE-20346) [DOC] document change to shell tests
[ https://issues.apache.org/jira/browse/HBASE-20346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426331#comment-16426331 ] Hadoop QA commented on HBASE-20346: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 10s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 26s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 25s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}109m 8s{color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}123m 3s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.master.procedure.TestServerCrashProcedure | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f | | JIRA Issue | HBASE-20346 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12917614/HBASE-20346.patch | | Optional Tests | asflicense javac javadoc unit | | uname | Linux 433213201daa 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 5fed7fd3d2 | | maven | version: Apache Maven 3.5.3 (3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) | | Default Java | 1.8.0_162 | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/12303/artifact/patchprocess/patch-unit-root.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/12303/testReport/ | | Max. process+thread count | 4449 (vs. ulimit of 1) | | modules | C: . U: . | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/12303/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org | This message was automatically generated. > [DOC] document change to shell tests > > > Key: HBASE-20346 > URL: https://issues.apache.org/jira/browse/HBASE-20346 > Project: HBase > Issue Type: Sub-task > Components: documentation, shell >Affects Versions: 2.0.0-beta-2, 2.0.0 >Reporter: Sean Busbey >Assignee: Mike Drob >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-20346.patch > > > HBASE-19903 changed how the shell tests are organized and executed, but it > missed updating the section on the ref guide that talks about the shell tests. > bring it up to date so that folks don't miss a bunch of the tests or add new > ones in the wrong place. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20330) ProcedureExecutor.start() gets stuck in recover lease on store.
[ https://issues.apache.org/jira/browse/HBASE-20330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426323#comment-16426323 ] Umesh Agashe commented on HBASE-20330: -- [~appy], [~xiaochen], can you please review the changes? > ProcedureExecutor.start() gets stuck in recover lease on store. > --- > > Key: HBASE-20330 > URL: https://issues.apache.org/jira/browse/HBASE-20330 > Project: HBase > Issue Type: Bug > Components: proc-v2 >Affects Versions: 2.0.0-beta-2 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Fix For: 2.0.0 > > Attachments: hbase-20330.master.001.patch > > > We have instance in our internal testing where master log is getting filled > with following messages: > {code} > 2018-04-02 17:11:17,566 INFO org.apache.hadoop.hbase.util.FSHDFSUtils: > Recover lease on dfs file > hdfs://ns1/hbase/MasterProcWALs/pv2-0018.log > 2018-04-02 17:11:17,567 INFO org.apache.hadoop.hbase.util.FSHDFSUtils: > Recovered lease, attempt=0 on > file=hdfs://ns1/hbase/MasterProcWALs/pv2-0018.log after 1ms > 2018-04-02 17:11:17,574 WARN > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore: Unable to > read tracker for hdfs://ns1/hbase/MasterProcWALs/pv2-0018.log > - Invalid Trailer version. got 111 expected 1 > 2018-04-02 17:11:17,576 ERROR > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore: Log file with > id=19 already exists > org.apache.hadoop.fs.FileAlreadyExistsException: > /hbase/MasterProcWALs/pv2-0019.log for client 10.17.202.11 > already exists > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.startFile(FSDirWriteFileOp.java:381) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2442) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2339) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:764) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:451) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) > {code} > Debugging it further with [~appy], [~avirmani] and [~xiaochen] we found that > when WALProcedureStore#rollWriter() fails and returns false for some reason, > it keeps looping continuously. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20204) Add locking to RefreshFileConnections in BucketCache
[ https://issues.apache.org/jira/browse/HBASE-20204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426318#comment-16426318 ] Zach York commented on HBASE-20204: --- Whoops, I didn't get around to actually committing this. I'll do that today. > Add locking to RefreshFileConnections in BucketCache > > > Key: HBASE-20204 > URL: https://issues.apache.org/jira/browse/HBASE-20204 > Project: HBase > Issue Type: Bug > Components: BucketCache >Affects Versions: 1.4.3, 2.0.0 >Reporter: Zach York >Assignee: Zach York >Priority: Major > Attachments: HBASE-20204.master.001.patch, > HBASE-20204.master.002.patch, HBASE-20204.master.003.patch > > > This is a follow-up to HBASE-20141 where [~anoop.hbase] suggested adding > locking for refreshing channels. > I have also seen this become an issue when a RS has to abort and it locks on > trying to flush out the remaining data to the cache (since cache on write was > turned on). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20330) ProcedureExecutor.start() gets stuck in recover lease on store.
[ https://issues.apache.org/jira/browse/HBASE-20330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-20330: - Status: Patch Available (was: In Progress) rollWriter() fails after creating the file and returns false. In next iteration of while loop in recoverLease() file list is refreshed. > ProcedureExecutor.start() gets stuck in recover lease on store. > --- > > Key: HBASE-20330 > URL: https://issues.apache.org/jira/browse/HBASE-20330 > Project: HBase > Issue Type: Bug > Components: proc-v2 >Affects Versions: 2.0.0-beta-2 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Fix For: 2.0.0 > > Attachments: hbase-20330.master.001.patch > > > We have instance in our internal testing where master log is getting filled > with following messages: > {code} > 2018-04-02 17:11:17,566 INFO org.apache.hadoop.hbase.util.FSHDFSUtils: > Recover lease on dfs file > hdfs://ns1/hbase/MasterProcWALs/pv2-0018.log > 2018-04-02 17:11:17,567 INFO org.apache.hadoop.hbase.util.FSHDFSUtils: > Recovered lease, attempt=0 on > file=hdfs://ns1/hbase/MasterProcWALs/pv2-0018.log after 1ms > 2018-04-02 17:11:17,574 WARN > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore: Unable to > read tracker for hdfs://ns1/hbase/MasterProcWALs/pv2-0018.log > - Invalid Trailer version. got 111 expected 1 > 2018-04-02 17:11:17,576 ERROR > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore: Log file with > id=19 already exists > org.apache.hadoop.fs.FileAlreadyExistsException: > /hbase/MasterProcWALs/pv2-0019.log for client 10.17.202.11 > already exists > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.startFile(FSDirWriteFileOp.java:381) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2442) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2339) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:764) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:451) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) > {code} > Debugging it further with [~appy], [~avirmani] and [~xiaochen] we found that > when WALProcedureStore#rollWriter() fails and returns false for some reason, > it keeps looping continuously. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20330) ProcedureExecutor.start() gets stuck in recover lease on store.
[ https://issues.apache.org/jira/browse/HBASE-20330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-20330: - Attachment: hbase-20330.master.001.patch > ProcedureExecutor.start() gets stuck in recover lease on store. > --- > > Key: HBASE-20330 > URL: https://issues.apache.org/jira/browse/HBASE-20330 > Project: HBase > Issue Type: Bug > Components: proc-v2 >Affects Versions: 2.0.0-beta-2 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Fix For: 2.0.0 > > Attachments: hbase-20330.master.001.patch > > > We have instance in our internal testing where master log is getting filled > with following messages: > {code} > 2018-04-02 17:11:17,566 INFO org.apache.hadoop.hbase.util.FSHDFSUtils: > Recover lease on dfs file > hdfs://ns1/hbase/MasterProcWALs/pv2-0018.log > 2018-04-02 17:11:17,567 INFO org.apache.hadoop.hbase.util.FSHDFSUtils: > Recovered lease, attempt=0 on > file=hdfs://ns1/hbase/MasterProcWALs/pv2-0018.log after 1ms > 2018-04-02 17:11:17,574 WARN > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore: Unable to > read tracker for hdfs://ns1/hbase/MasterProcWALs/pv2-0018.log > - Invalid Trailer version. got 111 expected 1 > 2018-04-02 17:11:17,576 ERROR > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore: Log file with > id=19 already exists > org.apache.hadoop.fs.FileAlreadyExistsException: > /hbase/MasterProcWALs/pv2-0019.log for client 10.17.202.11 > already exists > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.startFile(FSDirWriteFileOp.java:381) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2442) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2339) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:764) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:451) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) > {code} > Debugging it further with [~appy], [~avirmani] and [~xiaochen] we found that > when WALProcedureStore#rollWriter() fails and returns false for some reason, > it keeps looping continuously. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (HBASE-20330) ProcedureExecutor.start() gets stuck in recover lease on store.
[ https://issues.apache.org/jira/browse/HBASE-20330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-20330 started by Umesh Agashe. > ProcedureExecutor.start() gets stuck in recover lease on store. > --- > > Key: HBASE-20330 > URL: https://issues.apache.org/jira/browse/HBASE-20330 > Project: HBase > Issue Type: Bug > Components: proc-v2 >Affects Versions: 2.0.0-beta-2 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Fix For: 2.0.0 > > Attachments: hbase-20330.master.001.patch > > > We have instance in our internal testing where master log is getting filled > with following messages: > {code} > 2018-04-02 17:11:17,566 INFO org.apache.hadoop.hbase.util.FSHDFSUtils: > Recover lease on dfs file > hdfs://ns1/hbase/MasterProcWALs/pv2-0018.log > 2018-04-02 17:11:17,567 INFO org.apache.hadoop.hbase.util.FSHDFSUtils: > Recovered lease, attempt=0 on > file=hdfs://ns1/hbase/MasterProcWALs/pv2-0018.log after 1ms > 2018-04-02 17:11:17,574 WARN > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore: Unable to > read tracker for hdfs://ns1/hbase/MasterProcWALs/pv2-0018.log > - Invalid Trailer version. got 111 expected 1 > 2018-04-02 17:11:17,576 ERROR > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore: Log file with > id=19 already exists > org.apache.hadoop.fs.FileAlreadyExistsException: > /hbase/MasterProcWALs/pv2-0019.log for client 10.17.202.11 > already exists > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.startFile(FSDirWriteFileOp.java:381) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2442) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2339) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:764) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:451) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) > {code} > Debugging it further with [~appy], [~avirmani] and [~xiaochen] we found that > when WALProcedureStore#rollWriter() fails and returns false for some reason, > it keeps looping continuously. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20322) CME in StoreScanner causes region server crash
[ https://issues.apache.org/jira/browse/HBASE-20322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426308#comment-16426308 ] Hudson commented on HBASE-20322: Results for branch branch-1.3 [build #284 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/284/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/284//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/284//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/284//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > CME in StoreScanner causes region server crash > -- > > Key: HBASE-20322 > URL: https://issues.apache.org/jira/browse/HBASE-20322 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.2 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan >Priority: Major > Fix For: 1.5.0, 1.3.3, 1.4.4 > > Attachments: HBASE-20322.branch-1.3.001.patch, > HBASE-20322.branch-1.3.002-addendum.patch, HBASE-20322.branch-1.3.002.patch, > HBASE-20322.branch-1.4.001.patch > > > RS crashed with ConcurrentModificationException on our 1.3 cluster, stack > trace below. [~toffer] and I checked and there is a race condition between > flush and scanner close. When StoreScanner.updateReaders() is updating the > scanners after a newly flushed file (in this trace below a region close > during a split), the client's scanner could be closing thus causing CME. > Its rare, but since it crashes the region server, needs to be fixed. > FATAL regionserver.HRegionServer [regionserver/] : ABORTING region server > : Replay of WAL required. Forcing server shutdown > org.apache.hadoop.hbase.DroppedSnapshotException: region: > at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2579) > at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2255) > at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2217) > at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2207) > at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1501) > at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1420) > at > org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsBeforePONR(SplitTransactionImpl.java:398) > at > org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.createDaughters(SplitTransactionImpl.java:278) > at > org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(SplitTransactionImpl.java:566) > at > org.apache.hadoop.hbase.regionserver.SplitRequest.doSplitting(SplitRequest.java:82) > at > org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:154) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.util.ConcurrentModificationException > at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901) > at java.util.ArrayList$Itr.next(ArrayList.java:851) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.clearAndClose(StoreScanner.java:797) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.updateReaders(StoreScanner.java:825) > at > org.apache.hadoop.hbase.regionserver.HStore.notifyChangedReadersObservers(HStore.java:1155) > PS: ignore the line no in the above stack trace, method calls should help > understand whats happening. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20231) Not able to delete column family from a row using RemoteHTable
[ https://issues.apache.org/jira/browse/HBASE-20231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426309#comment-16426309 ] Hudson commented on HBASE-20231: Results for branch branch-1.3 [build #284 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/284/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/284//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/284//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/284//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Not able to delete column family from a row using RemoteHTable > -- > > Key: HBASE-20231 > URL: https://issues.apache.org/jira/browse/HBASE-20231 > Project: HBase > Issue Type: Bug > Components: REST >Reporter: Pankaj Kumar >Assignee: Pankaj Kumar >Priority: Major > Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.4, 2.0.0 > > Attachments: HBASE-20231-branch-1-v2.patch, > HBASE-20231-branch-1-v3.patch, HBASE-20231-branch-1.3.patch, > HBASE-20231-branch-1.patch, HBASE-20231-v2.patch, HBASE-20231-v3.patch, > HBASE-20231.patch > > > Example code to reproduce the issue, > {code:java} > Cluster cluster = new Cluster(); > cluster.add("rest-server-IP", rest-server-port); > Client client = new Client(cluster); > RemoteHTable table = new RemoteHTable(client, "t1"); > // Insert few records, > Put put = new Put(Bytes.toBytes("r1")); > put.add(Bytes.toBytes("cf1"), Bytes.toBytes("c1"), Bytes.toBytes("c1")); > put.add(Bytes.toBytes("cf1"), Bytes.toBytes("c2"), Bytes.toBytes("c2")); > put.add(Bytes.toBytes("cf2"), Bytes.toBytes("c1"), Bytes.toBytes("c1")); > table.put(put); > put = new Put(Bytes.toBytes("r2")); > put.add(Bytes.toBytes("cf1"), Bytes.toBytes("c1"), Bytes.toBytes("c1")); > put.add(Bytes.toBytes("cf1"), Bytes.toBytes("c2"), Bytes.toBytes("c2")); > put.add(Bytes.toBytes("cf2"), Bytes.toBytes("c1"), Bytes.toBytes("c1")); > table.put(put); > // Delete the entire column family from the row > Delete del = new Delete(Bytes.toBytes("r2")); > del.addFamily(Bytes.toBytes("cf1")); > table.delete(del); > {code} > Here the problem is in building row specification in > RemoteHTable.buildRowSpec(). Row specification is framed as "/t1/r2/cf1:" > instead of "/t1/r2/cf1". -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-15386) PREFETCH_BLOCKS_ON_OPEN in HColumnDescriptor is ignored
[ https://issues.apache.org/jira/browse/HBASE-15386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426304#comment-16426304 ] stack commented on HBASE-15386: --- Yeah, wrong JIRA ID commit fa215a67e20da8c1a450b16db27c73ee3f9d02c0 Author: stackDate: Wed Apr 20 09:38:30 2016 -0700 HBASE-15385 PREFETCH_BLOCKS_ON_OPEN in HColumnDescriptor is ignored > PREFETCH_BLOCKS_ON_OPEN in HColumnDescriptor is ignored > --- > > Key: HBASE-15386 > URL: https://issues.apache.org/jira/browse/HBASE-15386 > Project: HBase > Issue Type: Sub-task > Components: BucketCache >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 0.98.19, 1.4.0, 1.2.2, 1.3.1, 2.0.0 > > Attachments: 15386.branch-1.2.patch, 15386.branch-1.2.patch, > 15386.branch-1.patch, 15386.patch > > > We use the global flag hbase.rs.prefetchblocksonopen only and ignore the HCD > setting. > Purge from HCD or hook it up again (it probably worked once). > Thanks to Daniel Pol for finding this one. Let me fix. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-15314) Allow more than one backing file in bucketcache
[ https://issues.apache.org/jira/browse/HBASE-15314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426300#comment-16426300 ] stack commented on HBASE-15314: --- Got committed with a wonky JIRA ID commit e67eb6c424d76ee259f5076c277454a73e3a2bf4 Author: RamkrishnaDate: Thu Mar 16 16:11:35 2017 +0530 HBSE-15314 Allow more than one backing file in bucketcache (Chunhui Shen) > Allow more than one backing file in bucketcache > --- > > Key: HBASE-15314 > URL: https://issues.apache.org/jira/browse/HBASE-15314 > Project: HBase > Issue Type: Sub-task > Components: BucketCache >Reporter: stack >Assignee: chunhui shen >Priority: Major > Fix For: 2.0.0 > > Attachments: FileIOEngine.java, HBASE-15314-v2.patch, > HBASE-15314-v3.patch, HBASE-15314-v4.patch, HBASE-15314-v5.patch, > HBASE-15314-v6.patch, HBASE-15314-v7.patch, HBASE-15314-v8.patch, > HBASE-15314.master.001.patch, HBASE-15314.master.001.patch, HBASE-15314.patch > > > Allow bucketcache use more than just one backing file: e.g. chassis has more > than one SSD in it. > Usage (Setting the following configurations in hbase-site.xml): > {quote} > > hbase.bucketcache.ioengine > > files:/mnt/disk1/bucketcache,/mnt/disk2/bucketcache,/mnt/disk3/bucketcache,/mnt/disk4/bucketcache > > > hbase.bucketcache.size > 1048576 > > {quote} > The above setting means the total capacity of cache is 1048576MB(1TB), each > file length will be set to 0.25TB. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-15203) Reduce garbage created by path.toString() during Checksum verification
[ https://issues.apache.org/jira/browse/HBASE-15203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426297#comment-16426297 ] stack commented on HBASE-15203: --- Got committed w/ a misformatted JIRA ID commit 2cf8af5bf1d501156cbb3b421cf75c1051ead7d9 Author: ramkrishnaDate: Thu Feb 4 11:44:46 2016 +0530 HBASE-HBASE-15203 Reduce garbage created by path.toString() during Checksum verification (Ram) > Reduce garbage created by path.toString() during Checksum verification > -- > > Key: HBASE-15203 > URL: https://issues.apache.org/jira/browse/HBASE-15203 > Project: HBase > Issue Type: Sub-task > Components: regionserver >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Minor > Fix For: 1.3.0, 2.0.0 > > Attachments: HBASE-15203.patch, HBASE-15203_1.patch, > HBASE-15203_2.patch, HBASE-15203_branch-1.1.patch > > > When we try to read a block we do checksum verification for which we need the > file name in which the block belongs to. So we do Path.toString() every time. > This seems to create around 163MB of char[] that is garbage collected in a > simple scan run. It is also visible in writes but the impact is lesser. In > overall write/read profile the top 2 factors are byte[] and char[]. This > toString() can easily be avoided and reduce its share from the total. To make > it more precise in 1 min of profiling, among the 1.8G of garbage created by > StringBuilder.toString - this path.toString() was contributing around 3.5%. > After the patch this is totally not there. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-15194) TestStochasticLoadBalancer.testRegionReplicationOnMidClusterSameHosts flaky on trunk
[ https://issues.apache.org/jira/browse/HBASE-15194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426290#comment-16426290 ] stack commented on HBASE-15194: --- Got committed without a JIRA ID commit 9ec408e25b70f4ce586340b9396da67a1e38f6ca Author: stackDate: Sat Jan 30 07:51:21 2016 -0400 TestStochasticLoadBalancer.testRegionReplicationOnMidClusterSameHosts flaky on trunk > TestStochasticLoadBalancer.testRegionReplicationOnMidClusterSameHosts flaky > on trunk > > > Key: HBASE-15194 > URL: https://issues.apache.org/jira/browse/HBASE-15194 > Project: HBase > Issue Type: Sub-task > Components: flakey, test >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 1.3.0, 2.0.0 > > Attachments: disable.patch > > > Fails 25% of the time: > https://builds.apache.org/job/PreCommit-HBASE-Build/349/testReport/org.apache.hadoop.hbase.master.balancer/TestStochasticLoadBalancer/testRegionReplicationOnMidClusterSameHosts/history/ > I'm just going to disable till someone has time to dig in on the test. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-15186) HBASE-15158 Preamble 1 of 2: fix findbugs, add javadoc, change Region#getReadpoint to #getReadPoint, and some util
[ https://issues.apache.org/jira/browse/HBASE-15186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426289#comment-16426289 ] stack commented on HBASE-15186: --- Got committed under the JIRA ID HBASE-15158 tree 2b722aaecf4213be6ffe5d543c3f71e9ae637a94 parent 13a46df1815ed32bc9a2696f19cf620b4ce84bb4 author stackSun Jan 31 20:21:48 2016 -0800 committer stack Sun Jan 31 20:21:48 2016 -0800 HBASE-15158 HBASE-15158 Preamble 1 of 2: fix findbugs, add javadoc, change Region#getReadpoint to #getReadPoint, and some util > HBASE-15158 Preamble 1 of 2: fix findbugs, add javadoc, change > Region#getReadpoint to #getReadPoint, and some util > -- > > Key: HBASE-15186 > URL: https://issues.apache.org/jira/browse/HBASE-15186 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.0 > > Attachments: 15186v2.patch, 15186v3.patch, 15186v4.patch, > subpatch.patch > > > Break up the HBASE-15158 patch. Here is the first piece. Its a bunch of > findbugs fixes, a bit of utility for tag-handling (to be exploited in later > patches), some clarifying comments and javadoc (and javadoc fixes), cleanup > of a some Region API. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-18309) Support multi threads in CleanerChore
[ https://issues.apache.org/jira/browse/HBASE-18309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426288#comment-16426288 ] Zach York commented on HBASE-18309: --- +1 for opening a new JIRA. > Support multi threads in CleanerChore > - > > Key: HBASE-18309 > URL: https://issues.apache.org/jira/browse/HBASE-18309 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Reid Chan >Priority: Major > Fix For: 2.0.0-beta-1, 2.0.0 > > Attachments: HBASE-18309.addendum.patch, > HBASE-18309.branch-1.001.patch, HBASE-18309.branch-1.002.patch, > HBASE-18309.branch-1.003.patch, HBASE-18309.branch-1.004.patch, > HBASE-18309.branch-1.005.patch, HBASE-18309.branch-1.006.patch, > HBASE-18309.master.001.patch, HBASE-18309.master.002.patch, > HBASE-18309.master.004.patch, HBASE-18309.master.005.patch, > HBASE-18309.master.006.patch, HBASE-18309.master.007.patch, > HBASE-18309.master.008.patch, HBASE-18309.master.009.patch, > HBASE-18309.master.010.patch, HBASE-18309.master.011.patch, > HBASE-18309.master.012.patch, space_consumption_in_archive.png > > > There is only one thread in LogCleaner to clean oldWALs and in our big > cluster we find this is not enough. The number of files under oldWALs reach > the max-directory-items limit of HDFS and cause region server crash, so we > use multi threads for LogCleaner and the crash not happened any more. > What's more, currently there's only one thread iterating the archive > directory, and we could use multiple threads cleaning sub directories in > parallel to speed it up. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20276) [shell] Revert shell REPL change and document
[ https://issues.apache.org/jira/browse/HBASE-20276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426283#comment-16426283 ] Hadoop QA commented on HBASE-20276: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 30s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 36s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 12s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} rubocop {color} | {color:red} 0m 16s{color} | {color:red} The patch generated 7 new + 246 unchanged - 5 fixed = 253 total (was 251) {color} | | {color:green}+1{color} | {color:green} ruby-lint {color} | {color:green} 0m 7s{color} | {color:green} The patch generated 0 new + 189 unchanged - 1 fixed = 189 total (was 190) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 42s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}245m 23s{color} | {color:green} root in the patch passed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 51s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}261m 54s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f | | JIRA Issue | HBASE-20276 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12917589/HBASE-20276.0.patch | | Optional Tests | asflicense rubocop ruby_lint javac javadoc unit | | uname | Linux b1f80b133ad4 4.4.0-104-generic #127-Ubuntu SMP Mon Dec 11 12:16:42 UTC 2017 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 0c0fe05bc4 | | maven | version: Apache Maven 3.5.3 (3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) | | Default Java | 1.8.0_162 | | rubocop | v0.54.0 | | rubocop | https://builds.apache.org/job/PreCommit-HBASE-Build/12301/artifact/patchprocess/diff-patch-rubocop.txt | | ruby-lint | v2.3.1 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/12301/testReport/ | | asflicense | https://builds.apache.org/job/PreCommit-HBASE-Build/12301/artifact/patchprocess/patch-asflicense-problems.txt | | Max. process+thread count | 4740 (vs. ulimit of 1) | | modules | C: hbase-shell . U: . | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/12301/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org | This message was automatically generated. > [shell] Revert shell REPL change and document > - > > Key: HBASE-20276 > URL: https://issues.apache.org/jira/browse/HBASE-20276 > Project: HBase > Issue Type: Sub-task > Components: documentation, shell >Affects Versions: 1.4.0, 2.0.0 >Reporter: Sean Busbey >Assignee: Sean Busbey >Priority: Blocker > Fix For: 1.4.4, 2.0.0 > > Attachments: HBASE-20276.0.patch > > > Feedback from [~mdrob] on HBASE-19158: > {quote} > Shell: > HBASE-19770. There was another issue opened where this was identified as a > problem so maybe the shape will change further, but I can't find it now. > {quote} > New commentary from [~busbey]: > This was a follow on to
[jira] [Commented] (HBASE-20348) [DOC] call out change to tracing in upgrade guide
[ https://issues.apache.org/jira/browse/HBASE-20348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426274#comment-16426274 ] Hadoop QA commented on HBASE-20348: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 42s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 47s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 53s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}170m 3s{color} | {color:green} root in the patch passed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 28s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}186m 2s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f | | JIRA Issue | HBASE-20348 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12917602/HBASE-20348.patch | | Optional Tests | asflicense javac javadoc unit | | uname | Linux c34d315183b7 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 8bc723477b | | maven | version: Apache Maven 3.5.3 (3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) | | Default Java | 1.8.0_162 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/12302/testReport/ | | asflicense | https://builds.apache.org/job/PreCommit-HBASE-Build/12302/artifact/patchprocess/patch-asflicense-problems.txt | | Max. process+thread count | 4342 (vs. ulimit of 1) | | modules | C: . U: . | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/12302/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org | This message was automatically generated. > [DOC] call out change to tracing in upgrade guide > - > > Key: HBASE-20348 > URL: https://issues.apache.org/jira/browse/HBASE-20348 > Project: HBase > Issue Type: Sub-task > Components: documentation, tracing >Affects Versions: 2.0.0 >Reporter: Sean Busbey >Assignee: Mike Drob >Priority: Major > Fix For: 2.0.0 > > Attachments: HBASE-20348.patch > > > we changed our HTrace version across an incompatible boundary in HBASE-18601. > We should call out somewhere that folks who built their apps to do tracing > through our client will need to update. > might also be worth calling out our current doubts on utility. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20188) [TESTING] Performance
[ https://issues.apache.org/jira/browse/HBASE-20188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426262#comment-16426262 ] stack commented on HBASE-20188: --- [~eshcar] I tried it here and seems to run. I don't get your complaint. I'm at commit d655a1ca3e208a829641837d027ced59ead243fc Author: Sean BusbeyDate: Thu Mar 22 11:55:56 2018 -0500 Add HBase 2.0 binding. Are you running a released version? Could you try adding guava to your CLASSPATH to satisfy the java.lang.NoClassDefFoundError: com/google/common/base/Preconditions I made a summary fourth sheet in the document of where we are at the moment. We are 20% slower writing, 11% slower reading and 15% faster doing 50/50. If we disable in-memory compaction we are 9%/7%/17%. Am waiting now on a good story to tell about in-memory compaction. I'll try with smaller heaps in meantime to see if it helps but currently its detrimental in all of these basic YCSB runs. > [TESTING] Performance > - > > Key: HBASE-20188 > URL: https://issues.apache.org/jira/browse/HBASE-20188 > Project: HBase > Issue Type: Umbrella > Components: Performance >Reporter: stack >Assignee: stack >Priority: Blocker > Fix For: 2.0.0 > > Attachments: CAM-CONFIG-V01.patch, HBASE-20188.sh, HBase 2.0 > performance evaluation - Basic vs None_ system settings.pdf, > ITBLL2.5B_1.2.7vs2.0.0_cpu.png, ITBLL2.5B_1.2.7vs2.0.0_gctime.png, > ITBLL2.5B_1.2.7vs2.0.0_iops.png, ITBLL2.5B_1.2.7vs2.0.0_load.png, > ITBLL2.5B_1.2.7vs2.0.0_memheap.png, ITBLL2.5B_1.2.7vs2.0.0_memstore.png, > ITBLL2.5B_1.2.7vs2.0.0_ops.png, > ITBLL2.5B_1.2.7vs2.0.0_ops_NOT_summing_regions.png, YCSB_CPU.png, > YCSB_GC_TIME.png, YCSB_IN_MEMORY_COMPACTION=NONE.ops.png, YCSB_MEMSTORE.png, > YCSB_OPs.png, YCSB_in-memory-compaction=NONE.ops.png, YCSB_load.png, > flamegraph-1072.1.svg, flamegraph-1072.2.svg, hbase-env.sh, hbase-site.xml, > lock.127.workloadc.20180402T200918Z.svg, > lock.2.memsize2.c.20180403T160257Z.svg, run_ycsb.sh, tree.txt > > > How does 2.0.0 compare to old versions? Is it faster, slower? There is rumor > that it is much slower, that the problem is the asyncwal writing. Does > in-memory compaction slow us down or speed us up? What happens when you > enable offheaping? > Keep notes here in this umbrella issue. Need to be able to say something > about perf when 2.0.0 ships. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20231) Not able to delete column family from a row using RemoteHTable
[ https://issues.apache.org/jira/browse/HBASE-20231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426237#comment-16426237 ] Hudson commented on HBASE-20231: Results for branch branch-1.2 [build #288 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/288/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/288//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/288//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/288//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Not able to delete column family from a row using RemoteHTable > -- > > Key: HBASE-20231 > URL: https://issues.apache.org/jira/browse/HBASE-20231 > Project: HBase > Issue Type: Bug > Components: REST >Reporter: Pankaj Kumar >Assignee: Pankaj Kumar >Priority: Major > Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.4, 2.0.0 > > Attachments: HBASE-20231-branch-1-v2.patch, > HBASE-20231-branch-1-v3.patch, HBASE-20231-branch-1.3.patch, > HBASE-20231-branch-1.patch, HBASE-20231-v2.patch, HBASE-20231-v3.patch, > HBASE-20231.patch > > > Example code to reproduce the issue, > {code:java} > Cluster cluster = new Cluster(); > cluster.add("rest-server-IP", rest-server-port); > Client client = new Client(cluster); > RemoteHTable table = new RemoteHTable(client, "t1"); > // Insert few records, > Put put = new Put(Bytes.toBytes("r1")); > put.add(Bytes.toBytes("cf1"), Bytes.toBytes("c1"), Bytes.toBytes("c1")); > put.add(Bytes.toBytes("cf1"), Bytes.toBytes("c2"), Bytes.toBytes("c2")); > put.add(Bytes.toBytes("cf2"), Bytes.toBytes("c1"), Bytes.toBytes("c1")); > table.put(put); > put = new Put(Bytes.toBytes("r2")); > put.add(Bytes.toBytes("cf1"), Bytes.toBytes("c1"), Bytes.toBytes("c1")); > put.add(Bytes.toBytes("cf1"), Bytes.toBytes("c2"), Bytes.toBytes("c2")); > put.add(Bytes.toBytes("cf2"), Bytes.toBytes("c1"), Bytes.toBytes("c1")); > table.put(put); > // Delete the entire column family from the row > Delete del = new Delete(Bytes.toBytes("r2")); > del.addFamily(Bytes.toBytes("cf1")); > table.delete(del); > {code} > Here the problem is in building row specification in > RemoteHTable.buildRowSpec(). Row specification is framed as "/t1/r2/cf1:" > instead of "/t1/r2/cf1". -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-14672) Exorcise deprecated Delete#delete* apis
[ https://issues.apache.org/jira/browse/HBASE-14672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426224#comment-16426224 ] stack commented on HBASE-14672: --- Committed with wrong JIRA ID commit 094d65e6f52f5b3cb1210c4abbea2fb14bcbdf15 Author: Jonathan M HsiehDate: Wed Oct 21 16:35:50 2015 -0700 HBASE-14673 Exorcise deprecated Delete#delete* api > Exorcise deprecated Delete#delete* apis > --- > > Key: HBASE-14672 > URL: https://issues.apache.org/jira/browse/HBASE-14672 > Project: HBase > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Jonathan Hsieh >Assignee: Jonathan Hsieh >Priority: Major > Fix For: 2.0.0 > > Attachments: hbase-14672-v2.patch, hbase-14672-v2.patch, > hbase-14672.patch > > > Delete#delete* apis were replaced with Delete#add* apis. This converts all > instances of it and removes Delete#delete* apis. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20286) Improving shell command compaction_state
[ https://issues.apache.org/jira/browse/HBASE-20286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426208#comment-16426208 ] Appy commented on HBASE-20286: -- That issue is unrelated, it's about how to handle values returned by commands in interactive mode, and not about whether commands should return values or note in the first place. {quote}I believe users don't really rely on this hidden command, the idea of a compaction_state command came from a testing task. {quote} Can't know for sure, so let's put it back? Almost everything start with some idea and outgrows its initial scope. It's plausible that someone can be using it in a script as some sort of wait condition. > Improving shell command compaction_state > > > Key: HBASE-20286 > URL: https://issues.apache.org/jira/browse/HBASE-20286 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Csaba Skrabak >Priority: Minor > Attachments: HBASE-20286.patch > > > Command does not output anything, let's add a formatter.row call. > We should include the possible outputs in the help text. > *Further improvement possibility.* This command can be used for checking if > the compaction is done but very impractical if one wants to wait _until_ it > is done. Wish there would be a flag in the shell that enforces synchronous > compactions, that is, every time you issue a compact or major_compact in the > shell while this flag is set, you won't get back the prompt until it finishes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-14460) [Perf Regression] Merge of MVCC and SequenceId (HBASE-8763) slowed Increments, CheckAndPuts, batch operations
[ https://issues.apache.org/jira/browse/HBASE-14460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-14460: -- Issue Type: Umbrella (was: Bug) > [Perf Regression] Merge of MVCC and SequenceId (HBASE-8763) slowed > Increments, CheckAndPuts, batch operations > - > > Key: HBASE-14460 > URL: https://issues.apache.org/jira/browse/HBASE-14460 > Project: HBase > Issue Type: Umbrella > Components: Performance >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 1.2.0, 1.3.0, 1.1.4, 1.0.4, 2.0.0 > > Attachments: 0.94.test.patch, 0.98.test.patch, > 1.0.80.flamegraph-7932.svg, 14460.txt, 14460.v0.branch-1.0.patch, > 98.80.flamegraph-11428.svg, HBASE-14460-discussion.patch, client.test.patch, > flamegraph-13120.svg.master.singlecell.svg, flamegraph-26636.094.100.svg, > flamegraph-28066.098.singlecell.svg, flamegraph-28767.098.100.svg, > flamegraph-31647.master.100.svg, flamegraph-9466.094.singlecell.svg, > hack.flamegraph-16593.svg, hack.uncommitted.patch, m.test.patch, > region_lock.png, testincrement.094.patch, testincrement.098.patch, > testincrement.master.patch > > > As reported by 鈴木俊裕 up on the mailing list -- see "Performance degradation > between CDH5.3.1(HBase0.98.6) and CDH5.4.5(HBase1.0.0)" -- our unification of > sequenceid and MVCC slows Increments (and other ops) as the mvcc needs to > 'catch up' to our current point before we can read the last Increment value > that we need to update. > We can say that our Increment is just done wrong, we should just be writing > Increments and summing on read, but checkAndPut as well as batching > operations have the same issue. Fix. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-14348) Update download mirror link
[ https://issues.apache.org/jira/browse/HBASE-14348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-14348: -- Attachment: 0001-HBASE-14348-Update-download-mirror-link.patch > Update download mirror link > --- > > Key: HBASE-14348 > URL: https://issues.apache.org/jira/browse/HBASE-14348 > Project: HBase > Issue Type: Task > Components: documentation, website >Reporter: Andrew Purtell >Assignee: Lars Francke >Priority: Major > Fix For: 2.0.0 > > Attachments: 0001-HBASE-14348-Update-download-mirror-link.patch, > HBASE-14348-branch-1.patch, HBASE-14348.patch > > > Where we refer to www.apache.org/dyn/closer.cgi, we need to refer to > www.apache.org/dyn/closer.lua instead . -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-14348) Update download mirror link
[ https://issues.apache.org/jira/browse/HBASE-14348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-14348. --- Resolution: Fixed Attached 0001 which is what I pushed to master branch. Found a few other mentions of closer.cgi and converted them to closer.lua (Thanks Lars) > Update download mirror link > --- > > Key: HBASE-14348 > URL: https://issues.apache.org/jira/browse/HBASE-14348 > Project: HBase > Issue Type: Task > Components: documentation, website >Reporter: Andrew Purtell >Assignee: Lars Francke >Priority: Major > Fix For: 2.0.0 > > Attachments: 0001-HBASE-14348-Update-download-mirror-link.patch, > HBASE-14348-branch-1.patch, HBASE-14348.patch > > > Where we refer to www.apache.org/dyn/closer.cgi, we need to refer to > www.apache.org/dyn/closer.lua instead . -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20346) [DOC] document change to shell tests
[ https://issues.apache.org/jira/browse/HBASE-20346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Drob updated HBASE-20346: -- Attachment: HBASE-20346.patch > [DOC] document change to shell tests > > > Key: HBASE-20346 > URL: https://issues.apache.org/jira/browse/HBASE-20346 > Project: HBase > Issue Type: Sub-task > Components: documentation, shell >Affects Versions: 2.0.0-beta-2, 2.0.0 >Reporter: Sean Busbey >Assignee: Mike Drob >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-20346.patch > > > HBASE-19903 changed how the shell tests are organized and executed, but it > missed updating the section on the ref guide that talks about the shell tests. > bring it up to date so that folks don't miss a bunch of the tests or add new > ones in the wrong place. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20346) [DOC] document change to shell tests
[ https://issues.apache.org/jira/browse/HBASE-20346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Drob updated HBASE-20346: -- Status: Patch Available (was: Open) > [DOC] document change to shell tests > > > Key: HBASE-20346 > URL: https://issues.apache.org/jira/browse/HBASE-20346 > Project: HBase > Issue Type: Sub-task > Components: documentation, shell >Affects Versions: 2.0.0-beta-2, 2.0.0 >Reporter: Sean Busbey >Assignee: Mike Drob >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-20346.patch > > > HBASE-19903 changed how the shell tests are organized and executed, but it > missed updating the section on the ref guide that talks about the shell tests. > bring it up to date so that folks don't miss a bunch of the tests or add new > ones in the wrong place. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-14348) Update download mirror link
[ https://issues.apache.org/jira/browse/HBASE-14348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack reopened HBASE-14348: --- I don't see this in codebase. Let me reopen to reapply. > Update download mirror link > --- > > Key: HBASE-14348 > URL: https://issues.apache.org/jira/browse/HBASE-14348 > Project: HBase > Issue Type: Task > Components: documentation, website >Reporter: Andrew Purtell >Assignee: Lars Francke >Priority: Major > Fix For: 2.0.0 > > Attachments: HBASE-14348-branch-1.patch, HBASE-14348.patch > > > Where we refer to www.apache.org/dyn/closer.cgi, we need to refer to > www.apache.org/dyn/closer.lua instead . -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19488) Move to using Apache commons CollectionUtils
[ https://issues.apache.org/jira/browse/HBASE-19488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Appy updated HBASE-19488: - Fix Version/s: 2.1.0 > Move to using Apache commons CollectionUtils > > > Key: HBASE-19488 > URL: https://issues.apache.org/jira/browse/HBASE-19488 > Project: HBase > Issue Type: Improvement > Components: hbase >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Trivial > Fix For: 2.1.0 > > Attachments: HBASE-19488.1.patch, HBASE-19488.2.patch, > HBASE-19488.3.patch, HBASE-19488.4.patch, HBASE-19488.5.patch > > > A bunch of unused code in CollectionUtils or code that can be found in Apache > Commons libraries. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19488) Move to using Apache commons CollectionUtils
[ https://issues.apache.org/jira/browse/HBASE-19488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Appy updated HBASE-19488: - Summary: Move to using Apache commons CollectionUtils (was: Remove Unused Code from CollectionUtils) > Move to using Apache commons CollectionUtils > > > Key: HBASE-19488 > URL: https://issues.apache.org/jira/browse/HBASE-19488 > Project: HBase > Issue Type: Improvement > Components: hbase >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Trivial > Attachments: HBASE-19488.1.patch, HBASE-19488.2.patch, > HBASE-19488.3.patch, HBASE-19488.4.patch, HBASE-19488.5.patch > > > A bunch of unused code in CollectionUtils or code that can be found in Apache > Commons libraries. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20243) [Shell] Add shell command to create a new table by cloning the existent table
[ https://issues.apache.org/jira/browse/HBASE-20243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426172#comment-16426172 ] Appy commented on HBASE-20243: -- QA still not happy. Would have been fine committing it if qa was green, but since there's gonna be another revision, here are couple minor suggestions on the table which'll lead to more maintainable code. It's upto you to take/leave them * In our design, shell tests are not run when we submit a patch to just hbase-server module. So if someone changes error message, your last two tests will start failing. Probably checking for just the exception type is enough. * *Prefer multiple small tests over one-for-all test*. It's easy to maintain since unittest code is more chunked -> easier to understand -> easier to fix when tests fail. For eg, if your current test fails in "test for existent destination table", someone will have to read through everything else before itand that would be wasted effort since it's not needed and the test for 'existent destination table' could have very well been a simple <10 lines separate test. * Prefer to use constant declared variables if it's value matters in multiple places. For eg. NUM_SPLITS=2, then init column families with it and use the same thing in asserts. Makes it easy to understand tests. Another examples is, consider random test with two figures - 10 and 50. 10 initializes something and we are asserting 50. It's not obvious if we are expecting test to assert 10+40 or 10*5. * Avoid [coupling|https://en.wikipedia.org/wiki/Coupling_(computer_programming)] tests. If one breaks for weird reason, others might too. (ref: new_table = "test_clone_table_schema_table") > [Shell] Add shell command to create a new table by cloning the existent table > - > > Key: HBASE-20243 > URL: https://issues.apache.org/jira/browse/HBASE-20243 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng >Priority: Minor > Fix For: 2.1.0 > > Attachments: HBASE-20243.master.001.patch, > HBASE-20243.master.002.patch, HBASE-20243.master.003.patch, > HBASE-20243.master.004.patch, HBASE-20243.master.005.patch, > HBASE-20243.master.006.patch, HBASE-20243.master.007.patch > > > In the production environment, we need to create a new table every day. The > schema and the split keys of the table are the same as that of yesterday's > table, only the name of the table is different. For example, > x_20180321,x_20180322 etc.But now there is no convenient command to > do this. So we may need such a command(clone_table) to create a new table by > cloning the existent table. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20188) [TESTING] Performance
[ https://issues.apache.org/jira/browse/HBASE-20188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426168#comment-16426168 ] Eshcar Hillel commented on HBASE-20188: --- I am running hbase12 client This is the code of the client {code} package com.yahoo.ycsb.db.hbase12; /** * HBase 1.2 client for YCSB framework. * * A modified version of HBaseClient (which targets HBase v1.2) utilizing the * shaded client. * * It should run equivalent to following the hbase098 binding README. * */ public class HBaseClient12 extends com.yahoo.ycsb.db.HBaseClient10 { } {code} The difference from hbase10 is just in the pom.xml file I believe, which includes shaded-client instead of hbase-client - could this be to blame? {code} com.yahoo.ycsb hbase10-binding ${project.version} org.apache.hbase hbase-client ... org.apache.hbase hbase-shaded-client ${hbase12.version} {code} > [TESTING] Performance > - > > Key: HBASE-20188 > URL: https://issues.apache.org/jira/browse/HBASE-20188 > Project: HBase > Issue Type: Umbrella > Components: Performance >Reporter: stack >Assignee: stack >Priority: Blocker > Fix For: 2.0.0 > > Attachments: CAM-CONFIG-V01.patch, HBASE-20188.sh, HBase 2.0 > performance evaluation - Basic vs None_ system settings.pdf, > ITBLL2.5B_1.2.7vs2.0.0_cpu.png, ITBLL2.5B_1.2.7vs2.0.0_gctime.png, > ITBLL2.5B_1.2.7vs2.0.0_iops.png, ITBLL2.5B_1.2.7vs2.0.0_load.png, > ITBLL2.5B_1.2.7vs2.0.0_memheap.png, ITBLL2.5B_1.2.7vs2.0.0_memstore.png, > ITBLL2.5B_1.2.7vs2.0.0_ops.png, > ITBLL2.5B_1.2.7vs2.0.0_ops_NOT_summing_regions.png, YCSB_CPU.png, > YCSB_GC_TIME.png, YCSB_IN_MEMORY_COMPACTION=NONE.ops.png, YCSB_MEMSTORE.png, > YCSB_OPs.png, YCSB_in-memory-compaction=NONE.ops.png, YCSB_load.png, > flamegraph-1072.1.svg, flamegraph-1072.2.svg, hbase-env.sh, hbase-site.xml, > lock.127.workloadc.20180402T200918Z.svg, > lock.2.memsize2.c.20180403T160257Z.svg, run_ycsb.sh, tree.txt > > > How does 2.0.0 compare to old versions? Is it faster, slower? There is rumor > that it is much slower, that the problem is the asyncwal writing. Does > in-memory compaction slow us down or speed us up? What happens when you > enable offheaping? > Keep notes here in this umbrella issue. Need to be able to say something > about perf when 2.0.0 ships. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20348) [DOC] call out change to tracing in upgrade guide
[ https://issues.apache.org/jira/browse/HBASE-20348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426166#comment-16426166 ] Sean Busbey commented on HBASE-20348: - yeah the upgrade section was what I was looking for. nit: first reference (or maybe first reference per chapter?) should specify Apache HTrace (incubating) I think? otherwise +1 pending QA. > [DOC] call out change to tracing in upgrade guide > - > > Key: HBASE-20348 > URL: https://issues.apache.org/jira/browse/HBASE-20348 > Project: HBase > Issue Type: Sub-task > Components: documentation, tracing >Affects Versions: 2.0.0 >Reporter: Sean Busbey >Assignee: Mike Drob >Priority: Major > Fix For: 2.0.0 > > Attachments: HBASE-20348.patch > > > we changed our HTrace version across an incompatible boundary in HBASE-18601. > We should call out somewhere that folks who built their apps to do tracing > through our client will need to update. > might also be worth calling out our current doubts on utility. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20276) [shell] Revert shell REPL change and document
[ https://issues.apache.org/jira/browse/HBASE-20276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426162#comment-16426162 ] Mike Drob commented on HBASE-20276: --- +1 pending QA, fine to add a test in a follow-on issue, since a bunch of the implicit jruby stuff doesn't have testing anyway. > [shell] Revert shell REPL change and document > - > > Key: HBASE-20276 > URL: https://issues.apache.org/jira/browse/HBASE-20276 > Project: HBase > Issue Type: Sub-task > Components: documentation, shell >Affects Versions: 1.4.0, 2.0.0 >Reporter: Sean Busbey >Assignee: Sean Busbey >Priority: Blocker > Fix For: 1.4.4, 2.0.0 > > Attachments: HBASE-20276.0.patch > > > Feedback from [~mdrob] on HBASE-19158: > {quote} > Shell: > HBASE-19770. There was another issue opened where this was identified as a > problem so maybe the shape will change further, but I can't find it now. > {quote} > New commentary from [~busbey]: > This was a follow on to HBASE-15965. That change effectively makes it so none > of our ruby wrappers can be used to build expressions in an interactive REPL. > This is a pretty severe change (most of my tips on HBASE-15611 will break, I > think). > I think we should > a) Have a DISCUSS thread, spanning dev@ and user@ > b) based on the outcome of that thread, either default to the new behavior or > the old behavior > c) if we keep the HBASE-15965 behavior as the default, flag it as > incompatible, call it out in the hbase 2.0 upgrade section, and update docs > (two examples: the output in the shell_exercises sections would be wrong, and > the _table_variables section won't work) > d) In either case document the new flag in the ref guide -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20276) [shell] Revert shell REPL change and document
[ https://issues.apache.org/jira/browse/HBASE-20276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426158#comment-16426158 ] Sean Busbey commented on HBASE-20276: - because one of the things HBASE-15965 changed was to print out more details for the get_splits command, but to do it in the building block function {{HBase::Table::_get_splits_internal}}. That method currently also gets used by several tests that don't care about output, and it's also useful if someone needs to reach behind our supported interface to build something programmatic that will operate directly on the splits (ala the stuff in HBASE-15611). printing the splits in the internal part instead of in the user facing part where it presumably is usefully consumed seemed off, so I moved that part to the user facing command. The summary blurb about number of splits seemed similarly out of place though it's been there much longer, so I moved it over as well. The net result is that those properly using {{get_splits}} directly get the same output as before, but when we're hitting internal for it we skip having {{puts}} without reimplementing the function. > [shell] Revert shell REPL change and document > - > > Key: HBASE-20276 > URL: https://issues.apache.org/jira/browse/HBASE-20276 > Project: HBase > Issue Type: Sub-task > Components: documentation, shell >Affects Versions: 1.4.0, 2.0.0 >Reporter: Sean Busbey >Assignee: Sean Busbey >Priority: Blocker > Fix For: 1.4.4, 2.0.0 > > Attachments: HBASE-20276.0.patch > > > Feedback from [~mdrob] on HBASE-19158: > {quote} > Shell: > HBASE-19770. There was another issue opened where this was identified as a > problem so maybe the shape will change further, but I can't find it now. > {quote} > New commentary from [~busbey]: > This was a follow on to HBASE-15965. That change effectively makes it so none > of our ruby wrappers can be used to build expressions in an interactive REPL. > This is a pretty severe change (most of my tips on HBASE-15611 will break, I > think). > I think we should > a) Have a DISCUSS thread, spanning dev@ and user@ > b) based on the outcome of that thread, either default to the new behavior or > the old behavior > c) if we keep the HBASE-15965 behavior as the default, flag it as > incompatible, call it out in the hbase 2.0 upgrade section, and update docs > (two examples: the output in the shell_exercises sections would be wrong, and > the _table_variables section won't work) > d) In either case document the new flag in the ref guide -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20188) [TESTING] Performance
[ https://issues.apache.org/jira/browse/HBASE-20188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426154#comment-16426154 ] stack commented on HBASE-20188: --- Try hbase12 client instead of hbase10 [~eshcar]? > [TESTING] Performance > - > > Key: HBASE-20188 > URL: https://issues.apache.org/jira/browse/HBASE-20188 > Project: HBase > Issue Type: Umbrella > Components: Performance >Reporter: stack >Assignee: stack >Priority: Blocker > Fix For: 2.0.0 > > Attachments: CAM-CONFIG-V01.patch, HBASE-20188.sh, HBase 2.0 > performance evaluation - Basic vs None_ system settings.pdf, > ITBLL2.5B_1.2.7vs2.0.0_cpu.png, ITBLL2.5B_1.2.7vs2.0.0_gctime.png, > ITBLL2.5B_1.2.7vs2.0.0_iops.png, ITBLL2.5B_1.2.7vs2.0.0_load.png, > ITBLL2.5B_1.2.7vs2.0.0_memheap.png, ITBLL2.5B_1.2.7vs2.0.0_memstore.png, > ITBLL2.5B_1.2.7vs2.0.0_ops.png, > ITBLL2.5B_1.2.7vs2.0.0_ops_NOT_summing_regions.png, YCSB_CPU.png, > YCSB_GC_TIME.png, YCSB_IN_MEMORY_COMPACTION=NONE.ops.png, YCSB_MEMSTORE.png, > YCSB_OPs.png, YCSB_in-memory-compaction=NONE.ops.png, YCSB_load.png, > flamegraph-1072.1.svg, flamegraph-1072.2.svg, hbase-env.sh, hbase-site.xml, > lock.127.workloadc.20180402T200918Z.svg, > lock.2.memsize2.c.20180403T160257Z.svg, run_ycsb.sh, tree.txt > > > How does 2.0.0 compare to old versions? Is it faster, slower? There is rumor > that it is much slower, that the problem is the asyncwal writing. Does > in-memory compaction slow us down or speed us up? What happens when you > enable offheaping? > Keep notes here in this umbrella issue. Need to be able to say something > about perf when 2.0.0 ships. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HBASE-20346) [DOC] document change to shell tests
[ https://issues.apache.org/jira/browse/HBASE-20346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Drob reassigned HBASE-20346: - Assignee: Mike Drob > [DOC] document change to shell tests > > > Key: HBASE-20346 > URL: https://issues.apache.org/jira/browse/HBASE-20346 > Project: HBase > Issue Type: Sub-task > Components: documentation, shell >Affects Versions: 2.0.0-beta-2, 2.0.0 >Reporter: Sean Busbey >Assignee: Mike Drob >Priority: Critical > Fix For: 2.0.0 > > > HBASE-19903 changed how the shell tests are organized and executed, but it > missed updating the section on the ref guide that talks about the shell tests. > bring it up to date so that folks don't miss a bunch of the tests or add new > ones in the wrong place. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20346) [DOC] document change to shell tests
[ https://issues.apache.org/jira/browse/HBASE-20346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426149#comment-16426149 ] Mike Drob commented on HBASE-20346: --- Nevermind, found it. Will get an update here. > [DOC] document change to shell tests > > > Key: HBASE-20346 > URL: https://issues.apache.org/jira/browse/HBASE-20346 > Project: HBase > Issue Type: Sub-task > Components: documentation, shell >Affects Versions: 2.0.0-beta-2, 2.0.0 >Reporter: Sean Busbey >Priority: Critical > Fix For: 2.0.0 > > > HBASE-19903 changed how the shell tests are organized and executed, but it > missed updating the section on the ref guide that talks about the shell tests. > bring it up to date so that folks don't miss a bunch of the tests or add new > ones in the wrong place. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20346) [DOC] document change to shell tests
[ https://issues.apache.org/jira/browse/HBASE-20346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426140#comment-16426140 ] Mike Drob commented on HBASE-20346: --- I don't see _any_ section in the ref guide explaining shell testing. Do you have a link? > [DOC] document change to shell tests > > > Key: HBASE-20346 > URL: https://issues.apache.org/jira/browse/HBASE-20346 > Project: HBase > Issue Type: Sub-task > Components: documentation, shell >Affects Versions: 2.0.0-beta-2, 2.0.0 >Reporter: Sean Busbey >Priority: Critical > Fix For: 2.0.0 > > > HBASE-19903 changed how the shell tests are organized and executed, but it > missed updating the section on the ref guide that talks about the shell tests. > bring it up to date so that folks don't miss a bunch of the tests or add new > ones in the wrong place. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20188) [TESTING] Performance
[ https://issues.apache.org/jira/browse/HBASE-20188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426134#comment-16426134 ] Eshcar Hillel commented on HBASE-20188: --- This is the code that initiates {{bufferedMutator}} in ycsb: {code:java} final TableName tName = TableName.valueOf(table); this.currentTable = connection.getTable(tName); if (clientSideBuffering) { final BufferedMutatorParams p = new BufferedMutatorParams(tName); p.writeBufferSize(writeBufferSize); this.bufferedMutator = connection.getBufferedMutator(p); } {code} so need to understand why {{connection.getBufferedMutator(p)}} returns null > [TESTING] Performance > - > > Key: HBASE-20188 > URL: https://issues.apache.org/jira/browse/HBASE-20188 > Project: HBase > Issue Type: Umbrella > Components: Performance >Reporter: stack >Assignee: stack >Priority: Blocker > Fix For: 2.0.0 > > Attachments: CAM-CONFIG-V01.patch, HBASE-20188.sh, HBase 2.0 > performance evaluation - Basic vs None_ system settings.pdf, > ITBLL2.5B_1.2.7vs2.0.0_cpu.png, ITBLL2.5B_1.2.7vs2.0.0_gctime.png, > ITBLL2.5B_1.2.7vs2.0.0_iops.png, ITBLL2.5B_1.2.7vs2.0.0_load.png, > ITBLL2.5B_1.2.7vs2.0.0_memheap.png, ITBLL2.5B_1.2.7vs2.0.0_memstore.png, > ITBLL2.5B_1.2.7vs2.0.0_ops.png, > ITBLL2.5B_1.2.7vs2.0.0_ops_NOT_summing_regions.png, YCSB_CPU.png, > YCSB_GC_TIME.png, YCSB_IN_MEMORY_COMPACTION=NONE.ops.png, YCSB_MEMSTORE.png, > YCSB_OPs.png, YCSB_in-memory-compaction=NONE.ops.png, YCSB_load.png, > flamegraph-1072.1.svg, flamegraph-1072.2.svg, hbase-env.sh, hbase-site.xml, > lock.127.workloadc.20180402T200918Z.svg, > lock.2.memsize2.c.20180403T160257Z.svg, run_ycsb.sh, tree.txt > > > How does 2.0.0 compare to old versions? Is it faster, slower? There is rumor > that it is much slower, that the problem is the asyncwal writing. Does > in-memory compaction slow us down or speed us up? What happens when you > enable offheaping? > Keep notes here in this umbrella issue. Need to be able to say something > about perf when 2.0.0 ships. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20188) [TESTING] Performance
[ https://issues.apache.org/jira/browse/HBASE-20188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426130#comment-16426130 ] Eshcar Hillel commented on HBASE-20188: --- I am trying to write a new workload that applies client side buffering by using {{clientbuffering=true}} However when running the workload I get the following exception in line {{Preconditions.checkNotNull(bufferedMutator);}} {code} java.lang.NoClassDefFoundError: com/google/common/base/Preconditions at com.yahoo.ycsb.db.HBaseClient10.update(HBaseClient10.java:441) at com.yahoo.ycsb.DBWrapper.update(DBWrapper.java:198) at com.yahoo.ycsb.workloads.CoreWorkload.doTransactionUpdate(CoreWorkload.java:775) at com.yahoo.ycsb.workloads.CoreWorkload.doTransaction(CoreWorkload.java:608) at com.yahoo.ycsb.ClientThread.run(Client.java:454) at java.lang.Thread.run(Thread.java:745) {code} I was able to use this ycsb property in the past. Anyone aware of changes to client implementation that result in a null {{bufferedMutator}}? > [TESTING] Performance > - > > Key: HBASE-20188 > URL: https://issues.apache.org/jira/browse/HBASE-20188 > Project: HBase > Issue Type: Umbrella > Components: Performance >Reporter: stack >Assignee: stack >Priority: Blocker > Fix For: 2.0.0 > > Attachments: CAM-CONFIG-V01.patch, HBASE-20188.sh, HBase 2.0 > performance evaluation - Basic vs None_ system settings.pdf, > ITBLL2.5B_1.2.7vs2.0.0_cpu.png, ITBLL2.5B_1.2.7vs2.0.0_gctime.png, > ITBLL2.5B_1.2.7vs2.0.0_iops.png, ITBLL2.5B_1.2.7vs2.0.0_load.png, > ITBLL2.5B_1.2.7vs2.0.0_memheap.png, ITBLL2.5B_1.2.7vs2.0.0_memstore.png, > ITBLL2.5B_1.2.7vs2.0.0_ops.png, > ITBLL2.5B_1.2.7vs2.0.0_ops_NOT_summing_regions.png, YCSB_CPU.png, > YCSB_GC_TIME.png, YCSB_IN_MEMORY_COMPACTION=NONE.ops.png, YCSB_MEMSTORE.png, > YCSB_OPs.png, YCSB_in-memory-compaction=NONE.ops.png, YCSB_load.png, > flamegraph-1072.1.svg, flamegraph-1072.2.svg, hbase-env.sh, hbase-site.xml, > lock.127.workloadc.20180402T200918Z.svg, > lock.2.memsize2.c.20180403T160257Z.svg, run_ycsb.sh, tree.txt > > > How does 2.0.0 compare to old versions? Is it faster, slower? There is rumor > that it is much slower, that the problem is the asyncwal writing. Does > in-memory compaction slow us down or speed us up? What happens when you > enable offheaping? > Keep notes here in this umbrella issue. Need to be able to say something > about perf when 2.0.0 ships. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20322) CME in StoreScanner causes region server crash
[ https://issues.apache.org/jira/browse/HBASE-20322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426119#comment-16426119 ] Hudson commented on HBASE-20322: Results for branch branch-1.4 [build #276 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/276/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/276//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/276//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/276//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > CME in StoreScanner causes region server crash > -- > > Key: HBASE-20322 > URL: https://issues.apache.org/jira/browse/HBASE-20322 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.2 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan >Priority: Major > Fix For: 1.5.0, 1.3.3, 1.4.4 > > Attachments: HBASE-20322.branch-1.3.001.patch, > HBASE-20322.branch-1.3.002-addendum.patch, HBASE-20322.branch-1.3.002.patch, > HBASE-20322.branch-1.4.001.patch > > > RS crashed with ConcurrentModificationException on our 1.3 cluster, stack > trace below. [~toffer] and I checked and there is a race condition between > flush and scanner close. When StoreScanner.updateReaders() is updating the > scanners after a newly flushed file (in this trace below a region close > during a split), the client's scanner could be closing thus causing CME. > Its rare, but since it crashes the region server, needs to be fixed. > FATAL regionserver.HRegionServer [regionserver/] : ABORTING region server > : Replay of WAL required. Forcing server shutdown > org.apache.hadoop.hbase.DroppedSnapshotException: region: > at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2579) > at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2255) > at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2217) > at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2207) > at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1501) > at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1420) > at > org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsBeforePONR(SplitTransactionImpl.java:398) > at > org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.createDaughters(SplitTransactionImpl.java:278) > at > org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(SplitTransactionImpl.java:566) > at > org.apache.hadoop.hbase.regionserver.SplitRequest.doSplitting(SplitRequest.java:82) > at > org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:154) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.util.ConcurrentModificationException > at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901) > at java.util.ArrayList$Itr.next(ArrayList.java:851) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.clearAndClose(StoreScanner.java:797) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.updateReaders(StoreScanner.java:825) > at > org.apache.hadoop.hbase.regionserver.HStore.notifyChangedReadersObservers(HStore.java:1155) > PS: ignore the line no in the above stack trace, method calls should help > understand whats happening. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20301) Remove the meaningless plus sign from table.jsp
[ https://issues.apache.org/jira/browse/HBASE-20301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426121#comment-16426121 ] Hudson commented on HBASE-20301: Results for branch branch-1.4 [build #276 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/276/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/276//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/276//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/276//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Remove the meaningless plus sign from table.jsp > --- > > Key: HBASE-20301 > URL: https://issues.apache.org/jira/browse/HBASE-20301 > Project: HBase > Issue Type: Bug > Components: UI >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Minor > Fix For: 1.5.0, 1.4.4 > > Attachments: HBASE-20301.branch-1.v0.patch.patch, > screenshot(after).png, screenshot(before).jpg > > > see the screenshot -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20231) Not able to delete column family from a row using RemoteHTable
[ https://issues.apache.org/jira/browse/HBASE-20231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426120#comment-16426120 ] Hudson commented on HBASE-20231: Results for branch branch-1.4 [build #276 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/276/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/276//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/276//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/276//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Not able to delete column family from a row using RemoteHTable > -- > > Key: HBASE-20231 > URL: https://issues.apache.org/jira/browse/HBASE-20231 > Project: HBase > Issue Type: Bug > Components: REST >Reporter: Pankaj Kumar >Assignee: Pankaj Kumar >Priority: Major > Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.4, 2.0.0 > > Attachments: HBASE-20231-branch-1-v2.patch, > HBASE-20231-branch-1-v3.patch, HBASE-20231-branch-1.3.patch, > HBASE-20231-branch-1.patch, HBASE-20231-v2.patch, HBASE-20231-v3.patch, > HBASE-20231.patch > > > Example code to reproduce the issue, > {code:java} > Cluster cluster = new Cluster(); > cluster.add("rest-server-IP", rest-server-port); > Client client = new Client(cluster); > RemoteHTable table = new RemoteHTable(client, "t1"); > // Insert few records, > Put put = new Put(Bytes.toBytes("r1")); > put.add(Bytes.toBytes("cf1"), Bytes.toBytes("c1"), Bytes.toBytes("c1")); > put.add(Bytes.toBytes("cf1"), Bytes.toBytes("c2"), Bytes.toBytes("c2")); > put.add(Bytes.toBytes("cf2"), Bytes.toBytes("c1"), Bytes.toBytes("c1")); > table.put(put); > put = new Put(Bytes.toBytes("r2")); > put.add(Bytes.toBytes("cf1"), Bytes.toBytes("c1"), Bytes.toBytes("c1")); > put.add(Bytes.toBytes("cf1"), Bytes.toBytes("c2"), Bytes.toBytes("c2")); > put.add(Bytes.toBytes("cf2"), Bytes.toBytes("c1"), Bytes.toBytes("c1")); > table.put(put); > // Delete the entire column family from the row > Delete del = new Delete(Bytes.toBytes("r2")); > del.addFamily(Bytes.toBytes("cf1")); > table.delete(del); > {code} > Here the problem is in building row specification in > RemoteHTable.buildRowSpec(). Row specification is framed as "/t1/r2/cf1:" > instead of "/t1/r2/cf1". -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20350) RegionServer is aborted due to NPE when scanner lease expired on a region
Umesh Agashe created HBASE-20350: Summary: RegionServer is aborted due to NPE when scanner lease expired on a region Key: HBASE-20350 URL: https://issues.apache.org/jira/browse/HBASE-20350 Project: HBase Issue Type: Bug Affects Versions: 2.0.0-beta-2 Reporter: Umesh Agashe Fix For: 2.0.0 >From logs: {code} 2018-04-03 02:06:00,630 INFO org.apache.hadoop.hbase.regionserver.HRegion: Replaying edits from hdfs://ns1/hbase/data/default/IntegrationTestBigLinkedList_20180403004104/834545a2ae1baa47082a3bc7aab2be2f/recovered.edits/1032167 2018-04-03 02:06:00,724 INFO org.apache.hadoop.hbase.regionserver.RSRpcServices: Scanner 2120114333978460945 lease expired on region IntegrationTestBigLinkedList_20180403004104,\xF1\xFE\xCB\x98e1\xF8\xD4,1522742825561.ce0d91585a2d188123173c36d0b693a5. 2018-04-03 02:06:00,730 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: * ABORTING region server vd0510.halxg.cloudera.com,22101,1522626204176: Uncaught exception in executorService thread regionserver/vd0510.halxg.cloudera.com/10.17.226.13:22101.leaseChecker * java.lang.NullPointerException at org.apache.hadoop.hbase.CellComparatorImpl.compareRows(CellComparatorImpl.java:202) at org.apache.hadoop.hbase.CellComparatorImpl.compare(CellComparatorImpl.java:74) at org.apache.hadoop.hbase.CellComparatorImpl.compare(CellComparatorImpl.java:61) at org.apache.hadoop.hbase.regionserver.KeyValueHeap$KVScannerComparator.compare(KeyValueHeap.java:207) at org.apache.hadoop.hbase.regionserver.KeyValueHeap$KVScannerComparator.compare(KeyValueHeap.java:190) at org.apache.hadoop.hbase.regionserver.KeyValueHeap$KVScannerComparator.compare(KeyValueHeap.java:178) at java.util.PriorityQueue.siftDownUsingComparator(PriorityQueue.java:721) at java.util.PriorityQueue.siftDown(PriorityQueue.java:687) at java.util.PriorityQueue.poll(PriorityQueue.java:595) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.close(KeyValueHeap.java:228) at org.apache.hadoop.hbase.regionserver.StoreScanner.close(StoreScanner.java:483) at org.apache.hadoop.hbase.regionserver.StoreScanner.close(StoreScanner.java:464) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.close(KeyValueHeap.java:224) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.close(HRegion.java:) at org.apache.hadoop.hbase.regionserver.RSRpcServices$ScannerListener.leaseExpired(RSRpcServices.java:460) at org.apache.hadoop.hbase.regionserver.Leases.run(Leases.java:122) at java.lang.Thread.run(Thread.java:748) 2018-04-03 02:06:00,731 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [org.apache.hadoop.hbase.security.access.AccessController, org.apache.hadoop.hbase.security.token.TokenProvider, org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint, com.cloudera.navigator.audit.hbase.RegionAuditCoProcessor] 2018-04-03 02:06:00,737 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics as JSON on abort: { {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-17730) [DOC] Migration to 2.0 for coprocessors
[ https://issues.apache.org/jira/browse/HBASE-17730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426113#comment-16426113 ] Appy commented on HBASE-17730: -- Pushed header addition to design doc and changes to upgrade section. Thanks for review guys. > [DOC] Migration to 2.0 for coprocessors > > > Key: HBASE-17730 > URL: https://issues.apache.org/jira/browse/HBASE-17730 > Project: HBase > Issue Type: Sub-task > Components: documentation, migration >Reporter: Appy >Assignee: Appy >Priority: Blocker > Fix For: 2.0.0 > > Attachments: HBASE-17730.master.001.patch > > > Jiras breaking coprocessor compatibility should be marked with component ' > Coprocessor', and label 'incompatible'. > Close to releasing 2.0, we should go through all such jiras and write down > steps for migrating coprocessor easily. > The idea is, it might be very hard to fix coprocessor breakages by reverse > engineering errors, but will be easier we suggest easiest way to fix > breakages resulting from each individual incompatible change. > For eg. HBASE-17312 is incompatible change. It'll result in 100s of errors > because BaseXXXObserver classes are gone and will probably result in a lot of > confusion, but if we explicitly mention the fix which is just one line change > - replace "Foo extends BaseXXXObserver" with "Foo implements XXXObserver" - > it makes it very easy. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-17730) [DOC] Migration to 2.0 for coprocessors
[ https://issues.apache.org/jira/browse/HBASE-17730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Appy resolved HBASE-17730. -- Resolution: Fixed > [DOC] Migration to 2.0 for coprocessors > > > Key: HBASE-17730 > URL: https://issues.apache.org/jira/browse/HBASE-17730 > Project: HBase > Issue Type: Sub-task > Components: documentation, migration >Reporter: Appy >Assignee: Appy >Priority: Blocker > Fix For: 2.0.0 > > Attachments: HBASE-17730.master.001.patch > > > Jiras breaking coprocessor compatibility should be marked with component ' > Coprocessor', and label 'incompatible'. > Close to releasing 2.0, we should go through all such jiras and write down > steps for migrating coprocessor easily. > The idea is, it might be very hard to fix coprocessor breakages by reverse > engineering errors, but will be easier we suggest easiest way to fix > breakages resulting from each individual incompatible change. > For eg. HBASE-17312 is incompatible change. It'll result in 100s of errors > because BaseXXXObserver classes are gone and will probably result in a lot of > confusion, but if we explicitly mention the fix which is just one line change > - replace "Foo extends BaseXXXObserver" with "Foo implements XXXObserver" - > it makes it very easy. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20330) ProcedureExecutor.start() gets stuck in recover lease on store.
[ https://issues.apache.org/jira/browse/HBASE-20330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-20330: - Description: We have instance in our internal testing where master log is getting filled with following messages: {code} 2018-04-02 17:11:17,566 INFO org.apache.hadoop.hbase.util.FSHDFSUtils: Recover lease on dfs file hdfs://ns1/hbase/MasterProcWALs/pv2-0018.log 2018-04-02 17:11:17,567 INFO org.apache.hadoop.hbase.util.FSHDFSUtils: Recovered lease, attempt=0 on file=hdfs://ns1/hbase/MasterProcWALs/pv2-0018.log after 1ms 2018-04-02 17:11:17,574 WARN org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore: Unable to read tracker for hdfs://ns1/hbase/MasterProcWALs/pv2-0018.log - Invalid Trailer version. got 111 expected 1 2018-04-02 17:11:17,576 ERROR org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore: Log file with id=19 already exists org.apache.hadoop.fs.FileAlreadyExistsException: /hbase/MasterProcWALs/pv2-0019.log for client 10.17.202.11 already exists at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.startFile(FSDirWriteFileOp.java:381) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2442) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2339) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:764) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:451) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) {code} Debugging it further with [~appy], [~avirmani] and [~xiaochen] we found that when WALProcedureStore#rollWriter() fails and returns false for some reason, it keeps looping continuously. was: We have instance in our internal testing where master log is getting filled with following messages: {code} 2018-04-02 17:11:17,566 INFO org.apache.hadoop.hbase.util.FSHDFSUtils: Recover lease on dfs file hdfs://ns1/hbase/MasterProcWALs/pv2-0018.log 2018-04-02 17:11:17,567 INFO org.apache.hadoop.hbase.util.FSHDFSUtils: Recovered lease, attempt=0 on file=hdfs://ns1/hbase/MasterProcWALs/pv2-0018.log after 1ms 2018-04-02 17:11:17,574 WARN org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore: Unable to read tracker for hdfs://ns1/hbase/MasterProcWALs/pv2-0018.log - Invalid Trailer version. got 111 expected 1 2018-04-02 17:11:17,576 ERROR org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore: Log file with id=19 already exists org.apache.hadoop.fs.FileAlreadyExistsException: /hbase/MasterProcWALs/pv2-0019.log for client 10.17.202.11 already exists at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.startFile(FSDirWriteFileOp.java:381) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2442) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2339) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:764) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:451) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) {code} Debugging it
[jira] [Commented] (HBASE-17730) [DOC] Migration to 2.0 for coprocessors
[ https://issues.apache.org/jira/browse/HBASE-17730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426102#comment-16426102 ] Mike Drob commented on HBASE-17730: --- good catch on the license, Peter I'm +1 after that fix > [DOC] Migration to 2.0 for coprocessors > > > Key: HBASE-17730 > URL: https://issues.apache.org/jira/browse/HBASE-17730 > Project: HBase > Issue Type: Sub-task > Components: documentation, migration >Reporter: Appy >Assignee: Appy >Priority: Blocker > Fix For: 2.0.0 > > Attachments: HBASE-17730.master.001.patch > > > Jiras breaking coprocessor compatibility should be marked with component ' > Coprocessor', and label 'incompatible'. > Close to releasing 2.0, we should go through all such jiras and write down > steps for migrating coprocessor easily. > The idea is, it might be very hard to fix coprocessor breakages by reverse > engineering errors, but will be easier we suggest easiest way to fix > breakages resulting from each individual incompatible change. > For eg. HBASE-17312 is incompatible change. It'll result in 100s of errors > because BaseXXXObserver classes are gone and will probably result in a lot of > confusion, but if we explicitly mention the fix which is just one line change > - replace "Foo extends BaseXXXObserver" with "Foo implements XXXObserver" - > it makes it very easy. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HBASE-17730) [DOC] Migration to 2.0 for coprocessors
[ https://issues.apache.org/jira/browse/HBASE-17730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426098#comment-16426098 ] Appy edited comment on HBASE-17730 at 4/4/18 8:04 PM: -- Oh! Thanks [~psomogyi] for pointing it out. Correcting it. And committing the docs change was (Author: appy): Oh! Thanks [~psomogyi] for pointing it out. Correcting it. > [DOC] Migration to 2.0 for coprocessors > > > Key: HBASE-17730 > URL: https://issues.apache.org/jira/browse/HBASE-17730 > Project: HBase > Issue Type: Sub-task > Components: documentation, migration >Reporter: Appy >Assignee: Appy >Priority: Blocker > Fix For: 2.0.0 > > Attachments: HBASE-17730.master.001.patch > > > Jiras breaking coprocessor compatibility should be marked with component ' > Coprocessor', and label 'incompatible'. > Close to releasing 2.0, we should go through all such jiras and write down > steps for migrating coprocessor easily. > The idea is, it might be very hard to fix coprocessor breakages by reverse > engineering errors, but will be easier we suggest easiest way to fix > breakages resulting from each individual incompatible change. > For eg. HBASE-17312 is incompatible change. It'll result in 100s of errors > because BaseXXXObserver classes are gone and will probably result in a lot of > confusion, but if we explicitly mention the fix which is just one line change > - replace "Foo extends BaseXXXObserver" with "Foo implements XXXObserver" - > it makes it very easy. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20338) WALProcedureStore#recoverLease() should have fixed sleeps and/ or exponential backoff for retrying rollWriter()
[ https://issues.apache.org/jira/browse/HBASE-20338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426100#comment-16426100 ] Umesh Agashe commented on HBASE-20338: -- Thanks for taking this [~jojochuang]! > WALProcedureStore#recoverLease() should have fixed sleeps and/ or exponential > backoff for retrying rollWriter() > --- > > Key: HBASE-20338 > URL: https://issues.apache.org/jira/browse/HBASE-20338 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0-beta-2 >Reporter: Umesh Agashe >Assignee: Wei-Chiu Chuang >Priority: Major > > In our internal testing we observed that logs are getting flooded due to > continuous loop in WALProcedureStore#recoverLease(): > {code} > while (isRunning()) { > // Get Log-MaxID and recover lease on old logs > try { > flushLogId = initOldLogs(oldLogs); > } catch (FileNotFoundException e) { > LOG.warn("Someone else is active and deleted logs. retrying.", e); > oldLogs = getLogFiles(); > continue; > } > // Create new state-log > if (!rollWriter(flushLogId + 1)) { > // someone else has already created this log > LOG.debug("Someone else has already created log " + flushLogId); > continue; > } > {code} > rollWriter() fails to create a new file. Error messages in HDFS namenode logs > around same time: > {code} > INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 8020, call > org.apache.hadoop.hdfs.protocol.ClientProtocol.create from > 172.31.121.196:38508 Call#3141 Retry#0 > java.io.IOException: Exeption while contacting value generator > at > org.apache.hadoop.crypto.key.kms.ValueQueue.getAtMost(ValueQueue.java:389) > at > org.apache.hadoop.crypto.key.kms.ValueQueue.getNext(ValueQueue.java:291) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.generateEncryptedKey(KMSClientProvider.java:724) > at > org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.generateEncryptedKey(KeyProviderCryptoExtension.java:511) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem$2.run(FSNamesystem.java:2680) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem$2.run(FSNamesystem.java:2676) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920) > at > org.apache.hadoop.security.SecurityUtil.doAsUser(SecurityUtil.java:477) > at > org.apache.hadoop.security.SecurityUtil.doAsLoginUser(SecurityUtil.java:458) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.generateEncryptedDataEncryptionKey(FSNamesystem.java:2675) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2815) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2712) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:604) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.create(AuthorizationProviderProxyClientProtocol.java:115) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:412) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2226) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2220) > Caused by: java.net.ConnectException: Connection refused (Connection refused) > at java.net.PlainSocketImpl.socketConnect(Native Method) > at > java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) > at > java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) > at > java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) > at
[jira] [Commented] (HBASE-17730) [DOC] Migration to 2.0 for coprocessors
[ https://issues.apache.org/jira/browse/HBASE-17730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426098#comment-16426098 ] Appy commented on HBASE-17730: -- Oh! Thanks [~psomogyi] for pointing it out. Correcting it. > [DOC] Migration to 2.0 for coprocessors > > > Key: HBASE-17730 > URL: https://issues.apache.org/jira/browse/HBASE-17730 > Project: HBase > Issue Type: Sub-task > Components: documentation, migration >Reporter: Appy >Assignee: Appy >Priority: Blocker > Fix For: 2.0.0 > > Attachments: HBASE-17730.master.001.patch > > > Jiras breaking coprocessor compatibility should be marked with component ' > Coprocessor', and label 'incompatible'. > Close to releasing 2.0, we should go through all such jiras and write down > steps for migrating coprocessor easily. > The idea is, it might be very hard to fix coprocessor breakages by reverse > engineering errors, but will be easier we suggest easiest way to fix > breakages resulting from each individual incompatible change. > For eg. HBASE-17312 is incompatible change. It'll result in 100s of errors > because BaseXXXObserver classes are gone and will probably result in a lot of > confusion, but if we explicitly mention the fix which is just one line change > - replace "Foo extends BaseXXXObserver" with "Foo implements XXXObserver" - > it makes it very easy. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20275) [DOC] clarify impact to hfile command from HBASE-17197
[ https://issues.apache.org/jira/browse/HBASE-20275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426087#comment-16426087 ] Mike Drob commented on HBASE-20275: --- [~balazs.meszaros] - were you already looking at this? > [DOC] clarify impact to hfile command from HBASE-17197 > -- > > Key: HBASE-20275 > URL: https://issues.apache.org/jira/browse/HBASE-20275 > Project: HBase > Issue Type: Sub-task > Components: documentation, tooling >Affects Versions: 2.0.0 >Reporter: Sean Busbey >Priority: Major > > Feedback on HBASE-19158 from [~mdrob] > {quote} > Tools: > HBASE-17197 > {quote} > It's not clear to me from the patch on HBASE-17197 if this was actually a > change that needs to be called out. So tasks: > 1) Figure out if the {{hfile}} command args from HBase 1.y still works > 2) Update the title of HBASE-17197 to match what the change in the jira ended > up being > 3) If hfile changed in an incompatible way, add it to the upgrade section and > make sure the refguide section "hfile_tool" is up to date. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-13866) Add endpoint coprocessor to the section hbase.coprocessor.region.classes in HBase book
[ https://issues.apache.org/jira/browse/HBASE-13866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426086#comment-16426086 ] stack commented on HBASE-13866: --- Committed with wrong JIRA ID commit 8a2cef3315516501627c7a30bdcf989b12a32303 Author: g-bhardwajDate: Sun Oct 25 17:29:35 2015 +0530 HBASE-13867: Add endpoint coprocessor guide to HBase book. > Add endpoint coprocessor to the section hbase.coprocessor.region.classes in > HBase book > -- > > Key: HBASE-13866 > URL: https://issues.apache.org/jira/browse/HBASE-13866 > Project: HBase > Issue Type: Bug > Components: documentation >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Misty Stanley-Jones >Priority: Trivial > Fix For: 2.0.0 > > Attachments: HBASE-13866.patch > > > {quote} > hbase.coprocessor.region.classes > Description > A comma-separated list of Coprocessors that are loaded by default on all > tables. For any override coprocessor method, these classes will be called in > order. After implementing your own Coprocessor, just put it in HBase’s > classpath and add the fully qualified class name here. A coprocessor can also > be loaded on demand by setting HTableDescriptor. > {quote} > This must be more specific: not Coprocessors, but Region observers and > *endpoint coprocessors*. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20305) Add option to SyncTable that skip deletes on target cluster
[ https://issues.apache.org/jira/browse/HBASE-20305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-20305: --- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks for the patch, Wellington. Thanks for the review, Dave. > Add option to SyncTable that skip deletes on target cluster > --- > > Key: HBASE-20305 > URL: https://issues.apache.org/jira/browse/HBASE-20305 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Affects Versions: 2.0.0-alpha-4 >Reporter: Wellington Chevreuil >Assignee: Wellington Chevreuil >Priority: Minor > Fix For: 3.0.0 > > Attachments: 0001-HBASE-20305.master.001.patch, > HBASE-20305.master.002.patch > > > We had a situation where two clusters with active-active replication got out > of sync, but both had data that should be kept. The tables in question never > have data deleted, but ingestion had happened on the two different clusters, > some rows had been even updated. > In this scenario, a cell that is present in one of the table clusters should > not be deleted, but replayed on the other. Also, for cells with same > identifier but different values, the most recent value should be kept. > Current version of SyncTable would not be applicable here, because it would > simply copy the whole state from source to target, then losing any additional > rows that might be only in target, as well as cell values that got most > recent update. This could be solved by adding an option to skip deletes for > SyncTable. This way, the additional cells not present on source would still > be kept. For cells with same identifier but different values, it would just > perform a Put for the cell version from source, but client scans would still > fetch the most recent timestamp. > I'm attaching a patch with this additional option shortly. Please share your > thoughts. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-13279) Add src/main/asciidoc/asciidoctor.css to RAT exclusion list in POM
[ https://issues.apache.org/jira/browse/HBASE-13279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426065#comment-16426065 ] stack commented on HBASE-13279: --- Committed w/o JIRA ID commit fd8c13f61ab3a6f86489d091f45820fe1372ea22 Author: Andrew PurtellDate: Wed Mar 18 18:16:46 2015 -0700 Add src/main/asciidoc/asciidoctor.css to RAT exclusion list in POM > Add src/main/asciidoc/asciidoctor.css to RAT exclusion list in POM > -- > > Key: HBASE-13279 > URL: https://issues.apache.org/jira/browse/HBASE-13279 > Project: HBase > Issue Type: Bug > Components: documentation >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Minor > Fix For: 2.0.0 > > Attachments: > 0001-Add-src-main-asciidoc-asciidoctor.css-to-RAT-exclusi.patch > > > After copying back the latest doc updates from trunk to 0.98 branch for a > release, the release audit failed due to src/main/asciidoc/asciidoctor.css, > which is MIT licensed but only by reference. Exclude it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-13133) NPE when running TestSplitLogManager
[ https://issues.apache.org/jira/browse/HBASE-13133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426060#comment-16426060 ] stack commented on HBASE-13133: --- Committed with a mangled JIRA ID: 70ecf18817 HBASE-NPE when running TestSplitLogManager (Andrey Stepachev and Zhangduo) kalashnikov:hbase.git stack$ git show 70ecf18817 commit 70ecf18817ef219389a9e024ff21ffb99b6615d9 Author: stackDate: Sun Mar 1 19:54:10 2015 -0800 HBASE-NPE when running TestSplitLogManager (Andrey Stepachev and Zhangduo) > NPE when running TestSplitLogManager > > > Key: HBASE-13133 > URL: https://issues.apache.org/jira/browse/HBASE-13133 > Project: HBase > Issue Type: Bug >Reporter: Duo Zhang >Assignee: Andrey Stepachev >Priority: Major > Fix For: 1.0.1, 1.1.0, 2.0.0 > > Attachments: HBASE-13133.patch > > > https://builds.apache.org/job/HBase-TRUNK/6187/testReport/junit/org.apache.hadoop.hbase.master/TestSplitLogManager/testOrphanTaskAcquisition/ > {noformat} > 2015-03-01 01:34:58,902 INFO [Thread-23] master.TestSplitLogManager(298): > TestOrphanTaskAcquisition > 2015-03-01 01:34:58,904 DEBUG [Thread-23] > coordination.ZKSplitLogManagerCoordination(870): Distributed log replay=true > 2015-03-01 01:34:58,907 INFO [Thread-23] > coordination.ZKSplitLogManagerCoordination(594): found orphan task > orphan%2Ftest%2Fslash > 2015-03-01 01:34:58,913 INFO [Thread-23] > coordination.ZKSplitLogManagerCoordination(598): Found 1 orphan tasks and 0 > rescan nodes > 2015-03-01 01:34:58,913 ERROR [main-EventThread] > zookeeper.ClientCnxn$EventThread(613): Caught unexpected throwable > java.lang.NullPointerException > at > org.apache.hadoop.hbase.coordination.ZKSplitLogManagerCoordination.findOrCreateOrphanTask(ZKSplitLogManagerCoordination.java:546) > at > org.apache.hadoop.hbase.coordination.ZKSplitLogManagerCoordination.heartbeat(ZKSplitLogManagerCoordination.java:556) > at > org.apache.hadoop.hbase.coordination.ZKSplitLogManagerCoordination.getDataSetWatchSuccess(ZKSplitLogManagerCoordination.java:467) > at > org.apache.hadoop.hbase.coordination.ZKSplitLogManagerCoordination.access$700(ZKSplitLogManagerCoordination.java:74) > at > org.apache.hadoop.hbase.coordination.ZKSplitLogManagerCoordination$GetDataAsyncCallback.processResult(ZKSplitLogManagerCoordination.java:1020) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:561) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) > {noformat} > I got this NPE almost every time when running TestSplitLogManager locally. I > am not sure whether it is the root cause of test failing, but seems related. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20348) [DOC] call out change to tracing in upgrade guide
[ https://issues.apache.org/jira/browse/HBASE-20348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Drob updated HBASE-20348: -- Status: Patch Available (was: Open) > [DOC] call out change to tracing in upgrade guide > - > > Key: HBASE-20348 > URL: https://issues.apache.org/jira/browse/HBASE-20348 > Project: HBase > Issue Type: Sub-task > Components: documentation, tracing >Affects Versions: 2.0.0 >Reporter: Sean Busbey >Assignee: Mike Drob >Priority: Major > Fix For: 2.0.0 > > Attachments: HBASE-20348.patch > > > we changed our HTrace version across an incompatible boundary in HBASE-18601. > We should call out somewhere that folks who built their apps to do tracing > through our client will need to update. > might also be worth calling out our current doubts on utility. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20348) [DOC] call out change to tracing in upgrade guide
[ https://issues.apache.org/jira/browse/HBASE-20348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Drob updated HBASE-20348: -- Attachment: HBASE-20348.patch > [DOC] call out change to tracing in upgrade guide > - > > Key: HBASE-20348 > URL: https://issues.apache.org/jira/browse/HBASE-20348 > Project: HBase > Issue Type: Sub-task > Components: documentation, tracing >Affects Versions: 2.0.0 >Reporter: Sean Busbey >Assignee: Mike Drob >Priority: Major > Fix For: 2.0.0 > > Attachments: HBASE-20348.patch > > > we changed our HTrace version across an incompatible boundary in HBASE-18601. > We should call out somewhere that folks who built their apps to do tracing > through our client will need to update. > might also be worth calling out our current doubts on utility. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-12676) Fix the misleading ASCII art in IntegrationTestBigLinkedList
[ https://issues.apache.org/jira/browse/HBASE-12676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426039#comment-16426039 ] stack commented on HBASE-12676: --- Applied without JIRA ID tree b66f82ea2294019c4c0331f40f4a80395340c215 parent da2b5a962725d775ed6db8d0db417e5d7af8c561 author Yi DengThu Dec 11 16:36:46 2014 -0800 committer stack Thu Dec 11 17:22:29 2014 -0800 Fix the ASCII art Signed-off-by: stack > Fix the misleading ASCII art in IntegrationTestBigLinkedList > > > Key: HBASE-12676 > URL: https://issues.apache.org/jira/browse/HBASE-12676 > Project: HBase > Issue Type: Improvement > Components: documentation, integration tests >Affects Versions: 2.0.0 >Reporter: Yi Deng >Assignee: Yi Deng >Priority: Trivial > Labels: document > Fix For: 1.0.0, 0.98.9, 2.0.0 > > Attachments: 2.0-0001-Fix-the-ASCII-art.patch > > > The ASCII art before IntegrationTestBigLinkedList.GeneratorMapper was wrongly > drawn. This diff will fix it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-12676) Fix the misleading ASCII art in IntegrationTestBigLinkedList
[ https://issues.apache.org/jira/browse/HBASE-12676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426038#comment-16426038 ] stack commented on HBASE-12676: --- Applied without JIRA ID tree b66f82ea2294019c4c0331f40f4a80395340c215 parent da2b5a962725d775ed6db8d0db417e5d7af8c561 author Yi DengThu Dec 11 16:36:46 2014 -0800 committer stack Thu Dec 11 17:22:29 2014 -0800 Fix the ASCII art Signed-off-by: stack > Fix the misleading ASCII art in IntegrationTestBigLinkedList > > > Key: HBASE-12676 > URL: https://issues.apache.org/jira/browse/HBASE-12676 > Project: HBase > Issue Type: Improvement > Components: documentation, integration tests >Affects Versions: 2.0.0 >Reporter: Yi Deng >Assignee: Yi Deng >Priority: Trivial > Labels: document > Fix For: 1.0.0, 0.98.9, 2.0.0 > > Attachments: 2.0-0001-Fix-the-ASCII-art.patch > > > The ASCII art before IntegrationTestBigLinkedList.GeneratorMapper was wrongly > drawn. This diff will fix it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20348) [DOC] call out change to tracing in upgrade guide
[ https://issues.apache.org/jira/browse/HBASE-20348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426037#comment-16426037 ] Mike Drob commented on HBASE-20348: --- The tracing appendix was already updated for new syntax with HBASE-18601. You're looking for a note in the upgrade section now? > [DOC] call out change to tracing in upgrade guide > - > > Key: HBASE-20348 > URL: https://issues.apache.org/jira/browse/HBASE-20348 > Project: HBase > Issue Type: Sub-task > Components: documentation, tracing >Affects Versions: 2.0.0 >Reporter: Sean Busbey >Assignee: Mike Drob >Priority: Major > Fix For: 2.0.0 > > > we changed our HTrace version across an incompatible boundary in HBASE-18601. > We should call out somewhere that folks who built their apps to do tracing > through our client will need to update. > might also be worth calling out our current doubts on utility. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-12386) Replication gets stuck following a transient zookeeper error to remote peer cluster
[ https://issues.apache.org/jira/browse/HBASE-12386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426028#comment-16426028 ] stack commented on HBASE-12386: --- [~apurtell] No worries sir. I have done 10 for any one by anyone else. Just noting these facts in issue as I try to align JIRA and git for branch-2. > Replication gets stuck following a transient zookeeper error to remote peer > cluster > --- > > Key: HBASE-12386 > URL: https://issues.apache.org/jira/browse/HBASE-12386 > Project: HBase > Issue Type: Bug > Components: Replication >Affects Versions: 0.98.7 >Reporter: Adrian Muraru >Assignee: Adrian Muraru >Priority: Major > Fix For: 0.98.8, 0.99.2, 2.0.0 > > Attachments: HBASE-12386-0.98.patch, HBASE-12386.patch > > > Following a transient ZK error replication gets stuck and remote peers are > never updated. > Source region servers are reporting continuously the following error in logs: > "No replication sinks are available" -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HBASE-20348) [DOC] call out change to tracing in upgrade guide
[ https://issues.apache.org/jira/browse/HBASE-20348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Drob reassigned HBASE-20348: - Assignee: Mike Drob > [DOC] call out change to tracing in upgrade guide > - > > Key: HBASE-20348 > URL: https://issues.apache.org/jira/browse/HBASE-20348 > Project: HBase > Issue Type: Sub-task > Components: documentation, tracing >Affects Versions: 2.0.0 >Reporter: Sean Busbey >Assignee: Mike Drob >Priority: Major > Fix For: 2.0.0 > > > we changed our HTrace version across an incompatible boundary in HBASE-18601. > We should call out somewhere that folks who built their apps to do tracing > through our client will need to update. > might also be worth calling out our current doubts on utility. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20243) [Shell] Add shell command to create a new table by cloning the existent table
[ https://issues.apache.org/jira/browse/HBASE-20243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426011#comment-16426011 ] Hadoop QA commented on HBASE-20243: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 8s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 33s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 35s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 57s{color} | {color:green} master passed {color} | | {color:red}-1{color} | {color:red} shadedjars {color} | {color:red} 0m 21s{color} | {color:red} branch has 7 errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 41s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 59s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} rubocop {color} | {color:red} 0m 18s{color} | {color:red} The patch generated 36 new + 775 unchanged - 7 fixed = 811 total (was 782) {color} | | {color:red}-1{color} | {color:red} ruby-lint {color} | {color:red} 0m 21s{color} | {color:red} The patch generated 46 new + 1269 unchanged - 0 fixed = 1315 total (was 1269) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} shadedjars {color} | {color:red} 0m 11s{color} | {color:red} patch has 7 errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 15m 1s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 57s{color} | {color:green} hbase-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green}103m 2s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 0s{color} | {color:green} hbase-shell in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 55s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}157m 8s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f | | JIRA Issue | HBASE-20243 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12917569/HBASE-20243.master.007.patch | | Optional Tests | asflicense javac javadoc unit findbugs
[jira] [Commented] (HBASE-12386) Replication gets stuck following a transient zookeeper error to remote peer cluster
[ https://issues.apache.org/jira/browse/HBASE-12386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426009#comment-16426009 ] Andrew Purtell commented on HBASE-12386: Sorry about that. > Replication gets stuck following a transient zookeeper error to remote peer > cluster > --- > > Key: HBASE-12386 > URL: https://issues.apache.org/jira/browse/HBASE-12386 > Project: HBase > Issue Type: Bug > Components: Replication >Affects Versions: 0.98.7 >Reporter: Adrian Muraru >Assignee: Adrian Muraru >Priority: Major > Fix For: 0.98.8, 0.99.2, 2.0.0 > > Attachments: HBASE-12386-0.98.patch, HBASE-12386.patch > > > Following a transient ZK error replication gets stuck and remote peers are > never updated. > Source region servers are reporting continuously the following error in logs: > "No replication sinks are available" -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-15570) renewable delegation tokens for long-lived spark applications
[ https://issues.apache.org/jira/browse/HBASE-15570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426005#comment-16426005 ] Abhishek Talluri commented on HBASE-15570: -- Is there any update if this has been handled in spark2 .. It's confusing to see SPARK-14743 moved to resolved state and this remains unresolved. I have a client who is looking for automatic renewal of HBase delegation token and we are unsure if this has been fixed. > renewable delegation tokens for long-lived spark applications > - > > Key: HBASE-15570 > URL: https://issues.apache.org/jira/browse/HBASE-15570 > Project: HBase > Issue Type: Improvement > Components: spark >Reporter: Sean Busbey >Assignee: Sean Busbey >Priority: Major > > Right now our spark integration works on secure clusters by getting > delegation tokens and sending them to the executors. Unfortunately, > applications that need to run for longer than the delegation token lifetime > (by default 7 days) will fail. > In particular, this is an issue for Spark Streaming applications. Since they > expect to run indefinitely, we should have a means for renewing the > delegation tokens. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HBASE-20337) Update the doc on how to setup shortcircuit reads; its stale
[ https://issues.apache.org/jira/browse/HBASE-20337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425985#comment-16425985 ] stack edited comment on HBASE-20337 at 4/4/18 6:34 PM: --- Resolving as pushed. Pushed an addendum too with pointers to the advanced configs for upping cache and buffer sizes. Thanks for the reviews boys. Oh, fixed the whitespace on commit and the ASF license is from concurrent HBASE-17730 commit... its getting fixed. was (Author: stack): Resolving as pushed. Pushed an addendum too with pointers to the advanced configs for upping cache and buffer sizes. Thanks for the reviews boys. > Update the doc on how to setup shortcircuit reads; its stale > > > Key: HBASE-20337 > URL: https://issues.apache.org/jira/browse/HBASE-20337 > Project: HBase > Issue Type: Sub-task >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.0 > > Attachments: HBASE-20337.master.001.patch, > HBASE-20337.master.002.patch > > > The doc is from another era. Update it. Short-circuit reads can make a *big* > difference when random reading in particular. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HBASE-20337) Update the doc on how to setup shortcircuit reads; its stale
[ https://issues.apache.org/jira/browse/HBASE-20337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425985#comment-16425985 ] stack edited comment on HBASE-20337 at 4/4/18 6:33 PM: --- Resolving as pushed. Pushed an addendum too with pointers to the advanced configs for upping cache and buffer sizes. Thanks for the reviews boys. was (Author: stack): Resolving as pushed. Pushed an addendum too with pointers to the advanced configs for upping cache and buffer sizes. > Update the doc on how to setup shortcircuit reads; its stale > > > Key: HBASE-20337 > URL: https://issues.apache.org/jira/browse/HBASE-20337 > Project: HBase > Issue Type: Sub-task >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.0 > > Attachments: HBASE-20337.master.001.patch, > HBASE-20337.master.002.patch > > > The doc is from another era. Update it. Short-circuit reads can make a *big* > difference when random reading in particular. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20337) Update the doc on how to setup shortcircuit reads; its stale
[ https://issues.apache.org/jira/browse/HBASE-20337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-20337: -- Resolution: Fixed Status: Resolved (was: Patch Available) Resolving as pushed. Pushed an addendum too with pointers to the advanced configs for upping cache and buffer sizes. > Update the doc on how to setup shortcircuit reads; its stale > > > Key: HBASE-20337 > URL: https://issues.apache.org/jira/browse/HBASE-20337 > Project: HBase > Issue Type: Sub-task >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.0 > > Attachments: HBASE-20337.master.001.patch, > HBASE-20337.master.002.patch > > > The doc is from another era. Update it. Short-circuit reads can make a *big* > difference when random reading in particular. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20337) Update the doc on how to setup shortcircuit reads; its stale
[ https://issues.apache.org/jira/browse/HBASE-20337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425982#comment-16425982 ] stack commented on HBASE-20337: --- Pushed the patch but added note on setting checksum flag to off when reading as hbase does its own checksumming (to save on i/os -- added links to existing explanation on why hbase does its own checksumming). Thanks to [~ram_krish] for the reminder. Also added pointers to shortcircuit HDFS issue since it a good read and to Colin's overview blog done on rollout. > Update the doc on how to setup shortcircuit reads; its stale > > > Key: HBASE-20337 > URL: https://issues.apache.org/jira/browse/HBASE-20337 > Project: HBase > Issue Type: Sub-task >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.0 > > Attachments: HBASE-20337.master.001.patch, > HBASE-20337.master.002.patch > > > The doc is from another era. Update it. Short-circuit reads can make a big > difference. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20337) Update the doc on how to setup shortcircuit reads; its stale
[ https://issues.apache.org/jira/browse/HBASE-20337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-20337: -- Description: The doc is from another era. Update it. Short-circuit reads can make a *big* difference when random reading in particular. (was: The doc is from another era. Update it. Short-circuit reads can make a big difference.) > Update the doc on how to setup shortcircuit reads; its stale > > > Key: HBASE-20337 > URL: https://issues.apache.org/jira/browse/HBASE-20337 > Project: HBase > Issue Type: Sub-task >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.0 > > Attachments: HBASE-20337.master.001.patch, > HBASE-20337.master.002.patch > > > The doc is from another era. Update it. Short-circuit reads can make a *big* > difference when random reading in particular. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-17730) [DOC] Migration to 2.0 for coprocessors
[ https://issues.apache.org/jira/browse/HBASE-17730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425974#comment-16425974 ] stack commented on HBASE-17730: --- [~appy] What [~psomogyi] said Missing ASF license. > [DOC] Migration to 2.0 for coprocessors > > > Key: HBASE-17730 > URL: https://issues.apache.org/jira/browse/HBASE-17730 > Project: HBase > Issue Type: Sub-task > Components: documentation, migration >Reporter: Appy >Assignee: Appy >Priority: Blocker > Fix For: 2.0.0 > > Attachments: HBASE-17730.master.001.patch > > > Jiras breaking coprocessor compatibility should be marked with component ' > Coprocessor', and label 'incompatible'. > Close to releasing 2.0, we should go through all such jiras and write down > steps for migrating coprocessor easily. > The idea is, it might be very hard to fix coprocessor breakages by reverse > engineering errors, but will be easier we suggest easiest way to fix > breakages resulting from each individual incompatible change. > For eg. HBASE-17312 is incompatible change. It'll result in 100s of errors > because BaseXXXObserver classes are gone and will probably result in a lot of > confusion, but if we explicitly mention the fix which is just one line change > - replace "Foo extends BaseXXXObserver" with "Foo implements XXXObserver" - > it makes it very easy. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20276) [shell] Revert shell REPL change and document
[ https://issues.apache.org/jira/browse/HBASE-20276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425962#comment-16425962 ] Mike Drob commented on HBASE-20276: --- why the changes in get_splits? > [shell] Revert shell REPL change and document > - > > Key: HBASE-20276 > URL: https://issues.apache.org/jira/browse/HBASE-20276 > Project: HBase > Issue Type: Sub-task > Components: documentation, shell >Affects Versions: 1.4.0, 2.0.0 >Reporter: Sean Busbey >Assignee: Sean Busbey >Priority: Blocker > Fix For: 1.4.4, 2.0.0 > > Attachments: HBASE-20276.0.patch > > > Feedback from [~mdrob] on HBASE-19158: > {quote} > Shell: > HBASE-19770. There was another issue opened where this was identified as a > problem so maybe the shape will change further, but I can't find it now. > {quote} > New commentary from [~busbey]: > This was a follow on to HBASE-15965. That change effectively makes it so none > of our ruby wrappers can be used to build expressions in an interactive REPL. > This is a pretty severe change (most of my tips on HBASE-15611 will break, I > think). > I think we should > a) Have a DISCUSS thread, spanning dev@ and user@ > b) based on the outcome of that thread, either default to the new behavior or > the old behavior > c) if we keep the HBASE-15965 behavior as the default, flag it as > incompatible, call it out in the hbase 2.0 upgrade section, and update docs > (two examples: the output in the shell_exercises sections would be wrong, and > the _table_variables section won't work) > d) In either case document the new flag in the ref guide -- This message was sent by Atlassian JIRA (v7.6.3#76005)