[jira] [Commented] (HBASE-8871) The region server can crash at startup
[ https://issues.apache.org/jira/browse/HBASE-8871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700500#comment-13700500 ] Nicolas Liochon commented on HBASE-8871: committed, thanks for the review Stack! The region server can crash at startup -- Key: HBASE-8871 URL: https://issues.apache.org/jira/browse/HBASE-8871 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.98.0, 0.95.2 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Attachments: 8871.v1.patch I have this stack when I start a fresh region server. 5% of the time I would say (per region server). {code} 2013-07-04 12:00:22,609 FATAL [regionserver60020] regionserver.HRegionServer: ABORTING region server ip-10-137-7-67.ec2.internal,60020,1372939221819: Initialization of RS failed. Hence aborting RS. java.util.ConcurrentModificationException at java.util.Hashtable$Enumerator.next(Hashtable.java:1200) at org.apache.hadoop.conf.Configuration.iterator(Configuration.java:1820) at org.apache.hadoop.hbase.zookeeper.ZKConfig.makeZKProps(ZKConfig.java:92) at org.apache.hadoop.hbase.zookeeper.ZKConfig.getZKQuorumServersString(ZKConfig.java:267) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.init(ZooKeeperWatcher.java:158) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.init(ZooKeeperWatcher.java:133) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:667) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:647) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:778) at java.lang.Thread.run(Thread.java:722) 2013-07-04 12:00:22,614 FATAL [regionserver60020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [] 2013-07-04 12:00:22,614 INFO [regionserver60020] regionserver.HRegionServer: STOPPED: Initialization of RS failed. Hence aborting RS. 2013-07-04 12:00:22,616 FATAL [regionserver60020] regionserver.HRegionServer: ABORTING region server ip-10-137-7-67.ec2.internal,60020,1372939221819: Unhandled: null java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:798) at java.lang.Thread.run(Thread.java:722) 2013-07-04 12:00:22,616 FATAL [regionserver60020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [] 2013-07-04 12:00:22,617 INFO [regionserver60020] regionserver.HRegionServer: STOPPED: Unhandled: null 2013-07-04 12:00:22,767 INFO [main] regionserver.ShutdownHook: Installed shutdown hook thread: Shutdownhook:regionserver60020 2013-07-04 12:00:22,768 ERROR [main] regionserver.HRegionServerCommandLine: Region server exiting java.lang.RuntimeException: HRegionServer Aborted at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:66) at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:85) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:78) at org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2309) 2013-07-04 12:00:22,770 INFO [Thread-4] regionserver.ShutdownHook: Shutdown hook starting; hbase.shutdown.hook=true; fsShutdownHook=org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@21f0dbb9 {code} There is one bug here in the region server: we should not start the snapshot if the machine is supposed to stop: {code} // start the snapshot handler, since the server is ready to run this.snapshotManager.start(); {code} and the root cause is here in ZKConfig: {code} for (EntryString, String entry : conf) { // === BUG String key = entry.getKey(); if (key.startsWith(HConstants.ZK_CFG_PROPERTY_PREFIX)) { String zkKey = key.substring(HConstants.ZK_CFG_PROPERTY_PREFIX_LEN); String value = entry.getValue(); // If the value has variables substitutions, need to do a get. if (value.contains(VARIABLE_START)) { value = conf.get(key); } zkProperties.put(zkKey, value); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8871) The region server can crash at startup
[ https://issues.apache.org/jira/browse/HBASE-8871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Liochon updated HBASE-8871: --- Resolution: Fixed Fix Version/s: 0.95.2 0.98.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) The region server can crash at startup -- Key: HBASE-8871 URL: https://issues.apache.org/jira/browse/HBASE-8871 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.98.0, 0.95.2 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.98.0, 0.95.2 Attachments: 8871.v1.patch I have this stack when I start a fresh region server. 5% of the time I would say (per region server). {code} 2013-07-04 12:00:22,609 FATAL [regionserver60020] regionserver.HRegionServer: ABORTING region server ip-10-137-7-67.ec2.internal,60020,1372939221819: Initialization of RS failed. Hence aborting RS. java.util.ConcurrentModificationException at java.util.Hashtable$Enumerator.next(Hashtable.java:1200) at org.apache.hadoop.conf.Configuration.iterator(Configuration.java:1820) at org.apache.hadoop.hbase.zookeeper.ZKConfig.makeZKProps(ZKConfig.java:92) at org.apache.hadoop.hbase.zookeeper.ZKConfig.getZKQuorumServersString(ZKConfig.java:267) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.init(ZooKeeperWatcher.java:158) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.init(ZooKeeperWatcher.java:133) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:667) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:647) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:778) at java.lang.Thread.run(Thread.java:722) 2013-07-04 12:00:22,614 FATAL [regionserver60020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [] 2013-07-04 12:00:22,614 INFO [regionserver60020] regionserver.HRegionServer: STOPPED: Initialization of RS failed. Hence aborting RS. 2013-07-04 12:00:22,616 FATAL [regionserver60020] regionserver.HRegionServer: ABORTING region server ip-10-137-7-67.ec2.internal,60020,1372939221819: Unhandled: null java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:798) at java.lang.Thread.run(Thread.java:722) 2013-07-04 12:00:22,616 FATAL [regionserver60020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [] 2013-07-04 12:00:22,617 INFO [regionserver60020] regionserver.HRegionServer: STOPPED: Unhandled: null 2013-07-04 12:00:22,767 INFO [main] regionserver.ShutdownHook: Installed shutdown hook thread: Shutdownhook:regionserver60020 2013-07-04 12:00:22,768 ERROR [main] regionserver.HRegionServerCommandLine: Region server exiting java.lang.RuntimeException: HRegionServer Aborted at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:66) at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:85) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:78) at org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2309) 2013-07-04 12:00:22,770 INFO [Thread-4] regionserver.ShutdownHook: Shutdown hook starting; hbase.shutdown.hook=true; fsShutdownHook=org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@21f0dbb9 {code} There is one bug here in the region server: we should not start the snapshot if the machine is supposed to stop: {code} // start the snapshot handler, since the server is ready to run this.snapshotManager.start(); {code} and the root cause is here in ZKConfig: {code} for (EntryString, String entry : conf) { // === BUG String key = entry.getKey(); if (key.startsWith(HConstants.ZK_CFG_PROPERTY_PREFIX)) { String zkKey = key.substring(HConstants.ZK_CFG_PROPERTY_PREFIX_LEN); String value = entry.getValue(); // If the value has variables substitutions, need to do a get. if (value.contains(VARIABLE_START)) { value = conf.get(key); } zkProperties.put(zkKey, value); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8867) HLogUtils#getServerNameFromHLogDirectoryName does not take into account the -splitting extension
[ https://issues.apache.org/jira/browse/HBASE-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700502#comment-13700502 ] Nicolas Liochon commented on HBASE-8867: Committed, thanks for the review Ted! HLogUtils#getServerNameFromHLogDirectoryName does not take into account the -splitting extension Key: HBASE-8867 URL: https://issues.apache.org/jira/browse/HBASE-8867 Project: HBase Issue Type: Bug Components: MTTR Reporter: Nicolas Liochon Assignee: Nicolas Liochon Attachments: 8867.v1.patch + it could reuse some existing code -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8867) HLogUtils#getServerNameFromHLogDirectoryName does not take into account the -splitting extension
[ https://issues.apache.org/jira/browse/HBASE-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Liochon updated HBASE-8867: --- Resolution: Fixed Fix Version/s: 0.95.2 0.98.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) HLogUtils#getServerNameFromHLogDirectoryName does not take into account the -splitting extension Key: HBASE-8867 URL: https://issues.apache.org/jira/browse/HBASE-8867 Project: HBase Issue Type: Bug Components: MTTR Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.98.0, 0.95.2 Attachments: 8867.v1.patch + it could reuse some existing code -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7840) Enhance the java it framework to start stop a distributed hbase hadoop cluster
[ https://issues.apache.org/jira/browse/HBASE-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700503#comment-13700503 ] Nicolas Liochon commented on HBASE-7840: bq. do you think it reasonable to expect that hbase user have sudo on a test rig I think it's reasonable to ask for it. I will see what I can do. But i will need to commit something sometimes :-) Enhance the java it framework to start stop a distributed hbase hadoop cluster --- Key: HBASE-7840 URL: https://issues.apache.org/jira/browse/HBASE-7840 Project: HBase Issue Type: New Feature Components: test Affects Versions: 0.95.2 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Priority: Minor Fix For: 0.95.2 Attachments: 7840.v1.patch, 7840.v3.patch, 7840.v4.patch Needs are to use a development version of HBase HDFS 1 2. Ideally, should be nicely backportable to 0.94 to allow comparisons and regression tests between versions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8729) distributedLogReplay may hang during chained region server failure
[ https://issues.apache.org/jira/browse/HBASE-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700508#comment-13700508 ] Jeffrey Zhong commented on HBASE-8729: -- Thanks [~saint@gmail.com] and [~te...@apache.org] for reviews! I've integrated the change into 0.95 trunk branches. {quote} rather than ask if lock is 'locked' and unlock it if current thread matches the lock 'owner'? {quote} needReleaseLock is a local boolean variable so other threads won't see its value of current thread. Therefore other threads won't release splitLogLock without really owning it. The downside of checking lock 'owner' needs to cast splitLogLock down to ReentrantLock type. distributedLogReplay may hang during chained region server failure -- Key: HBASE-8729 URL: https://issues.apache.org/jira/browse/HBASE-8729 Project: HBase Issue Type: Bug Components: MTTR Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.98.0, 0.95.2 Attachments: 8729-v2.patch, hbase-8729.patch, hbase-8729-v3.patch, hbase-8729-v4.patch, hbase-8729-v5.patch In a test, half cluster(in terms of region servers) was down and some log replay had incurred chained RS failures(receiving RS of a log replay failed again). Since by default, we only allow 3 concurrent SSH handlers(controlled by {code}this.executorService.startExecutorService(ExecutorType.MASTER_SERVER_OPERATIONS,conf.getInt(hbase.master.executor.serverops.threads, 3));{code}). If all 3 SSH handlers are doing logReplay(blocking call) and one of receiving RS fails again then logReplay will hang because regions of the newly failed RS can't be re-assigned to another live RS(no ssh handler will be processed due to max threads setting) and existing log replay will keep routing replay traffic to the dead RS. The fix is to submit logReplay work into a separate type of executor queue in order not to block SSH region assignment so that logReplay can route traffic to a live RS after retries and move forward. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8872) Can't launch the integration tests without hijacking the classpath
[ https://issues.apache.org/jira/browse/HBASE-8872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700535#comment-13700535 ] Jeffrey Zhong commented on HBASE-8872: -- The same issue opened at HBASE-8200. We should fix this because I also hit this issue a few times. Can't launch the integration tests without hijacking the classpath -- Key: HBASE-8872 URL: https://issues.apache.org/jira/browse/HBASE-8872 Project: HBase Issue Type: Bug Components: scripts, test Affects Versions: 0.98.0, 0.95.1 Reporter: Nicolas Liochon The doc says in 16.7.5.2. Running integration tests against distributed cluster: The configuration will be picked by the bin/hbase script. On trunk I have this stack {noformat} bin/hbase org.apache.hadoop.hbase.IntegrationTestsDriver IntegrationTestRecoveryEmptyTableCleanStopBox Exception in thread main java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/IntegrationTestsDriver Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.IntegrationTestsDriver at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) Could not find the main class: org.apache.hadoop.hbase.IntegrationTestsDriver. Program will exit. {noformat} I workaround this by using: export HBASE_CLASSPATH=~/hbase/hbase-it/target/test-classes/ But 1) It's a workaround, not a fix. 2) It may (or does) not work when we're using the packaged version of hbase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8729) distributedLogReplay may hang during chained region server failure
[ https://issues.apache.org/jira/browse/HBASE-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-8729: - Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) distributedLogReplay may hang during chained region server failure -- Key: HBASE-8729 URL: https://issues.apache.org/jira/browse/HBASE-8729 Project: HBase Issue Type: Bug Components: MTTR Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.98.0, 0.95.2 Attachments: 8729-v2.patch, hbase-8729.patch, hbase-8729-v3.patch, hbase-8729-v4.patch, hbase-8729-v5.patch In a test, half cluster(in terms of region servers) was down and some log replay had incurred chained RS failures(receiving RS of a log replay failed again). Since by default, we only allow 3 concurrent SSH handlers(controlled by {code}this.executorService.startExecutorService(ExecutorType.MASTER_SERVER_OPERATIONS,conf.getInt(hbase.master.executor.serverops.threads, 3));{code}). If all 3 SSH handlers are doing logReplay(blocking call) and one of receiving RS fails again then logReplay will hang because regions of the newly failed RS can't be re-assigned to another live RS(no ssh handler will be processed due to max threads setting) and existing log replay will keep routing replay traffic to the dead RS. The fix is to submit logReplay work into a separate type of executor queue in order not to block SSH region assignment so that logReplay can route traffic to a live RS after retries and move forward. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8810) Bring in code constants in line with default xml's
[ https://issues.apache.org/jira/browse/HBASE-8810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700544#comment-13700544 ] Nicolas Liochon commented on HBASE-8810: I've written this {code} package org.apache.hadoop.hbase; import org.apache.commons.logging.Log; import org.apache.commons.logging.LogFactory; import org.apache.hadoop.conf.Configuration; import org.junit.Test; import org.junit.experimental.categories.Category; import java.lang.reflect.Field; import java.util.Map; @Category(SmallTests.class) public class TestDefaultSettings { public static final Log LOG = LogFactory.getLog(TestDefaultSettings.class); private String codeNameFromXMLName(String xmlName){ String codeName = DEFAULT_ + xmlName.toUpperCase().replace('.', '_').trim(); return codeName; } @Test public void testDefaultSettings() throws IllegalAccessException { Configuration codeConf = new Configuration(); Configuration xmlConf = HBaseConfiguration.create(); Class? hcc = HConstants.class; for (Map.EntryString, String e : xmlConf){ String xmlName = e.getKey(); String codeName = codeNameFromXMLName(xmlName); try { Field f = hcc.getField(codeName); String codeVal = ( + f.get(null)).trim(); if (e.getValue().equals(codeVal)){ System.out.println(OK xmlName: + xmlName + , codeConf: code= + codeVal +xml= + e.getValue()); }else { System.err.println(NOK xmlName: + xmlName + , codeConf: code= + codeVal +xml= + e.getValue()); } } catch (NoSuchFieldException e1) { System.out.println( NoSuchFieldException: xmlName: + xmlName + , + codeConf: code= + codeName +xml= + e.getKey()); } } } } {code} But it's not a huge success because the naming differs in most cases. Bring in code constants in line with default xml's -- Key: HBASE-8810 URL: https://issues.apache.org/jira/browse/HBASE-8810 Project: HBase Issue Type: Bug Components: Usability Reporter: Elliott Clark Assignee: Elliott Clark Fix For: 0.95.2 Attachments: 8810.txt, 8810v2.txt, hbase-default_to_java_constants.xsl, HBaseDefaultXMLConstants.java After the defaults were changed in the xml some constants were left the same. DEFAULT_HBASE_CLIENT_PAUSE for example. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8873) minor concurrent issue about filesCompacting
Liang Xie created HBASE-8873: Summary: minor concurrent issue about filesCompacting Key: HBASE-8873 URL: https://issues.apache.org/jira/browse/HBASE-8873 Project: HBase Issue Type: Bug Reporter: Liang Xie Assignee: Liang Xie -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8873) minor concurrent issue about filesCompacting
[ https://issues.apache.org/jira/browse/HBASE-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HBASE-8873: - Description: i am reading compaction code, seems there's a minor thread-safe issue on filesCompacting in both 0.94 and trunk, we guard it with synchronized/lock, except needsCompaction() function. and the fix should sth like this: synchronized (filesCompacting) { ... } minor concurrent issue about filesCompacting Key: HBASE-8873 URL: https://issues.apache.org/jira/browse/HBASE-8873 Project: HBase Issue Type: Bug Reporter: Liang Xie Assignee: Liang Xie i am reading compaction code, seems there's a minor thread-safe issue on filesCompacting in both 0.94 and trunk, we guard it with synchronized/lock, except needsCompaction() function. and the fix should sth like this: synchronized (filesCompacting) { ... } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7623) Username is not available for HConnectionManager to use in HConnectionKey
[ https://issues.apache.org/jira/browse/HBASE-7623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700580#comment-13700580 ] yogesh bedekar commented on HBASE-7623: --- Hi, I am using hbase-0.94.8.jar and getting the following exception while trying to connect - WARN client.HConnectionManager: Error obtaining current user, skipping username in HConnectionKey java.io.IOException: failure to login at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:490) at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:452) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.util.Methods.call(Methods.java:37) at org.apache.hadoop.hbase.security.User.call(User.java:607) at org.apache.hadoop.hbase.security.User.callStatic(User.java:597) at org.apache.hadoop.hbase.security.User.access$400(User.java:51) at org.apache.hadoop.hbase.security.User$SecureHadoopUser.init(User.java:414) at org.apache.hadoop.hbase.security.User$SecureHadoopUser.init(User.java:409) at org.apache.hadoop.hbase.security.User.getCurrent(User.java:157) What is the solution to this problem? I have embedded this hbase jar in an osgi bundle and am unable to connect to my HBase database. Thanks in advance for any help. Rgds, Yogesh Bedekar Technical Lead Oracle Username is not available for HConnectionManager to use in HConnectionKey - Key: HBASE-7623 URL: https://issues.apache.org/jira/browse/HBASE-7623 Project: HBase Issue Type: Improvement Components: Client, security Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Attachments: trunk-7623.patch Sometimes, some non-IOException prevents User.getCurrent() to get a username. It makes it impossible to create a HConnection. We should catch all exception here: {noformat} try { User currentUser = User.getCurrent(); if (currentUser != null) { username = currentUser.getName(); } } catch (IOException ioe) { LOG.warn(Error obtaining current user, skipping username in HConnectionKey, ioe); } {noformat} Not just IOException, so that client can move forward. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8873) minor concurrent issue about filesCompacting
[ https://issues.apache.org/jira/browse/HBASE-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HBASE-8873: - Affects Version/s: 0.95.1 0.94.9 Status: Patch Available (was: Open) minor concurrent issue about filesCompacting Key: HBASE-8873 URL: https://issues.apache.org/jira/browse/HBASE-8873 Project: HBase Issue Type: Bug Affects Versions: 0.94.9, 0.95.1 Reporter: Liang Xie Assignee: Liang Xie Attachments: HBase-8873-0.94.txt i am reading compaction code, seems there's a minor thread-safe issue on filesCompacting in both 0.94 and trunk, we guard it with synchronized/lock, except needsCompaction() function. and the fix should sth like this: synchronized (filesCompacting) { ... } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8873) minor concurrent issue about filesCompacting
[ https://issues.apache.org/jira/browse/HBASE-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HBASE-8873: - Attachment: HBase-8873-0.94.txt minor concurrent issue about filesCompacting Key: HBASE-8873 URL: https://issues.apache.org/jira/browse/HBASE-8873 Project: HBase Issue Type: Bug Reporter: Liang Xie Assignee: Liang Xie Attachments: HBase-8873-0.94.txt i am reading compaction code, seems there's a minor thread-safe issue on filesCompacting in both 0.94 and trunk, we guard it with synchronized/lock, except needsCompaction() function. and the fix should sth like this: synchronized (filesCompacting) { ... } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8873) minor concurrent issue about filesCompacting
[ https://issues.apache.org/jira/browse/HBASE-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700581#comment-13700581 ] Hadoop QA commented on HBASE-8873: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12590983/HBase-8873-0.94.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6221//console This message is automatically generated. minor concurrent issue about filesCompacting Key: HBASE-8873 URL: https://issues.apache.org/jira/browse/HBASE-8873 Project: HBase Issue Type: Bug Affects Versions: 0.95.1, 0.94.9 Reporter: Liang Xie Assignee: Liang Xie Attachments: HBase-8873-0.94.txt i am reading compaction code, seems there's a minor thread-safe issue on filesCompacting in both 0.94 and trunk, we guard it with synchronized/lock, except needsCompaction() function. and the fix should sth like this: synchronized (filesCompacting) { ... } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8867) HLogUtils#getServerNameFromHLogDirectoryName does not take into account the -splitting extension
[ https://issues.apache.org/jira/browse/HBASE-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700589#comment-13700589 ] Hudson commented on HBASE-8867: --- Integrated in hbase-0.95 #289 (See [https://builds.apache.org/job/hbase-0.95/289/]) HBASE-8867 HLogUtils#getServerNameFromHLogDirectoryName does not take into account the -splitting extension (Revision 1499924) Result = FAILURE nkeywal : Files : * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogUtil.java * /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java HLogUtils#getServerNameFromHLogDirectoryName does not take into account the -splitting extension Key: HBASE-8867 URL: https://issues.apache.org/jira/browse/HBASE-8867 Project: HBase Issue Type: Bug Components: MTTR Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.98.0, 0.95.2 Attachments: 8867.v1.patch + it could reuse some existing code -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8874) PutCombiner is skipping KeyValues while combing puts of same row during bulkload
rajeshbabu created HBASE-8874: - Summary: PutCombiner is skipping KeyValues while combing puts of same row during bulkload Key: HBASE-8874 URL: https://issues.apache.org/jira/browse/HBASE-8874 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.95.1, 0.95.0 Reporter: rajeshbabu Assignee: rajeshbabu Priority: Critical Fix For: 0.98.0, 0.95.2 While combining puts of same row in map phase we are using below logic in PutCombiner#reduce. In for loop first time we will add one Put object to puts map. Next time onwards we are just overriding key values of a family with key values of the same family in other put. So we are mostly writing one Put object to map output and remaining will be skipped(data loss). {code} Mapbyte[], Put puts = new TreeMapbyte[], Put(Bytes.BYTES_COMPARATOR); for (Put p : vals) { cnt++; if (!puts.containsKey(p.getRow())) { puts.put(p.getRow(), p); } else { puts.get(p.getRow()).getFamilyMap().putAll(p.getFamilyMap()); } } {code} We need to change logic similar as below because we are sure the rowkey of all the puts will be same. {code} Put finalPut = null; Mapbyte[], List? extends Cell familyMap = null; for (Put p : vals) { cnt++; if (finalPut==null) { finalPut = p; familyMap = finalPut.getFamilyMap(); } else { for (Entrybyte[], List? extends Cell entry : p.getFamilyMap().entrySet()) { List? extends Cell list = familyMap.get(entry.getKey()); if (list == null) { familyMap.put(entry.getKey(), entry.getValue()); } else { (((ListKeyValue)list)).addAll((ListKeyValue)entry.getValue()); } } } } context.write(row, finalPut); {code} Also need to implement TODOs mentioned by Nick {code} // TODO: would be better if we knew codeK row/code and Put rowkey were // identical. Then this whole Put buffering business goes away. // TODO: Could use HeapSize to create an upper bound on the memory size of // the puts map and flush some portion of the content while looping. This // flush could result in multiple Puts for a single rowkey. That is // acceptable because Combiner is run as an optimization and it's not // critical that all Puts are grouped perfectly. {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8729) distributedLogReplay may hang during chained region server failure
[ https://issues.apache.org/jira/browse/HBASE-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700590#comment-13700590 ] Hudson commented on HBASE-8729: --- Integrated in hbase-0.95 #289 (See [https://builds.apache.org/job/hbase-0.95/289/]) HBASE-8729: distributedLogReplay may hang during chained region server failure (Revision 1499926) Result = FAILURE jeffreyz : Files : * /hbase/branches/0.95/hbase-client/src/main/java/org/apache/hadoop/hbase/executor/EventType.java * /hbase/branches/0.95/hbase-client/src/main/java/org/apache/hadoop/hbase/executor/ExecutorType.java * /hbase/branches/0.95/hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestDataIngestWithChaosMonkey.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/LogReplayHandler.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/SplitLogWorker.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEditsReplaySink.java distributedLogReplay may hang during chained region server failure -- Key: HBASE-8729 URL: https://issues.apache.org/jira/browse/HBASE-8729 Project: HBase Issue Type: Bug Components: MTTR Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.98.0, 0.95.2 Attachments: 8729-v2.patch, hbase-8729.patch, hbase-8729-v3.patch, hbase-8729-v4.patch, hbase-8729-v5.patch In a test, half cluster(in terms of region servers) was down and some log replay had incurred chained RS failures(receiving RS of a log replay failed again). Since by default, we only allow 3 concurrent SSH handlers(controlled by {code}this.executorService.startExecutorService(ExecutorType.MASTER_SERVER_OPERATIONS,conf.getInt(hbase.master.executor.serverops.threads, 3));{code}). If all 3 SSH handlers are doing logReplay(blocking call) and one of receiving RS fails again then logReplay will hang because regions of the newly failed RS can't be re-assigned to another live RS(no ssh handler will be processed due to max threads setting) and existing log replay will keep routing replay traffic to the dead RS. The fix is to submit logReplay work into a separate type of executor queue in order not to block SSH region assignment so that logReplay can route traffic to a live RS after retries and move forward. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8871) The region server can crash at startup
[ https://issues.apache.org/jira/browse/HBASE-8871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700588#comment-13700588 ] Hudson commented on HBASE-8871: --- Integrated in hbase-0.95 #289 (See [https://builds.apache.org/job/hbase-0.95/289/]) HBASE-8871 The region server can crash at startup (Revision 1499922) Result = FAILURE nkeywal : Files : * /hbase/branches/0.95/hbase-client/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKConfig.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java The region server can crash at startup -- Key: HBASE-8871 URL: https://issues.apache.org/jira/browse/HBASE-8871 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.98.0, 0.95.2 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.98.0, 0.95.2 Attachments: 8871.v1.patch I have this stack when I start a fresh region server. 5% of the time I would say (per region server). {code} 2013-07-04 12:00:22,609 FATAL [regionserver60020] regionserver.HRegionServer: ABORTING region server ip-10-137-7-67.ec2.internal,60020,1372939221819: Initialization of RS failed. Hence aborting RS. java.util.ConcurrentModificationException at java.util.Hashtable$Enumerator.next(Hashtable.java:1200) at org.apache.hadoop.conf.Configuration.iterator(Configuration.java:1820) at org.apache.hadoop.hbase.zookeeper.ZKConfig.makeZKProps(ZKConfig.java:92) at org.apache.hadoop.hbase.zookeeper.ZKConfig.getZKQuorumServersString(ZKConfig.java:267) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.init(ZooKeeperWatcher.java:158) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.init(ZooKeeperWatcher.java:133) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:667) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:647) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:778) at java.lang.Thread.run(Thread.java:722) 2013-07-04 12:00:22,614 FATAL [regionserver60020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [] 2013-07-04 12:00:22,614 INFO [regionserver60020] regionserver.HRegionServer: STOPPED: Initialization of RS failed. Hence aborting RS. 2013-07-04 12:00:22,616 FATAL [regionserver60020] regionserver.HRegionServer: ABORTING region server ip-10-137-7-67.ec2.internal,60020,1372939221819: Unhandled: null java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:798) at java.lang.Thread.run(Thread.java:722) 2013-07-04 12:00:22,616 FATAL [regionserver60020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [] 2013-07-04 12:00:22,617 INFO [regionserver60020] regionserver.HRegionServer: STOPPED: Unhandled: null 2013-07-04 12:00:22,767 INFO [main] regionserver.ShutdownHook: Installed shutdown hook thread: Shutdownhook:regionserver60020 2013-07-04 12:00:22,768 ERROR [main] regionserver.HRegionServerCommandLine: Region server exiting java.lang.RuntimeException: HRegionServer Aborted at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:66) at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:85) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:78) at org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2309) 2013-07-04 12:00:22,770 INFO [Thread-4] regionserver.ShutdownHook: Shutdown hook starting; hbase.shutdown.hook=true; fsShutdownHook=org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@21f0dbb9 {code} There is one bug here in the region server: we should not start the snapshot if the machine is supposed to stop: {code} // start the snapshot handler, since the server is ready to run this.snapshotManager.start(); {code} and the root cause is here in ZKConfig: {code} for (EntryString, String entry : conf) { // === BUG String key = entry.getKey(); if (key.startsWith(HConstants.ZK_CFG_PROPERTY_PREFIX)) { String zkKey = key.substring(HConstants.ZK_CFG_PROPERTY_PREFIX_LEN); String value = entry.getValue(); // If the value has variables substitutions, need to do a get. if (value.contains(VARIABLE_START)) { value = conf.get(key); } zkProperties.put(zkKey, value); } {code} -- This message is
[jira] [Commented] (HBASE-8874) PutCombiner is skipping KeyValues while combing puts of same row during bulkload
[ https://issues.apache.org/jira/browse/HBASE-8874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700591#comment-13700591 ] rajeshbabu commented on HBASE-8874: --- I have basic working patch. I will implement above TODOs and upload patch on monday. PutCombiner is skipping KeyValues while combing puts of same row during bulkload Key: HBASE-8874 URL: https://issues.apache.org/jira/browse/HBASE-8874 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.95.0, 0.95.1 Reporter: rajeshbabu Assignee: rajeshbabu Priority: Critical Fix For: 0.98.0, 0.95.2 While combining puts of same row in map phase we are using below logic in PutCombiner#reduce. In for loop first time we will add one Put object to puts map. Next time onwards we are just overriding key values of a family with key values of the same family in other put. So we are mostly writing one Put object to map output and remaining will be skipped(data loss). {code} Mapbyte[], Put puts = new TreeMapbyte[], Put(Bytes.BYTES_COMPARATOR); for (Put p : vals) { cnt++; if (!puts.containsKey(p.getRow())) { puts.put(p.getRow(), p); } else { puts.get(p.getRow()).getFamilyMap().putAll(p.getFamilyMap()); } } {code} We need to change logic similar as below because we are sure the rowkey of all the puts will be same. {code} Put finalPut = null; Mapbyte[], List? extends Cell familyMap = null; for (Put p : vals) { cnt++; if (finalPut==null) { finalPut = p; familyMap = finalPut.getFamilyMap(); } else { for (Entrybyte[], List? extends Cell entry : p.getFamilyMap().entrySet()) { List? extends Cell list = familyMap.get(entry.getKey()); if (list == null) { familyMap.put(entry.getKey(), entry.getValue()); } else { (((ListKeyValue)list)).addAll((ListKeyValue)entry.getValue()); } } } } context.write(row, finalPut); {code} Also need to implement TODOs mentioned by Nick {code} // TODO: would be better if we knew codeK row/code and Put rowkey were // identical. Then this whole Put buffering business goes away. // TODO: Could use HeapSize to create an upper bound on the memory size of // the puts map and flush some portion of the content while looping. This // flush could result in multiple Puts for a single rowkey. That is // acceptable because Combiner is run as an optimization and it's not // critical that all Puts are grouped perfectly. {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8729) distributedLogReplay may hang during chained region server failure
[ https://issues.apache.org/jira/browse/HBASE-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700594#comment-13700594 ] Hudson commented on HBASE-8729: --- Integrated in HBase-TRUNK #4216 (See [https://builds.apache.org/job/HBase-TRUNK/4216/]) HBASE-8729: distributedLogReplay may hang during chained region server failure (Revision 1499925) Result = SUCCESS jeffreyz : Files : * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/executor/EventType.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/executor/ExecutorType.java * /hbase/trunk/hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestDataIngestWithChaosMonkey.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/LogReplayHandler.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/SplitLogWorker.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEditsReplaySink.java distributedLogReplay may hang during chained region server failure -- Key: HBASE-8729 URL: https://issues.apache.org/jira/browse/HBASE-8729 Project: HBase Issue Type: Bug Components: MTTR Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.98.0, 0.95.2 Attachments: 8729-v2.patch, hbase-8729.patch, hbase-8729-v3.patch, hbase-8729-v4.patch, hbase-8729-v5.patch In a test, half cluster(in terms of region servers) was down and some log replay had incurred chained RS failures(receiving RS of a log replay failed again). Since by default, we only allow 3 concurrent SSH handlers(controlled by {code}this.executorService.startExecutorService(ExecutorType.MASTER_SERVER_OPERATIONS,conf.getInt(hbase.master.executor.serverops.threads, 3));{code}). If all 3 SSH handlers are doing logReplay(blocking call) and one of receiving RS fails again then logReplay will hang because regions of the newly failed RS can't be re-assigned to another live RS(no ssh handler will be processed due to max threads setting) and existing log replay will keep routing replay traffic to the dead RS. The fix is to submit logReplay work into a separate type of executor queue in order not to block SSH region assignment so that logReplay can route traffic to a live RS after retries and move forward. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8867) HLogUtils#getServerNameFromHLogDirectoryName does not take into account the -splitting extension
[ https://issues.apache.org/jira/browse/HBASE-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700593#comment-13700593 ] Hudson commented on HBASE-8867: --- Integrated in HBase-TRUNK #4216 (See [https://builds.apache.org/job/HBase-TRUNK/4216/]) HBASE-8867 HLogUtils#getServerNameFromHLogDirectoryName does not take into account the -splitting extension (Revision 1499923) Result = SUCCESS nkeywal : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogUtil.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java HLogUtils#getServerNameFromHLogDirectoryName does not take into account the -splitting extension Key: HBASE-8867 URL: https://issues.apache.org/jira/browse/HBASE-8867 Project: HBase Issue Type: Bug Components: MTTR Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.98.0, 0.95.2 Attachments: 8867.v1.patch + it could reuse some existing code -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8871) The region server can crash at startup
[ https://issues.apache.org/jira/browse/HBASE-8871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700592#comment-13700592 ] Hudson commented on HBASE-8871: --- Integrated in HBase-TRUNK #4216 (See [https://builds.apache.org/job/HBase-TRUNK/4216/]) HBASE-8871 The region server can crash at startup (Revision 1499921) Result = SUCCESS nkeywal : Files : * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKConfig.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java The region server can crash at startup -- Key: HBASE-8871 URL: https://issues.apache.org/jira/browse/HBASE-8871 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.98.0, 0.95.2 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.98.0, 0.95.2 Attachments: 8871.v1.patch I have this stack when I start a fresh region server. 5% of the time I would say (per region server). {code} 2013-07-04 12:00:22,609 FATAL [regionserver60020] regionserver.HRegionServer: ABORTING region server ip-10-137-7-67.ec2.internal,60020,1372939221819: Initialization of RS failed. Hence aborting RS. java.util.ConcurrentModificationException at java.util.Hashtable$Enumerator.next(Hashtable.java:1200) at org.apache.hadoop.conf.Configuration.iterator(Configuration.java:1820) at org.apache.hadoop.hbase.zookeeper.ZKConfig.makeZKProps(ZKConfig.java:92) at org.apache.hadoop.hbase.zookeeper.ZKConfig.getZKQuorumServersString(ZKConfig.java:267) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.init(ZooKeeperWatcher.java:158) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.init(ZooKeeperWatcher.java:133) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:667) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:647) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:778) at java.lang.Thread.run(Thread.java:722) 2013-07-04 12:00:22,614 FATAL [regionserver60020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [] 2013-07-04 12:00:22,614 INFO [regionserver60020] regionserver.HRegionServer: STOPPED: Initialization of RS failed. Hence aborting RS. 2013-07-04 12:00:22,616 FATAL [regionserver60020] regionserver.HRegionServer: ABORTING region server ip-10-137-7-67.ec2.internal,60020,1372939221819: Unhandled: null java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:798) at java.lang.Thread.run(Thread.java:722) 2013-07-04 12:00:22,616 FATAL [regionserver60020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [] 2013-07-04 12:00:22,617 INFO [regionserver60020] regionserver.HRegionServer: STOPPED: Unhandled: null 2013-07-04 12:00:22,767 INFO [main] regionserver.ShutdownHook: Installed shutdown hook thread: Shutdownhook:regionserver60020 2013-07-04 12:00:22,768 ERROR [main] regionserver.HRegionServerCommandLine: Region server exiting java.lang.RuntimeException: HRegionServer Aborted at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:66) at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:85) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:78) at org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2309) 2013-07-04 12:00:22,770 INFO [Thread-4] regionserver.ShutdownHook: Shutdown hook starting; hbase.shutdown.hook=true; fsShutdownHook=org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@21f0dbb9 {code} There is one bug here in the region server: we should not start the snapshot if the machine is supposed to stop: {code} // start the snapshot handler, since the server is ready to run this.snapshotManager.start(); {code} and the root cause is here in ZKConfig: {code} for (EntryString, String entry : conf) { // === BUG String key = entry.getKey(); if (key.startsWith(HConstants.ZK_CFG_PROPERTY_PREFIX)) { String zkKey = key.substring(HConstants.ZK_CFG_PROPERTY_PREFIX_LEN); String value = entry.getValue(); // If the value has variables substitutions, need to do a get. if (value.contains(VARIABLE_START)) { value = conf.get(key); } zkProperties.put(zkKey, value); } {code} -- This message is
[jira] [Updated] (HBASE-8874) PutCombiner is skipping KeyValues while combining puts of same row during bulkload
[ https://issues.apache.org/jira/browse/HBASE-8874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rajeshbabu updated HBASE-8874: -- Summary: PutCombiner is skipping KeyValues while combining puts of same row during bulkload (was: PutCombiner is skipping KeyValues while combing puts of same row during bulkload) PutCombiner is skipping KeyValues while combining puts of same row during bulkload -- Key: HBASE-8874 URL: https://issues.apache.org/jira/browse/HBASE-8874 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.95.0, 0.95.1 Reporter: rajeshbabu Assignee: rajeshbabu Priority: Critical Fix For: 0.98.0, 0.95.2 While combining puts of same row in map phase we are using below logic in PutCombiner#reduce. In for loop first time we will add one Put object to puts map. Next time onwards we are just overriding key values of a family with key values of the same family in other put. So we are mostly writing one Put object to map output and remaining will be skipped(data loss). {code} Mapbyte[], Put puts = new TreeMapbyte[], Put(Bytes.BYTES_COMPARATOR); for (Put p : vals) { cnt++; if (!puts.containsKey(p.getRow())) { puts.put(p.getRow(), p); } else { puts.get(p.getRow()).getFamilyMap().putAll(p.getFamilyMap()); } } {code} We need to change logic similar as below because we are sure the rowkey of all the puts will be same. {code} Put finalPut = null; Mapbyte[], List? extends Cell familyMap = null; for (Put p : vals) { cnt++; if (finalPut==null) { finalPut = p; familyMap = finalPut.getFamilyMap(); } else { for (Entrybyte[], List? extends Cell entry : p.getFamilyMap().entrySet()) { List? extends Cell list = familyMap.get(entry.getKey()); if (list == null) { familyMap.put(entry.getKey(), entry.getValue()); } else { (((ListKeyValue)list)).addAll((ListKeyValue)entry.getValue()); } } } } context.write(row, finalPut); {code} Also need to implement TODOs mentioned by Nick {code} // TODO: would be better if we knew codeK row/code and Put rowkey were // identical. Then this whole Put buffering business goes away. // TODO: Could use HeapSize to create an upper bound on the memory size of // the puts map and flush some portion of the content while looping. This // flush could result in multiple Puts for a single rowkey. That is // acceptable because Combiner is run as an optimization and it's not // critical that all Puts are grouped perfectly. {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8874) PutCombiner is skipping KeyValues while combining puts of same row during bulkload
[ https://issues.apache.org/jira/browse/HBASE-8874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700607#comment-13700607 ] ramkrishna.s.vasudevan commented on HBASE-8874: --- Good one.. Your change above makes sense. PutCombiner is skipping KeyValues while combining puts of same row during bulkload -- Key: HBASE-8874 URL: https://issues.apache.org/jira/browse/HBASE-8874 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.95.0, 0.95.1 Reporter: rajeshbabu Assignee: rajeshbabu Priority: Critical Fix For: 0.98.0, 0.95.2 While combining puts of same row in map phase we are using below logic in PutCombiner#reduce. In for loop first time we will add one Put object to puts map. Next time onwards we are just overriding key values of a family with key values of the same family in other put. So we are mostly writing one Put object to map output and remaining will be skipped(data loss). {code} Mapbyte[], Put puts = new TreeMapbyte[], Put(Bytes.BYTES_COMPARATOR); for (Put p : vals) { cnt++; if (!puts.containsKey(p.getRow())) { puts.put(p.getRow(), p); } else { puts.get(p.getRow()).getFamilyMap().putAll(p.getFamilyMap()); } } {code} We need to change logic similar as below because we are sure the rowkey of all the puts will be same. {code} Put finalPut = null; Mapbyte[], List? extends Cell familyMap = null; for (Put p : vals) { cnt++; if (finalPut==null) { finalPut = p; familyMap = finalPut.getFamilyMap(); } else { for (Entrybyte[], List? extends Cell entry : p.getFamilyMap().entrySet()) { List? extends Cell list = familyMap.get(entry.getKey()); if (list == null) { familyMap.put(entry.getKey(), entry.getValue()); } else { (((ListKeyValue)list)).addAll((ListKeyValue)entry.getValue()); } } } } context.write(row, finalPut); {code} Also need to implement TODOs mentioned by Nick {code} // TODO: would be better if we knew codeK row/code and Put rowkey were // identical. Then this whole Put buffering business goes away. // TODO: Could use HeapSize to create an upper bound on the memory size of // the puts map and flush some portion of the content while looping. This // flush could result in multiple Puts for a single rowkey. That is // acceptable because Combiner is run as an optimization and it's not // critical that all Puts are grouped perfectly. {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8875) incorrect javadoc for EXCLUDE_FROM_MINOR_COMPACTION
Liang Xie created HBASE-8875: Summary: incorrect javadoc for EXCLUDE_FROM_MINOR_COMPACTION Key: HBASE-8875 URL: https://issues.apache.org/jira/browse/HBASE-8875 Project: HBase Issue Type: Bug Components: Compaction Reporter: Liang Xie Assignee: Liang Xie Priority: Trivial -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8875) incorrect javadoc for EXCLUDE_FROM_MINOR_COMPACTION
[ https://issues.apache.org/jira/browse/HBASE-8875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HBASE-8875: - Attachment: HBase-8875.txt incorrect javadoc for EXCLUDE_FROM_MINOR_COMPACTION --- Key: HBASE-8875 URL: https://issues.apache.org/jira/browse/HBASE-8875 Project: HBase Issue Type: Bug Components: Compaction Reporter: Liang Xie Assignee: Liang Xie Priority: Trivial Attachments: HBase-8875.txt - /** Major compaction flag in FileInfo */ + /** Minor compaction flag in FileInfo */ public static final byte[] EXCLUDE_FROM_MINOR_COMPACTION_KEY = Bytes.toBytes(EXCLUDE_FROM_MINOR_COMPACTION); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8875) incorrect javadoc for EXCLUDE_FROM_MINOR_COMPACTION
[ https://issues.apache.org/jira/browse/HBASE-8875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HBASE-8875: - Description: - /** Major compaction flag in FileInfo */ + /** Minor compaction flag in FileInfo */ public static final byte[] EXCLUDE_FROM_MINOR_COMPACTION_KEY = Bytes.toBytes(EXCLUDE_FROM_MINOR_COMPACTION); incorrect javadoc for EXCLUDE_FROM_MINOR_COMPACTION --- Key: HBASE-8875 URL: https://issues.apache.org/jira/browse/HBASE-8875 Project: HBase Issue Type: Bug Components: Compaction Reporter: Liang Xie Assignee: Liang Xie Priority: Trivial Attachments: HBase-8875.txt - /** Major compaction flag in FileInfo */ + /** Minor compaction flag in FileInfo */ public static final byte[] EXCLUDE_FROM_MINOR_COMPACTION_KEY = Bytes.toBytes(EXCLUDE_FROM_MINOR_COMPACTION); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8875) incorrect javadoc for EXCLUDE_FROM_MINOR_COMPACTION
[ https://issues.apache.org/jira/browse/HBASE-8875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HBASE-8875: - Status: Patch Available (was: Open) incorrect javadoc for EXCLUDE_FROM_MINOR_COMPACTION --- Key: HBASE-8875 URL: https://issues.apache.org/jira/browse/HBASE-8875 Project: HBase Issue Type: Bug Components: Compaction Reporter: Liang Xie Assignee: Liang Xie Priority: Trivial Attachments: HBase-8875.txt - /** Major compaction flag in FileInfo */ + /** Minor compaction flag in FileInfo */ public static final byte[] EXCLUDE_FROM_MINOR_COMPACTION_KEY = Bytes.toBytes(EXCLUDE_FROM_MINOR_COMPACTION); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8729) distributedLogReplay may hang during chained region server failure
[ https://issues.apache.org/jira/browse/HBASE-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700624#comment-13700624 ] Hudson commented on HBASE-8729: --- Integrated in hbase-0.95-on-hadoop2 #165 (See [https://builds.apache.org/job/hbase-0.95-on-hadoop2/165/]) HBASE-8729: distributedLogReplay may hang during chained region server failure (Revision 1499926) Result = FAILURE jeffreyz : Files : * /hbase/branches/0.95/hbase-client/src/main/java/org/apache/hadoop/hbase/executor/EventType.java * /hbase/branches/0.95/hbase-client/src/main/java/org/apache/hadoop/hbase/executor/ExecutorType.java * /hbase/branches/0.95/hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestDataIngestWithChaosMonkey.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/LogReplayHandler.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/SplitLogWorker.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEditsReplaySink.java distributedLogReplay may hang during chained region server failure -- Key: HBASE-8729 URL: https://issues.apache.org/jira/browse/HBASE-8729 Project: HBase Issue Type: Bug Components: MTTR Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.98.0, 0.95.2 Attachments: 8729-v2.patch, hbase-8729.patch, hbase-8729-v3.patch, hbase-8729-v4.patch, hbase-8729-v5.patch In a test, half cluster(in terms of region servers) was down and some log replay had incurred chained RS failures(receiving RS of a log replay failed again). Since by default, we only allow 3 concurrent SSH handlers(controlled by {code}this.executorService.startExecutorService(ExecutorType.MASTER_SERVER_OPERATIONS,conf.getInt(hbase.master.executor.serverops.threads, 3));{code}). If all 3 SSH handlers are doing logReplay(blocking call) and one of receiving RS fails again then logReplay will hang because regions of the newly failed RS can't be re-assigned to another live RS(no ssh handler will be processed due to max threads setting) and existing log replay will keep routing replay traffic to the dead RS. The fix is to submit logReplay work into a separate type of executor queue in order not to block SSH region assignment so that logReplay can route traffic to a live RS after retries and move forward. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8871) The region server can crash at startup
[ https://issues.apache.org/jira/browse/HBASE-8871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700622#comment-13700622 ] Hudson commented on HBASE-8871: --- Integrated in hbase-0.95-on-hadoop2 #165 (See [https://builds.apache.org/job/hbase-0.95-on-hadoop2/165/]) HBASE-8871 The region server can crash at startup (Revision 1499922) Result = FAILURE nkeywal : Files : * /hbase/branches/0.95/hbase-client/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKConfig.java * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java The region server can crash at startup -- Key: HBASE-8871 URL: https://issues.apache.org/jira/browse/HBASE-8871 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.98.0, 0.95.2 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.98.0, 0.95.2 Attachments: 8871.v1.patch I have this stack when I start a fresh region server. 5% of the time I would say (per region server). {code} 2013-07-04 12:00:22,609 FATAL [regionserver60020] regionserver.HRegionServer: ABORTING region server ip-10-137-7-67.ec2.internal,60020,1372939221819: Initialization of RS failed. Hence aborting RS. java.util.ConcurrentModificationException at java.util.Hashtable$Enumerator.next(Hashtable.java:1200) at org.apache.hadoop.conf.Configuration.iterator(Configuration.java:1820) at org.apache.hadoop.hbase.zookeeper.ZKConfig.makeZKProps(ZKConfig.java:92) at org.apache.hadoop.hbase.zookeeper.ZKConfig.getZKQuorumServersString(ZKConfig.java:267) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.init(ZooKeeperWatcher.java:158) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.init(ZooKeeperWatcher.java:133) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:667) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:647) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:778) at java.lang.Thread.run(Thread.java:722) 2013-07-04 12:00:22,614 FATAL [regionserver60020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [] 2013-07-04 12:00:22,614 INFO [regionserver60020] regionserver.HRegionServer: STOPPED: Initialization of RS failed. Hence aborting RS. 2013-07-04 12:00:22,616 FATAL [regionserver60020] regionserver.HRegionServer: ABORTING region server ip-10-137-7-67.ec2.internal,60020,1372939221819: Unhandled: null java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:798) at java.lang.Thread.run(Thread.java:722) 2013-07-04 12:00:22,616 FATAL [regionserver60020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [] 2013-07-04 12:00:22,617 INFO [regionserver60020] regionserver.HRegionServer: STOPPED: Unhandled: null 2013-07-04 12:00:22,767 INFO [main] regionserver.ShutdownHook: Installed shutdown hook thread: Shutdownhook:regionserver60020 2013-07-04 12:00:22,768 ERROR [main] regionserver.HRegionServerCommandLine: Region server exiting java.lang.RuntimeException: HRegionServer Aborted at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:66) at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:85) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:78) at org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2309) 2013-07-04 12:00:22,770 INFO [Thread-4] regionserver.ShutdownHook: Shutdown hook starting; hbase.shutdown.hook=true; fsShutdownHook=org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@21f0dbb9 {code} There is one bug here in the region server: we should not start the snapshot if the machine is supposed to stop: {code} // start the snapshot handler, since the server is ready to run this.snapshotManager.start(); {code} and the root cause is here in ZKConfig: {code} for (EntryString, String entry : conf) { // === BUG String key = entry.getKey(); if (key.startsWith(HConstants.ZK_CFG_PROPERTY_PREFIX)) { String zkKey = key.substring(HConstants.ZK_CFG_PROPERTY_PREFIX_LEN); String value = entry.getValue(); // If the value has variables substitutions, need to do a get. if (value.contains(VARIABLE_START)) { value = conf.get(key); } zkProperties.put(zkKey, value); }
[jira] [Commented] (HBASE-8867) HLogUtils#getServerNameFromHLogDirectoryName does not take into account the -splitting extension
[ https://issues.apache.org/jira/browse/HBASE-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700623#comment-13700623 ] Hudson commented on HBASE-8867: --- Integrated in hbase-0.95-on-hadoop2 #165 (See [https://builds.apache.org/job/hbase-0.95-on-hadoop2/165/]) HBASE-8867 HLogUtils#getServerNameFromHLogDirectoryName does not take into account the -splitting extension (Revision 1499924) Result = FAILURE nkeywal : Files : * /hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogUtil.java * /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java HLogUtils#getServerNameFromHLogDirectoryName does not take into account the -splitting extension Key: HBASE-8867 URL: https://issues.apache.org/jira/browse/HBASE-8867 Project: HBase Issue Type: Bug Components: MTTR Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.98.0, 0.95.2 Attachments: 8867.v1.patch + it could reuse some existing code -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8867) HLogUtils#getServerNameFromHLogDirectoryName does not take into account the -splitting extension
[ https://issues.apache.org/jira/browse/HBASE-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700648#comment-13700648 ] Hudson commented on HBASE-8867: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #600 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/600/]) HBASE-8867 HLogUtils#getServerNameFromHLogDirectoryName does not take into account the -splitting extension (Revision 1499923) Result = FAILURE nkeywal : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogUtil.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java HLogUtils#getServerNameFromHLogDirectoryName does not take into account the -splitting extension Key: HBASE-8867 URL: https://issues.apache.org/jira/browse/HBASE-8867 Project: HBase Issue Type: Bug Components: MTTR Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.98.0, 0.95.2 Attachments: 8867.v1.patch + it could reuse some existing code -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8871) The region server can crash at startup
[ https://issues.apache.org/jira/browse/HBASE-8871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700647#comment-13700647 ] Hudson commented on HBASE-8871: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #600 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/600/]) HBASE-8871 The region server can crash at startup (Revision 1499921) Result = FAILURE nkeywal : Files : * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKConfig.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java The region server can crash at startup -- Key: HBASE-8871 URL: https://issues.apache.org/jira/browse/HBASE-8871 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.98.0, 0.95.2 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.98.0, 0.95.2 Attachments: 8871.v1.patch I have this stack when I start a fresh region server. 5% of the time I would say (per region server). {code} 2013-07-04 12:00:22,609 FATAL [regionserver60020] regionserver.HRegionServer: ABORTING region server ip-10-137-7-67.ec2.internal,60020,1372939221819: Initialization of RS failed. Hence aborting RS. java.util.ConcurrentModificationException at java.util.Hashtable$Enumerator.next(Hashtable.java:1200) at org.apache.hadoop.conf.Configuration.iterator(Configuration.java:1820) at org.apache.hadoop.hbase.zookeeper.ZKConfig.makeZKProps(ZKConfig.java:92) at org.apache.hadoop.hbase.zookeeper.ZKConfig.getZKQuorumServersString(ZKConfig.java:267) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.init(ZooKeeperWatcher.java:158) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.init(ZooKeeperWatcher.java:133) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:667) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:647) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:778) at java.lang.Thread.run(Thread.java:722) 2013-07-04 12:00:22,614 FATAL [regionserver60020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [] 2013-07-04 12:00:22,614 INFO [regionserver60020] regionserver.HRegionServer: STOPPED: Initialization of RS failed. Hence aborting RS. 2013-07-04 12:00:22,616 FATAL [regionserver60020] regionserver.HRegionServer: ABORTING region server ip-10-137-7-67.ec2.internal,60020,1372939221819: Unhandled: null java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:798) at java.lang.Thread.run(Thread.java:722) 2013-07-04 12:00:22,616 FATAL [regionserver60020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [] 2013-07-04 12:00:22,617 INFO [regionserver60020] regionserver.HRegionServer: STOPPED: Unhandled: null 2013-07-04 12:00:22,767 INFO [main] regionserver.ShutdownHook: Installed shutdown hook thread: Shutdownhook:regionserver60020 2013-07-04 12:00:22,768 ERROR [main] regionserver.HRegionServerCommandLine: Region server exiting java.lang.RuntimeException: HRegionServer Aborted at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:66) at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:85) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:78) at org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2309) 2013-07-04 12:00:22,770 INFO [Thread-4] regionserver.ShutdownHook: Shutdown hook starting; hbase.shutdown.hook=true; fsShutdownHook=org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@21f0dbb9 {code} There is one bug here in the region server: we should not start the snapshot if the machine is supposed to stop: {code} // start the snapshot handler, since the server is ready to run this.snapshotManager.start(); {code} and the root cause is here in ZKConfig: {code} for (EntryString, String entry : conf) { // === BUG String key = entry.getKey(); if (key.startsWith(HConstants.ZK_CFG_PROPERTY_PREFIX)) { String zkKey = key.substring(HConstants.ZK_CFG_PROPERTY_PREFIX_LEN); String value = entry.getValue(); // If the value has variables substitutions, need to do a get. if (value.contains(VARIABLE_START)) { value = conf.get(key); } zkProperties.put(zkKey, value); } {code}
[jira] [Commented] (HBASE-8729) distributedLogReplay may hang during chained region server failure
[ https://issues.apache.org/jira/browse/HBASE-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700649#comment-13700649 ] Hudson commented on HBASE-8729: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #600 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/600/]) HBASE-8729: distributedLogReplay may hang during chained region server failure (Revision 1499925) Result = FAILURE jeffreyz : Files : * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/executor/EventType.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/executor/ExecutorType.java * /hbase/trunk/hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestDataIngestWithChaosMonkey.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/LogReplayHandler.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/SplitLogWorker.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEditsReplaySink.java distributedLogReplay may hang during chained region server failure -- Key: HBASE-8729 URL: https://issues.apache.org/jira/browse/HBASE-8729 Project: HBase Issue Type: Bug Components: MTTR Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.98.0, 0.95.2 Attachments: 8729-v2.patch, hbase-8729.patch, hbase-8729-v3.patch, hbase-8729-v4.patch, hbase-8729-v5.patch In a test, half cluster(in terms of region servers) was down and some log replay had incurred chained RS failures(receiving RS of a log replay failed again). Since by default, we only allow 3 concurrent SSH handlers(controlled by {code}this.executorService.startExecutorService(ExecutorType.MASTER_SERVER_OPERATIONS,conf.getInt(hbase.master.executor.serverops.threads, 3));{code}). If all 3 SSH handlers are doing logReplay(blocking call) and one of receiving RS fails again then logReplay will hang because regions of the newly failed RS can't be re-assigned to another live RS(no ssh handler will be processed due to max threads setting) and existing log replay will keep routing replay traffic to the dead RS. The fix is to submit logReplay work into a separate type of executor queue in order not to block SSH region assignment so that logReplay can route traffic to a live RS after retries and move forward. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8876) Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test
Lars George created HBASE-8876: -- Summary: Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test Key: HBASE-8876 URL: https://issues.apache.org/jira/browse/HBASE-8876 Project: HBase Issue Type: Sub-task Components: Thrift Reporter: Lars George Assignee: Lars George Fix For: 0.98.0, 0.95.2, 0.94.10 HBASE-8774 adds support for batching through large rows. A unit test was missing though, which is added here. Further cleanup as well, to test scan, scan with filter, and scan with batch size separately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8823) Ensure HBASE-7826 is covered by Thrift 2
[ https://issues.apache.org/jira/browse/HBASE-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700677#comment-13700677 ] Lars George commented on HBASE-8823: HBASE-8774 and HBASE-8876 adds tests for scanners that also check the order. All working in Thrift 2. Ensure HBASE-7826 is covered by Thrift 2 Key: HBASE-8823 URL: https://issues.apache.org/jira/browse/HBASE-8823 Project: HBase Issue Type: Sub-task Components: Thrift Reporter: Lars George Labels: thrift2 HBASE-7826 is about sorted results, we need to check if Thrift 2 handles this as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4360) Maintain information on the time a RS went dead
[ https://issues.apache.org/jira/browse/HBASE-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] samar updated HBASE-4360: - Attachment: HBASE-4360_5.patch formatting error corrected Maintain information on the time a RS went dead --- Key: HBASE-4360 URL: https://issues.apache.org/jira/browse/HBASE-4360 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.94.0 Reporter: Harsh J Assignee: samar Priority: Minor Fix For: 0.98.0, 0.95.2 Attachments: ds_hbase_multiple_server_test.png, ds_hbase.png, HBASE-4360_1.patch, HBASE-4360_2.patch, HBASE-4360_3.patch, HBASE-4360_4.patch, HBASE-4360_5.patch, master-status1.png Just something that'd be generally helpful, is to maintain DeadServer info with the last timestamp when it was determined as dead. Makes it easier to hunt the logs, and I don't think its much too expensive to maintain (one additional update per dead determination). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4360) Maintain information on the time a RS went dead
[ https://issues.apache.org/jira/browse/HBASE-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700718#comment-13700718 ] samar commented on HBASE-4360: -- [~nkeywal] may be you can try the patch in you environment . Would be a double check. Also a UI review would be helpful :-) Maintain information on the time a RS went dead --- Key: HBASE-4360 URL: https://issues.apache.org/jira/browse/HBASE-4360 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.94.0 Reporter: Harsh J Assignee: samar Priority: Minor Fix For: 0.98.0, 0.95.2 Attachments: ds_hbase_multiple_server_test.png, ds_hbase.png, HBASE-4360_1.patch, HBASE-4360_2.patch, HBASE-4360_3.patch, HBASE-4360_4.patch, HBASE-4360_5.patch, master-status1.png Just something that'd be generally helpful, is to maintain DeadServer info with the last timestamp when it was determined as dead. Makes it easier to hunt the logs, and I don't think its much too expensive to maintain (one additional update per dead determination). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4360) Maintain information on the time a RS went dead
[ https://issues.apache.org/jira/browse/HBASE-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700914#comment-13700914 ] Nicolas Liochon commented on HBASE-4360: Trying right now Maintain information on the time a RS went dead --- Key: HBASE-4360 URL: https://issues.apache.org/jira/browse/HBASE-4360 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.94.0 Reporter: Harsh J Assignee: samar Priority: Minor Fix For: 0.98.0, 0.95.2 Attachments: ds_hbase_multiple_server_test.png, ds_hbase.png, HBASE-4360_1.patch, HBASE-4360_2.patch, HBASE-4360_3.patch, HBASE-4360_4.patch, HBASE-4360_5.patch, master-status1.png Just something that'd be generally helpful, is to maintain DeadServer info with the last timestamp when it was determined as dead. Makes it easier to hunt the logs, and I don't think its much too expensive to maintain (one additional update per dead determination). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-8823) Ensure HBASE-7826 is covered by Thrift 2
[ https://issues.apache.org/jira/browse/HBASE-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars George resolved HBASE-8823. Resolution: Not A Problem Assignee: Lars George Ensure HBASE-7826 is covered by Thrift 2 Key: HBASE-8823 URL: https://issues.apache.org/jira/browse/HBASE-8823 Project: HBase Issue Type: Sub-task Components: Thrift Reporter: Lars George Assignee: Lars George Labels: thrift2 HBASE-7826 is about sorted results, we need to check if Thrift 2 handles this as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8876) Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test
[ https://issues.apache.org/jira/browse/HBASE-8876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars George updated HBASE-8876: --- Attachment: HBASE-8876.patch Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test --- Key: HBASE-8876 URL: https://issues.apache.org/jira/browse/HBASE-8876 Project: HBase Issue Type: Sub-task Components: Thrift Reporter: Lars George Assignee: Lars George Labels: thrift2 Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: HBASE-8876.patch HBASE-8774 adds support for batching through large rows. A unit test was missing though, which is added here. Further cleanup as well, to test scan, scan with filter, and scan with batch size separately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8876) Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test
[ https://issues.apache.org/jira/browse/HBASE-8876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars George updated HBASE-8876: --- Attachment: HBASE-8876-0.94.patch Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test --- Key: HBASE-8876 URL: https://issues.apache.org/jira/browse/HBASE-8876 Project: HBase Issue Type: Sub-task Components: Thrift Reporter: Lars George Assignee: Lars George Labels: thrift2 Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: HBASE-8876-0.94.patch, HBASE-8876.patch HBASE-8774 adds support for batching through large rows. A unit test was missing though, which is added here. Further cleanup as well, to test scan, scan with filter, and scan with batch size separately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8876) Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test
[ https://issues.apache.org/jira/browse/HBASE-8876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700930#comment-13700930 ] Lars George commented on HBASE-8876: Hey [~madani], added batchsize test. Please have a looksee if you like. Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test --- Key: HBASE-8876 URL: https://issues.apache.org/jira/browse/HBASE-8876 Project: HBase Issue Type: Sub-task Components: Thrift Reporter: Lars George Assignee: Lars George Labels: thrift2 Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: HBASE-8876-0.94.patch, HBASE-8876.patch HBASE-8774 adds support for batching through large rows. A unit test was missing though, which is added here. Further cleanup as well, to test scan, scan with filter, and scan with batch size separately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8876) Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test
[ https://issues.apache.org/jira/browse/HBASE-8876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars George updated HBASE-8876: --- Status: Patch Available (was: Open) Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test --- Key: HBASE-8876 URL: https://issues.apache.org/jira/browse/HBASE-8876 Project: HBase Issue Type: Sub-task Components: Thrift Reporter: Lars George Assignee: Lars George Labels: thrift2 Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: HBASE-8876-0.94.patch, HBASE-8876.patch HBASE-8774 adds support for batching through large rows. A unit test was missing though, which is added here. Further cleanup as well, to test scan, scan with filter, and scan with batch size separately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8876) Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test
[ https://issues.apache.org/jira/browse/HBASE-8876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars George updated HBASE-8876: --- Resolution: Fixed Status: Resolved (was: Patch Available) Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test --- Key: HBASE-8876 URL: https://issues.apache.org/jira/browse/HBASE-8876 Project: HBase Issue Type: Sub-task Components: Thrift Reporter: Lars George Assignee: Lars George Labels: thrift2 Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: HBASE-8876-0.94.patch, HBASE-8876.patch HBASE-8774 adds support for batching through large rows. A unit test was missing though, which is added here. Further cleanup as well, to test scan, scan with filter, and scan with batch size separately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8877) Reentrant row locks
Dave Latham created HBASE-8877: -- Summary: Reentrant row locks Key: HBASE-8877 URL: https://issues.apache.org/jira/browse/HBASE-8877 Project: HBase Issue Type: Bug Components: Coprocessors, regionserver Reporter: Dave Latham Assignee: Dave Latham Fix For: 0.95.2 HBASE-8806 revealed performance problems with batch mutations failing to reacquire the same row locks. It looks like HBASE-8806 will use a less intrusive change for 0.94 to have batch mutations track their own row locks and not attempt to reacquire them. Another approach will be to support reentrant row locks directly. This allows simplifying a great deal of calling code to no longer track and pass around lock ids. One affect this change will have is changing the RegionObserver coprocessor's methods preBatchMutate and postBatchMutate from taking a {{MiniBatchOperationInProgressPairMutation, Integer miniBatchOp}} to taking a {{MiniBatchOperationInProgressMutation miniBatchOp}}. I don't believe CPs should be relying on these lock ids, but that's a potential incompatibility. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8753) Provide new delete flag which can delete all cells under a column-family which have a same designated timestamp
[ https://issues.apache.org/jira/browse/HBASE-8753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-8753: -- Attachment: 8753-trunk-V2.patch TestFromClientSide hung because there was missing protobuf translation in ProtobufUtil TestFromClientSide passes for patch v2. Provide new delete flag which can delete all cells under a column-family which have a same designated timestamp --- Key: HBASE-8753 URL: https://issues.apache.org/jira/browse/HBASE-8753 Project: HBase Issue Type: New Feature Components: Deletes, Scanners Affects Versions: 0.95.1 Reporter: Feng Honghua Assignee: Feng Honghua Attachments: 8753-trunk-V2.patch, HBASE-8753-0.94-V0.patch, HBASE-8753-trunk-V0.patch, HBASE-8753-trunk-V1.patch In one of our production scenario (Xiaomi message search), multiple cells will be put in batch using a same timestamp with different column names under a specific column-family. And after some time these cells also need to be deleted in batch by given a specific timestamp. But the column names are parsed tokens which can be arbitrary words , so such batch delete is impossible without first retrieving all KVs from that CF and get the column name list which has KV with that given timestamp, and then issuing individual deleteColumn for each column in that column-list. Though it's possible to do such batch delete, its performance is poor, and customers also find their code is quite clumsy by first retrieving and populating the column list and then issuing a deleteColumn for each column in that column-list. This feature resolves this problem by introducing a new delete flag: DeleteFamilyVersion. 1). When you need to delete all KVs under a column-family with a given timestamp, just call Delete.deleteFamilyVersion(cfName, timestamp); only a DeleteFamilyVersion type KV is put to HBase (like DeleteFamily / DeleteColumn / Delete) without read operation; 2). Like other delete types, DeleteFamilyVersion takes effect in get/scan/flush/compact operations, the ScanDeleteTracker now parses out and uses DeleteFamilyVersion to prevent all KVs under the specific CF which has the same timestamp as the DeleteFamilyVersion KV to pop-up as part of a get/scan result (also in flush/compact). Our customers find this feature efficient, clean and easy-to-use since it does its work without knowing the exact column names list that needs to be deleted. This feature has been running smoothly for a couple of months in our production clusters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8877) Reentrant row locks
[ https://issues.apache.org/jira/browse/HBASE-8877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Latham updated HBASE-8877: --- Attachment: HBASE-8877.patch Here's a patch implementing reentrant row locks. It changes the locking interface to these 3 methods: - void getRowLock(byte[] row) throws IOException (acquire, throw if unable to acqurie before timeout) - boolean tryRowLock(byte[] row) (acquire without wait and return true iff successful) - void releaseMyRowLocks() (release all row locks held by the current thread) Reentrant row locks --- Key: HBASE-8877 URL: https://issues.apache.org/jira/browse/HBASE-8877 Project: HBase Issue Type: Bug Components: Coprocessors, regionserver Reporter: Dave Latham Assignee: Dave Latham Fix For: 0.95.2 Attachments: HBASE-8877.patch HBASE-8806 revealed performance problems with batch mutations failing to reacquire the same row locks. It looks like HBASE-8806 will use a less intrusive change for 0.94 to have batch mutations track their own row locks and not attempt to reacquire them. Another approach will be to support reentrant row locks directly. This allows simplifying a great deal of calling code to no longer track and pass around lock ids. One affect this change will have is changing the RegionObserver coprocessor's methods preBatchMutate and postBatchMutate from taking a {{MiniBatchOperationInProgressPairMutation, Integer miniBatchOp}} to taking a {{MiniBatchOperationInProgressMutation miniBatchOp}}. I don't believe CPs should be relying on these lock ids, but that's a potential incompatibility. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8877) Reentrant row locks
[ https://issues.apache.org/jira/browse/HBASE-8877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700940#comment-13700940 ] Dave Latham commented on HBASE-8877: A few changes in this patch compared to the latest similar patch on HBASE-8806: - fixed the heap size logic to match the new fields - wrapped a couple lines that were longer than 100 chars - changed releaseMyRowLocks method to free the list of locks from the thread local so that each thread doesn't wind up having very long empty lists hanging around - changed the row lock methods from public to package private as they are only used inside HRegion (and by some tests) - added some comments to checkAndMutate to make clear row it's reusing locks - renamed new thread local field to rowLocksHeldByThread Reentrant row locks --- Key: HBASE-8877 URL: https://issues.apache.org/jira/browse/HBASE-8877 Project: HBase Issue Type: Bug Components: Coprocessors, regionserver Reporter: Dave Latham Assignee: Dave Latham Fix For: 0.95.2 Attachments: HBASE-8877.patch HBASE-8806 revealed performance problems with batch mutations failing to reacquire the same row locks. It looks like HBASE-8806 will use a less intrusive change for 0.94 to have batch mutations track their own row locks and not attempt to reacquire them. Another approach will be to support reentrant row locks directly. This allows simplifying a great deal of calling code to no longer track and pass around lock ids. One affect this change will have is changing the RegionObserver coprocessor's methods preBatchMutate and postBatchMutate from taking a {{MiniBatchOperationInProgressPairMutation, Integer miniBatchOp}} to taking a {{MiniBatchOperationInProgressMutation miniBatchOp}}. I don't believe CPs should be relying on these lock ids, but that's a potential incompatibility. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8877) Reentrant row locks
[ https://issues.apache.org/jira/browse/HBASE-8877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Latham updated HBASE-8877: --- Status: Patch Available (was: Open) Reentrant row locks --- Key: HBASE-8877 URL: https://issues.apache.org/jira/browse/HBASE-8877 Project: HBase Issue Type: Bug Components: Coprocessors, regionserver Reporter: Dave Latham Assignee: Dave Latham Fix For: 0.95.2 Attachments: HBASE-8877.patch HBASE-8806 revealed performance problems with batch mutations failing to reacquire the same row locks. It looks like HBASE-8806 will use a less intrusive change for 0.94 to have batch mutations track their own row locks and not attempt to reacquire them. Another approach will be to support reentrant row locks directly. This allows simplifying a great deal of calling code to no longer track and pass around lock ids. One affect this change will have is changing the RegionObserver coprocessor's methods preBatchMutate and postBatchMutate from taking a {{MiniBatchOperationInProgressPairMutation, Integer miniBatchOp}} to taking a {{MiniBatchOperationInProgressMutation miniBatchOp}}. I don't believe CPs should be relying on these lock ids, but that's a potential incompatibility. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700942#comment-13700942 ] Dave Latham commented on HBASE-8806: Filed HBASE-8877 to track the reentrant approach separately and uploaded a new patch there with some cleanup (and fixed heap size logic). Let's continue that discussion there. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, 8806-0.94-v6.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch, HBASE-8806-threadBasedRowLocks-v2.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4360) Maintain information on the time a RS went dead
[ https://issues.apache.org/jira/browse/HBASE-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700951#comment-13700951 ] Nicolas Liochon commented on HBASE-4360: Tested, it works. I'm going to commit, it's a one line addendum. Thanks for the fix, Samar! Maintain information on the time a RS went dead --- Key: HBASE-4360 URL: https://issues.apache.org/jira/browse/HBASE-4360 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.94.0 Reporter: Harsh J Assignee: samar Priority: Minor Fix For: 0.98.0, 0.95.2 Attachments: ds_hbase_multiple_server_test.png, ds_hbase.png, HBASE-4360_1.patch, HBASE-4360_2.patch, HBASE-4360_3.patch, HBASE-4360_4.patch, HBASE-4360_5.patch, master-status1.png Just something that'd be generally helpful, is to maintain DeadServer info with the last timestamp when it was determined as dead. Makes it easier to hunt the logs, and I don't think its much too expensive to maintain (one additional update per dead determination). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8774) Add BatchSize and Filter to Thrift2
[ https://issues.apache.org/jira/browse/HBASE-8774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700963#comment-13700963 ] Hudson commented on HBASE-8774: --- Integrated in HBase-TRUNK #4217 (See [https://builds.apache.org/job/HBase-TRUNK/4217/]) HBASE-8876 Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test (Revision 1500033) Result = FAILURE larsgeorge : Files : * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/thrift2/TestThriftHBaseServiceHandler.java Add BatchSize and Filter to Thrift2 --- Key: HBASE-8774 URL: https://issues.apache.org/jira/browse/HBASE-8774 Project: HBase Issue Type: Sub-task Components: Thrift Affects Versions: 0.95.1 Reporter: Hamed Madani Assignee: Hamed Madani Labels: thrift2 Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: HBASE_8774.patch, HBASE_8774_v2.patch, HBASE_8774_v3.patch, HBASE_8774_v4.patch, HBASE_8774_v5_0.94.patch, HBASE_8774_v5.patch, HBASE_8774_v5.patch Attached Patch will add BatchSize and Filter support to Thrift2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8876) Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test
[ https://issues.apache.org/jira/browse/HBASE-8876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700964#comment-13700964 ] Hudson commented on HBASE-8876: --- Integrated in HBase-TRUNK #4217 (See [https://builds.apache.org/job/HBase-TRUNK/4217/]) HBASE-8876 Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test (Revision 1500033) Result = FAILURE larsgeorge : Files : * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/thrift2/TestThriftHBaseServiceHandler.java Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test --- Key: HBASE-8876 URL: https://issues.apache.org/jira/browse/HBASE-8876 Project: HBase Issue Type: Sub-task Components: Thrift Reporter: Lars George Assignee: Lars George Labels: thrift2 Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: HBASE-8876-0.94.patch, HBASE-8876.patch HBASE-8774 adds support for batching through large rows. A unit test was missing though, which is added here. Further cleanup as well, to test scan, scan with filter, and scan with batch size separately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8877) Reentrant row locks
[ https://issues.apache.org/jira/browse/HBASE-8877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700971#comment-13700971 ] Ted Yu commented on HBASE-8877: --- {code} - public OperationStatus[] batchMutate( - PairMutation, Integer[] mutationsAndLocks) throws IOException { -return batchMutate(mutationsAndLocks, false); + public OperationStatus[] batchMutate(Mutation[] mutations) throws IOException { {code} Should dev@hbase be polled for the removal of lock Id ? {code} /** - * Returns existing row lock if found, otherwise - * obtains a new row lock and returns it. - * @param lockid requested by the user, or null if the user didn't already hold lock - * @param row the row to lock - * @param waitForLock if true, will block until the lock is available, otherwise will - * simply return null if it could not acquire the lock. - * @return lockid or null if waitForLock is false and the lock was unavailable. + * public void getRowLock(byte [] row) throws IOException { + * internalObtainRowLock(row, true); */ {code} The change in javadoc seems unnecessary. {code} + static class RowLockContext { {code} RowLockContext can be private, right ? Reentrant row locks --- Key: HBASE-8877 URL: https://issues.apache.org/jira/browse/HBASE-8877 Project: HBase Issue Type: Bug Components: Coprocessors, regionserver Reporter: Dave Latham Assignee: Dave Latham Fix For: 0.95.2 Attachments: HBASE-8877.patch HBASE-8806 revealed performance problems with batch mutations failing to reacquire the same row locks. It looks like HBASE-8806 will use a less intrusive change for 0.94 to have batch mutations track their own row locks and not attempt to reacquire them. Another approach will be to support reentrant row locks directly. This allows simplifying a great deal of calling code to no longer track and pass around lock ids. One affect this change will have is changing the RegionObserver coprocessor's methods preBatchMutate and postBatchMutate from taking a {{MiniBatchOperationInProgressPairMutation, Integer miniBatchOp}} to taking a {{MiniBatchOperationInProgressMutation miniBatchOp}}. I don't believe CPs should be relying on these lock ids, but that's a potential incompatibility. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8774) Add BatchSize and Filter to Thrift2
[ https://issues.apache.org/jira/browse/HBASE-8774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700974#comment-13700974 ] Hudson commented on HBASE-8774: --- Integrated in HBase-0.94-security #193 (See [https://builds.apache.org/job/HBase-0.94-security/193/]) HBASE-8876 Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test (Revision 1500035) Result = FAILURE larsgeorge : Files : * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/thrift2/TestThriftHBaseServiceHandler.java Add BatchSize and Filter to Thrift2 --- Key: HBASE-8774 URL: https://issues.apache.org/jira/browse/HBASE-8774 Project: HBase Issue Type: Sub-task Components: Thrift Affects Versions: 0.95.1 Reporter: Hamed Madani Assignee: Hamed Madani Labels: thrift2 Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: HBASE_8774.patch, HBASE_8774_v2.patch, HBASE_8774_v3.patch, HBASE_8774_v4.patch, HBASE_8774_v5_0.94.patch, HBASE_8774_v5.patch, HBASE_8774_v5.patch Attached Patch will add BatchSize and Filter support to Thrift2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8876) Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test
[ https://issues.apache.org/jira/browse/HBASE-8876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700975#comment-13700975 ] Hudson commented on HBASE-8876: --- Integrated in HBase-0.94-security #193 (See [https://builds.apache.org/job/HBase-0.94-security/193/]) HBASE-8876 Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test (Revision 1500035) Result = FAILURE larsgeorge : Files : * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/thrift2/TestThriftHBaseServiceHandler.java Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test --- Key: HBASE-8876 URL: https://issues.apache.org/jira/browse/HBASE-8876 Project: HBase Issue Type: Sub-task Components: Thrift Reporter: Lars George Assignee: Lars George Labels: thrift2 Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: HBASE-8876-0.94.patch, HBASE-8876.patch HBASE-8774 adds support for batching through large rows. A unit test was missing though, which is added here. Further cleanup as well, to test scan, scan with filter, and scan with batch size separately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4360) Maintain information on the time a RS went dead
[ https://issues.apache.org/jira/browse/HBASE-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700973#comment-13700973 ] samar commented on HBASE-4360: -- thanks [~nkeywal] for catching the issue Maintain information on the time a RS went dead --- Key: HBASE-4360 URL: https://issues.apache.org/jira/browse/HBASE-4360 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.94.0 Reporter: Harsh J Assignee: samar Priority: Minor Fix For: 0.98.0, 0.95.2 Attachments: ds_hbase_multiple_server_test.png, ds_hbase.png, HBASE-4360_1.patch, HBASE-4360_2.patch, HBASE-4360_3.patch, HBASE-4360_4.patch, HBASE-4360_5.patch, master-status1.png Just something that'd be generally helpful, is to maintain DeadServer info with the last timestamp when it was determined as dead. Makes it easier to hunt the logs, and I don't think its much too expensive to maintain (one additional update per dead determination). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8774) Add BatchSize and Filter to Thrift2
[ https://issues.apache.org/jira/browse/HBASE-8774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700982#comment-13700982 ] Hudson commented on HBASE-8774: --- Integrated in HBase-0.94 #1039 (See [https://builds.apache.org/job/HBase-0.94/1039/]) HBASE-8876 Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test (Revision 1500035) Result = FAILURE larsgeorge : Files : * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/thrift2/TestThriftHBaseServiceHandler.java Add BatchSize and Filter to Thrift2 --- Key: HBASE-8774 URL: https://issues.apache.org/jira/browse/HBASE-8774 Project: HBase Issue Type: Sub-task Components: Thrift Affects Versions: 0.95.1 Reporter: Hamed Madani Assignee: Hamed Madani Labels: thrift2 Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: HBASE_8774.patch, HBASE_8774_v2.patch, HBASE_8774_v3.patch, HBASE_8774_v4.patch, HBASE_8774_v5_0.94.patch, HBASE_8774_v5.patch, HBASE_8774_v5.patch Attached Patch will add BatchSize and Filter support to Thrift2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8876) Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test
[ https://issues.apache.org/jira/browse/HBASE-8876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700983#comment-13700983 ] Hudson commented on HBASE-8876: --- Integrated in HBase-0.94 #1039 (See [https://builds.apache.org/job/HBase-0.94/1039/]) HBASE-8876 Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test (Revision 1500035) Result = FAILURE larsgeorge : Files : * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/thrift2/TestThriftHBaseServiceHandler.java Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test --- Key: HBASE-8876 URL: https://issues.apache.org/jira/browse/HBASE-8876 Project: HBase Issue Type: Sub-task Components: Thrift Reporter: Lars George Assignee: Lars George Labels: thrift2 Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: HBASE-8876-0.94.patch, HBASE-8876.patch HBASE-8774 adds support for batching through large rows. A unit test was missing though, which is added here. Further cleanup as well, to test scan, scan with filter, and scan with batch size separately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8753) Provide new delete flag which can delete all cells under a column-family which have a same designated timestamp
[ https://issues.apache.org/jira/browse/HBASE-8753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700988#comment-13700988 ] Hadoop QA commented on HBASE-8753: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12591012/8753-trunk-V2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 2 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6223//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6223//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6223//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6223//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6223//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6223//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6223//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6223//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6223//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6223//console This message is automatically generated. Provide new delete flag which can delete all cells under a column-family which have a same designated timestamp --- Key: HBASE-8753 URL: https://issues.apache.org/jira/browse/HBASE-8753 Project: HBase Issue Type: New Feature Components: Deletes, Scanners Affects Versions: 0.95.1 Reporter: Feng Honghua Assignee: Feng Honghua Attachments: 8753-trunk-V2.patch, HBASE-8753-0.94-V0.patch, HBASE-8753-trunk-V0.patch, HBASE-8753-trunk-V1.patch In one of our production scenario (Xiaomi message search), multiple cells will be put in batch using a same timestamp with different column names under a specific column-family. And after some time these cells also need to be deleted in batch by given a specific timestamp. But the column names are parsed tokens which can be arbitrary words , so such batch delete is impossible without first retrieving all KVs from that CF and get the column name list which has KV with that given timestamp, and then issuing individual deleteColumn for each column in that column-list. Though it's possible to do such batch delete, its performance is poor, and customers also find their code is quite clumsy by first retrieving and populating the column list and then issuing a deleteColumn for each column in that column-list. This feature resolves this problem by introducing a new delete flag: DeleteFamilyVersion. 1). When you need to delete all KVs under a column-family with a given timestamp, just call Delete.deleteFamilyVersion(cfName, timestamp); only a DeleteFamilyVersion type KV is put to HBase (like DeleteFamily / DeleteColumn / Delete) without read operation; 2). Like other delete types, DeleteFamilyVersion takes effect in get/scan/flush/compact operations, the ScanDeleteTracker now parses out and uses DeleteFamilyVersion to prevent all KVs under the
[jira] [Resolved] (HBASE-4360) Maintain information on the time a RS went dead
[ https://issues.apache.org/jira/browse/HBASE-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Liochon resolved HBASE-4360. Resolution: Fixed HBASE-4360_5.patch is what I applied as an addendum Maintain information on the time a RS went dead --- Key: HBASE-4360 URL: https://issues.apache.org/jira/browse/HBASE-4360 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.94.0 Reporter: Harsh J Assignee: samar Priority: Minor Fix For: 0.98.0, 0.95.2 Attachments: ds_hbase_multiple_server_test.png, ds_hbase.png, HBASE-4360_1.patch, HBASE-4360_2.patch, HBASE-4360_3.patch, HBASE-4360_4.patch, HBASE-4360_5.patch, master-status1.png Just something that'd be generally helpful, is to maintain DeadServer info with the last timestamp when it was determined as dead. Makes it easier to hunt the logs, and I don't think its much too expensive to maintain (one additional update per dead determination). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8877) Reentrant row locks
[ https://issues.apache.org/jira/browse/HBASE-8877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700996#comment-13700996 ] Hadoop QA commented on HBASE-8877: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12591013/HBASE-8877.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 9 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 2 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6224//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6224//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6224//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6224//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6224//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6224//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6224//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6224//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6224//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6224//console This message is automatically generated. Reentrant row locks --- Key: HBASE-8877 URL: https://issues.apache.org/jira/browse/HBASE-8877 Project: HBase Issue Type: Bug Components: Coprocessors, regionserver Reporter: Dave Latham Assignee: Dave Latham Fix For: 0.95.2 Attachments: HBASE-8877.patch HBASE-8806 revealed performance problems with batch mutations failing to reacquire the same row locks. It looks like HBASE-8806 will use a less intrusive change for 0.94 to have batch mutations track their own row locks and not attempt to reacquire them. Another approach will be to support reentrant row locks directly. This allows simplifying a great deal of calling code to no longer track and pass around lock ids. One affect this change will have is changing the RegionObserver coprocessor's methods preBatchMutate and postBatchMutate from taking a {{MiniBatchOperationInProgressPairMutation, Integer miniBatchOp}} to taking a {{MiniBatchOperationInProgressMutation miniBatchOp}}. I don't believe CPs should be relying on these lock ids, but that's a potential incompatibility. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HBASE-8877) Reentrant row locks
[ https://issues.apache.org/jira/browse/HBASE-8877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13701013#comment-13701013 ] Andrew Purtell edited comment on HBASE-8877 at 7/5/13 5:31 PM: --- +1 on the CP interface change. There might (in theory - doubt in practice) have been some utility before for passing through lock ids but if this issue is committed obviously no longer. This proposal is only for = 0.95? Edit: fix JIRA formatting. was (Author: apurtell): +1 on the CP interface change. There might (in theory - doubt in practice) have been some utility before for passing through lock ids but if this issue is committed obviously no longer. This proposal is only for 0.95+? Reentrant row locks --- Key: HBASE-8877 URL: https://issues.apache.org/jira/browse/HBASE-8877 Project: HBase Issue Type: Bug Components: Coprocessors, regionserver Reporter: Dave Latham Assignee: Dave Latham Fix For: 0.95.2 Attachments: HBASE-8877.patch HBASE-8806 revealed performance problems with batch mutations failing to reacquire the same row locks. It looks like HBASE-8806 will use a less intrusive change for 0.94 to have batch mutations track their own row locks and not attempt to reacquire them. Another approach will be to support reentrant row locks directly. This allows simplifying a great deal of calling code to no longer track and pass around lock ids. One affect this change will have is changing the RegionObserver coprocessor's methods preBatchMutate and postBatchMutate from taking a {{MiniBatchOperationInProgressPairMutation, Integer miniBatchOp}} to taking a {{MiniBatchOperationInProgressMutation miniBatchOp}}. I don't believe CPs should be relying on these lock ids, but that's a potential incompatibility. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8877) Reentrant row locks
[ https://issues.apache.org/jira/browse/HBASE-8877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13701013#comment-13701013 ] Andrew Purtell commented on HBASE-8877: --- +1 on the CP interface change. There might (in theory - doubt in practice) have been some utility before for passing through lock ids but if this issue is committed obviously no longer. This proposal is only for 0.95+? Reentrant row locks --- Key: HBASE-8877 URL: https://issues.apache.org/jira/browse/HBASE-8877 Project: HBase Issue Type: Bug Components: Coprocessors, regionserver Reporter: Dave Latham Assignee: Dave Latham Fix For: 0.95.2 Attachments: HBASE-8877.patch HBASE-8806 revealed performance problems with batch mutations failing to reacquire the same row locks. It looks like HBASE-8806 will use a less intrusive change for 0.94 to have batch mutations track their own row locks and not attempt to reacquire them. Another approach will be to support reentrant row locks directly. This allows simplifying a great deal of calling code to no longer track and pass around lock ids. One affect this change will have is changing the RegionObserver coprocessor's methods preBatchMutate and postBatchMutate from taking a {{MiniBatchOperationInProgressPairMutation, Integer miniBatchOp}} to taking a {{MiniBatchOperationInProgressMutation miniBatchOp}}. I don't believe CPs should be relying on these lock ids, but that's a potential incompatibility. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8876) Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test
[ https://issues.apache.org/jira/browse/HBASE-8876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13701028#comment-13701028 ] Hudson commented on HBASE-8876: --- Integrated in hbase-0.95 #290 (See [https://builds.apache.org/job/hbase-0.95/290/]) HBASE-8876 Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test (Revision 1500034) Result = SUCCESS larsgeorge : Files : * /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/thrift2/TestThriftHBaseServiceHandler.java Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test --- Key: HBASE-8876 URL: https://issues.apache.org/jira/browse/HBASE-8876 Project: HBase Issue Type: Sub-task Components: Thrift Reporter: Lars George Assignee: Lars George Labels: thrift2 Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: HBASE-8876-0.94.patch, HBASE-8876.patch HBASE-8774 adds support for batching through large rows. A unit test was missing though, which is added here. Further cleanup as well, to test scan, scan with filter, and scan with batch size separately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8774) Add BatchSize and Filter to Thrift2
[ https://issues.apache.org/jira/browse/HBASE-8774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13701027#comment-13701027 ] Hudson commented on HBASE-8774: --- Integrated in hbase-0.95 #290 (See [https://builds.apache.org/job/hbase-0.95/290/]) HBASE-8876 Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test (Revision 1500034) Result = SUCCESS larsgeorge : Files : * /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/thrift2/TestThriftHBaseServiceHandler.java Add BatchSize and Filter to Thrift2 --- Key: HBASE-8774 URL: https://issues.apache.org/jira/browse/HBASE-8774 Project: HBase Issue Type: Sub-task Components: Thrift Affects Versions: 0.95.1 Reporter: Hamed Madani Assignee: Hamed Madani Labels: thrift2 Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: HBASE_8774.patch, HBASE_8774_v2.patch, HBASE_8774_v3.patch, HBASE_8774_v4.patch, HBASE_8774_v5_0.94.patch, HBASE_8774_v5.patch, HBASE_8774_v5.patch Attached Patch will add BatchSize and Filter support to Thrift2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-8814) Possible NPE in split if a region has empty store files.
[ https://issues.apache.org/jira/browse/HBASE-8814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack reopened HBASE-8814: -- Reverted again. Since this patch was applied, TestGet is failing w/ strange deserialization issue. I cannot see how this patch is responsible. Backing out for now to see if build starts passing again. Possible NPE in split if a region has empty store files. Key: HBASE-8814 URL: https://issues.apache.org/jira/browse/HBASE-8814 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.8 Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: HBASE-8814_94.patch, HBASE-8814_94_v2.patch, HBASE-8814_trunk_addendum.patch, HBASE-8814_trunk.patch, HBASE-8814_v2.patch {code} 2013-06-27 14:12:54,472 INFO [RS:1;BLRY2R009039160:49833-splits-1372322572806] regionserver.SplitRequest(92): Running rollback/cleanup of failed split of testSplitShouldNotThrowNPEEvenARegionHasEmptySplitFiles,,1372322556662.276e00da1420119e2f91f3a4c4c41d78.; java.util.concurrent.ExecutionException: java.lang.NullPointerException java.io.IOException: java.util.concurrent.ExecutionException: java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.SplitTransaction.splitStoreFiles(SplitTransaction.java:602) at org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:297) at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:466) at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:82) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.util.concurrent.ExecutionException: java.lang.NullPointerException at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.hadoop.hbase.regionserver.SplitTransaction.splitStoreFiles(SplitTransaction.java:596) ... 6 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionFileSystem.splitStoreFile(HRegionFileSystem.java:539) at org.apache.hadoop.hbase.regionserver.SplitTransaction.splitStoreFile(SplitTransaction.java:610) at org.apache.hadoop.hbase.regionserver.SplitTransaction.access$1(SplitTransaction.java:607) at org.apache.hadoop.hbase.regionserver.SplitTransaction$StoreFileSplitter.call(SplitTransaction.java:633) at org.apache.hadoop.hbase.regionserver.SplitTransaction$StoreFileSplitter.call(SplitTransaction.java:1) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) ... 3 more {code} If a storefile is empty(can be because of puts and deletes) then first and lastkey of the file will be empty. Then we will get first or last key as null. Then we will end up in NPE when we will check splitkey in the range or not. {code} if (top) { //check if larger than last key. KeyValue splitKey = KeyValue.createFirstOnRow(splitRow); byte[] lastKey = f.createReader().getLastKey(); if (f.getReader().getComparator().compare(splitKey.getBuffer(), splitKey.getKeyOffset(), splitKey.getKeyLength(), lastKey, 0, lastKey.length) 0) { return null; } } else { //check if smaller than first key KeyValue splitKey = KeyValue.createLastOnRow(splitRow); byte[] firstKey = f.createReader().getFirstKey(); if (f.getReader().getComparator().compare(splitKey.getBuffer(), splitKey.getKeyOffset(), splitKey.getKeyLength(), firstKey, 0, firstKey.length) 0) { return null; } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8814) Possible NPE in split if a region has empty store files.
[ https://issues.apache.org/jira/browse/HBASE-8814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13701046#comment-13701046 ] stack commented on HBASE-8814: -- Here is sample: https://builds.apache.org/job/HBase-0.94-security/193/ Here is another: https://builds.apache.org/job/HBase-0.94/1038/ See how test started failing when this went in. Possible NPE in split if a region has empty store files. Key: HBASE-8814 URL: https://issues.apache.org/jira/browse/HBASE-8814 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.8 Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: HBASE-8814_94.patch, HBASE-8814_94_v2.patch, HBASE-8814_trunk_addendum.patch, HBASE-8814_trunk.patch, HBASE-8814_v2.patch {code} 2013-06-27 14:12:54,472 INFO [RS:1;BLRY2R009039160:49833-splits-1372322572806] regionserver.SplitRequest(92): Running rollback/cleanup of failed split of testSplitShouldNotThrowNPEEvenARegionHasEmptySplitFiles,,1372322556662.276e00da1420119e2f91f3a4c4c41d78.; java.util.concurrent.ExecutionException: java.lang.NullPointerException java.io.IOException: java.util.concurrent.ExecutionException: java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.SplitTransaction.splitStoreFiles(SplitTransaction.java:602) at org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:297) at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:466) at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:82) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.util.concurrent.ExecutionException: java.lang.NullPointerException at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.hadoop.hbase.regionserver.SplitTransaction.splitStoreFiles(SplitTransaction.java:596) ... 6 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionFileSystem.splitStoreFile(HRegionFileSystem.java:539) at org.apache.hadoop.hbase.regionserver.SplitTransaction.splitStoreFile(SplitTransaction.java:610) at org.apache.hadoop.hbase.regionserver.SplitTransaction.access$1(SplitTransaction.java:607) at org.apache.hadoop.hbase.regionserver.SplitTransaction$StoreFileSplitter.call(SplitTransaction.java:633) at org.apache.hadoop.hbase.regionserver.SplitTransaction$StoreFileSplitter.call(SplitTransaction.java:1) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) ... 3 more {code} If a storefile is empty(can be because of puts and deletes) then first and lastkey of the file will be empty. Then we will get first or last key as null. Then we will end up in NPE when we will check splitkey in the range or not. {code} if (top) { //check if larger than last key. KeyValue splitKey = KeyValue.createFirstOnRow(splitRow); byte[] lastKey = f.createReader().getLastKey(); if (f.getReader().getComparator().compare(splitKey.getBuffer(), splitKey.getKeyOffset(), splitKey.getKeyLength(), lastKey, 0, lastKey.length) 0) { return null; } } else { //check if smaller than first key KeyValue splitKey = KeyValue.createLastOnRow(splitRow); byte[] firstKey = f.createReader().getFirstKey(); if (f.getReader().getComparator().compare(splitKey.getBuffer(), splitKey.getKeyOffset(), splitKey.getKeyLength(), firstKey, 0, firstKey.length) 0) { return null; } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8799) TestAccessController#testBulkLoad failing on trunk/0.95
[ https://issues.apache.org/jira/browse/HBASE-8799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-8799: -- Resolution: Fixed Fix Version/s: 0.98.0 Assignee: Andrew Purtell (was: stack) Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk and 0.95 branches. TestAccessController passes locally. TestAccessController#testBulkLoad failing on trunk/0.95 --- Key: HBASE-8799 URL: https://issues.apache.org/jira/browse/HBASE-8799 Project: HBase Issue Type: Bug Components: Coprocessors, security, test Affects Versions: 0.98.0, 0.95.2 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.98.0, 0.95.2 Attachments: 8799disableFailingTest.txt, 8799.patch, 8799.txt I've observed this in Jenkins reports and also while I was working on HBASE-8692, only on trunk/0.95, not on 0.94: {quote} Failed tests: testBulkLoad(org.apache.hadoop.hbase.security.access.TestAccessController): Expected action to pass for user 'rwuser' but was denied {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8876) Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test
[ https://issues.apache.org/jira/browse/HBASE-8876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13701052#comment-13701052 ] Hamed Madani commented on HBASE-8876: - Looks great [~larsgeorge] Thanks for sharing. Addendum to HBASE-8774 Add BatchSize and Filter to Thrift2 - Add BatchSize Test --- Key: HBASE-8876 URL: https://issues.apache.org/jira/browse/HBASE-8876 Project: HBase Issue Type: Sub-task Components: Thrift Reporter: Lars George Assignee: Lars George Labels: thrift2 Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: HBASE-8876-0.94.patch, HBASE-8876.patch HBASE-8774 adds support for batching through large rows. A unit test was missing though, which is added here. Further cleanup as well, to test scan, scan with filter, and scan with batch size separately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8878) Backport TestAccessController changes on HBASE-8799 to 0.94
Andrew Purtell created HBASE-8878: - Summary: Backport TestAccessController changes on HBASE-8799 to 0.94 Key: HBASE-8878 URL: https://issues.apache.org/jira/browse/HBASE-8878 Project: HBase Issue Type: Test Components: security, test Affects Versions: 0.94.10 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor The permissions granted to the USER_CREATE test user were changed on HBASE-8799. We should make the same changes to TestAccessController in 0.94 to keep the differences in access controller tests to a minimum between the branches. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8878) Backport TestAccessController changes on HBASE-8799 to 0.94
[ https://issues.apache.org/jira/browse/HBASE-8878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-8878: -- Attachment: 8799-0.94.patch Low risk test only change. TestAccessController passes locally with this patch applied. Backport TestAccessController changes on HBASE-8799 to 0.94 --- Key: HBASE-8878 URL: https://issues.apache.org/jira/browse/HBASE-8878 Project: HBase Issue Type: Test Components: security, test Affects Versions: 0.94.10 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Attachments: 8799-0.94.patch The permissions granted to the USER_CREATE test user were changed on HBASE-8799. We should make the same changes to TestAccessController in 0.94 to keep the differences in access controller tests to a minimum between the branches. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8878) Backport TestAccessController changes on HBASE-8799 to 0.94
[ https://issues.apache.org/jira/browse/HBASE-8878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-8878: -- Status: Patch Available (was: Open) Backport TestAccessController changes on HBASE-8799 to 0.94 --- Key: HBASE-8878 URL: https://issues.apache.org/jira/browse/HBASE-8878 Project: HBase Issue Type: Test Components: security, test Affects Versions: 0.94.10 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Attachments: 8799-0.94.patch The permissions granted to the USER_CREATE test user were changed on HBASE-8799. We should make the same changes to TestAccessController in 0.94 to keep the differences in access controller tests to a minimum between the branches. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8878) Backport TestAccessController changes on HBASE-8799 to 0.94
[ https://issues.apache.org/jira/browse/HBASE-8878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13701061#comment-13701061 ] Hadoop QA commented on HBASE-8878: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12591032/8799-0.94.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6225//console This message is automatically generated. Backport TestAccessController changes on HBASE-8799 to 0.94 --- Key: HBASE-8878 URL: https://issues.apache.org/jira/browse/HBASE-8878 Project: HBase Issue Type: Test Components: security, test Affects Versions: 0.94.10 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Attachments: 8799-0.94.patch The permissions granted to the USER_CREATE test user were changed on HBASE-8799. We should make the same changes to TestAccessController in 0.94 to keep the differences in access controller tests to a minimum between the branches. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4360) Maintain information on the time a RS went dead
[ https://issues.apache.org/jira/browse/HBASE-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13701070#comment-13701070 ] Hudson commented on HBASE-4360: --- Integrated in HBase-TRUNK #4218 (See [https://builds.apache.org/job/HBase-TRUNK/4218/]) HBASE-4360 Maintain information on the time a RS went dead - addendum (samar) (Revision 1500064) Result = FAILURE nkeywal : Files : * /hbase/trunk/hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/master/MasterStatusTmpl.jamon Maintain information on the time a RS went dead --- Key: HBASE-4360 URL: https://issues.apache.org/jira/browse/HBASE-4360 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.94.0 Reporter: Harsh J Assignee: samar Priority: Minor Fix For: 0.98.0, 0.95.2 Attachments: ds_hbase_multiple_server_test.png, ds_hbase.png, HBASE-4360_1.patch, HBASE-4360_2.patch, HBASE-4360_3.patch, HBASE-4360_4.patch, HBASE-4360_5.patch, master-status1.png Just something that'd be generally helpful, is to maintain DeadServer info with the last timestamp when it was determined as dead. Makes it easier to hunt the logs, and I don't think its much too expensive to maintain (one additional update per dead determination). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8814) Possible NPE in split if a region has empty store files.
[ https://issues.apache.org/jira/browse/HBASE-8814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13701073#comment-13701073 ] Hudson commented on HBASE-8814: --- Integrated in HBase-0.94-security #194 (See [https://builds.apache.org/job/HBase-0.94-security/194/]) HBASE-8814 Possible NPE in split if a region has empty store files; REVERT AGAIN (2nd TIME) (Revision 1500084) Result = FAILURE stack : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java Possible NPE in split if a region has empty store files. Key: HBASE-8814 URL: https://issues.apache.org/jira/browse/HBASE-8814 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.8 Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: HBASE-8814_94.patch, HBASE-8814_94_v2.patch, HBASE-8814_trunk_addendum.patch, HBASE-8814_trunk.patch, HBASE-8814_v2.patch {code} 2013-06-27 14:12:54,472 INFO [RS:1;BLRY2R009039160:49833-splits-1372322572806] regionserver.SplitRequest(92): Running rollback/cleanup of failed split of testSplitShouldNotThrowNPEEvenARegionHasEmptySplitFiles,,1372322556662.276e00da1420119e2f91f3a4c4c41d78.; java.util.concurrent.ExecutionException: java.lang.NullPointerException java.io.IOException: java.util.concurrent.ExecutionException: java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.SplitTransaction.splitStoreFiles(SplitTransaction.java:602) at org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:297) at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:466) at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:82) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.util.concurrent.ExecutionException: java.lang.NullPointerException at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.hadoop.hbase.regionserver.SplitTransaction.splitStoreFiles(SplitTransaction.java:596) ... 6 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionFileSystem.splitStoreFile(HRegionFileSystem.java:539) at org.apache.hadoop.hbase.regionserver.SplitTransaction.splitStoreFile(SplitTransaction.java:610) at org.apache.hadoop.hbase.regionserver.SplitTransaction.access$1(SplitTransaction.java:607) at org.apache.hadoop.hbase.regionserver.SplitTransaction$StoreFileSplitter.call(SplitTransaction.java:633) at org.apache.hadoop.hbase.regionserver.SplitTransaction$StoreFileSplitter.call(SplitTransaction.java:1) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) ... 3 more {code} If a storefile is empty(can be because of puts and deletes) then first and lastkey of the file will be empty. Then we will get first or last key as null. Then we will end up in NPE when we will check splitkey in the range or not. {code} if (top) { //check if larger than last key. KeyValue splitKey = KeyValue.createFirstOnRow(splitRow); byte[] lastKey = f.createReader().getLastKey(); if (f.getReader().getComparator().compare(splitKey.getBuffer(), splitKey.getKeyOffset(), splitKey.getKeyLength(), lastKey, 0, lastKey.length) 0) { return null; } } else { //check if smaller than first key KeyValue splitKey = KeyValue.createLastOnRow(splitRow); byte[] firstKey = f.createReader().getFirstKey(); if (f.getReader().getComparator().compare(splitKey.getBuffer(), splitKey.getKeyOffset(), splitKey.getKeyLength(), firstKey, 0, firstKey.length) 0) { return null; } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8814) Possible NPE in split if a region has empty store files.
[ https://issues.apache.org/jira/browse/HBASE-8814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13701078#comment-13701078 ] Hudson commented on HBASE-8814: --- Integrated in HBase-0.94 #1040 (See [https://builds.apache.org/job/HBase-0.94/1040/]) HBASE-8814 Possible NPE in split if a region has empty store files; REVERT AGAIN (2nd TIME) (Revision 1500084) Result = FAILURE stack : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java Possible NPE in split if a region has empty store files. Key: HBASE-8814 URL: https://issues.apache.org/jira/browse/HBASE-8814 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.8 Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: HBASE-8814_94.patch, HBASE-8814_94_v2.patch, HBASE-8814_trunk_addendum.patch, HBASE-8814_trunk.patch, HBASE-8814_v2.patch {code} 2013-06-27 14:12:54,472 INFO [RS:1;BLRY2R009039160:49833-splits-1372322572806] regionserver.SplitRequest(92): Running rollback/cleanup of failed split of testSplitShouldNotThrowNPEEvenARegionHasEmptySplitFiles,,1372322556662.276e00da1420119e2f91f3a4c4c41d78.; java.util.concurrent.ExecutionException: java.lang.NullPointerException java.io.IOException: java.util.concurrent.ExecutionException: java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.SplitTransaction.splitStoreFiles(SplitTransaction.java:602) at org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:297) at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:466) at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:82) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.util.concurrent.ExecutionException: java.lang.NullPointerException at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.hadoop.hbase.regionserver.SplitTransaction.splitStoreFiles(SplitTransaction.java:596) ... 6 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionFileSystem.splitStoreFile(HRegionFileSystem.java:539) at org.apache.hadoop.hbase.regionserver.SplitTransaction.splitStoreFile(SplitTransaction.java:610) at org.apache.hadoop.hbase.regionserver.SplitTransaction.access$1(SplitTransaction.java:607) at org.apache.hadoop.hbase.regionserver.SplitTransaction$StoreFileSplitter.call(SplitTransaction.java:633) at org.apache.hadoop.hbase.regionserver.SplitTransaction$StoreFileSplitter.call(SplitTransaction.java:1) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) ... 3 more {code} If a storefile is empty(can be because of puts and deletes) then first and lastkey of the file will be empty. Then we will get first or last key as null. Then we will end up in NPE when we will check splitkey in the range or not. {code} if (top) { //check if larger than last key. KeyValue splitKey = KeyValue.createFirstOnRow(splitRow); byte[] lastKey = f.createReader().getLastKey(); if (f.getReader().getComparator().compare(splitKey.getBuffer(), splitKey.getKeyOffset(), splitKey.getKeyLength(), lastKey, 0, lastKey.length) 0) { return null; } } else { //check if smaller than first key KeyValue splitKey = KeyValue.createLastOnRow(splitRow); byte[] firstKey = f.createReader().getFirstKey(); if (f.getReader().getComparator().compare(splitKey.getBuffer(), splitKey.getKeyOffset(), splitKey.getKeyLength(), firstKey, 0, firstKey.length) 0) { return null; } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4360) Maintain information on the time a RS went dead
[ https://issues.apache.org/jira/browse/HBASE-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13701091#comment-13701091 ] Hudson commented on HBASE-4360: --- Integrated in hbase-0.95 #291 (See [https://builds.apache.org/job/hbase-0.95/291/]) HBASE-4360 Maintain information on the time a RS went dead - addendum (samar) (Revision 1500066) Result = SUCCESS nkeywal : Files : * /hbase/branches/0.95/hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/master/MasterStatusTmpl.jamon Maintain information on the time a RS went dead --- Key: HBASE-4360 URL: https://issues.apache.org/jira/browse/HBASE-4360 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.94.0 Reporter: Harsh J Assignee: samar Priority: Minor Fix For: 0.98.0, 0.95.2 Attachments: ds_hbase_multiple_server_test.png, ds_hbase.png, HBASE-4360_1.patch, HBASE-4360_2.patch, HBASE-4360_3.patch, HBASE-4360_4.patch, HBASE-4360_5.patch, master-status1.png Just something that'd be generally helpful, is to maintain DeadServer info with the last timestamp when it was determined as dead. Makes it easier to hunt the logs, and I don't think its much too expensive to maintain (one additional update per dead determination). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8879) Client Scanner spams the logs if there are lots of scanners.
Elliott Clark created HBASE-8879: Summary: Client Scanner spams the logs if there are lots of scanners. Key: HBASE-8879 URL: https://issues.apache.org/jira/browse/HBASE-8879 Project: HBase Issue Type: Bug Reporter: Elliott Clark The log in client scanner should probably be on the trace level otherwise you end up with this: {code} 2013-07-05 12:41:12,501 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,502 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,503 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,506 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,507 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,508 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,509 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,509 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,510 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8879) Client Scanner spams the logs if there are lots of scanners.
[ https://issues.apache.org/jira/browse/HBASE-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-8879: - Component/s: Client Client Scanner spams the logs if there are lots of scanners. Key: HBASE-8879 URL: https://issues.apache.org/jira/browse/HBASE-8879 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.98.0, 0.95.1 Reporter: Elliott Clark The log in client scanner should probably be on the trace level otherwise you end up with this: {code} 2013-07-05 12:41:12,501 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,502 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,503 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,506 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,507 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,508 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,509 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,509 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,510 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8879) Client Scanner spams the logs if there are lots of scanners.
[ https://issues.apache.org/jira/browse/HBASE-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-8879: - Affects Version/s: 0.98.0 0.95.1 Client Scanner spams the logs if there are lots of scanners. Key: HBASE-8879 URL: https://issues.apache.org/jira/browse/HBASE-8879 Project: HBase Issue Type: Bug Affects Versions: 0.98.0, 0.95.1 Reporter: Elliott Clark The log in client scanner should probably be on the trace level otherwise you end up with this: {code} 2013-07-05 12:41:12,501 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,502 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,503 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,506 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,507 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,508 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,509 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,509 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,510 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8879) Client Scanner spams the logs if there are lots of scanners.
[ https://issues.apache.org/jira/browse/HBASE-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-8879: - Labels: noob (was: ) Client Scanner spams the logs if there are lots of scanners. Key: HBASE-8879 URL: https://issues.apache.org/jira/browse/HBASE-8879 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.98.0, 0.95.1 Reporter: Elliott Clark Labels: noob The log in client scanner should probably be on the trace level otherwise you end up with this: {code} 2013-07-05 12:41:12,501 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,502 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,503 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,506 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,507 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,508 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,509 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,509 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= 2013-07-05 12:41:12,510 DEBUG [pool-48-thread-3] client.ClientScanner(97): Scan table=IntegrationTestMTTR, startRow= {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8880) Integration Tests shouldn't set the number or reties.
[ https://issues.apache.org/jira/browse/HBASE-8880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-8880: - Component/s: test Client Integration Tests shouldn't set the number or reties. - Key: HBASE-8880 URL: https://issues.apache.org/jira/browse/HBASE-8880 Project: HBase Issue Type: Bug Components: Client, test Affects Versions: 0.98.0, 0.95.1 Reporter: Elliott Clark Setting the number of client reties should be a function of the environment, not of the test. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8880) Integration Tests shouldn't set the number or reties.
Elliott Clark created HBASE-8880: Summary: Integration Tests shouldn't set the number or reties. Key: HBASE-8880 URL: https://issues.apache.org/jira/browse/HBASE-8880 Project: HBase Issue Type: Bug Affects Versions: 0.95.1, 0.98.0 Reporter: Elliott Clark Setting the number of client reties should be a function of the environment, not of the test. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8809) Include deletes in the scan (setRaw) method does not respect the time range or the filter
[ https://issues.apache.org/jira/browse/HBASE-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13701107#comment-13701107 ] Jesse Yates commented on HBASE-8809: As slight follow up to this, it feels like raw scans should also ignore the column version/timestamp filtering. In particular, I'm talking about this section in ScanQueryMatcher: {code} MatchCode colChecker = columns.checkColumn(bytes, offset, qualLength, timestamp, type, kv.getMemstoreTS() maxReadPointToTrackVersions); /* * According to current implementation, colChecker can only be * SEEK_NEXT_COL, SEEK_NEXT_ROW, SKIP or INCLUDE. Therefore, always return * the MatchCode. If it is SEEK_NEXT_ROW, also set stickyNextRow. */ ... {code} Where the ScanWildcardColumnTracker will not ignore the timestamp in the simple case - four puts to the same row with different timestamps will ignore the oldest by default, even though its still present in the store regardless of the rawness of the scan. Thoughts? Include deletes in the scan (setRaw) method does not respect the time range or the filter - Key: HBASE-8809 URL: https://issues.apache.org/jira/browse/HBASE-8809 Project: HBase Issue Type: Bug Components: Scanners Reporter: Vasu Mariyala Assignee: Lars Hofhansl Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: 8809-0.94.txt, 8809-trunk.txt, DeleteMarkers.doc If a row has been deleted at time stamp 'T' and a scan with time range (0, T-1) is executed, it still returns the delete marker at time stamp 'T'. It is because of the code in ScanQueryMatcher.java {code} if (retainDeletesInOutput || (!isUserScan (EnvironmentEdgeManager.currentTimeMillis() - timestamp) = timeToPurgeDeletes) || kv.getMemstoreTS() maxReadPointToTrackVersions) { // always include or it is not time yet to check whether it is OK // to purge deltes or not return MatchCode.INCLUDE; } {code} The assumption is scan (even with setRaw is set to true) should respect the filters and the time range specified. Please let me know if you think this behavior can be changed so that I can provide a patch for it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HBASE-8809) Include deletes in the scan (setRaw) method does not respect the time range or the filter
[ https://issues.apache.org/jira/browse/HBASE-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13701107#comment-13701107 ] Jesse Yates edited comment on HBASE-8809 at 7/5/13 8:39 PM: As slight follow up to this, it feels like raw scans should also ignore the column version/timestamp filtering. In particular, I'm talking about this section in ScanQueryMatcher: {code} MatchCode colChecker = columns.checkColumn(bytes, offset, qualLength, timestamp, type, kv.getMemstoreTS() maxReadPointToTrackVersions); /* * According to current implementation, colChecker can only be * SEEK_NEXT_COL, SEEK_NEXT_ROW, SKIP or INCLUDE. Therefore, always return * the MatchCode. If it is SEEK_NEXT_ROW, also set stickyNextRow. */ ... {code} Where the ScanWildcardColumnTracker will not ignore the timestamp in the simple case - four (since default is to keep 3 versions) puts to the same row with increasing timestamps will ignore the first by default, even though its still present in the store regardless of the rawness of the scan. Thoughts? was (Author: jesse_yates): As slight follow up to this, it feels like raw scans should also ignore the column version/timestamp filtering. In particular, I'm talking about this section in ScanQueryMatcher: {code} MatchCode colChecker = columns.checkColumn(bytes, offset, qualLength, timestamp, type, kv.getMemstoreTS() maxReadPointToTrackVersions); /* * According to current implementation, colChecker can only be * SEEK_NEXT_COL, SEEK_NEXT_ROW, SKIP or INCLUDE. Therefore, always return * the MatchCode. If it is SEEK_NEXT_ROW, also set stickyNextRow. */ ... {code} Where the ScanWildcardColumnTracker will not ignore the timestamp in the simple case - four puts to the same row with different timestamps will ignore the oldest by default, even though its still present in the store regardless of the rawness of the scan. Thoughts? Include deletes in the scan (setRaw) method does not respect the time range or the filter - Key: HBASE-8809 URL: https://issues.apache.org/jira/browse/HBASE-8809 Project: HBase Issue Type: Bug Components: Scanners Reporter: Vasu Mariyala Assignee: Lars Hofhansl Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: 8809-0.94.txt, 8809-trunk.txt, DeleteMarkers.doc If a row has been deleted at time stamp 'T' and a scan with time range (0, T-1) is executed, it still returns the delete marker at time stamp 'T'. It is because of the code in ScanQueryMatcher.java {code} if (retainDeletesInOutput || (!isUserScan (EnvironmentEdgeManager.currentTimeMillis() - timestamp) = timeToPurgeDeletes) || kv.getMemstoreTS() maxReadPointToTrackVersions) { // always include or it is not time yet to check whether it is OK // to purge deltes or not return MatchCode.INCLUDE; } {code} The assumption is scan (even with setRaw is set to true) should respect the filters and the time range specified. Please let me know if you think this behavior can be changed so that I can provide a patch for it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8809) Include deletes in the scan (setRaw) method does not respect the time range or the filter
[ https://issues.apache.org/jira/browse/HBASE-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Yates updated HBASE-8809: --- Attachment: hbase-8809-0.94-addendum-example.patch Attaching small example patch for what I'm talking about with ignoring versions for a column on raw scan (in 0.94, should be the same for 0.95/6 and trunk). Include deletes in the scan (setRaw) method does not respect the time range or the filter - Key: HBASE-8809 URL: https://issues.apache.org/jira/browse/HBASE-8809 Project: HBase Issue Type: Bug Components: Scanners Reporter: Vasu Mariyala Assignee: Lars Hofhansl Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: 8809-0.94.txt, 8809-trunk.txt, DeleteMarkers.doc, hbase-8809-0.94-addendum-example.patch If a row has been deleted at time stamp 'T' and a scan with time range (0, T-1) is executed, it still returns the delete marker at time stamp 'T'. It is because of the code in ScanQueryMatcher.java {code} if (retainDeletesInOutput || (!isUserScan (EnvironmentEdgeManager.currentTimeMillis() - timestamp) = timeToPurgeDeletes) || kv.getMemstoreTS() maxReadPointToTrackVersions) { // always include or it is not time yet to check whether it is OK // to purge deltes or not return MatchCode.INCLUDE; } {code} The assumption is scan (even with setRaw is set to true) should respect the filters and the time range specified. Please let me know if you think this behavior can be changed so that I can provide a patch for it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8753) Provide new delete flag which can delete all cells under a column-family which have a same designated timestamp
[ https://issues.apache.org/jira/browse/HBASE-8753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13701113#comment-13701113 ] Lars Hofhansl commented on HBASE-8753: -- Looked at v2. Three comments: 1. {code} + } else if (type == KeyValue.Type.DeleteFamilyVersion.getCode()) { +if (familyVersionStamps.isEmpty()) { + familyVersionStamps.add(timestamp); +} else { + long minTimeStamp= familyVersionStamps.first(); + assert timestamp = minTimeStamp : deleteFamilyStamp + minTimeStamp+ + followed by a bigger one + timestamp; + + // remove duplication(ignore deleteFamilyVersion with same timestamp) + if (timestamp minTimeStamp) { +familyVersionStamps.add(timestamp); + } +} +return; {code} This all seems overkill just to check the correct sort order. Can just do {code} + } else if (type == KeyValue.Type.DeleteFamilyVersion.getCode()) { +familyVersionStamps.add(timestamp); +return; {code} 2. Also, if there is a normal family delete marker with a timestamp newer than the family version marker, we do not need to store the version delete marker at all (as the row is already targeted for delete). 3. Lastly, this is the only delete marker type for which multiple ones need to be kept in memory during a scan... There can never be more than one family, column, or version delete marker that need to be tracked, but for the family version marker we need to potentially track arbitrarily many. That *is* a concern. Provide new delete flag which can delete all cells under a column-family which have a same designated timestamp --- Key: HBASE-8753 URL: https://issues.apache.org/jira/browse/HBASE-8753 Project: HBase Issue Type: New Feature Components: Deletes, Scanners Affects Versions: 0.95.1 Reporter: Feng Honghua Assignee: Feng Honghua Attachments: 8753-trunk-V2.patch, HBASE-8753-0.94-V0.patch, HBASE-8753-trunk-V0.patch, HBASE-8753-trunk-V1.patch In one of our production scenario (Xiaomi message search), multiple cells will be put in batch using a same timestamp with different column names under a specific column-family. And after some time these cells also need to be deleted in batch by given a specific timestamp. But the column names are parsed tokens which can be arbitrary words , so such batch delete is impossible without first retrieving all KVs from that CF and get the column name list which has KV with that given timestamp, and then issuing individual deleteColumn for each column in that column-list. Though it's possible to do such batch delete, its performance is poor, and customers also find their code is quite clumsy by first retrieving and populating the column list and then issuing a deleteColumn for each column in that column-list. This feature resolves this problem by introducing a new delete flag: DeleteFamilyVersion. 1). When you need to delete all KVs under a column-family with a given timestamp, just call Delete.deleteFamilyVersion(cfName, timestamp); only a DeleteFamilyVersion type KV is put to HBase (like DeleteFamily / DeleteColumn / Delete) without read operation; 2). Like other delete types, DeleteFamilyVersion takes effect in get/scan/flush/compact operations, the ScanDeleteTracker now parses out and uses DeleteFamilyVersion to prevent all KVs under the specific CF which has the same timestamp as the DeleteFamilyVersion KV to pop-up as part of a get/scan result (also in flush/compact). Our customers find this feature efficient, clean and easy-to-use since it does its work without knowing the exact column names list that needs to be deleted. This feature has been running smoothly for a couple of months in our production clusters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-1936) ClassLoader that loads from hdfs; useful adding filters to classpath without having to restart services
[ https://issues.apache.org/jira/browse/HBASE-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13701119#comment-13701119 ] stack commented on HBASE-1936: -- [~jxiang] Hey Jimmy, what is this: {code} + private static final String WRITABLE_GET = + AgD//wEBD3Rlc3QuTW9ja0ZpbHRlcgEAAH//AQAA; {code} It seems to be a class named test.MockFilter. When Get goes to read into an instance of this class, it is failing with an AbstractMethodException in 0.94 when it looks like you were expecting a fail on can't find class test.MockFilter... See https://builds.apache.org/job/HBase-0.94/1040/testReport/org.apache.hadoop.hbase.client/TestGet/testDynamicFilter/ It has been happening over the last few builds both in 0.94 and in security. I'm not sure what changed and it is hard to track (it was failing locally regardless of what version of 0.94 I checked out and now it is passing again -- as though it were a library issue or something stuck in my repo). Let me add catching this exception for now so tests pass again...hbsae-8881 ClassLoader that loads from hdfs; useful adding filters to classpath without having to restart services --- Key: HBASE-1936 URL: https://issues.apache.org/jira/browse/HBASE-1936 Project: HBase Issue Type: New Feature Reporter: stack Assignee: Jimmy Xiang Labels: noob Fix For: 0.98.0, 0.94.7, 0.95.1 Attachments: 0.94-1936.patch, cp_from_hdfs.patch, HBASE-1936-trunk(forReview).patch, trunk-1936.patch, trunk-1936_v2.1.patch, trunk-1936_v2.2.patch, trunk-1936_v2.patch, trunk-1936_v3.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8809) Include deletes in the scan (setRaw) method does not respect the time range or the filter
[ https://issues.apache.org/jira/browse/HBASE-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13701118#comment-13701118 ] Lars Hofhansl commented on HBASE-8809: -- Heh... I just had exactly the same thought! But I would suggest a different fix: {code} -int maxVersions = Math.min(scan.getMaxVersions(), scanInfo.getMaxVersions()); +int maxVersions = scan.isRaw() ? scan.getMaxVersions() : Math.min(scan.getMaxVersions(), +scanInfo.getMaxVersions()); {code} That way number of versions is controlled by the scan similar to how we do time ranges. Include deletes in the scan (setRaw) method does not respect the time range or the filter - Key: HBASE-8809 URL: https://issues.apache.org/jira/browse/HBASE-8809 Project: HBase Issue Type: Bug Components: Scanners Reporter: Vasu Mariyala Assignee: Lars Hofhansl Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: 8809-0.94.txt, 8809-trunk.txt, DeleteMarkers.doc, hbase-8809-0.94-addendum-example.patch If a row has been deleted at time stamp 'T' and a scan with time range (0, T-1) is executed, it still returns the delete marker at time stamp 'T'. It is because of the code in ScanQueryMatcher.java {code} if (retainDeletesInOutput || (!isUserScan (EnvironmentEdgeManager.currentTimeMillis() - timestamp) = timeToPurgeDeletes) || kv.getMemstoreTS() maxReadPointToTrackVersions) { // always include or it is not time yet to check whether it is OK // to purge deltes or not return MatchCode.INCLUDE; } {code} The assumption is scan (even with setRaw is set to true) should respect the filters and the time range specified. Please let me know if you think this behavior can be changed so that I can provide a patch for it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8881) TestGet failing in testDynamicFilter with AbstractMethodException
stack created HBASE-8881: Summary: TestGet failing in testDynamicFilter with AbstractMethodException Key: HBASE-8881 URL: https://issues.apache.org/jira/browse/HBASE-8881 Project: HBase Issue Type: Bug Reporter: stack See https://builds.apache.org/job/HBase-0.94/1040/testReport/org.apache.hadoop.hbase.client/TestGet/testDynamicFilter/ It has been happening in the last set of builds. It does not seem related to the checkin it started happening on. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8881) TestGet failing in testDynamicFilter with AbstractMethodException
[ https://issues.apache.org/jira/browse/HBASE-8881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-8881: - Attachment: 8881.txt Catch the abstractmethodexception and keep going... TestGet failing in testDynamicFilter with AbstractMethodException - Key: HBASE-8881 URL: https://issues.apache.org/jira/browse/HBASE-8881 Project: HBase Issue Type: Bug Reporter: stack Attachments: 8881.txt See https://builds.apache.org/job/HBase-0.94/1040/testReport/org.apache.hadoop.hbase.client/TestGet/testDynamicFilter/ It has been happening in the last set of builds. It does not seem related to the checkin it started happening on. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8881) TestGet failing in testDynamicFilter with AbstractMethodException
[ https://issues.apache.org/jira/browse/HBASE-8881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-8881: - Affects Version/s: 0.94.9 Assignee: stack TestGet failing in testDynamicFilter with AbstractMethodException - Key: HBASE-8881 URL: https://issues.apache.org/jira/browse/HBASE-8881 Project: HBase Issue Type: Bug Affects Versions: 0.94.9 Reporter: stack Assignee: stack Attachments: 8881.txt See https://builds.apache.org/job/HBase-0.94/1040/testReport/org.apache.hadoop.hbase.client/TestGet/testDynamicFilter/ It has been happening in the last set of builds. It does not seem related to the checkin it started happening on. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8809) Include deletes in the scan (setRaw) method does not respect the time range or the filter
[ https://issues.apache.org/jira/browse/HBASE-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13701121#comment-13701121 ] Lars Hofhansl commented on HBASE-8809: -- If that is acceptable I will commit my fix to all branches under this jira. Include deletes in the scan (setRaw) method does not respect the time range or the filter - Key: HBASE-8809 URL: https://issues.apache.org/jira/browse/HBASE-8809 Project: HBase Issue Type: Bug Components: Scanners Reporter: Vasu Mariyala Assignee: Lars Hofhansl Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: 8809-0.94.txt, 8809-trunk.txt, DeleteMarkers.doc, hbase-8809-0.94-addendum-example.patch If a row has been deleted at time stamp 'T' and a scan with time range (0, T-1) is executed, it still returns the delete marker at time stamp 'T'. It is because of the code in ScanQueryMatcher.java {code} if (retainDeletesInOutput || (!isUserScan (EnvironmentEdgeManager.currentTimeMillis() - timestamp) = timeToPurgeDeletes) || kv.getMemstoreTS() maxReadPointToTrackVersions) { // always include or it is not time yet to check whether it is OK // to purge deltes or not return MatchCode.INCLUDE; } {code} The assumption is scan (even with setRaw is set to true) should respect the filters and the time range specified. Please let me know if you think this behavior can be changed so that I can provide a patch for it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8814) Possible NPE in split if a region has empty store files.
[ https://issues.apache.org/jira/browse/HBASE-8814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13701122#comment-13701122 ] stack commented on HBASE-8814: -- I put this patch back into the 0.94 branch. It is not responsible for the strange serialization issue I was accrediting it with above. Possible NPE in split if a region has empty store files. Key: HBASE-8814 URL: https://issues.apache.org/jira/browse/HBASE-8814 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.8 Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: HBASE-8814_94.patch, HBASE-8814_94_v2.patch, HBASE-8814_trunk_addendum.patch, HBASE-8814_trunk.patch, HBASE-8814_v2.patch {code} 2013-06-27 14:12:54,472 INFO [RS:1;BLRY2R009039160:49833-splits-1372322572806] regionserver.SplitRequest(92): Running rollback/cleanup of failed split of testSplitShouldNotThrowNPEEvenARegionHasEmptySplitFiles,,1372322556662.276e00da1420119e2f91f3a4c4c41d78.; java.util.concurrent.ExecutionException: java.lang.NullPointerException java.io.IOException: java.util.concurrent.ExecutionException: java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.SplitTransaction.splitStoreFiles(SplitTransaction.java:602) at org.apache.hadoop.hbase.regionserver.SplitTransaction.createDaughters(SplitTransaction.java:297) at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:466) at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:82) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.util.concurrent.ExecutionException: java.lang.NullPointerException at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.hadoop.hbase.regionserver.SplitTransaction.splitStoreFiles(SplitTransaction.java:596) ... 6 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionFileSystem.splitStoreFile(HRegionFileSystem.java:539) at org.apache.hadoop.hbase.regionserver.SplitTransaction.splitStoreFile(SplitTransaction.java:610) at org.apache.hadoop.hbase.regionserver.SplitTransaction.access$1(SplitTransaction.java:607) at org.apache.hadoop.hbase.regionserver.SplitTransaction$StoreFileSplitter.call(SplitTransaction.java:633) at org.apache.hadoop.hbase.regionserver.SplitTransaction$StoreFileSplitter.call(SplitTransaction.java:1) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) ... 3 more {code} If a storefile is empty(can be because of puts and deletes) then first and lastkey of the file will be empty. Then we will get first or last key as null. Then we will end up in NPE when we will check splitkey in the range or not. {code} if (top) { //check if larger than last key. KeyValue splitKey = KeyValue.createFirstOnRow(splitRow); byte[] lastKey = f.createReader().getLastKey(); if (f.getReader().getComparator().compare(splitKey.getBuffer(), splitKey.getKeyOffset(), splitKey.getKeyLength(), lastKey, 0, lastKey.length) 0) { return null; } } else { //check if smaller than first key KeyValue splitKey = KeyValue.createLastOnRow(splitRow); byte[] firstKey = f.createReader().getFirstKey(); if (f.getReader().getComparator().compare(splitKey.getBuffer(), splitKey.getKeyOffset(), splitKey.getKeyLength(), firstKey, 0, firstKey.length) 0) { return null; } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8799) TestAccessController#testBulkLoad failing on trunk/0.95
[ https://issues.apache.org/jira/browse/HBASE-8799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13701124#comment-13701124 ] Hudson commented on HBASE-8799: --- Integrated in HBase-TRUNK #4219 (See [https://builds.apache.org/job/HBase-TRUNK/4219/]) HBASE-8799. TestAccessController#testBulkLoad failing on trunk/0.95 (Revision 1500088) Result = SUCCESS apurtell : Files : * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessController.java TestAccessController#testBulkLoad failing on trunk/0.95 --- Key: HBASE-8799 URL: https://issues.apache.org/jira/browse/HBASE-8799 Project: HBase Issue Type: Bug Components: Coprocessors, security, test Affects Versions: 0.98.0, 0.95.2 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.98.0, 0.95.2 Attachments: 8799disableFailingTest.txt, 8799.patch, 8799.txt I've observed this in Jenkins reports and also while I was working on HBASE-8692, only on trunk/0.95, not on 0.94: {quote} Failed tests: testBulkLoad(org.apache.hadoop.hbase.security.access.TestAccessController): Expected action to pass for user 'rwuser' but was denied {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8809) Include deletes in the scan (setRaw) method does not respect the time range or the filter
[ https://issues.apache.org/jira/browse/HBASE-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13701125#comment-13701125 ] Jesse Yates commented on HBASE-8809: +1 [~lhofhansl] LGTM (and much cleaner than my solution :) Include deletes in the scan (setRaw) method does not respect the time range or the filter - Key: HBASE-8809 URL: https://issues.apache.org/jira/browse/HBASE-8809 Project: HBase Issue Type: Bug Components: Scanners Reporter: Vasu Mariyala Assignee: Lars Hofhansl Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: 8809-0.94.txt, 8809-trunk.txt, DeleteMarkers.doc, hbase-8809-0.94-addendum-example.patch If a row has been deleted at time stamp 'T' and a scan with time range (0, T-1) is executed, it still returns the delete marker at time stamp 'T'. It is because of the code in ScanQueryMatcher.java {code} if (retainDeletesInOutput || (!isUserScan (EnvironmentEdgeManager.currentTimeMillis() - timestamp) = timeToPurgeDeletes) || kv.getMemstoreTS() maxReadPointToTrackVersions) { // always include or it is not time yet to check whether it is OK // to purge deltes or not return MatchCode.INCLUDE; } {code} The assumption is scan (even with setRaw is set to true) should respect the filters and the time range specified. Please let me know if you think this behavior can be changed so that I can provide a patch for it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8809) Include deletes in the scan (setRaw) method does not respect the time range or the filter
[ https://issues.apache.org/jira/browse/HBASE-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13701126#comment-13701126 ] Jesse Yates commented on HBASE-8809: Mind putting in a test for that too? Include deletes in the scan (setRaw) method does not respect the time range or the filter - Key: HBASE-8809 URL: https://issues.apache.org/jira/browse/HBASE-8809 Project: HBase Issue Type: Bug Components: Scanners Reporter: Vasu Mariyala Assignee: Lars Hofhansl Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: 8809-0.94.txt, 8809-trunk.txt, DeleteMarkers.doc, hbase-8809-0.94-addendum-example.patch If a row has been deleted at time stamp 'T' and a scan with time range (0, T-1) is executed, it still returns the delete marker at time stamp 'T'. It is because of the code in ScanQueryMatcher.java {code} if (retainDeletesInOutput || (!isUserScan (EnvironmentEdgeManager.currentTimeMillis() - timestamp) = timeToPurgeDeletes) || kv.getMemstoreTS() maxReadPointToTrackVersions) { // always include or it is not time yet to check whether it is OK // to purge deltes or not return MatchCode.INCLUDE; } {code} The assumption is scan (even with setRaw is set to true) should respect the filters and the time range specified. Please let me know if you think this behavior can be changed so that I can provide a patch for it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira