[jira] Commented: (HBASE-3255) Allow Export tool to choose subset of rows
[ https://issues.apache.org/jira/browse/HBASE-3255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12934436#action_12934436 ] Lars George commented on HBASE-3255: Hi Ted, Isn't this a dupe from your HBASE-2495? And for the change of Import please create a new JIRA with the details on what you suggest please. Lars Allow Export tool to choose subset of rows -- Key: HBASE-3255 URL: https://issues.apache.org/jira/browse/HBASE-3255 Project: HBase Issue Type: Improvement Components: util Affects Versions: 0.20.6 Reporter: Ted Yu org.apache.hadoop.hbase.mapreduce.Export should allow user to specify a subset of rows. This capability would help develop solution for problem which produces unwanted rows (in .META. table e.g.) that must be deleted. One such case is https://issues.apache.org/jira/browse/HBASE-3251 We can export the dangling row(s) from .META., delete it and later import the row(s) to (another) hbase instance. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3243) Disable Table closed region on wrong host
[ https://issues.apache.org/jira/browse/HBASE-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12934535#action_12934535 ] Jonathan Gray commented on HBASE-3243: -- Well try running again with my patch. Or you could even run it again without to see if it happens again and we could get another set of logs. I guess run it with the patch and then if it doesn't ever happen again we can punt the issue or resolve it until we see it again. Disable Table closed region on wrong host - Key: HBASE-3243 URL: https://issues.apache.org/jira/browse/HBASE-3243 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.0 Reporter: Todd Lipcon Priority: Blocker Fix For: 0.90.0 Attachments: hbase-3243-logs.tar.bz2, HBASE-3243-v1.patch I ran some YCSB benchmarks which resulted in about 150 regions worth of data overnight. Then I disabled the table, and the master for some reason closed one region on the wrong server. The server ignored this, but the region remained open on a different server, which later flipped out when it tried to flush due to hlog accumulation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HBASE-3258) EOF when version file is empty
EOF when version file is empty -- Key: HBASE-3258 URL: https://issues.apache.org/jira/browse/HBASE-3258 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.90.0, 0.92.0 I somehow was able to get an empty hbase.version file on a test machine and when I start HBase I see: {noformat} starting master, logging to /data/jdcryans/git/hbase/bin/../logs/hbase-jdcryans-master-hbasedev.out Exception in thread master-hbasedev:6 java.lang.NullPointerException at org.apache.hadoop.hbase.master.HMaster.stopServiceThreads(HMaster.java:559) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:286) {noformat} And in the master's log: {noformat} 2010-11-22 10:08:43,003 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. java.io.EOFException at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:323) at java.io.DataInputStream.readUTF(DataInputStream.java:572) at org.apache.hadoop.hbase.util.FSUtils.getVersion(FSUtils.java:151) at org.apache.hadoop.hbase.util.FSUtils.checkVersion(FSUtils.java:170) at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:226) at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:104) at org.apache.hadoop.hbase.master.MasterFileSystem.init(MasterFileSystem.java:89) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:337) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:273) 2010-11-22 10:08:43,006 INFO org.apache.hadoop.hbase.master.HMaster: Aborting {noformat} I thought that that kind of issue was solved a long time ago, but somehow it's there again. I'll fix by handling the EOF and also will look at that ugly NPE. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3258) EOF when version file is empty
[ https://issues.apache.org/jira/browse/HBASE-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12934547#action_12934547 ] Jean-Daniel Cryans commented on HBASE-3258: --- For the record, the reason is that I was testing 0.90 with 0.20-append and since both are currently incompatible at the data transfer level (ugly ugly), the master is able to create the file but unable to write to. This HDFS-724 situation is bad. EOF when version file is empty -- Key: HBASE-3258 URL: https://issues.apache.org/jira/browse/HBASE-3258 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.90.0, 0.92.0 I somehow was able to get an empty hbase.version file on a test machine and when I start HBase I see: {noformat} starting master, logging to /data/jdcryans/git/hbase/bin/../logs/hbase-jdcryans-master-hbasedev.out Exception in thread master-hbasedev:6 java.lang.NullPointerException at org.apache.hadoop.hbase.master.HMaster.stopServiceThreads(HMaster.java:559) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:286) {noformat} And in the master's log: {noformat} 2010-11-22 10:08:43,003 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. java.io.EOFException at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:323) at java.io.DataInputStream.readUTF(DataInputStream.java:572) at org.apache.hadoop.hbase.util.FSUtils.getVersion(FSUtils.java:151) at org.apache.hadoop.hbase.util.FSUtils.checkVersion(FSUtils.java:170) at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:226) at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:104) at org.apache.hadoop.hbase.master.MasterFileSystem.init(MasterFileSystem.java:89) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:337) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:273) 2010-11-22 10:08:43,006 INFO org.apache.hadoop.hbase.master.HMaster: Aborting {noformat} I thought that that kind of issue was solved a long time ago, but somehow it's there again. I'll fix by handling the EOF and also will look at that ugly NPE. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HBASE-3259) Can't kill the region servers when they wait on the master or the cluster state znode
Can't kill the region servers when they wait on the master or the cluster state znode - Key: HBASE-3259 URL: https://issues.apache.org/jira/browse/HBASE-3259 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.90.0, 0.92.0 With a situation like HBASE-3258, it's easy to have the region servers stuck on waiting for either the master or the cluster state znode since it has no timeout. You have to kill -9 them to have them shutting down. This is very bad for usability. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3259) Can't kill the region servers when they wait on the master or the cluster state znode
[ https://issues.apache.org/jira/browse/HBASE-3259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12934556#action_12934556 ] Jonathan Gray commented on HBASE-3259: -- Like you said, maybe this is bad for usability, not sure this is blocking or a bug. You want to make it so you can just 'kill' without -9? Or you want to add timeout on RS on startup? The former seems no different for usability. The latter might be okay but not sure it's expected behavior. What would the default timeout be? Can't kill the region servers when they wait on the master or the cluster state znode - Key: HBASE-3259 URL: https://issues.apache.org/jira/browse/HBASE-3259 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.90.0, 0.92.0 With a situation like HBASE-3258, it's easy to have the region servers stuck on waiting for either the master or the cluster state znode since it has no timeout. You have to kill -9 them to have them shutting down. This is very bad for usability. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3259) Can't kill the region servers when they wait on the master or the cluster state znode
[ https://issues.apache.org/jira/browse/HBASE-3259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12934561#action_12934561 ] Jean-Daniel Cryans commented on HBASE-3259: --- bq. Like you said, maybe this is bad for usability, not sure this is blocking or a bug. I foresee that a majority of our new users will hit this issue if they have any sort of trouble setting up their cluster, so I think this is a blocker. bq. You want to make it so you can just 'kill' without -9? Not just kill, but also hbase-daemon.sh stop regionserver since it also hangs. Imagine a few machines in that state where you have to manually kill -9 every one of them. bq. Or you want to add timeout on RS on startup? A timeout to the blocking, but that we retry until either the data is available or the region server is stopped. Like 1 or 2 seconds. I'm currently writing the patch. Can't kill the region servers when they wait on the master or the cluster state znode - Key: HBASE-3259 URL: https://issues.apache.org/jira/browse/HBASE-3259 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.90.0, 0.92.0 With a situation like HBASE-3258, it's easy to have the region servers stuck on waiting for either the master or the cluster state znode since it has no timeout. You have to kill -9 them to have them shutting down. This is very bad for usability. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3258) EOF when version file is empty
[ https://issues.apache.org/jira/browse/HBASE-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12934576#action_12934576 ] Todd Lipcon commented on HBASE-3258: When we create the .version file, we should create it in a tmp location and then move it into place. It's probably empty in the case that a server crashes between writing and closing. EOF when version file is empty -- Key: HBASE-3258 URL: https://issues.apache.org/jira/browse/HBASE-3258 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.90.0, 0.92.0 Attachments: HBASE-3258.patch I somehow was able to get an empty hbase.version file on a test machine and when I start HBase I see: {noformat} starting master, logging to /data/jdcryans/git/hbase/bin/../logs/hbase-jdcryans-master-hbasedev.out Exception in thread master-hbasedev:6 java.lang.NullPointerException at org.apache.hadoop.hbase.master.HMaster.stopServiceThreads(HMaster.java:559) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:286) {noformat} And in the master's log: {noformat} 2010-11-22 10:08:43,003 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. java.io.EOFException at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:323) at java.io.DataInputStream.readUTF(DataInputStream.java:572) at org.apache.hadoop.hbase.util.FSUtils.getVersion(FSUtils.java:151) at org.apache.hadoop.hbase.util.FSUtils.checkVersion(FSUtils.java:170) at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:226) at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:104) at org.apache.hadoop.hbase.master.MasterFileSystem.init(MasterFileSystem.java:89) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:337) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:273) 2010-11-22 10:08:43,006 INFO org.apache.hadoop.hbase.master.HMaster: Aborting {noformat} I thought that that kind of issue was solved a long time ago, but somehow it's there again. I'll fix by handling the EOF and also will look at that ugly NPE. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-3259) Can't kill the region servers when they wait on the master or the cluster state znode
[ https://issues.apache.org/jira/browse/HBASE-3259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Daniel Cryans updated HBASE-3259: -- Attachment: HBASE-3259.patch Small refactoring in HRS to handle the timeout and check to stopped, I thought of doing it down in ZooKeeperNodeTracker but I'm not sure if we want that behavior everywhere. Can't kill the region servers when they wait on the master or the cluster state znode - Key: HBASE-3259 URL: https://issues.apache.org/jira/browse/HBASE-3259 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.90.0, 0.92.0 Attachments: HBASE-3259.patch With a situation like HBASE-3258, it's easy to have the region servers stuck on waiting for either the master or the cluster state znode since it has no timeout. You have to kill -9 them to have them shutting down. This is very bad for usability. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3256) Coprocessors: Coprocessor host and observer for HMaster
[ https://issues.apache.org/jira/browse/HBASE-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12934581#action_12934581 ] Gary Helmling commented on HBASE-3256: -- Some additional details on the code changes here: # add a {{org.apache.hadoop.hbase.coprocessor.MasterObserver}} interface defining pre/post methods for createTable, deleteTable, modifyTable, addColumn, modifyColumn, deleteColumn, enable/disable, move, balance, and shutdown # extract a common base class from the current {{org.apache.hadoop.hbase.regionserver.CoprocessorHost}} to {{org.apache.hadoop.hbase.coprocessor.CoprocessorHost}} # rename the existing region-specific {{org.apache.hadoop.hbase.regionserver.CoprocessorHost}} to {{RegionCoprocessorHost}} # add a new {{org.apache.hadoop.hbase.master.MasterCoprocessorHost}} for HMaster integration # refactor the current {{org.apache.hadoop.hbase.coprocessor.CoprocessorEnvironment}} into a base interface with {{RegionCoprocessorEnvironment}} and {{MasterCoprocessorEnvironment extensions}} Coprocessors: Coprocessor host and observer for HMaster --- Key: HBASE-3256 URL: https://issues.apache.org/jira/browse/HBASE-3256 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Gary Helmling Fix For: 0.92.0 Implement a coprocessor host for HMaster. Hook observers into administrative operations performed on tables: create, alter, assignment, load balance, and allow observers to modify base master behavior. Support automatic loading of coprocessor implementation. Consider refactoring the master coprocessor host and regionserver coprocessor host into a common base class. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HBASE-3260) Coprocessors: Lifecycle management
Coprocessors: Lifecycle management -- Key: HBASE-3260 URL: https://issues.apache.org/jira/browse/HBASE-3260 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HBASE-3261) NPE out of HRS.run at startup when clock is out of sync
NPE out of HRS.run at startup when clock is out of sync --- Key: HBASE-3261 URL: https://issues.apache.org/jira/browse/HBASE-3261 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Fix For: 0.90.0, 0.92.0 This is what I get when I start a region server that's not properly sync'ed: {noformat} Exception in thread regionserver60020 java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:603) at java.lang.Thread.run(Thread.java:637) {noformat} I this case the line was: {noformat} hlogRoller.interruptIfNecessary(); {noformat} I guess we could add a bunch of other null checks. The end result is the same, the RS dies, but I think it's misleading. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-3260) Coprocessors: Lifecycle management
[ https://issues.apache.org/jira/browse/HBASE-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-3260: -- Description: Considering extending CPs to the master, we have no equivalent to pre/postOpen and pre/postClose as on the regionserver. We also should consider how to resolve dependencies and initialization ordering if loading coprocessors that depend on others. OSGi (http://en.wikipedia.org/wiki/OSGi) has a lifecycle API and is familiar to many Java programmers, so we propose to borrow its terminology and state machine. A lifecycle layer manages coprocessors as they are dynamically installed, started, stopped, updated and uninstalled. Coprocessors rely on the framework for dependency resolution and class loading. In turn, the framework calls up to lifecycle management methods in the coprocessor as needed. A coprocessor transitions between the below states over its lifetime: ||State||Description|| |UNINSTALLED|The coprocessor implementation is not installed. This is the default implicit state.| |INSTALLED|The coprocessor implementation has been successfully installed| |STARTING|A coprocessor instance is being started.| |ACTIVE|The coprocessor instance has been successfully activated and is running.| |STOPPING|A coprocessor instance is being stopped.| See attached state diagram. Transitions to STOPPING will only happen as the region is being closed. If a coprocessor throws an unhandled exception, this will cause the RegionServer to close the region, stopping all coprocessor instances on it. Transitions from INSTALLED-STARTING and ACTIVE-STOPPING would go through upcall methods into the coprocessor via the CoprocessorLifecycle interface: {code:java} public interface CoprocessorLifecycle { void start(CoprocessorEnvironment env) throws IOException; void stop(CoprocessorEnvironment env) throws IOException; } {code} Fix Version/s: 0.92.0 Coprocessors: Lifecycle management -- Key: HBASE-3260 URL: https://issues.apache.org/jira/browse/HBASE-3260 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Fix For: 0.92.0 Considering extending CPs to the master, we have no equivalent to pre/postOpen and pre/postClose as on the regionserver. We also should consider how to resolve dependencies and initialization ordering if loading coprocessors that depend on others. OSGi (http://en.wikipedia.org/wiki/OSGi) has a lifecycle API and is familiar to many Java programmers, so we propose to borrow its terminology and state machine. A lifecycle layer manages coprocessors as they are dynamically installed, started, stopped, updated and uninstalled. Coprocessors rely on the framework for dependency resolution and class loading. In turn, the framework calls up to lifecycle management methods in the coprocessor as needed. A coprocessor transitions between the below states over its lifetime: ||State||Description|| |UNINSTALLED|The coprocessor implementation is not installed. This is the default implicit state.| |INSTALLED|The coprocessor implementation has been successfully installed| |STARTING|A coprocessor instance is being started.| |ACTIVE|The coprocessor instance has been successfully activated and is running.| |STOPPING|A coprocessor instance is being stopped.| See attached state diagram. Transitions to STOPPING will only happen as the region is being closed. If a coprocessor throws an unhandled exception, this will cause the RegionServer to close the region, stopping all coprocessor instances on it. Transitions from INSTALLED-STARTING and ACTIVE-STOPPING would go through upcall methods into the coprocessor via the CoprocessorLifecycle interface: {code:java} public interface CoprocessorLifecycle { void start(CoprocessorEnvironment env) throws IOException; void stop(CoprocessorEnvironment env) throws IOException; } {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-3260) Coprocessors: Lifecycle management
[ https://issues.apache.org/jira/browse/HBASE-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-3260: -- Attachment: statechart.png Coprocessors: Lifecycle management -- Key: HBASE-3260 URL: https://issues.apache.org/jira/browse/HBASE-3260 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Fix For: 0.92.0 Attachments: statechart.png Considering extending CPs to the master, we have no equivalent to pre/postOpen and pre/postClose as on the regionserver. We also should consider how to resolve dependencies and initialization ordering if loading coprocessors that depend on others. OSGi (http://en.wikipedia.org/wiki/OSGi) has a lifecycle API and is familiar to many Java programmers, so we propose to borrow its terminology and state machine. A lifecycle layer manages coprocessors as they are dynamically installed, started, stopped, updated and uninstalled. Coprocessors rely on the framework for dependency resolution and class loading. In turn, the framework calls up to lifecycle management methods in the coprocessor as needed. A coprocessor transitions between the below states over its lifetime: ||State||Description|| |UNINSTALLED|The coprocessor implementation is not installed. This is the default implicit state.| |INSTALLED|The coprocessor implementation has been successfully installed| |STARTING|A coprocessor instance is being started.| |ACTIVE|The coprocessor instance has been successfully activated and is running.| |STOPPING|A coprocessor instance is being stopped.| See attached state diagram. Transitions to STOPPING will only happen as the region is being closed. If a coprocessor throws an unhandled exception, this will cause the RegionServer to close the region, stopping all coprocessor instances on it. Transitions from INSTALLED-STARTING and ACTIVE-STOPPING would go through upcall methods into the coprocessor via the CoprocessorLifecycle interface: {code:java} public interface CoprocessorLifecycle { void start(CoprocessorEnvironment env) throws IOException; void stop(CoprocessorEnvironment env) throws IOException; } {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3261) NPE out of HRS.run at startup when clock is out of sync
[ https://issues.apache.org/jira/browse/HBASE-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12934664#action_12934664 ] Jonathan Gray commented on HBASE-3261: -- Yeah this is something I had to add a lot of checks for in HMaster as well. +1 on adding null checks before we stop/interrupt stuff. NPE out of HRS.run at startup when clock is out of sync --- Key: HBASE-3261 URL: https://issues.apache.org/jira/browse/HBASE-3261 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Fix For: 0.90.0, 0.92.0 This is what I get when I start a region server that's not properly sync'ed: {noformat} Exception in thread regionserver60020 java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:603) at java.lang.Thread.run(Thread.java:637) {noformat} I this case the line was: {noformat} hlogRoller.interruptIfNecessary(); {noformat} I guess we could add a bunch of other null checks. The end result is the same, the RS dies, but I think it's misleading. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3227) Edit of log messages before branching...
[ https://issues.apache.org/jira/browse/HBASE-3227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12934688#action_12934688 ] HBase Review Board commented on HBASE-3227: --- Message from: Nicolas nspiegelb...@facebook.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1212/#review1971 --- trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java http://review.cloudera.org/r/1212/#comment6227 I'd suggest keeping the store name in this debug message since we're considering thread pools for compactions... - Nicolas Edit of log messages before branching... Key: HBASE-3227 URL: https://issues.apache.org/jira/browse/HBASE-3227 Project: HBase Issue Type: Improvement Reporter: stack Fix For: 0.90.0 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-3262) TestHMasterRPCException uses non-ephemeral port for master
[ https://issues.apache.org/jira/browse/HBASE-3262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Gray updated HBASE-3262: - Attachment: HBASE-3262-v1.patch Uses ephemeral port for master and cleans up unused imports causing warnings. TestHMasterRPCException uses non-ephemeral port for master -- Key: HBASE-3262 URL: https://issues.apache.org/jira/browse/HBASE-3262 Project: HBase Issue Type: Bug Affects Versions: 0.90.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Fix For: 0.90.0 Attachments: HBASE-3262-v1.patch TestHMasterRPCException instantiates an HMaster but doesn't use an ephemeral port which can cause the test to fail if port already in use. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HBASE-3262) TestHMasterRPCException uses non-ephemeral port for master
TestHMasterRPCException uses non-ephemeral port for master -- Key: HBASE-3262 URL: https://issues.apache.org/jira/browse/HBASE-3262 Project: HBase Issue Type: Bug Affects Versions: 0.90.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Fix For: 0.90.0 Attachments: HBASE-3262-v1.patch TestHMasterRPCException instantiates an HMaster but doesn't use an ephemeral port which can cause the test to fail if port already in use. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3262) TestHMasterRPCException uses non-ephemeral port for master
[ https://issues.apache.org/jira/browse/HBASE-3262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12934724#action_12934724 ] Jonathan Gray commented on HBASE-3262: -- Maybe we should push this setting of port 0 as master/rs ports into constructor of HBaseTestingUtility? TestHMasterRPCException uses non-ephemeral port for master -- Key: HBASE-3262 URL: https://issues.apache.org/jira/browse/HBASE-3262 Project: HBase Issue Type: Bug Affects Versions: 0.90.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Fix For: 0.90.0 Attachments: HBASE-3262-v1.patch TestHMasterRPCException instantiates an HMaster but doesn't use an ephemeral port which can cause the test to fail if port already in use. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2888) Review all our metrics
[ https://issues.apache.org/jira/browse/HBASE-2888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12934763#action_12934763 ] Alex Baranau commented on HBASE-2888: - As per small discussion here: http://search-hadoop.com/m/DZcdNHsTOe2, here are some extra things we might want to expose: 1. Splits stats. We have in JMX flush and compaction data (time spent and data amount). Should we add also stats for split procedures as they affect hbase behaviour too? 2. Flush/Compaction/Split rate. For flush and compaction we expose only time spent and data amount stats, but we might also want to show smth like operations rate (number of actions). Based on flush/compaction/split rate one can make judgements on whether some configuration is properly set (e.g. hbase.hregion.memstore.flush.size). 3. Events log. Also I think would be very useful for ops to have ability to watch at events (like splits, flushes, compactions) on a web interface/in JMX, know when they appear, aka events' log. Thus one can go to to web page and see what can affect performance degradation for a particular period of time. Currently we have to (and do) go to log files for that kind of info. Review all our metrics -- Key: HBASE-2888 URL: https://issues.apache.org/jira/browse/HBASE-2888 Project: HBase Issue Type: Improvement Components: master Reporter: Jean-Daniel Cryans Fix For: 0.92.0 HBase publishes a bunch of metrics, some useful some wasteful, that should be improved to deliver a better ops experience. Examples: - Block cache hit ratio converges at some point and stops moving - fsReadLatency goes down when compactions are running - storefileIndexSizeMB is the exact same number once a system is serving production load We could use new metrics too. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2888) Review all our metrics
[ https://issues.apache.org/jira/browse/HBASE-2888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12934765#action_12934765 ] Alex Baranau commented on HBASE-2888: - In general, does it makes sense to create separate issue to review the way we expose particular metrics/stats? E.g. should we show particular metric on a web interface or just put into JMX, in what form (in case of web), etc.? Review all our metrics -- Key: HBASE-2888 URL: https://issues.apache.org/jira/browse/HBASE-2888 Project: HBase Issue Type: Improvement Components: master Reporter: Jean-Daniel Cryans Fix For: 0.92.0 HBase publishes a bunch of metrics, some useful some wasteful, that should be improved to deliver a better ops experience. Examples: - Block cache hit ratio converges at some point and stops moving - fsReadLatency goes down when compactions are running - storefileIndexSizeMB is the exact same number once a system is serving production load We could use new metrics too. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.