[jira] [Created] (HBASE-4669) Suggest to add a choice of using round-robin assignment on enable-table
Suggest to add a choice of using round-robin assignment on enable-table --- Key: HBASE-4669 URL: https://issues.apache.org/jira/browse/HBASE-4669 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.90.4, 0.94.0 Reporter: Jieshan Bean Priority: Minor Fix For: 0.94.0, 0.90.5 Under some scenarios, we use the function of disable/enable HTable. But currently, enable HTable using the random-assignment. We hope all the regions show a better distribution, no matter how many regions and how many regionservers. So I suggest to add a choice of using round-robin assignment on enable-table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4670) Fix javadoc warnings
[ https://issues.apache.org/jira/browse/HBASE-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4670: - Attachment: javadoc.txt Fixes for javadoc warnings. Fix javadoc warnings Key: HBASE-4670 URL: https://issues.apache.org/jira/browse/HBASE-4670 Project: HBase Issue Type: Bug Reporter: stack Fix For: 0.92.0 Attachments: javadoc.txt We have hundreds of javadoc warnings emitted on every build. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-4670) Fix javadoc warnings
[ https://issues.apache.org/jira/browse/HBASE-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-4670. -- Resolution: Fixed Assignee: stack Committed to 0.92 branch and trunk. Neither have warnings now. Fix javadoc warnings Key: HBASE-4670 URL: https://issues.apache.org/jira/browse/HBASE-4670 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Fix For: 0.92.0 Attachments: javadoc.txt We have hundreds of javadoc warnings emitted on every build. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4610) Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug)
[ https://issues.apache.org/jira/browse/HBASE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13134798#comment-13134798 ] stack commented on HBASE-4610: -- Any update on this one? Port HBASE-3380 (Master failover can split logs of live servers) to 92/trunk (definitely bring in config params, decide if we need to do more to fix the bug) - Key: HBASE-4610 URL: https://issues.apache.org/jira/browse/HBASE-4610 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0, 0.94.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Over in HBASE-3380 we were having some TestMasterFailover flakiness. We added some more config parameters to better control the master startup loop where it waits for RS to heartbeat in. We had thought at the time that 92 would have a different solution but it is still relying on heartbeats to learn about RSs. For now, we should definitely bring these config params into 92/trunk. Otherwise this is an incompatible regression and adding these will also make things like what was just reported over in HBASE-4603 trivial to fix in an optimal way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4436) Remove methods deprecated in 0.90 from TRUNK and 0.92
[ https://issues.apache.org/jira/browse/HBASE-4436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13134799#comment-13134799 ] stack commented on HBASE-4436: -- Would be good to get this into 0.92. You fellas are busy I hear Jon. Remove methods deprecated in 0.90 from TRUNK and 0.92 - Key: HBASE-4436 URL: https://issues.apache.org/jira/browse/HBASE-4436 Project: HBase Issue Type: Task Reporter: stack Assignee: Jonathan Hsieh Priority: Critical Labels: noob Fix For: 0.92.0 Remove methods deprecated in 0.90 from codebase. i took a quick look. The messy bit is thrift referring to old stuff; will take a little work to do the convertion over to the new methods. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4304) requestsPerSecond counter stuck at 0
[ https://issues.apache.org/jira/browse/HBASE-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13134801#comment-13134801 ] stack commented on HBASE-4304: -- @Li Any update on this one boss? requestsPerSecond counter stuck at 0 Key: HBASE-4304 URL: https://issues.apache.org/jira/browse/HBASE-4304 Project: HBase Issue Type: Bug Components: master, regionserver Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Li Pi Priority: Critical Fix For: 0.92.0 Running trunk @ r1163343, all of the requestsPerSecond counters are showing 0 both in the master UI and in the RS UI. The writeRequestsCount metric is properly updating in the RS UI. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4191) hbase load balancer needs locality awareness
[ https://issues.apache.org/jira/browse/HBASE-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13134803#comment-13134803 ] Mikhail Bautin commented on HBASE-4191: --- @Ted: could you please elaborate on how you express the region assignment problem as a Max Flow problem? If we define the cost of assigning a region to a server based on locality, and define a constraint of load balancedness to be such that each regionserver is assigned no more than approximately ceil(numRegions / numServers) + C regions for some small value of C, then I can see how the problem becomes a min-cost max flow (http://en.wikipedia.org/wiki/Minimum_cost_flow_problem). However, I don't see how we could reduce the assignment problem to the max-flow problem directly (http://en.wikipedia.org/wiki/Maximum_flow_problem). hbase load balancer needs locality awareness Key: HBASE-4191 URL: https://issues.apache.org/jira/browse/HBASE-4191 Project: HBase Issue Type: New Feature Reporter: Ted Yu Assignee: Liyin Tang Previously, HBASE-4114 implements the metrics for HFile HDFS block locality, which provides the HFile level locality information. But in order to work with load balancer and region assignment, we need the region level locality information. Let's define the region locality information first, which is almost the same as HFile locality index. HRegion locality index (HRegion A, RegionServer B) = (Total number of HDFS blocks that can be retrieved locally by the RegionServer B for the HRegion A) / ( Total number of the HDFS blocks for the Region A) So the HRegion locality index tells us that how much locality we can get if the HMaster assign the HRegion A to the RegionServer B. So there will be 2 steps involved to assign regions based on the locality. 1) During the cluster start up time, the master will scan the hdfs to calculate the HRegion locality index for each pair of HRegion and Region Server. It is pretty expensive to scan the dfs. So we only needs to do this once during the start up time. 2) During the cluster run time, each region server will update the HRegion locality index as metrics periodically as HBASE-4114 did. The Region Server can expose them to the Master through ZK, meta table, or just RPC messages. Based on the HRegion locality index, the assignment manager in the master would have a global knowledge about the region locality distribution. Imaging the HRegion locality index as the capacity between the region server set and region set, the assignment manager could the run the MAXIMUM FLOW solver to reach the global optimization. Also the master should share this global view to secondary master in case the master fail over happens. In addition, the HBASE-4491 (Locality Checker) is the tool, which is based on the same metrics, to proactively to scan dfs to calculate the global locality information in the cluster. It will help us to verify data locality information during the run time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3515) [replication] ReplicationSource can miss a log after RS comes out of GC
[ https://issues.apache.org/jira/browse/HBASE-3515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13134805#comment-13134805 ] stack commented on HBASE-3515: -- @J-D What you thinking regards this issue now? Should it hold up 0.92? [replication] ReplicationSource can miss a log after RS comes out of GC --- Key: HBASE-3515 URL: https://issues.apache.org/jira/browse/HBASE-3515 Project: HBase Issue Type: Bug Affects Versions: 0.90.0 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3515.patch This is from Hudson build 1738, if a log is about to be rolled and the ZK connection is already closed then the replication code will fail at adding the new log in ZK but the log will still be rolled and it's possible that some edits will make it in. From the log: {quote} 2011-02-08 10:21:20,618 FATAL [RegionServer:0;vesta.apache.org,46117,1297160399378.logRoller] regionserver.HRegionServer(1383): ABORTING region server serverName=vesta.apache.org,46117,1297160399378, load=(requests=1525, regions=12, usedHeap=273, maxHeap=1244): Failed add log to list org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /1/replication/rs/vesta.apache.org,46117,1297160399378/2/vesta.apache.org%3A46117.1297160480509 ... 2011-02-08 10:21:22,444 DEBUG [MASTER_META_SERVER_OPERATIONS-vesta.apache.org:56008-0] wal.HLogSplitter(258): Splitting hlog 8 of 8: hdfs://localhost:55474/user/hudson/.logs/vesta.apache.org,46117,1297160399378/vesta.apache.org%3A46117.1297160480509, length=0 2011-02-08 10:21:22,862 DEBUG [MASTER_META_SERVER_OPERATIONS-vesta.apache.org:56008-0] wal.HLogSplitter(436): Pushed=31 entries from hdfs://localhost:55474/user/hudson/.logs/vesta.apache.org,46117,1297160399378/vesta.apache.org%3A46117.1297160480509 {quote} The easiest thing to do would be let the exception out and cancel the log roll. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4274) RS should periodically ping its HLog pipeline even if no writes are arriving
[ https://issues.apache.org/jira/browse/HBASE-4274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4274: - Fix Version/s: (was: 0.92.0) 0.94.0 Moving out of 0.92. This does not seem to be a critical 0.92 issue any more given Gary work. RS should periodically ping its HLog pipeline even if no writes are arriving Key: HBASE-4274 URL: https://issues.apache.org/jira/browse/HBASE-4274 Project: HBase Issue Type: Improvement Components: regionserver, wal Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.94.0 If you restart HDFS underneath HBase, when HBase isn't taking any write load, the region servers won't notice that there's any problem until the next time they take a write, at which point they will abort (because the pipeline is gone from beneath them). It would be better if they wrote some garbage to their HLog once every few seconds as a sort of keepalive, so they will aggressively abort as soon as there's an issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4457) Starting region server on non-default info port is resulting in broken URL's in master UI
[ https://issues.apache.org/jira/browse/HBASE-4457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4457: - Priority: Critical (was: Minor) Making this critical; its an ugly regression not being able to browse to regionserver from master UI Starting region server on non-default info port is resulting in broken URL's in master UI - Key: HBASE-4457 URL: https://issues.apache.org/jira/browse/HBASE-4457 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0 Reporter: Praveen Patibandla Priority: Critical Labels: newbie Fix For: 0.92.0 Attachments: 4457-V1.patch, 4457.patch When hbase.regionserver.info.port is set to non-default port, Master UI has broken URL's in the region server table because it's hard coded to default port. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4467) Handle inconsistencies in Hadoop libraries naming in hbase script
[ https://issues.apache.org/jira/browse/HBASE-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13134815#comment-13134815 ] stack commented on HBASE-4467: -- @LarsG You going to commit boss? Handle inconsistencies in Hadoop libraries naming in hbase script - Key: HBASE-4467 URL: https://issues.apache.org/jira/browse/HBASE-4467 Project: HBase Issue Type: Bug Components: scripts Affects Versions: 0.92.0, 0.94.0 Reporter: Lars George Assignee: Lars George Priority: Trivial Fix For: 0.92.0, 0.94.0 Attachments: HBASE-4467.patch When using an Hadoop tarball that has a library naming of hadoop-x.y.z-core as opposed to hadoop-core-x.y.z then the hbase script throws errors. {noformat} $ bin/start-hbase.sh ls: /projects/opensource/hadoop-0.20.2-append/hadoop-core*.jar: No such file or directory Exception in thread main java.lang.NoClassDefFoundError: org/apache/hadoop/util/PlatformName Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.util.PlatformName at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) ls: /projects/opensource/hadoop-0.20.2-append/hadoop-core*.jar: No such file or directory Exception in thread main java.lang.NoClassDefFoundError: org/apache/hadoop/util/PlatformName Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.util.PlatformName at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) localhost: starting zookeeper, logging to /projects/opensource/hbase-trunk-rw//logs/hbase-larsgeorge-zookeeper-de1-app-mbp-2.out localhost: /projects/opensource/hadoop-0.20.2-append localhost: ls: /projects/opensource/hadoop-0.20.2-append/hadoop-core*.jar: No such file or directory localhost: Exception in thread main java.lang.NoClassDefFoundError: org/apache/hadoop/util/PlatformName localhost: Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.util.PlatformName localhost:at java.net.URLClassLoader$1.run(URLClassLoader.java:202) localhost:at java.security.AccessController.doPrivileged(Native Method) localhost:at java.net.URLClassLoader.findClass(URLClassLoader.java:190) localhost:at java.lang.ClassLoader.loadClass(ClassLoader.java:306) localhost:at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) localhost:at java.lang.ClassLoader.loadClass(ClassLoader.java:247) starting master, logging to /projects/opensource/hbase-trunk-rw/bin/../logs/hbase-larsgeorge-master-de1-app-mbp-2.out /projects/opensource/hadoop-0.20.2-append ls: /projects/opensource/hadoop-0.20.2-append/hadoop-core*.jar: No such file or directory Exception in thread main java.lang.NoClassDefFoundError: org/apache/hadoop/util/PlatformName Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.util.PlatformName at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) localhost: starting regionserver, logging to /projects/opensource/hbase-trunk-rw//logs/hbase-larsgeorge-regionserver-de1-app-mbp-2.out localhost: /projects/opensource/hadoop-0.20.2-append localhost: ls: /projects/opensource/hadoop-0.20.2-append/hadoop-core*.jar: No such file or directory localhost: Exception in thread main java.lang.NoClassDefFoundError: org/apache/hadoop/util/PlatformName localhost: Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.util.PlatformName localhost:at java.net.URLClassLoader$1.run(URLClassLoader.java:202) localhost:at java.security.AccessController.doPrivileged(Native Method) localhost:at java.net.URLClassLoader.findClass(URLClassLoader.java:190) localhost:at java.lang.ClassLoader.loadClass(ClassLoader.java:306) localhost:at
[jira] [Updated] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Bauer updated HBASE-4377: --- Attachment: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch It base from 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch It have small enhancement in MetaEntry constructor, because i had realy broken testing env after recreating META i had empty TableName values, so this enhancement fill TableName from regionName, this need setTableName method in HRegionInfo class(now ith public, but its can be protect). After recreating META lastly i have testing env working again :) [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, hbase-4377-trunk.v2.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4578) NPE when altering a table that has moving regions
[ https://issues.apache.org/jira/browse/HBASE-4578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135006#comment-13135006 ] Hudson commented on HBASE-4578: --- Integrated in HBase-TRUNK #2366 (See [https://builds.apache.org/job/HBase-TRUNK/2366/]) HBASE-4578 NPE when altering a table that has moving regions stack : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java NPE when altering a table that has moving regions - Key: HBASE-4578 URL: https://issues.apache.org/jira/browse/HBASE-4578 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Assignee: gaojinchao Priority: Blocker Fix For: 0.92.0 Attachments: HBASE-4578_Trunk_V1.patch, HBASE-4578_trial_Trunk.patch I'm still not a 100% sure on the source of this error, but here's what I was able to get twice while altering a table that was doing a bunch of splits: {quote} 2011-10-11 23:48:59,344 INFO org.apache.hadoop.hbase.master.handler.SplitRegionHandler: Handled SPLIT report); parent=TestTable,0002608338,1318376880454.a75d6815fdfc513fb1c8aabe086c6763. daughter a=TestTable,0002608338,1318376938764.ef170ff6cd8695dc8aec92e542dc9ac1.daughter b=TestTable,0003301408,1318376938764.36eb2530341bd46888ede312c5559b5d. 2011-10-11 23:49:09,579 DEBUG org.apache.hadoop.hbase.master.handler.TableEventHandler: Ignoring table not disabled exception for supporting online schema changes. 2011-10-11 23:49:09,580 INFO org.apache.hadoop.hbase.master.handler.TableEventHandler: Handling table operation C_M_MODIFY_TABLE on table TestTable 2011-10-11 23:49:09,612 INFO org.apache.hadoop.hbase.util.FSUtils: TableInfoPath = hdfs://sv4r11s38:9100/hbase/TestTable/.tableinfo tmpPath = hdfs://sv4r11s38:9100/hbase/TestTable/.tmp/.tableinfo.1318376949612 2011-10-11 23:49:09,692 INFO org.apache.hadoop.hbase.util.FSUtils: TableDescriptor stored. TableInfoPath = hdfs://sv4r11s38:9100/hbase/TestTable/.tableinfo 2011-10-11 23:49:09,693 INFO org.apache.hadoop.hbase.util.FSUtils: Updated tableinfo=hdfs://sv4r11s38:9100/hbase/TestTable/.tableinfo to blah 2011-10-11 23:49:09,695 INFO org.apache.hadoop.hbase.master.handler.TableEventHandler: Bucketing regions by region server... 2011-10-11 23:49:09,695 DEBUG org.apache.hadoop.hbase.client.MetaScanner: Scanning .META. starting at row=TestTable,,00 for max=2147483647 rows 2011-10-11 23:49:09,709 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: The connection to hconnection-0x132f043bbde02e9 has been closed. 2011-10-11 23:49:09,709 ERROR org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while processing event C_M_MODIFY_TABLE java.lang.NullPointerException at java.util.TreeMap.getEntry(TreeMap.java:324) at java.util.TreeMap.containsKey(TreeMap.java:209) at org.apache.hadoop.hbase.master.handler.TableEventHandler.reOpenAllRegions(TableEventHandler.java:114) at org.apache.hadoop.hbase.master.handler.TableEventHandler.process(TableEventHandler.java:90) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:168) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {quote} The first time the shell reported that all the regions were updated correctly, the second time it got stuck for a while: {quote} 6/14 regions updated. 0/14 regions updated. ... 0/14 regions updated. 2/16 regions updated. ... 2/16 regions updated. 8/9 regions updated. ... 8/9 regions updated. {quote} After which I killed it, redid the alter and it worked. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4374) Up default regions size from 256M to 1G
[ https://issues.apache.org/jira/browse/HBASE-4374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135007#comment-13135007 ] Hudson commented on HBASE-4374: --- Integrated in HBase-TRUNK #2366 (See [https://builds.apache.org/job/HBase-TRUNK/2366/]) HBASE-4374 Up default regions size from 256M to 1G stack : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/main/resources/hbase-default.xml Up default regions size from 256M to 1G --- Key: HBASE-4374 URL: https://issues.apache.org/jira/browse/HBASE-4374 Project: HBase Issue Type: Task Reporter: stack Priority: Blocker Fix For: 0.92.0 Attachments: 4374.txt HBASE-4365 has some discussion of why we default for a table should tend to fewer bigger regions. It doesn't look like this issue will be done for 0.92. For 0.92, lets up default region size from 256M to 1G and talk up pre-split on table creation in manual. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4656) Note how dfs.support.append has to be enabled in 0.20.205.0 clusters
[ https://issues.apache.org/jira/browse/HBASE-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135009#comment-13135009 ] Hudson commented on HBASE-4656: --- Integrated in HBase-TRUNK #2366 (See [https://builds.apache.org/job/HBase-TRUNK/2366/]) HBASE-4656 Note how dfs.support.append has to be enabled in 0.20.205.0 clusters stack : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/docbkx/configuration.xml * /hbase/trunk/src/main/resources/hbase-default.xml Note how dfs.support.append has to be enabled in 0.20.205.0 clusters Key: HBASE-4656 URL: https://issues.apache.org/jira/browse/HBASE-4656 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Fix For: 0.92.0 Attachments: 4656.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4070) [Coprocessors] Improve region server metrics to report loaded coprocessors to master
[ https://issues.apache.org/jira/browse/HBASE-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135005#comment-13135005 ] Hudson commented on HBASE-4070: --- Integrated in HBase-TRUNK #2366 (See [https://builds.apache.org/job/HBase-TRUNK/2366/]) HBASE-4070 Improve region server metrics to report loaded coprocessors to master apurtell : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/main/jamon/org/apache/hbase/tmpl/master/MasterStatusTmpl.jamon * /hbase/trunk/src/main/jamon/org/apache/hbase/tmpl/regionserver/RSStatusTmpl.jamon * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HServerLoad.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java [Coprocessors] Improve region server metrics to report loaded coprocessors to master Key: HBASE-4070 URL: https://issues.apache.org/jira/browse/HBASE-4070 Project: HBase Issue Type: Improvement Affects Versions: 0.90.3 Reporter: Mingjie Lai Assignee: Eugene Koontz Fix For: 0.92.0, 0.94.0 Attachments: HBASE-4070.patch, HBASE-4070.patch, HBASE-4070.patch, HBASE-4070.patch, master-web-ui.jpg, rs-status-web-ui.jpg HBASE-3512 is about listing loaded cp classes at shell. To make it more generic, we need a way to report this piece of information from region to master (or just at region server level). So later on, we can display the loaded class names at shell as well as web console. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4367) Deadlock in MemStore flusher due to JDK internally synchronizing on current thread
[ https://issues.apache.org/jira/browse/HBASE-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135008#comment-13135008 ] Hudson commented on HBASE-4367: --- Integrated in HBase-TRUNK #2366 (See [https://builds.apache.org/job/HBase-TRUNK/2366/]) HBASE-4367 Deadlock in MemStore flusher due to JDK internally synchronizing on current thread stack : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/Chore.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabCache.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Leases.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/LogRoller.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/HasThread.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/JVMClusterUtil.java Deadlock in MemStore flusher due to JDK internally synchronizing on current thread -- Key: HBASE-4367 URL: https://issues.apache.org/jira/browse/HBASE-4367 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.92.0 Attachments: 4367.txt, hbase-4367.txt We observed a deadlock in production between the following threads: - IPC handler thread holding the monitor lock on MemStoreFlusher inside reclaimMemStoreMemory, waiting to obtain MemStoreFlusher.lock (the reentrant lock member) - cacheFlusher thread inside flushRegion holds MemStoreFlusher.lock, and then calls PriorityCompactionQueue.add, which calls PriorityCompactionQueue.addToRegionsInQueue, which calls CompactionRequest.toString(), which calls Date.toString. If this occurs just after a GC under memory pressure, Date.toString needs to reload locale information (stored in a soft reference), so it calls ResourceBundle.loadBundle, which uses Thread.currentThread() as a synchronizer (see sun bug http://bugs.sun.com/view_bug.do?bug_id=6915621). Since the current thread is the MemStoreFlusher itself, we have a lock order inversion and a deadlock. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3512) Coprocessors: Shell support for listing currently loaded coprocessor set
[ https://issues.apache.org/jira/browse/HBASE-3512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135010#comment-13135010 ] Hudson commented on HBASE-3512: --- Integrated in HBase-TRUNK #2366 (See [https://builds.apache.org/job/HBase-TRUNK/2366/]) HBASE-3512 Shell support for listing currently loaded coprocessors apurtell : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/main/ruby/hbase/admin.rb Coprocessors: Shell support for listing currently loaded coprocessor set Key: HBASE-3512 URL: https://issues.apache.org/jira/browse/HBASE-3512 Project: HBase Issue Type: Improvement Components: coprocessors Reporter: Andrew Purtell Assignee: Eugene Koontz Fix For: 0.92.0, 0.94.0 Attachments: HBASE-3512-only.patch, HBASE-3512-only.patch, HBASE-3512.patch, HBASE-3512.patch, HBASE-3512.patch, hbase-shell-session.txt Add support to the shell for listing the coprocessors loaded globally on the regionserver and those loaded on a per-table basis. Perhaps by extending the 'status' command. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4191) hbase load balancer needs locality awareness
[ https://issues.apache.org/jira/browse/HBASE-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135066#comment-13135066 ] Ted Yu commented on HBASE-4191: --- Liyin modified JIRA description and put down max-flow problem. Original description didn't mention it. hbase load balancer needs locality awareness Key: HBASE-4191 URL: https://issues.apache.org/jira/browse/HBASE-4191 Project: HBase Issue Type: New Feature Reporter: Ted Yu Assignee: Liyin Tang Previously, HBASE-4114 implements the metrics for HFile HDFS block locality, which provides the HFile level locality information. But in order to work with load balancer and region assignment, we need the region level locality information. Let's define the region locality information first, which is almost the same as HFile locality index. HRegion locality index (HRegion A, RegionServer B) = (Total number of HDFS blocks that can be retrieved locally by the RegionServer B for the HRegion A) / ( Total number of the HDFS blocks for the Region A) So the HRegion locality index tells us that how much locality we can get if the HMaster assign the HRegion A to the RegionServer B. So there will be 2 steps involved to assign regions based on the locality. 1) During the cluster start up time, the master will scan the hdfs to calculate the HRegion locality index for each pair of HRegion and Region Server. It is pretty expensive to scan the dfs. So we only needs to do this once during the start up time. 2) During the cluster run time, each region server will update the HRegion locality index as metrics periodically as HBASE-4114 did. The Region Server can expose them to the Master through ZK, meta table, or just RPC messages. Based on the HRegion locality index, the assignment manager in the master would have a global knowledge about the region locality distribution. Imaging the HRegion locality index as the capacity between the region server set and region set, the assignment manager could the run the MAXIMUM FLOW solver to reach the global optimization. Also the master should share this global view to secondary master in case the master fail over happens. In addition, the HBASE-4491 (Locality Checker) is the tool, which is based on the same metrics, to proactively to scan dfs to calculate the global locality information in the cluster. It will help us to verify data locality information during the run time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4658) Put attributes are not exposed via the ThriftServer
[ https://issues.apache.org/jira/browse/HBASE-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135087#comment-13135087 ] Andrew Purtell commented on HBASE-4658: --- +1 for updating Thrift with support for attributes on all of Get, Put, Scan, Delete, Increment, ... Put attributes are not exposed via the ThriftServer --- Key: HBASE-4658 URL: https://issues.apache.org/jira/browse/HBASE-4658 Project: HBase Issue Type: Bug Components: thrift Reporter: dhruba borthakur Assignee: dhruba borthakur The Put api also takes in a bunch of arbitrary attributes that an application can use to associate metadata with each put operation. This is not exposed via Thrift. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4668) List HDFS enhancements to speed up backups for HBase
[ https://issues.apache.org/jira/browse/HBASE-4668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135090#comment-13135090 ] Andrew Purtell commented on HBASE-4668: --- Please consider both 0.20.x and 0.23.x variants. List HDFS enhancements to speed up backups for HBase Key: HBASE-4668 URL: https://issues.apache.org/jira/browse/HBASE-4668 Project: HBase Issue Type: Sub-task Components: documentation, regionserver Reporter: Karthik Ranganathan There are a host of improvements that help: - HDFS fast copy - Various enhancements to fast copy to speed up things - File level hard links - which does ext3 hardlinks instead of copying blocks thereby saving a lot of iops Need to list out the HDFS jira's and have patches on them. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4660) Place to publish RegionServer information such as webuiport and coprocessors loaded
[ https://issues.apache.org/jira/browse/HBASE-4660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135099#comment-13135099 ] Andrew Purtell commented on HBASE-4660: --- bq. HBASE-4070 added loaded CoProcessors to HServerLoad which seems like wrong place to carry this info. This issue didn't come up during code review. Agree that HSL is overloaded and at least should be renamed. Place to publish RegionServer information such as webuiport and coprocessors loaded --- Key: HBASE-4660 URL: https://issues.apache.org/jira/browse/HBASE-4660 Project: HBase Issue Type: Bug Reporter: stack HBASE-4070 added loaded CoProcessors to HServerLoad which seems like wrong place to carry this info. We need a locus for static info of this type such as loaded CoProcessors and webuiport as well as stuff like how many cpus, RAM, etc: e.g. in regionserver znode or available on invocation of an HRegionServer method (master can ask HRegionServer when it needs it). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4367) Deadlock in MemStore flusher due to JDK internally synchronizing on current thread
[ https://issues.apache.org/jira/browse/HBASE-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135116#comment-13135116 ] Hudson commented on HBASE-4367: --- Integrated in HBase-0.92 #79 (See [https://builds.apache.org/job/HBase-0.92/79/]) HBASE-4367 Deadlock in MemStore flusher due to JDK internally synchronizing on current thread stack : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/Chore.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabCache.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/Leases.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/LogRoller.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/util/HasThread.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/util/JVMClusterUtil.java Deadlock in MemStore flusher due to JDK internally synchronizing on current thread -- Key: HBASE-4367 URL: https://issues.apache.org/jira/browse/HBASE-4367 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.4 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.92.0 Attachments: 4367.txt, hbase-4367.txt We observed a deadlock in production between the following threads: - IPC handler thread holding the monitor lock on MemStoreFlusher inside reclaimMemStoreMemory, waiting to obtain MemStoreFlusher.lock (the reentrant lock member) - cacheFlusher thread inside flushRegion holds MemStoreFlusher.lock, and then calls PriorityCompactionQueue.add, which calls PriorityCompactionQueue.addToRegionsInQueue, which calls CompactionRequest.toString(), which calls Date.toString. If this occurs just after a GC under memory pressure, Date.toString needs to reload locale information (stored in a soft reference), so it calls ResourceBundle.loadBundle, which uses Thread.currentThread() as a synchronizer (see sun bug http://bugs.sun.com/view_bug.do?bug_id=6915621). Since the current thread is the MemStoreFlusher itself, we have a lock order inversion and a deadlock. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3929) Add option to HFile tool to produce basic stats
[ https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135115#comment-13135115 ] Hudson commented on HBASE-3929: --- Integrated in HBase-0.92 #79 (See [https://builds.apache.org/job/HBase-0.92/79/]) HBASE-3929 Add option to HFile tool to produce basic stats todd : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java Add option to HFile tool to produce basic stats --- Key: HBASE-3929 URL: https://issues.apache.org/jira/browse/HBASE-3929 Project: HBase Issue Type: New Feature Components: io Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Matteo Bertozzi Fix For: 0.92.0 Attachments: HBASE-3929-v2.patch, HBASE-3929-v3.patch, hbase-3929-draft.patch, hbase-3929-draft.txt In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce some basic statistics about it: - min/mean/max key size, value size (uncompressed) - min/mean/max number of columns per row (uncompressed) - min/mean/max number of bytes per row (uncompressed) - the key of the largest row -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4374) Up default regions size from 256M to 1G
[ https://issues.apache.org/jira/browse/HBASE-4374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135114#comment-13135114 ] Hudson commented on HBASE-4374: --- Integrated in HBase-0.92 #79 (See [https://builds.apache.org/job/HBase-0.92/79/]) HBASE-4374 Up default regions size from 256M to 1G stack : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/resources/hbase-default.xml Up default regions size from 256M to 1G --- Key: HBASE-4374 URL: https://issues.apache.org/jira/browse/HBASE-4374 Project: HBase Issue Type: Task Reporter: stack Priority: Blocker Fix For: 0.92.0 Attachments: 4374.txt HBASE-4365 has some discussion of why we default for a table should tend to fewer bigger regions. It doesn't look like this issue will be done for 0.92. For 0.92, lets up default region size from 256M to 1G and talk up pre-split on table creation in manual. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4656) Note how dfs.support.append has to be enabled in 0.20.205.0 clusters
[ https://issues.apache.org/jira/browse/HBASE-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135117#comment-13135117 ] Hudson commented on HBASE-4656: --- Integrated in HBase-0.92 #79 (See [https://builds.apache.org/job/HBase-0.92/79/]) HBASE-4656 Note how dfs.support.append has to be enabled in 0.20.205.0 clusters stack : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/resources/hbase-default.xml Note how dfs.support.append has to be enabled in 0.20.205.0 clusters Key: HBASE-4656 URL: https://issues.apache.org/jira/browse/HBASE-4656 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Fix For: 0.92.0 Attachments: 4656.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4670) Fix javadoc warnings
[ https://issues.apache.org/jira/browse/HBASE-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135118#comment-13135118 ] Hudson commented on HBASE-4670: --- Integrated in HBase-0.92 #79 (See [https://builds.apache.org/job/HBase-0.92/79/]) HBASE-4670 Fix javadoc warnings stack : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/HConstants.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/HServerAddress.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/HServerInfo.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/KeyValue.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ServerName.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/TableDescriptors.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/catalog/MetaMigrationRemovingHTD.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HConnection.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HTablePool.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/Mutation.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/RetriesExhaustedWithDetailsException.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/coprocessor/MasterObserver.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/coprocessor/package-info.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/executor/RegionTransitionData.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/filter/BitComparator.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/filter/ParseFilter.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/filter/RowFilter.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/io/hfile/InlineBlockWriter.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/io/hfile/slab/SlabCache.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ipc/Delayable.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/mapreduce/hadoopbackport/InputSampler.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/LoadBalancerFactory.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/monitoring/MonitoredTask.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/monitoring/ThreadMonitoring.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/SplitLogWorker.java *
[jira] [Commented] (HBASE-4588) The floating point arithmetic to validate memory allocation configurations need to be done as integers
[ https://issues.apache.org/jira/browse/HBASE-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135125#comment-13135125 ] Hudson commented on HBASE-4588: --- Integrated in HBase-0.92 #79 (See [https://builds.apache.org/job/HBase-0.92/79/]) HBASE-4588 The floating point arithmetic to validate memory allocation configurations need to be done as integers (dhruba) jgray : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/HBaseConfiguration.java The floating point arithmetic to validate memory allocation configurations need to be done as integers -- Key: HBASE-4588 URL: https://issues.apache.org/jira/browse/HBASE-4588 Project: HBase Issue Type: Bug Affects Versions: 0.92.0, 0.94.0 Reporter: Jonathan Gray Assignee: dhruba borthakur Priority: Minor Fix For: 0.92.0 Attachments: configVerify1.txt, configVerify2.txt The floating point arithmetic to validate memory allocation configurations need to be done as integers. On our cluster, we had block cache = 0.6 and memstore = 0.2. It was saying this was 0.8 when it is actually equal. Minor bug but annoying nonetheless. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4437) Update hadoop in 0.92 (0.20.205?)
[ https://issues.apache.org/jira/browse/HBASE-4437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135124#comment-13135124 ] Hudson commented on HBASE-4437: --- Integrated in HBase-0.92 #79 (See [https://builds.apache.org/job/HBase-0.92/79/]) HBASE-4437 Update hadoop in 0.92 (0.20.205?) stack : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/pom.xml * /hbase/branches/0.92/src/main/jamon/org/apache/hbase/tmpl/master/MasterStatusTmpl.jamon * /hbase/branches/0.92/src/main/jamon/org/apache/hbase/tmpl/regionserver/RSStatusTmpl.jamon * /hbase/branches/0.92/src/main/resources/hbase-webapps/master/table.jsp * /hbase/branches/0.92/src/main/resources/hbase-webapps/master/tablesDetailed.jsp * /hbase/branches/0.92/src/main/resources/hbase-webapps/master/zk.jsp Update hadoop in 0.92 (0.20.205?) - Key: HBASE-4437 URL: https://issues.apache.org/jira/browse/HBASE-4437 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Fix For: 0.92.0 Attachments: 4437.txt We ship with branch-0.20-append a few versions back from the tip. If 205 comes out and hbase works on it, we should ship 0.92 with it (while also ensuring it work w/ 0.22 and 0.23 branches). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4642) Add Apache License Header
[ https://issues.apache.org/jira/browse/HBASE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135122#comment-13135122 ] Hudson commented on HBASE-4642: --- Integrated in HBase-0.92 #79 (See [https://builds.apache.org/job/HBase-0.92/79/]) HBASE-4642 Add Apache License Header stack : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/bin/local-master-backup.sh * /hbase/branches/0.92/bin/local-regionservers.sh * /hbase/branches/0.92/bin/set_meta_block_caching.rb * /hbase/branches/0.92/bin/set_meta_memstore_size.rb * /hbase/branches/0.92/pom.xml Add Apache License Header -- Key: HBASE-4642 URL: https://issues.apache.org/jira/browse/HBASE-4642 Project: HBase Issue Type: Improvement Reporter: Giridharan Kesavan Assignee: stack Fix For: 0.92.0 Attachments: 4642.txt, ratswarnings.txt executing mvn apache-rat:check fails with [ERROR] Failed to execute goal org.apache.rat:apache-rat-plugin:0.6:check (default-cli) on project hbase: Too many unapproved licenses: 84 - [Help 1] org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.rat:apache-rat-plugin:0.6:check (default-cli) on project hbase: Too many unapproved licenses: 84 there are about 70 + files which are missing the Apache License Headers and rest of them should be added to the exclude list. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4070) [Coprocessors] Improve region server metrics to report loaded coprocessors to master
[ https://issues.apache.org/jira/browse/HBASE-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135123#comment-13135123 ] Hudson commented on HBASE-4070: --- Integrated in HBase-0.92 #79 (See [https://builds.apache.org/job/HBase-0.92/79/]) HBASE-4070 Improve region server metrics to report loaded coprocessors to master apurtell : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/jamon/org/apache/hbase/tmpl/master/MasterStatusTmpl.jamon * /hbase/branches/0.92/src/main/jamon/org/apache/hbase/tmpl/regionserver/RSStatusTmpl.jamon * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/HServerLoad.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java [Coprocessors] Improve region server metrics to report loaded coprocessors to master Key: HBASE-4070 URL: https://issues.apache.org/jira/browse/HBASE-4070 Project: HBase Issue Type: Improvement Affects Versions: 0.90.3 Reporter: Mingjie Lai Assignee: Eugene Koontz Fix For: 0.92.0, 0.94.0 Attachments: HBASE-4070.patch, HBASE-4070.patch, HBASE-4070.patch, HBASE-4070.patch, master-web-ui.jpg, rs-status-web-ui.jpg HBASE-3512 is about listing loaded cp classes at shell. To make it more generic, we need a way to report this piece of information from region to master (or just at region server level). So later on, we can display the loaded class names at shell as well as web console. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4651) ConcurrentModificationException might be thrown in TestHCM.testConnectionUniqueness
[ https://issues.apache.org/jira/browse/HBASE-4651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135121#comment-13135121 ] Hudson commented on HBASE-4651: --- Integrated in HBase-0.92 #79 (See [https://builds.apache.org/job/HBase-0.92/79/]) HBASE-4651 ConcurrentModificationException might be thrown in TestHCM.testConnectionUniqueness (Jinchao) tedyu : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java ConcurrentModificationException might be thrown in TestHCM.testConnectionUniqueness --- Key: HBASE-4651 URL: https://issues.apache.org/jira/browse/HBASE-4651 Project: HBase Issue Type: Test Affects Versions: 0.92.0 Reporter: gaojinchao Assignee: gaojinchao Priority: Minor Fix For: 0.92.0 Attachments: HBASE-4651_Trunk.patch See https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2357/testReport/junit/org.apache.hadoop.hbase.client/TestHCM/testConnectionUniqueness 363 EntryK,V nextEntry() { 364 if (modCount != expectedModCount) 365 throw new ConcurrentModificationExceptionhttp://kickjava.com/src/java/util/ConcurrentModificationException.java.htm [image: JavaDoc] http://kickjava.com/2487.htm(); Read more: http://kickjava.com/src/java/util/LinkedHashMap.java.htm#ixzz1bbCC0gaT HCM uses proper synchronization when accessing HBASE_INSTANCES. Looking at TestHCM.getValidKeyCount(), it puts values of HBASE_INSTANCES in a Set and returns the size of the Set. However, post HBASE-3777, the values (HConnectionImplementation's) in HBASE_INSTANCES would be unique. TestHCM.getValidKeyCount() can be removed from the test. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4447) Allow hbase.version to be passed in as command-line argument
[ https://issues.apache.org/jira/browse/HBASE-4447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135127#comment-13135127 ] Hudson commented on HBASE-4447: --- Integrated in HBase-0.92 #79 (See [https://builds.apache.org/job/HBase-0.92/79/]) HBASE-4447 Allow hbase.version to be passed in as command-line argument; REVERTsvn diffsvn diff stack : Files : * /hbase/branches/0.92/pom.xml Allow hbase.version to be passed in as command-line argument Key: HBASE-4447 URL: https://issues.apache.org/jira/browse/HBASE-4447 Project: HBase Issue Type: Improvement Components: build Affects Versions: 0.92.0 Reporter: Joep Rottinghuis Assignee: Joep Rottinghuis Fix For: 0.92.0 Attachments: HBASE-4447-0.92.patch Currently the build always produces the jars and tarball according to the version baked into the POM. When we modify this to allow the version to be passed in as a command-line argument, it can still default to the same behavior, yet give the flexibility for an internal build to tag on own version. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4578) NPE when altering a table that has moving regions
[ https://issues.apache.org/jira/browse/HBASE-4578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135126#comment-13135126 ] Hudson commented on HBASE-4578: --- Integrated in HBase-0.92 #79 (See [https://builds.apache.org/job/HBase-0.92/79/]) HBASE-4578 NPE when altering a table that has moving regions stack : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java NPE when altering a table that has moving regions - Key: HBASE-4578 URL: https://issues.apache.org/jira/browse/HBASE-4578 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Assignee: gaojinchao Priority: Blocker Fix For: 0.92.0 Attachments: HBASE-4578_Trunk_V1.patch, HBASE-4578_trial_Trunk.patch I'm still not a 100% sure on the source of this error, but here's what I was able to get twice while altering a table that was doing a bunch of splits: {quote} 2011-10-11 23:48:59,344 INFO org.apache.hadoop.hbase.master.handler.SplitRegionHandler: Handled SPLIT report); parent=TestTable,0002608338,1318376880454.a75d6815fdfc513fb1c8aabe086c6763. daughter a=TestTable,0002608338,1318376938764.ef170ff6cd8695dc8aec92e542dc9ac1.daughter b=TestTable,0003301408,1318376938764.36eb2530341bd46888ede312c5559b5d. 2011-10-11 23:49:09,579 DEBUG org.apache.hadoop.hbase.master.handler.TableEventHandler: Ignoring table not disabled exception for supporting online schema changes. 2011-10-11 23:49:09,580 INFO org.apache.hadoop.hbase.master.handler.TableEventHandler: Handling table operation C_M_MODIFY_TABLE on table TestTable 2011-10-11 23:49:09,612 INFO org.apache.hadoop.hbase.util.FSUtils: TableInfoPath = hdfs://sv4r11s38:9100/hbase/TestTable/.tableinfo tmpPath = hdfs://sv4r11s38:9100/hbase/TestTable/.tmp/.tableinfo.1318376949612 2011-10-11 23:49:09,692 INFO org.apache.hadoop.hbase.util.FSUtils: TableDescriptor stored. TableInfoPath = hdfs://sv4r11s38:9100/hbase/TestTable/.tableinfo 2011-10-11 23:49:09,693 INFO org.apache.hadoop.hbase.util.FSUtils: Updated tableinfo=hdfs://sv4r11s38:9100/hbase/TestTable/.tableinfo to blah 2011-10-11 23:49:09,695 INFO org.apache.hadoop.hbase.master.handler.TableEventHandler: Bucketing regions by region server... 2011-10-11 23:49:09,695 DEBUG org.apache.hadoop.hbase.client.MetaScanner: Scanning .META. starting at row=TestTable,,00 for max=2147483647 rows 2011-10-11 23:49:09,709 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: The connection to hconnection-0x132f043bbde02e9 has been closed. 2011-10-11 23:49:09,709 ERROR org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while processing event C_M_MODIFY_TABLE java.lang.NullPointerException at java.util.TreeMap.getEntry(TreeMap.java:324) at java.util.TreeMap.containsKey(TreeMap.java:209) at org.apache.hadoop.hbase.master.handler.TableEventHandler.reOpenAllRegions(TableEventHandler.java:114) at org.apache.hadoop.hbase.master.handler.TableEventHandler.process(TableEventHandler.java:90) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:168) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {quote} The first time the shell reported that all the regions were updated correctly, the second time it got stuck for a while: {quote} 6/14 regions updated. 0/14 regions updated. ... 0/14 regions updated. 2/16 regions updated. ... 2/16 regions updated. 8/9 regions updated. ... 8/9 regions updated. {quote} After which I killed it, redid the alter and it worked. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4671) HBaseTestingUtility unable to connect to regionserver because of 127.0.0.1 / 127.0.1.1 discrepancy
HBaseTestingUtility unable to connect to regionserver because of 127.0.0.1 / 127.0.1.1 discrepancy -- Key: HBASE-4671 URL: https://issues.apache.org/jira/browse/HBASE-4671 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.90.4 Environment: At least Ubuntu 11.10 with a default hosts file. Reporter: Ferdy When /etc/hosts contains following lines (and this is not uncommon) it will cause HBaseTestingUtility to malfunction. 127.0.0.1 localhost 127.0.1.1 myMachineName Symptoms: 2011-10-25 17:38:30,875 WARN master.AssignmentManager - Failed assignment of -ROOT-,,0.70236052 to serverName=localhost,34462,1319557102914, load=(requests=0, regions=0, usedHeap=46, maxHeap=865), trying to assign elsewhere instead; retry=0 org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy interface org.apache.hadoop.hbase.ipc.HRegionInterface to /127.0.0.1:34462 after attempts=1 because 2011-10-25 17:38:28,371 INFO regionserver.HRegionServer - Serving as localhost,34462,1319557102914, RPC listening on /127.0.1.1:34462, sessionid=0x1333bbb7a180002 caused by /127.0.0.1:34462 vs /127.0.1.1:34462 Workaround: Changing 127.0.1.1 to 127.0.0.1 works. Permanent solution: Dunno, my understanding of inner workings is not sufficient enough. Although it seems like it has something to do with changing the machine name from myMachineName to localhost during the test: 2011-10-25 17:38:28,056 INFO regionserver.HRegionServer - Master passed us address to use. Was=myMachineName:34462, Now=localhost:34462 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4671) HBaseTestingUtility unable to connect to regionserver because of 127.0.0.1 / 127.0.1.1 discrepancy
[ https://issues.apache.org/jira/browse/HBASE-4671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135180#comment-13135180 ] Ferdy commented on HBASE-4671: -- (Changing 127.0.1.1 to 127.0.0.1 in the hosts file that is.) HBaseTestingUtility unable to connect to regionserver because of 127.0.0.1 / 127.0.1.1 discrepancy -- Key: HBASE-4671 URL: https://issues.apache.org/jira/browse/HBASE-4671 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.90.4 Environment: At least Ubuntu 11.10 with a default hosts file. Reporter: Ferdy When /etc/hosts contains following lines (and this is not uncommon) it will cause HBaseTestingUtility to malfunction. 127.0.0.1 localhost 127.0.1.1 myMachineName Symptoms: 2011-10-25 17:38:30,875 WARN master.AssignmentManager - Failed assignment of -ROOT-,,0.70236052 to serverName=localhost,34462,1319557102914, load=(requests=0, regions=0, usedHeap=46, maxHeap=865), trying to assign elsewhere instead; retry=0 org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy interface org.apache.hadoop.hbase.ipc.HRegionInterface to /127.0.0.1:34462 after attempts=1 because 2011-10-25 17:38:28,371 INFO regionserver.HRegionServer - Serving as localhost,34462,1319557102914, RPC listening on /127.0.1.1:34462, sessionid=0x1333bbb7a180002 caused by /127.0.0.1:34462 vs /127.0.1.1:34462 Workaround: Changing 127.0.1.1 to 127.0.0.1 works. Permanent solution: Dunno, my understanding of inner workings is not sufficient enough. Although it seems like it has something to do with changing the machine name from myMachineName to localhost during the test: 2011-10-25 17:38:28,056 INFO regionserver.HRegionServer - Master passed us address to use. Was=myMachineName:34462, Now=localhost:34462 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135222#comment-13135222 ] Ted Yu commented on HBASE-4377: --- @Sebastian: 0.92 is close to release. I got the following: {code} [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile (default-testCompile) on project hbase: Compilation failure: Compilation failure: [ERROR] /Users/zhihyu/92hbase/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java:[277,21] doFsck(org.apache.hadoop.conf.Configuration,boolean) in org.apache.hadoop.hbase.util.hbck.HbckTestingUtil cannot be applied to (boolean) [ERROR] [ERROR] /Users/zhihyu/92hbase/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java:[286,23] doFsck(org.apache.hadoop.conf.Configuration,boolean) in org.apache.hadoop.hbase.util.hbck.HbckTestingUtil cannot be applied to (boolean) {code} Do you mind refresh patch for 0.92 ? Thanks [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, hbase-4377-trunk.v2.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4669) Add an option of using round-robin assignment for enabling table
[ https://issues.apache.org/jira/browse/HBASE-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4669: -- Description: Under some scenarios, we use the function of disable/enable HTable. But currently, enable HTable uses the random-assignment. We hope all the regions show a better distribution, no matter how many regions and how many regionservers. So I suggest to add an option of using round-robin assignment on enable-table. was: Under some scenarios, we use the function of disable/enable HTable. But currently, enable HTable using the random-assignment. We hope all the regions show a better distribution, no matter how many regions and how many regionservers. So I suggest to add a choice of using round-robin assignment on enable-table. Summary: Add an option of using round-robin assignment for enabling table (was: Suggest to add a choice of using round-robin assignment on enable-table) Add an option of using round-robin assignment for enabling table Key: HBASE-4669 URL: https://issues.apache.org/jira/browse/HBASE-4669 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.90.4, 0.94.0 Reporter: Jieshan Bean Priority: Minor Fix For: 0.94.0, 0.90.5 Under some scenarios, we use the function of disable/enable HTable. But currently, enable HTable uses the random-assignment. We hope all the regions show a better distribution, no matter how many regions and how many regionservers. So I suggest to add an option of using round-robin assignment on enable-table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4377: -- Status: Open (was: Patch Available) Prepare for Jenkins patch testing. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, hbase-4377-trunk.v2.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2611) Handle RS that fails while processing the failure of another one
[ https://issues.apache.org/jira/browse/HBASE-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135292#comment-13135292 ] Jean-Daniel Cryans commented on HBASE-2611: --- Actually it would be nice if it was in a separate utility package since atomically moving a znode folder recursively would be a very useful function in general. It might even already exist on the net. Handle RS that fails while processing the failure of another one Key: HBASE-2611 URL: https://issues.apache.org/jira/browse/HBASE-2611 Project: HBase Issue Type: Sub-task Components: replication Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans HBASE-2223 doesn't manage region servers that fail while doing the transfer of HLogs queues from other region servers that failed. Devise a reliable way to do it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-4377: -- Attachment: hbase-4377.trunk.v3.txt attached a non git style version of v3 of the patch. applies on trunk and 0.92. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4377: -- Affects Version/s: 0.92.0 Status: Patch Available (was: Open) [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135324#comment-13135324 ] Jonathan Hsieh commented on HBASE-4377: --- Seb glad to hear that this basically worked for you. Would it make sense to add Seb's change as a separate jira after the original patch gets committed? IMO, it feels like it needs a test case as well. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Bauer updated HBASE-4377: --- Attachment: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch patch with Ted comments and corrected tests PS. for python programer it's still strange that i need to use equals not == for objects ;) [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4648) Bytes.toBigDecimal() doesn't use offset
[ https://issues.apache.org/jira/browse/HBASE-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135371#comment-13135371 ] Lars Hofhansl commented on HBASE-4648: -- Patch looks good to me. I'm find applying to 0.90, 0.92, and trunk. Nobody should use toBigDecimal(byte[], int). If nobody objects I'll commit this later today. Bytes.toBigDecimal() doesn't use offset --- Key: HBASE-4648 URL: https://issues.apache.org/jira/browse/HBASE-4648 Project: HBase Issue Type: Bug Components: util Affects Versions: 0.90.4 Environment: Java 1.6.0_26, Mac OS X 10.7 and CentOS 6 Reporter: Bryan Keller Attachments: bigdec.patch, bigdec2.patch The Bytes.toBigDecimal(byte[], offset, len) method does not use the offset, thus you will get an incorrect result for the BigDecimal unless the BigDecimal's bytes are at the beginning of the byte array. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (HBASE-4648) Bytes.toBigDecimal() doesn't use offset
[ https://issues.apache.org/jira/browse/HBASE-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135371#comment-13135371 ] Lars Hofhansl edited comment on HBASE-4648 at 10/25/11 7:56 PM: Patch looks good to me. I'm fine applying to 0.90, 0.92, and trunk. Nobody should use toBigDecimal(byte[], int). If nobody objects I'll commit this later today. was (Author: lhofhansl): Patch looks good to me. I'm find applying to 0.90, 0.92, and trunk. Nobody should use toBigDecimal(byte[], int). If nobody objects I'll commit this later today. Bytes.toBigDecimal() doesn't use offset --- Key: HBASE-4648 URL: https://issues.apache.org/jira/browse/HBASE-4648 Project: HBase Issue Type: Bug Components: util Affects Versions: 0.90.4 Environment: Java 1.6.0_26, Mac OS X 10.7 and CentOS 6 Reporter: Bryan Keller Attachments: bigdec.patch, bigdec2.patch The Bytes.toBigDecimal(byte[], offset, len) method does not use the offset, thus you will get an incorrect result for the BigDecimal unless the BigDecimal's bytes are at the beginning of the byte array. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giridharan Kesavan updated HBASE-4377: -- Status: Open (was: Patch Available) [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4191) hbase load balancer needs locality awareness
[ https://issues.apache.org/jira/browse/HBASE-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135395#comment-13135395 ] Karthik Ranganathan commented on HBASE-4191: A couple of initial thoughts on things we would need to consider: 1. I think there should be a weight for node-locality, rack-locality and cross-rack reads while computing the flow. 2. Also, I think we need one more constraint - we want the final state to have roughly the same number of regions per regionserver (within the slop). So we would need a decision tree or some such. hbase load balancer needs locality awareness Key: HBASE-4191 URL: https://issues.apache.org/jira/browse/HBASE-4191 Project: HBase Issue Type: New Feature Reporter: Ted Yu Assignee: Liyin Tang Previously, HBASE-4114 implements the metrics for HFile HDFS block locality, which provides the HFile level locality information. But in order to work with load balancer and region assignment, we need the region level locality information. Let's define the region locality information first, which is almost the same as HFile locality index. HRegion locality index (HRegion A, RegionServer B) = (Total number of HDFS blocks that can be retrieved locally by the RegionServer B for the HRegion A) / ( Total number of the HDFS blocks for the Region A) So the HRegion locality index tells us that how much locality we can get if the HMaster assign the HRegion A to the RegionServer B. So there will be 2 steps involved to assign regions based on the locality. 1) During the cluster start up time, the master will scan the hdfs to calculate the HRegion locality index for each pair of HRegion and Region Server. It is pretty expensive to scan the dfs. So we only needs to do this once during the start up time. 2) During the cluster run time, each region server will update the HRegion locality index as metrics periodically as HBASE-4114 did. The Region Server can expose them to the Master through ZK, meta table, or just RPC messages. Based on the HRegion locality index, the assignment manager in the master would have a global knowledge about the region locality distribution and can run the MIN COST MAXIMUM FLOW solver to reach the global optimization. Let's construct the graph first: [Graph] Imaging there is a bipartite graph and the left side is the set of regions and the right side is the set of region servers. There is a source node which links itself to each node in the region set. There is a sink node which is linked from each node in the region server set. [Capacity] The capacity between the source node and region nodes is 1. And the capacity between the region nodes and region server nodes is also 1. (The purpose is each region can ONLY be assigned to one region server at one time) The capacity between the region server nodes and sink node are the avg number of regions which should be assigned each region server. (The purpose is balance the load for each region server) [Cost] The cost between each region and region server is the opposite of locality index, which means the higher locality is, if region A is assigned to region server B, the lower cost it is. The cost function could be more sophisticated when we put more metrics into account. So after running the min-cost max flow solver, the master could assign the regions based on the global locality optimization. Also the master should share this global view to secondary master in case the master fail over happens. In addition, the HBASE-4491 (Locality Checker) is the tool, which is based on the same metrics, to proactively to scan dfs to calculate the global locality information in the cluster. It will help us to verify data locality information during the run time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4191) hbase load balancer needs locality awareness
[ https://issues.apache.org/jira/browse/HBASE-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135404#comment-13135404 ] Ted Yu commented on HBASE-4191: --- @Liyin: Thanks for formulating the requirement. I think the description should be modified after consensus is reached. In the meantime, feel free to make comments in this JIRA. I think the following goals may not be achieved at the same time: 1. maximum (node/rack) locality 2. the same number of regions on each live region server As Karthik said, slop is an important factor in decision making. hbase load balancer needs locality awareness Key: HBASE-4191 URL: https://issues.apache.org/jira/browse/HBASE-4191 Project: HBase Issue Type: New Feature Reporter: Ted Yu Assignee: Liyin Tang Previously, HBASE-4114 implements the metrics for HFile HDFS block locality, which provides the HFile level locality information. But in order to work with load balancer and region assignment, we need the region level locality information. Let's define the region locality information first, which is almost the same as HFile locality index. HRegion locality index (HRegion A, RegionServer B) = (Total number of HDFS blocks that can be retrieved locally by the RegionServer B for the HRegion A) / ( Total number of the HDFS blocks for the Region A) So the HRegion locality index tells us that how much locality we can get if the HMaster assign the HRegion A to the RegionServer B. So there will be 2 steps involved to assign regions based on the locality. 1) During the cluster start up time, the master will scan the hdfs to calculate the HRegion locality index for each pair of HRegion and Region Server. It is pretty expensive to scan the dfs. So we only needs to do this once during the start up time. 2) During the cluster run time, each region server will update the HRegion locality index as metrics periodically as HBASE-4114 did. The Region Server can expose them to the Master through ZK, meta table, or just RPC messages. Based on the HRegion locality index, the assignment manager in the master would have a global knowledge about the region locality distribution and can run the MIN COST MAXIMUM FLOW solver to reach the global optimization. Let's construct the graph first: [Graph] Imaging there is a bipartite graph and the left side is the set of regions and the right side is the set of region servers. There is a source node which links itself to each node in the region set. There is a sink node which is linked from each node in the region server set. [Capacity] The capacity between the source node and region nodes is 1. And the capacity between the region nodes and region server nodes is also 1. (The purpose is each region can ONLY be assigned to one region server at one time) The capacity between the region server nodes and sink node are the avg number of regions which should be assigned each region server. (The purpose is balance the load for each region server) [Cost] The cost between each region and region server is the opposite of locality index, which means the higher locality is, if region A is assigned to region server B, the lower cost it is. The cost function could be more sophisticated when we put more metrics into account. So after running the min-cost max flow solver, the master could assign the regions based on the global locality optimization. Also the master should share this global view to secondary master in case the master fail over happens. In addition, the HBASE-4491 (Locality Checker) is the tool, which is based on the same metrics, to proactively to scan dfs to calculate the global locality information in the cluster. It will help us to verify data locality information during the run time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4672) Ability to expand HBase racks gracefully
Ability to expand HBase racks gracefully Key: HBASE-4672 URL: https://issues.apache.org/jira/browse/HBASE-4672 Project: HBase Issue Type: Umbrella Reporter: Karthik Ranganathan Assignee: Karthik Ranganathan When a HBase cell gets pretty saturated on top-of-rack bandwidth, then adding another rack of regionservers actually hurts. This is a task to be able to gracefully expand the cluster -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4673) NPE in HFileReaderV2.close during major compaction when hfile.block.cache.size is set to 0
NPE in HFileReaderV2.close during major compaction when hfile.block.cache.size is set to 0 --- Key: HBASE-4673 URL: https://issues.apache.org/jira/browse/HBASE-4673 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Lars Hofhansl Priority: Minor On a test system got this exception when hfile.block.cache.size is set to 0: java.lang.NullPointerException at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.close(HFileReaderV2.java:321) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.close(StoreFile.java:1065) at org.apache.hadoop.hbase.regionserver.StoreFile.closeReader(StoreFile.java:539) at org.apache.hadoop.hbase.regionserver.StoreFile.deleteReader(StoreFile.java:549) at org.apache.hadoop.hbase.regionserver.Store.completeCompaction(Store.java:1314) at org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:686) at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1016) at org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest.run(CompactionRequest.java:178) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Minor issue as nobody in their right mind with have hfile.block.cache.size=0 Looks like this is due to HBASE-4422 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4674) splitLog silently fails
splitLog silently fails --- Key: HBASE-4674 URL: https://issues.apache.org/jira/browse/HBASE-4674 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Environment: splitLog() can fail silently and region can open w/o its edits getting replayed. Reporter: Prakash Khemani Assignee: Prakash Khemani Priority: Blocker -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4508) Backport HBASE-3777 to 0.90 branch
[ https://issues.apache.org/jira/browse/HBASE-4508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4508: -- Release Note: A new config parameter, hbase.connection.per.config, has been added. If set to true, there is no connection sharing. If set to false, connection sharing would be enabled so that fewer connections to zookeeper are used. Backport HBASE-3777 to 0.90 branch -- Key: HBASE-4508 URL: https://issues.apache.org/jira/browse/HBASE-4508 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Bright Fulton Fix For: 0.90.5 Attachments: HBASE-4508.v1.patch, HBASE-4508.v2.patch, HBASE-4508.v3.patch, HBASE-4508.v4.git.patch, HBASE-4508.v4.patch, HBASE-4508.v5.patch See discussion here: http://search-hadoop.com/m/MJBId1aazTR1/backporting+HBASE-3777+to+0.90subj=backporting+HBASE+3777+to+0+90 Rocketfuel has been running 0.90.3 with HBASE-3777 since its resolution. They have 10 RS nodes , 1 Master and 1 Zookeeper Live writes and reads but super heavy on reads. Cache hit is pretty high. The qps on one of their data centers is 50K. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135459#comment-13135459 ] Ted Yu commented on HBASE-4377: --- Sebastian's latest patch applies to 0.92 and hbck related tests passed. I think we should include his enhancement. Test for his scenario can be added later. @Jonathan: What's your opinion ? [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135478#comment-13135478 ] Jonathan Hsieh commented on HBASE-4377: --- @Ted I'm basically ok wit it. @Seb can you post some of the bad .regioninfo files? I'm curious about what you did to need to use a full rebuild! [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135486#comment-13135486 ] Hadoop QA commented on HBASE-4377: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12500737/0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 18 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -167 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.coprocessor.TestMasterObserver org.apache.hadoop.hbase.master.TestDistributedLogSplitting Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/66//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/66//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/66//console This message is automatically generated. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135494#comment-13135494 ] Ted Yu commented on HBASE-4377: --- Some tests failed due to: {code} Caused by: java.io.IOException: Too many open files at sun.nio.ch.IOUtil.initPipe(Native Method) at sun.nio.ch.EPollSelectorImpl.init(EPollSelectorImpl.java:49) at sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:18) {code} One test failure is tracked by HBASE-4675 [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, hbase-4377-trunk.v2.patch, hbase-4377.trunk.v3.txt In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4379) [hbck] Does not complain about tables with no end region [Z,]
[ https://issues.apache.org/jira/browse/HBASE-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135512#comment-13135512 ] Jonathan Hsieh commented on HBASE-4379: --- This one is very similar to HBASE-4378 which was recently review/commmitted, and comments/complaints about this one? [hbck] Does not complain about tables with no end region [Z,] - Key: HBASE-4379 URL: https://issues.apache.org/jira/browse/HBASE-4379 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.92.0, 0.90.5 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4379-hbck-does-not-complain-about-tables-with-.patch hbck does not detect or have an error condition when the last region of a table is missing (end key != ''). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4379) [hbck] Does not complain about tables with no end region [Z,]
[ https://issues.apache.org/jira/browse/HBASE-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135517#comment-13135517 ] Ted Yu commented on HBASE-4379: --- {code} The last region in table was not null. There should be a region {code} I think the above message can be improved. How about: {code} The last region in table doesn't have null end key. There should be a region {code} [hbck] Does not complain about tables with no end region [Z,] - Key: HBASE-4379 URL: https://issues.apache.org/jira/browse/HBASE-4379 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.92.0, 0.90.5 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4379-hbck-does-not-complain-about-tables-with-.patch hbck does not detect or have an error condition when the last region of a table is missing (end key != ''). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4304) requestsPerSecond counter stuck at 0
[ https://issues.apache.org/jira/browse/HBASE-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135534#comment-13135534 ] Li Pi commented on HBASE-4304: -- On it! On Tue, Oct 25, 2011 at 12:16 AM, stack (Commented) (JIRA) requestsPerSecond counter stuck at 0 Key: HBASE-4304 URL: https://issues.apache.org/jira/browse/HBASE-4304 Project: HBase Issue Type: Bug Components: master, regionserver Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Li Pi Priority: Critical Fix For: 0.92.0 Running trunk @ r1163343, all of the requestsPerSecond counters are showing 0 both in the master UI and in the RS UI. The writeRequestsCount metric is properly updating in the RS UI. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4676) Prefix Compression - Trie data block encoding
Prefix Compression - Trie data block encoding - Key: HBASE-4676 URL: https://issues.apache.org/jira/browse/HBASE-4676 Project: HBase Issue Type: New Feature Components: io Affects Versions: 0.94.0 Reporter: Matt Corgan The HBase data block format has room for 2 significant improvements for applications that have high block cache hit ratios. First, there is no prefix compression, and the current KeyValue format is somewhat metadata heavy, so there can be tremendous memory bloat for many common data layouts, specifically those with long keys and short values. Second, there is no random access to KeyValues inside data blocks. This means that every time you double the datablock size, average seek time (or average cpu consumption) goes up by a factor of 2. The standard 64KB block size is ~10x slower for random seeks than a 4KB block size, but block sizes as small as 4KB cause problems elsewhere. Using block sizes of 256KB or 1MB or more may be more efficient from a disk access and block-cache perspective in many big-data applications, but doing so is infeasible from a random seek perspective. The PrefixTrie block encoding format attempts to solve both of these problems. Some features: * trie format for row key encoding completely eliminates duplicate row keys and encodes similar row keys into a standard trie structure which also saves a lot of space * the column family is currently stored once at the beginning of each block. this could easily be modified to allow multiple family names per block * all qualifiers in the block are stored in their own trie format which caters nicely to wide rows. duplicate qualifers between rows are eliminated. the size of this trie determines the width of the block's qualifier fixed-width-int * the minimum timestamp is stored at the beginning of the block, and deltas are calculated from that. the maximum delta determines the width of the block's timestamp fixed-width-int The block is structured with metadata at the beginning, then a section for the row trie, then the column trie, then the timestamp deltas, and then then all the values. Most work is done in the row trie, where every leaf node (corresponding to a row) contains a list of offsets/references corresponding to the cells in that row. Each cell is fixed-width to enable binary searching and is represented by [1 byte operationType, X bytes qualifier offset, X bytes timestamp delta offset]. If all operation types are the same for a block, there will be zero per-cell overhead. Same for timestamps. Same for qualifiers when i get a chance. So, the compression aspect is very strong, but makes a few small sacrifices on VarInt size to enable faster binary searches in trie fan-out nodes. A more compressed but slower version might build on this by also applying further (suffix, etc) compression on the trie nodes at the cost of slower write speed. Even further compression could be obtained by using all VInts instead of FInts with a sacrifice on random seek speed (though not huge). One current drawback is the current write speed. While programmed with good constructs like TreeMaps, ByteBuffers, binary searches, etc, it's not programmed with the same level of optimization as the read path. Work will need to be done to optimize the data structures used for encoding and could probably show a 10x increase. It will still be slower than delta encoding, but with a much higher decode speed. I have not yet created a thorough benchmark for write speed nor sequential read speed. Though the trie is reaching a point where it is internally very efficient (probably within half or a quarter of its max read speed) the way that hbase currently uses it is far from optimal. The KeyValueScanner and related classes that iterate through the trie will eventually need to be smarter and have methods to do things like skipping to the next row of results without scanning every cell in between. When that is accomplished it will also allow much faster compactions because the full row key will not have to be compared as often as it is now. Current code is on github. The trie code is in a separate project than the slightly modified hbase. There is an hbase project there as well with the DeltaEncoding patch applied, and it builds on top of that. https://github.com/hotpads/hbase/tree/delta-encoding-plus-trie https://github.com/hotpads/hbase-prefix-trie I'll follow up later with more implementation ideas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4676) Prefix Compression - Trie data block encoding
[ https://issues.apache.org/jira/browse/HBASE-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Corgan updated HBASE-4676: --- Attachment: SeeksPerSec by blockSize.png See the attached chart for a comparison of seeks/second between the current block format (NONE), the PREFIX delta encoding, and the TRIE encoding. This shows how important the random block access is when blocks are at the default size, let alone if they're larger. With 16 software threads running on a 4 core (8 hyperthread) cpu, the current block format NONE can do ~38k seeks/s, the PREFIX encoding can do ~20k, and the TRIE encoding can do ~1.6mm. So 40-80x faster seeks at medium block size. Prefix Compression - Trie data block encoding - Key: HBASE-4676 URL: https://issues.apache.org/jira/browse/HBASE-4676 Project: HBase Issue Type: New Feature Components: io Affects Versions: 0.94.0 Reporter: Matt Corgan Attachments: SeeksPerSec by blockSize.png The HBase data block format has room for 2 significant improvements for applications that have high block cache hit ratios. First, there is no prefix compression, and the current KeyValue format is somewhat metadata heavy, so there can be tremendous memory bloat for many common data layouts, specifically those with long keys and short values. Second, there is no random access to KeyValues inside data blocks. This means that every time you double the datablock size, average seek time (or average cpu consumption) goes up by a factor of 2. The standard 64KB block size is ~10x slower for random seeks than a 4KB block size, but block sizes as small as 4KB cause problems elsewhere. Using block sizes of 256KB or 1MB or more may be more efficient from a disk access and block-cache perspective in many big-data applications, but doing so is infeasible from a random seek perspective. The PrefixTrie block encoding format attempts to solve both of these problems. Some features: * trie format for row key encoding completely eliminates duplicate row keys and encodes similar row keys into a standard trie structure which also saves a lot of space * the column family is currently stored once at the beginning of each block. this could easily be modified to allow multiple family names per block * all qualifiers in the block are stored in their own trie format which caters nicely to wide rows. duplicate qualifers between rows are eliminated. the size of this trie determines the width of the block's qualifier fixed-width-int * the minimum timestamp is stored at the beginning of the block, and deltas are calculated from that. the maximum delta determines the width of the block's timestamp fixed-width-int The block is structured with metadata at the beginning, then a section for the row trie, then the column trie, then the timestamp deltas, and then then all the values. Most work is done in the row trie, where every leaf node (corresponding to a row) contains a list of offsets/references corresponding to the cells in that row. Each cell is fixed-width to enable binary searching and is represented by [1 byte operationType, X bytes qualifier offset, X bytes timestamp delta offset]. If all operation types are the same for a block, there will be zero per-cell overhead. Same for timestamps. Same for qualifiers when i get a chance. So, the compression aspect is very strong, but makes a few small sacrifices on VarInt size to enable faster binary searches in trie fan-out nodes. A more compressed but slower version might build on this by also applying further (suffix, etc) compression on the trie nodes at the cost of slower write speed. Even further compression could be obtained by using all VInts instead of FInts with a sacrifice on random seek speed (though not huge). One current drawback is the current write speed. While programmed with good constructs like TreeMaps, ByteBuffers, binary searches, etc, it's not programmed with the same level of optimization as the read path. Work will need to be done to optimize the data structures used for encoding and could probably show a 10x increase. It will still be slower than delta encoding, but with a much higher decode speed. I have not yet created a thorough benchmark for write speed nor sequential read speed. Though the trie is reaching a point where it is internally very efficient (probably within half or a quarter of its max read speed) the way that hbase currently uses it is far from optimal. The KeyValueScanner and related classes that iterate through the trie will eventually need to be smarter and have methods to do things like skipping to the next row of results without scanning every cell in between. When that is accomplished
[jira] [Commented] (HBASE-3515) [replication] ReplicationSource can miss a log after RS comes out of GC
[ https://issues.apache.org/jira/browse/HBASE-3515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135572#comment-13135572 ] Jean-Daniel Cryans commented on HBASE-3515: --- It seems the code changed a lot because now in trunk postLogRoll throws an IOE that kills the server. All I need to do is to bubble it up! [replication] ReplicationSource can miss a log after RS comes out of GC --- Key: HBASE-3515 URL: https://issues.apache.org/jira/browse/HBASE-3515 Project: HBase Issue Type: Bug Affects Versions: 0.90.0 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3515.patch This is from Hudson build 1738, if a log is about to be rolled and the ZK connection is already closed then the replication code will fail at adding the new log in ZK but the log will still be rolled and it's possible that some edits will make it in. From the log: {quote} 2011-02-08 10:21:20,618 FATAL [RegionServer:0;vesta.apache.org,46117,1297160399378.logRoller] regionserver.HRegionServer(1383): ABORTING region server serverName=vesta.apache.org,46117,1297160399378, load=(requests=1525, regions=12, usedHeap=273, maxHeap=1244): Failed add log to list org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /1/replication/rs/vesta.apache.org,46117,1297160399378/2/vesta.apache.org%3A46117.1297160480509 ... 2011-02-08 10:21:22,444 DEBUG [MASTER_META_SERVER_OPERATIONS-vesta.apache.org:56008-0] wal.HLogSplitter(258): Splitting hlog 8 of 8: hdfs://localhost:55474/user/hudson/.logs/vesta.apache.org,46117,1297160399378/vesta.apache.org%3A46117.1297160480509, length=0 2011-02-08 10:21:22,862 DEBUG [MASTER_META_SERVER_OPERATIONS-vesta.apache.org:56008-0] wal.HLogSplitter(436): Pushed=31 entries from hdfs://localhost:55474/user/hudson/.logs/vesta.apache.org,46117,1297160399378/vesta.apache.org%3A46117.1297160480509 {quote} The easiest thing to do would be let the exception out and cancel the log roll. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4677) Remove old single bulkLoadHFile method
Remove old single bulkLoadHFile method -- Key: HBASE-4677 URL: https://issues.apache.org/jira/browse/HBASE-4677 Project: HBase Issue Type: Sub-task Reporter: Jonathan Hsieh In review for HBASE-4649, there is some debate as whether to remove, deprecate, or leave the HRegionServer.bulkLoadHFile method. https://reviews.apache.org/r/2545/ . This jira will take care of that for the 0.92 and trunk releases, and allow the same patch to remain for an optional 0.90.x patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-4677) Remove old single bulkLoadHFile method
[ https://issues.apache.org/jira/browse/HBASE-4677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh reassigned HBASE-4677: - Assignee: Jonathan Hsieh Remove old single bulkLoadHFile method -- Key: HBASE-4677 URL: https://issues.apache.org/jira/browse/HBASE-4677 Project: HBase Issue Type: Sub-task Components: regionserver Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Fix For: 0.92.0 In review for HBASE-4649, there is some debate as whether to remove, deprecate, or leave the HRegionServer.bulkLoadHFile method. https://reviews.apache.org/r/2545/ . This jira will take care of that for the 0.92 and trunk releases, and allow the same patch to remain for an optional 0.90.x patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4565) Maven HBase build broken on cygwin with copynativelib.sh call.
[ https://issues.apache.org/jira/browse/HBASE-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135598#comment-13135598 ] Li Pi commented on HBASE-4565: -- +1. Not having to dual boot on my desktop would be awesome. Maven HBase build broken on cygwin with copynativelib.sh call. -- Key: HBASE-4565 URL: https://issues.apache.org/jira/browse/HBASE-4565 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.0 Environment: cygwin (on xp and win7) Reporter: Suraj Varma Assignee: Suraj Varma Labels: build, maven Fix For: 0.94.0 This is broken in both 0.92 as well as trunk pom.xml Here's a sample maven log snippet from trunk (from Mayuresh on user mailing list) [INFO] [antrun:run {execution: package}] [INFO] Executing tasks main: [mkdir] Created dir: D:\workspace\mkshirsa\hbase-trunk\target\hbase-0.93-SNAPSHOT\hbase-0.93-SNAPSHOT\lib\native\${build.platform} [exec] ls: cannot access D:workspacemkshirsahbase-trunktarget/nativelib: No such file or directory [exec] tar (child): Cannot connect to D: resolve failed [INFO] [ERROR] BUILD ERROR [INFO] [INFO] An Ant BuildException has occured: exec returned: 3328 There are two issues: 1) The ant run task below doesn't resolve the windows file separator returned by the project.build.directory - this causes the above resolve failed. !-- Using Unix cp to preserve symlinks, using script to handle wildcards -- echo file=${project.build.directory}/copynativelibs.sh if [ `ls ${project.build.directory}/nativelib | wc -l` -ne 0]; then 2) The tar argument value below also has a similar issue in that the path arg doesn't resolve right. !-- Using Unix tar to preserve symlinks -- exec executable=tar failonerror=yes dir=${project.build.directory}/${project.artifactId}-${project.version} arg value=czf/ arg value=/cygdrive/c/workspaces/hbase-0.92-svn/target/${project.artifactId}-${project.version}.tar.gz/ arg value=./ /exec In both cases, the fix would probably be to use a cross-platform way to handle the directory locations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4679) Thrift null mutation error
Thrift null mutation error -- Key: HBASE-4679 URL: https://issues.apache.org/jira/browse/HBASE-4679 Project: HBase Issue Type: Bug Affects Versions: 0.92.0, 0.94.0 Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg When using null as a value for a mutation, HBasse thrift client failed and threw an error. We should instad check for a null byte buffer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4679) Thrift null mutation error
[ https://issues.apache.org/jira/browse/HBASE-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Spiegelberg updated HBASE-4679: --- Status: Patch Available (was: Open) Thrift null mutation error -- Key: HBASE-4679 URL: https://issues.apache.org/jira/browse/HBASE-4679 Project: HBase Issue Type: Bug Affects Versions: 0.92.0, 0.94.0 Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Attachments: HBASE-4679.patch When using null as a value for a mutation, HBasse thrift client failed and threw an error. We should instad check for a null byte buffer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4679) Thrift null mutation error
[ https://issues.apache.org/jira/browse/HBASE-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Spiegelberg updated HBASE-4679: --- Attachment: HBASE-4679.patch Thrift null mutation error -- Key: HBASE-4679 URL: https://issues.apache.org/jira/browse/HBASE-4679 Project: HBase Issue Type: Bug Affects Versions: 0.92.0, 0.94.0 Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Attachments: HBASE-4679.patch When using null as a value for a mutation, HBasse thrift client failed and threw an error. We should instad check for a null byte buffer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4679) Thrift null mutation error
[ https://issues.apache.org/jira/browse/HBASE-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135652#comment-13135652 ] Hadoop QA commented on HBASE-4679: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12500808/HBASE-4679.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/67//console This message is automatically generated. Thrift null mutation error -- Key: HBASE-4679 URL: https://issues.apache.org/jira/browse/HBASE-4679 Project: HBase Issue Type: Bug Affects Versions: 0.92.0, 0.94.0 Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Attachments: HBASE-4679.patch When using null as a value for a mutation, HBasse thrift client failed and threw an error. We should instad check for a null byte buffer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4577) Region server reports storefileSizeMB bigger than storefileUncompressedSizeMB
[ https://issues.apache.org/jira/browse/HBASE-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135686#comment-13135686 ] gaojinchao commented on HBASE-4577: --- All have the same block structure, I think storefileUncompressedSizeMB means sum of all uncompressedSizeWithHeader Here are some comments /** Total uncompressed bytes, maybe calculate a compression ratio later. */ protected long totalUncompressedBytes = 0; I try to make a patch. Can you review whether it makes sense ? Region server reports storefileSizeMB bigger than storefileUncompressedSizeMB - Key: HBASE-4577 URL: https://issues.apache.org/jira/browse/HBASE-4577 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Assignee: gaojinchao Priority: Minor Fix For: 0.92.0 Minor issue while looking at the RS metrics: bq. numberOfStorefiles=8, storefileUncompressedSizeMB=2418, storefileSizeMB=2420, compressionRatio=1.0008 I guess there's a truncation somewhere when it's adding the numbers up. FWIW there's no compression on that table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4300) Start of new-version master fails if old master's znode is hanging around
[ https://issues.apache.org/jira/browse/HBASE-4300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4300: - Status: In Progress (was: Patch Available) Start of new-version master fails if old master's znode is hanging around - Key: HBASE-4300 URL: https://issues.apache.org/jira/browse/HBASE-4300 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: stack Priority: Critical Fix For: 0.92.0 Attachments: 4300-v2.txt, 4300.txt I shut down an 0.90 cluster, and had to do so uncleanly. I then started a trunk (0.92) cluster before the old master znode had expired. This cased: java.lang.StringIndexOutOfBoundsException: String index out of range: -1 at java.lang.String.substring(String.java:1937) at org.apache.hadoop.hbase.ServerName.parseHostname(ServerName.java:81) at org.apache.hadoop.hbase.ServerName.init(ServerName.java:63) at org.apache.hadoop.hbase.master.ActiveMasterManager.blockUntilBecomingActiveMaster(ActiveMasterManager.java:148) at org.apache.hadoop.hbase.master.HMaster.becomeActiveMaster(HMaster.java:342) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:297) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4300) Start of new-version master fails if old master's znode is hanging around
[ https://issues.apache.org/jira/browse/HBASE-4300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4300: - Status: Patch Available (was: In Progress) Seeing if I can trigger new patch build Start of new-version master fails if old master's znode is hanging around - Key: HBASE-4300 URL: https://issues.apache.org/jira/browse/HBASE-4300 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: stack Priority: Critical Fix For: 0.92.0 Attachments: 4300-v2.txt, 4300.txt I shut down an 0.90 cluster, and had to do so uncleanly. I then started a trunk (0.92) cluster before the old master znode had expired. This cased: java.lang.StringIndexOutOfBoundsException: String index out of range: -1 at java.lang.String.substring(String.java:1937) at org.apache.hadoop.hbase.ServerName.parseHostname(ServerName.java:81) at org.apache.hadoop.hbase.ServerName.init(ServerName.java:63) at org.apache.hadoop.hbase.master.ActiveMasterManager.blockUntilBecomingActiveMaster(ActiveMasterManager.java:148) at org.apache.hadoop.hbase.master.HMaster.becomeActiveMaster(HMaster.java:342) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:297) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4669) Add an option of using round-robin assignment for enabling table
[ https://issues.apache.org/jira/browse/HBASE-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jieshan Bean updated HBASE-4669: Attachment: HBASE-4669-Trunk.patch Add an option of using round-robin assignment for enabling table Key: HBASE-4669 URL: https://issues.apache.org/jira/browse/HBASE-4669 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.90.4, 0.94.0 Reporter: Jieshan Bean Priority: Minor Fix For: 0.94.0, 0.90.5 Attachments: HBASE-4669-Trunk.patch Under some scenarios, we use the function of disable/enable HTable. But currently, enable HTable uses the random-assignment. We hope all the regions show a better distribution, no matter how many regions and how many regionservers. So I suggest to add an option of using round-robin assignment on enable-table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4508) Backport HBASE-3777 to 0.90 branch
[ https://issues.apache.org/jira/browse/HBASE-4508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135694#comment-13135694 ] Ted Yu commented on HBASE-4508: --- From Shrijeet: 1. Rolling restart with 0.90 trunk + patch jar : PASSED 2. Verify that 0.90 trunk jar leaks connection with HTable pool size set to 10 , hbase.zookeeper.property.maxClientCnxns set to 20 and launching 10 client threads : PASSED 3. Verify that 0.90 trunk jar + patch does not leak connection with HTable pool size set to 10 , hbase.zookeeper.property.maxClientCnxns set to 20 and launching 10 client threads : PASSED 4. Verify that 0.90 trunk jar + patch still holds up when HTable pool size set to 10 , hbase.zookeeper.property. maxClientCnxns set to 5 and launching 100 client threads : PASSED The established connections in my final test was 2. This I verified using both netstat and zookeeper dump in UI. One caveat: remember to set hbase.connection.per.config false in order to use connection sharing. 5. hbase.connection.per.config=true , max connections to 300 and bumping ulimit from 1024 to 2048, client threads 100 : PASSED Backport HBASE-3777 to 0.90 branch -- Key: HBASE-4508 URL: https://issues.apache.org/jira/browse/HBASE-4508 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Bright Fulton Fix For: 0.90.5 Attachments: HBASE-4508.v1.patch, HBASE-4508.v2.patch, HBASE-4508.v3.patch, HBASE-4508.v4.git.patch, HBASE-4508.v4.patch, HBASE-4508.v5.patch See discussion here: http://search-hadoop.com/m/MJBId1aazTR1/backporting+HBASE-3777+to+0.90subj=backporting+HBASE+3777+to+0+90 Rocketfuel has been running 0.90.3 with HBASE-3777 since its resolution. They have 10 RS nodes , 1 Master and 1 Zookeeper Live writes and reads but super heavy on reads. Cache hit is pretty high. The qps on one of their data centers is 50K. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4669) Add an option of using round-robin assignment for enabling table
[ https://issues.apache.org/jira/browse/HBASE-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135697#comment-13135697 ] Ted Yu commented on HBASE-4669: --- Do we have to introduce assignUserRegionsToOnlineServers() ? We already have: {code} public void assignUserRegions(ListHRegionInfo regions, ListServerName servers) {code} I don't think IOE should be swallowed: {code} +try { + assignmentManager.assignUserRegionsToOnlineServers(regions); +} catch (IOException e) { + LOG.error(Error while assigning user regions to online servers, e); +} catch (InterruptedException e) { {code} Please add javadoc for the new unit test. Publishing on reviewboard would allow more people to comment. Add an option of using round-robin assignment for enabling table Key: HBASE-4669 URL: https://issues.apache.org/jira/browse/HBASE-4669 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.90.4, 0.94.0 Reporter: Jieshan Bean Priority: Minor Fix For: 0.94.0, 0.90.5 Attachments: HBASE-4669-Trunk.patch Under some scenarios, we use the function of disable/enable HTable. But currently, enable HTable uses the random-assignment. We hope all the regions show a better distribution, no matter how many regions and how many regionservers. So I suggest to add an option of using round-robin assignment on enable-table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4470) ServerNotRunningException coming out of assignRootAndMeta kills the Master
[ https://issues.apache.org/jira/browse/HBASE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135729#comment-13135729 ] stack commented on HBASE-4470: -- We still think this a blocker on 0.90.5? ServerNotRunningException coming out of assignRootAndMeta kills the Master -- Key: HBASE-4470 URL: https://issues.apache.org/jira/browse/HBASE-4470 Project: HBase Issue Type: Bug Affects Versions: 0.90.4 Reporter: Jean-Daniel Cryans Priority: Blocker Fix For: 0.90.5 I'm surprised we still have issues like that and I didn't get a hit while googling so forgive me if there's already a jira about it. When the master starts it verifies the locations of root and meta before assigning them, if the server is started but not running you'll get this: {quote} 2011-09-23 04:47:44,859 WARN org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: RemoteException connecting to RS org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running yet at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1038) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771) at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257) at $Proxy6.getProtocolVersion(Unknown Source) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444) at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:969) at org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:388) at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:287) at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:484) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:441) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:388) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:282) {quote} I hit that 3-4 times this week while debugging something else. The worst is that when you restart the master it sees that as a failover, but none of the regions are assigned so it takes an eternity to get back fully online. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira