[jira] [Commented] (HBASE-5776) HTableMultiplexer
[ https://issues.apache.org/jira/browse/HBASE-5776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278600#comment-13278600 ] Phabricator commented on HBASE-5776: Kannan has accepted the revision [jira][89-fb][HBASE-5776] HTableMultiplexer. looks good Liyin. Remaining are cosmetic comments, hence accepting! INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java:1731 space after to src/main/java/org/apache/hadoop/hbase/client/HTableMultiplexer.java:174 failed is unused src/main/java/org/apache/hadoop/hbase/client/HTableMultiplexer.java:159 failed is unused src/main/java/org/apache/hadoop/hbase/client/HTableMultiplexer.java:359 -1 - add space between - and 1. REVISION DETAIL https://reviews.facebook.net/D2775 BRANCH HBASE-5776 To: Kannan, Liyin Cc: JIRA, tedyu HTableMultiplexer -- Key: HBASE-5776 URL: https://issues.apache.org/jira/browse/HBASE-5776 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: D2775.1.patch, D2775.1.patch, D2775.2.patch, D2775.2.patch, D2775.3.patch, D2775.4.patch There is a known issue in HBase client that single slow/dead region server could slow down the multiput operations across all the region servers. So the HBase client will be as slow as the slowest region server in the cluster. To solve this problem, HTableMultiplexer will separate the multiput submitting threads with the flush threads, which means the multiput operation will be a nonblocking operation. The submitting thread will shard all the puts into different queues based on its destination region server and return immediately. The flush threads will flush these puts from each queue to its destination region server. Currently the HTableMultiplexer only supports the put operation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5882) Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor
[ https://issues.apache.org/jira/browse/HBASE-5882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Jindal updated HBASE-5882: --- Status: Patch Available (was: Open) Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor - Key: HBASE-5882 URL: https://issues.apache.org/jira/browse/HBASE-5882 Project: HBase Issue Type: Improvement Affects Versions: 0.92.1, 0.90.6 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Attachments: hbase_5882.patch, hbase_5882_V2.patch Currently on master restart if it tries to do processRIT, any region if found on dead server tries to avoid the nwe assignment so that timeout monitor can take care. This case is more prominent if the node is found in RS_ZK_REGION_OPENING state. I think we can handle this by triggering a new assignment with a new plan. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5546) Master assigns region in the original region server when opening region failed
[ https://issues.apache.org/jira/browse/HBASE-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Jindal updated HBASE-5546: --- Status: Open (was: Patch Available) Master assigns region in the original region server when opening region failed Key: HBASE-5546 URL: https://issues.apache.org/jira/browse/HBASE-5546 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0 Reporter: gaojinchao Assignee: Ashutosh Jindal Priority: Minor Fix For: 0.96.0 Attachments: hbase-5546.patch, hbase-5546_1.patch, hbase-5546_2.patch, hbase-5546_3.patch, hbase-5546_4.patch Master assigns region in the original region server when RS_ZK_REGION_FAILED_OPEN envent was coming. Maybe we should choose other region server. [2012-03-07 10:14:21,750] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:14:31,826] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:14:41,903] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:14:51,975] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:02,056] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:12,167] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:22,231] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:32,303] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:42,375] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:52,447] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:16:02,528] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:16:12,600] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:16:22,676] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5546) Master assigns region in the original region server when opening region failed
[ https://issues.apache.org/jira/browse/HBASE-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Jindal updated HBASE-5546: --- Fix Version/s: (was: 0.96.0) Status: Patch Available (was: Open) Master assigns region in the original region server when opening region failed Key: HBASE-5546 URL: https://issues.apache.org/jira/browse/HBASE-5546 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0 Reporter: gaojinchao Assignee: Ashutosh Jindal Priority: Minor Attachments: hbase-5546.patch, hbase-5546_1.patch, hbase-5546_2.patch, hbase-5546_3.patch, hbase-5546_4.patch Master assigns region in the original region server when RS_ZK_REGION_FAILED_OPEN envent was coming. Maybe we should choose other region server. [2012-03-07 10:14:21,750] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:14:31,826] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:14:41,903] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:14:51,975] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:02,056] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:12,167] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:22,231] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:32,303] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:42,375] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:52,447] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:16:02,528] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:16:12,600] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:16:22,676] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5997) Fix concerns raised in HBASE-5922 related to HalfStoreFileReader
[ https://issues.apache.org/jira/browse/HBASE-5997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278605#comment-13278605 ] Anoop Sam John commented on HBASE-5997: --- Whatever issue I was mentioning above, we can make out of this JIRA. So will focus on the HalfStoreFileReader issue here.. Will deal the other with another JIRA issue. Fix concerns raised in HBASE-5922 related to HalfStoreFileReader Key: HBASE-5997 URL: https://issues.apache.org/jira/browse/HBASE-5997 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Assignee: Anoop Sam John Attachments: Testcase.patch.txt Pls refer to the comment https://issues.apache.org/jira/browse/HBASE-5922?focusedCommentId=13269346page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13269346. Raised this issue to solve that comment. Just incase we don't forget it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6031) RegionServer does not go down while aborting
[ https://issues.apache.org/jira/browse/HBASE-6031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278607#comment-13278607 ] ramkrishna.s.vasudevan commented on HBASE-6031: --- HBASE-5641 is available in this. If you see the thread dump regionserver60020.decayingSampleTick.1 shows as daemon. So i suspect there is something in the jetty server that is hanging and that being nondaemon makes the RS process to be alive. @LArs Was this is the same dump trace you saw in HBASE-5641? RegionServer does not go down while aborting Key: HBASE-6031 URL: https://issues.apache.org/jira/browse/HBASE-6031 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Following is the thread dump. {code} 1997531088@qtp-716941846-5 prio=10 tid=0x7f7c5820c800 nid=0xe1b in Object.wait() [0x7f7c56ae8000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) at org.mortbay.io.nio.SelectChannelEndPoint.blockWritable(SelectChannelEndPoint.java:279) - locked 0x7f7cfe0616d0 (a org.mortbay.jetty.nio.SelectChannelConnector$ConnectorEndPoint) at org.mortbay.jetty.AbstractGenerator$Output.blockForOutput(AbstractGenerator.java:545) at org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:639) at org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:580) at java.io.ByteArrayOutputStream.writeTo(ByteArrayOutputStream.java:109) - locked 0x7f7cfe74d758 (a org.mortbay.util.ByteArrayOutputStream2) at org.mortbay.jetty.AbstractGenerator$OutputWriter.write(AbstractGenerator.java:904) at java.io.Writer.write(Writer.java:96) - locked 0x7f7cfca02fc0 (a org.mortbay.jetty.HttpConnection$OutputWriter) at java.io.PrintWriter.write(PrintWriter.java:361) - locked 0x7f7cfca02fc0 (a org.mortbay.jetty.HttpConnection$OutputWriter) at org.jamon.escaping.HtmlEscaping.write(HtmlEscaping.java:43) at org.jamon.escaping.AbstractCharacterEscaping.write(AbstractCharacterEscaping.java:35) at org.apache.hadoop.hbase.tmpl.regionserver.RSStatusTmplImpl.renderNoFlush(RSStatusTmplImpl.java:222) at org.apache.hadoop.hbase.tmpl.regionserver.RSStatusTmpl.renderNoFlush(RSStatusTmpl.java:180) at org.apache.hadoop.hbase.tmpl.regionserver.RSStatusTmpl.render(RSStatusTmpl.java:171) at org.apache.hadoop.hbase.regionserver.RSStatusServlet.doGet(RSStatusServlet.java:48) at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) at org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:109) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:932) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) 1374615312@qtp-716941846-3 prio=10 tid=0x7f7c58214800 nid=0xc42 in Object.wait() [0x7f7c55bd9000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) at
[jira] [Commented] (HBASE-5546) Master assigns region in the original region server when opening region failed
[ https://issues.apache.org/jira/browse/HBASE-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278612#comment-13278612 ] Hadoop QA commented on HBASE-5546: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528003/hbase-5546_4.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop23. The patch compiles against the hadoop 0.23.x profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 32 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1924//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1924//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1924//console This message is automatically generated. Master assigns region in the original region server when opening region failed Key: HBASE-5546 URL: https://issues.apache.org/jira/browse/HBASE-5546 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0 Reporter: gaojinchao Assignee: Ashutosh Jindal Priority: Minor Attachments: hbase-5546.patch, hbase-5546_1.patch, hbase-5546_2.patch, hbase-5546_3.patch, hbase-5546_4.patch Master assigns region in the original region server when RS_ZK_REGION_FAILED_OPEN envent was coming. Maybe we should choose other region server. [2012-03-07 10:14:21,750] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:14:31,826] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:14:41,903] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:14:51,975] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:02,056] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:12,167] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:22,231] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:32,303] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:42,375] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:52,447] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:16:02,528] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:16:12,600] [DEBUG] [main-EventThread]
[jira] [Commented] (HBASE-5699) Run with 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278615#comment-13278615 ] ramkrishna.s.vasudevan commented on HBASE-5699: --- We are also interested in this. Worked on a prototype with having one HLog instance but underlying there will be multiple writer instances. The regions will be allocated with any one of the writer instance and each region will be writing to hlog using the instance associated with it. Even on logrolling the instances against each region will be updated and the region will continue to use its mapping. Without patch ~53K puts/sec. With patch ~78-80k puts/sec It is a 3 node cluster and the size of each record was 1k. No of regions : 2800 By default used 3 writer instances. I was able to pass the testcases related to TestHlog and TestDistributedLogSplitting. But Testmasterreplication was not passing. Replication needs some change based on this which i did not work on much. The pendingWrites list that we use is now converted into a map having the writer with the list of pending writes. Pls provide your suggestions on this. BTW, Li Pi, any progress on this? I would love to help you in this. May be i can prepare a more forma patch and upload over here. Run with 1 WAL in HRegionServer - Key: HBASE-5699 URL: https://issues.apache.org/jira/browse/HBASE-5699 Project: HBase Issue Type: Improvement Reporter: binlijin Assignee: Li Pi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6011) Unable to start master in local mode
[ https://issues.apache.org/jira/browse/HBASE-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278620#comment-13278620 ] ramkrishna.s.vasudevan commented on HBASE-6011: --- This fix makes testcase added in HBASE-5806 to fail. Unable to start master in local mode Key: HBASE-6011 URL: https://issues.apache.org/jira/browse/HBASE-6011 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6011-v2.patch, 6011.patch Got this trying to launch head of 0.94 branch in local mode from the build tree but it happens with trunk and 0.92 too: {noformat} 12/05/15 19:35:45 ERROR master.HMasterCommandLine: Failed to start master java.lang.ClassCastException: org.apache.hadoop.hbase.master.HMaster cannot be cast to org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1761) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HBASE-6011) Unable to start master in local mode
[ https://issues.apache.org/jira/browse/HBASE-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278620#comment-13278620 ] ramkrishna.s.vasudevan edited comment on HBASE-6011 at 5/18/12 7:04 AM: This fix makes testcase added in HBASE-5806 to fail. See build https://builds.apache.org/job/HBase-0.94/198/ Subsequent hadoopqa bot is also failing. was (Author: ram_krish): This fix makes testcase added in HBASE-5806 to fail. Unable to start master in local mode Key: HBASE-6011 URL: https://issues.apache.org/jira/browse/HBASE-6011 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6011-v2.patch, 6011.patch Got this trying to launch head of 0.94 branch in local mode from the build tree but it happens with trunk and 0.92 too: {noformat} 12/05/15 19:35:45 ERROR master.HMasterCommandLine: Failed to start master java.lang.ClassCastException: org.apache.hadoop.hbase.master.HMaster cannot be cast to org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1761) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6011) Unable to start master in local mode
[ https://issues.apache.org/jira/browse/HBASE-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278622#comment-13278622 ] ramkrishna.s.vasudevan commented on HBASE-6011: --- The testcase change was done by us in HBASe-5806. :( Anyway will check it and get back on the right addendum over here or for that fix. Unable to start master in local mode Key: HBASE-6011 URL: https://issues.apache.org/jira/browse/HBASE-6011 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6011-v2.patch, 6011.patch Got this trying to launch head of 0.94 branch in local mode from the build tree but it happens with trunk and 0.92 too: {noformat} 12/05/15 19:35:45 ERROR master.HMasterCommandLine: Failed to start master java.lang.ClassCastException: org.apache.hadoop.hbase.master.HMaster cannot be cast to org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1761) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6011) Unable to start master in local mode
[ https://issues.apache.org/jira/browse/HBASE-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278628#comment-13278628 ] Andrew Purtell commented on HBASE-6011: --- Sorry Ram I don't follow, this change seems totally unrelated. Unable to start master in local mode Key: HBASE-6011 URL: https://issues.apache.org/jira/browse/HBASE-6011 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6011-v2.patch, 6011.patch Got this trying to launch head of 0.94 branch in local mode from the build tree but it happens with trunk and 0.92 too: {noformat} 12/05/15 19:35:45 ERROR master.HMasterCommandLine: Failed to start master java.lang.ClassCastException: org.apache.hadoop.hbase.master.HMaster cannot be cast to org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1761) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5882) Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor
[ https://issues.apache.org/jira/browse/HBASE-5882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278630#comment-13278630 ] Hadoop QA commented on HBASE-5882: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528004/hbase_5882_V2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop23. The patch compiles against the hadoop 0.23.x profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 32 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.TestRegionRebalancing org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster org.apache.hadoop.hbase.master.TestAssignmentManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1925//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1925//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1925//console This message is automatically generated. Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor - Key: HBASE-5882 URL: https://issues.apache.org/jira/browse/HBASE-5882 Project: HBase Issue Type: Improvement Affects Versions: 0.90.6, 0.92.1 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Attachments: hbase_5882.patch, hbase_5882_V2.patch Currently on master restart if it tries to do processRIT, any region if found on dead server tries to avoid the nwe assignment so that timeout monitor can take care. This case is more prominent if the node is found in RS_ZK_REGION_OPENING state. I think we can handle this by triggering a new assignment with a new plan. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5926) Delete the master znode after a master crash
[ https://issues.apache.org/jira/browse/HBASE-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-5926: --- Status: Patch Available (was: Open) Delete the master znode after a master crash Key: HBASE-5926 URL: https://issues.apache.org/jira/browse/HBASE-5926 Project: HBase Issue Type: Improvement Components: master, scripts Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5926.v10.patch, 5926.v11.patch, 5926.v13.patch, 5926.v6.patch, 5926.v8.patch, 5926.v9.patch This is the continuation of the work done in HBASE-5844. But we can't apply exactly the same strategy: for the region server, there is a znode per region server, while for the master backup master there is a single znode for both. So if we apply the same strategy as for a regionserver, we may have this scenario: 1) Master starts 2) Backup master starts 3) Master dies 4) ZK detects it 5) Backup master receives the update from ZK 6) Backup master creates the new master node and become the main master 7) Previous master script continues 8) Previous master script deletes the master node in ZK 9) = issue: we deleted the node just created by the new master This should not happen often (usually the znode will be deleted soon enough), but it can happen. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5926) Delete the master znode after a master crash
[ https://issues.apache.org/jira/browse/HBASE-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-5926: --- Status: Open (was: Patch Available) Delete the master znode after a master crash Key: HBASE-5926 URL: https://issues.apache.org/jira/browse/HBASE-5926 Project: HBase Issue Type: Improvement Components: master, scripts Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5926.v10.patch, 5926.v11.patch, 5926.v13.patch, 5926.v6.patch, 5926.v8.patch, 5926.v9.patch This is the continuation of the work done in HBASE-5844. But we can't apply exactly the same strategy: for the region server, there is a znode per region server, while for the master backup master there is a single znode for both. So if we apply the same strategy as for a regionserver, we may have this scenario: 1) Master starts 2) Backup master starts 3) Master dies 4) ZK detects it 5) Backup master receives the update from ZK 6) Backup master creates the new master node and become the main master 7) Previous master script continues 8) Previous master script deletes the master node in ZK 9) = issue: we deleted the node just created by the new master This should not happen often (usually the znode will be deleted soon enough), but it can happen. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5926) Delete the master znode after a master crash
[ https://issues.apache.org/jira/browse/HBASE-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278654#comment-13278654 ] nkeywal commented on HBASE-5926: v13 should do it... Delete the master znode after a master crash Key: HBASE-5926 URL: https://issues.apache.org/jira/browse/HBASE-5926 Project: HBase Issue Type: Improvement Components: master, scripts Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5926.v10.patch, 5926.v11.patch, 5926.v13.patch, 5926.v6.patch, 5926.v8.patch, 5926.v9.patch This is the continuation of the work done in HBASE-5844. But we can't apply exactly the same strategy: for the region server, there is a znode per region server, while for the master backup master there is a single znode for both. So if we apply the same strategy as for a regionserver, we may have this scenario: 1) Master starts 2) Backup master starts 3) Master dies 4) ZK detects it 5) Backup master receives the update from ZK 6) Backup master creates the new master node and become the main master 7) Previous master script continues 8) Previous master script deletes the master node in ZK 9) = issue: we deleted the node just created by the new master This should not happen often (usually the znode will be deleted soon enough), but it can happen. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5926) Delete the master znode after a master crash
[ https://issues.apache.org/jira/browse/HBASE-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-5926: --- Attachment: 5926.v13.patch Delete the master znode after a master crash Key: HBASE-5926 URL: https://issues.apache.org/jira/browse/HBASE-5926 Project: HBase Issue Type: Improvement Components: master, scripts Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5926.v10.patch, 5926.v11.patch, 5926.v13.patch, 5926.v6.patch, 5926.v8.patch, 5926.v9.patch This is the continuation of the work done in HBASE-5844. But we can't apply exactly the same strategy: for the region server, there is a znode per region server, while for the master backup master there is a single znode for both. So if we apply the same strategy as for a regionserver, we may have this scenario: 1) Master starts 2) Backup master starts 3) Master dies 4) ZK detects it 5) Backup master receives the update from ZK 6) Backup master creates the new master node and become the main master 7) Previous master script continues 8) Previous master script deletes the master node in ZK 9) = issue: we deleted the node just created by the new master This should not happen often (usually the znode will be deleted soon enough), but it can happen. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5920) New Compactions Logic can silently prevent user-initiated compactions from occurring
[ https://issues.apache.org/jira/browse/HBASE-5920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278655#comment-13278655 ] Hadoop QA commented on HBASE-5920: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528014/HBASE-5920-trunk-1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop23. The patch compiles against the hadoop 0.23.x profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 32 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1926//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1926//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1926//console This message is automatically generated. New Compactions Logic can silently prevent user-initiated compactions from occurring Key: HBASE-5920 URL: https://issues.apache.org/jira/browse/HBASE-5920 Project: HBase Issue Type: Bug Components: client, regionserver Affects Versions: 0.92.1 Reporter: Derek Wollenstein Priority: Minor Labels: compaction Attachments: HBASE-5920-0.92.1-1.patch, HBASE-5920-0.92.1-2.patch, HBASE-5920-0.92.1.patch, HBASE-5920-trunk-1.patch, HBASE-5920-trunk.patch There seem to be some tuning settings in which manually triggered major compactions will do nothing, including loggic From Store.java in the function ListStoreFile compactSelection(ListStoreFile candidates) When a user manually triggers a compaction, this follows the same logic as a normal compaction check. when a user manually triggers a major compaction, something similar happens. Putting this all together: 1. If a user triggers a major compaction, this is checked against a max files threshold (hbase.hstore.compaction.max). If the number of storefiles to compact is max files, then we downgrade to a minor compaction 2. If we are in a minor compaction, we do the following checks: a. If the file is less than a minimum size (hbase.hstore.compaction.min.size) we automatically include it b. Otherwise, we check how the size compares to the next largest size. based on hbase.hstore.compaction.ratio. c. If the number of files included is less than a minimum count (hbase.hstore.compaction.min) then don't compact. In many of the exit strategies, we aren't seeing an error message. The net-net of this is that if we have a mix of very large and very small files, we may end up having too many files to do a major compact, but too few files to do a minor compact. I'm trying to go through and see if I'm understanding things correctly, but this seems like the bug To put it another way 2012-05-02 20:09:36,389 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Large Compaction requested: regionName=str,44594594594594592,1334939064521.f7aed25b55d4d7988af763bede9ce74e., store Name=c, fileCount=15, fileSize=1.5g (20.2k, 362.5m, 155.3k, 3.0m, 30.7k, 361.2m, 6.9m, 4.7m, 14.7k, 363.4m, 30.9m, 3.2m, 7.3k, 362.9m, 23.5m), priority=-9, time=3175046817624398; Because: Recursive enqueue; compaction_queue=(59:0), split_queue=0 When we had a minimum compaction size of 128M, and default settings for hbase.hstore.compaction.min,hbase.hstore.compaction.max,hbase.hstore.compaction.ratio, we were not getting a compaction to run even if we ran major_compact 'str,44594594594594592,1334939064521.f7aed25b55d4d7988af763bede9ce74e.' from the ruby shell. Note that we had many tiny regions (20k, 155k, 3m, 30k,..) and several large regions (362.5m,361.2m,363.4m,362.9m). I think the bimodal nature of the sizes prevented us from doing a compaction. I'm not 100% sure where this errored out because when I manually triggered a compaction, I did not see ' // if we don't have enough files to compact, just wait if (filesToCompact.size() this.minFilesToCompact) { if (LOG.isDebugEnabled()) {
[jira] [Commented] (HBASE-5882) Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor
[ https://issues.apache.org/jira/browse/HBASE-5882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278670#comment-13278670 ] ramkrishna.s.vasudevan commented on HBASE-5882: --- The latest test case failure in TestAssignmentManager is due to the impact of the testcase that went in HBASE-5927. A small tweak will make it work. Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor - Key: HBASE-5882 URL: https://issues.apache.org/jira/browse/HBASE-5882 Project: HBase Issue Type: Improvement Affects Versions: 0.90.6, 0.92.1 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Attachments: hbase_5882.patch, hbase_5882_V2.patch Currently on master restart if it tries to do processRIT, any region if found on dead server tries to avoid the nwe assignment so that timeout monitor can take care. This case is more prominent if the node is found in RS_ZK_REGION_OPENING state. I think we can handle this by triggering a new assignment with a new plan. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5997) Fix concerns raised in HBASE-5922 related to HalfStoreFileReader
[ https://issues.apache.org/jira/browse/HBASE-5997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-5997: -- Attachment: HBASE-5997_0.94.patch Patch for 0.94 version Now I have changed like the check will compare the rowkey only. From the external usage of this API within HBase code, we always do the seekBefore based on rowkey only. Not with rowkey+cf+qualifier way. Pls provide your suggestion Once the solution is ok I can backport for other versions. Fix concerns raised in HBASE-5922 related to HalfStoreFileReader Key: HBASE-5997 URL: https://issues.apache.org/jira/browse/HBASE-5997 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Assignee: Anoop Sam John Attachments: HBASE-5997_0.94.patch, Testcase.patch.txt Pls refer to the comment https://issues.apache.org/jira/browse/HBASE-5922?focusedCommentId=13269346page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13269346. Raised this issue to solve that comment. Just incase we don't forget it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5997) Fix concerns raised in HBASE-5922 related to HalfStoreFileReader
[ https://issues.apache.org/jira/browse/HBASE-5997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-5997: -- Affects Version/s: 0.96.0 0.90.6 0.92.1 0.94.0 Fix concerns raised in HBASE-5922 related to HalfStoreFileReader Key: HBASE-5997 URL: https://issues.apache.org/jira/browse/HBASE-5997 Project: HBase Issue Type: Bug Affects Versions: 0.90.6, 0.92.1, 0.94.0, 0.96.0 Reporter: ramkrishna.s.vasudevan Assignee: Anoop Sam John Attachments: HBASE-5997_0.94.patch, Testcase.patch.txt Pls refer to the comment https://issues.apache.org/jira/browse/HBASE-5922?focusedCommentId=13269346page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13269346. Raised this issue to solve that comment. Just incase we don't forget it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5926) Delete the master znode after a master crash
[ https://issues.apache.org/jira/browse/HBASE-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278682#comment-13278682 ] Hadoop QA commented on HBASE-5926: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528018/5926.v13.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop23. The patch compiles against the hadoop 0.23.x profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 33 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1927//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1927//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1927//console This message is automatically generated. Delete the master znode after a master crash Key: HBASE-5926 URL: https://issues.apache.org/jira/browse/HBASE-5926 Project: HBase Issue Type: Improvement Components: master, scripts Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5926.v10.patch, 5926.v11.patch, 5926.v13.patch, 5926.v6.patch, 5926.v8.patch, 5926.v9.patch This is the continuation of the work done in HBASE-5844. But we can't apply exactly the same strategy: for the region server, there is a znode per region server, while for the master backup master there is a single znode for both. So if we apply the same strategy as for a regionserver, we may have this scenario: 1) Master starts 2) Backup master starts 3) Master dies 4) ZK detects it 5) Backup master receives the update from ZK 6) Backup master creates the new master node and become the main master 7) Previous master script continues 8) Previous master script deletes the master node in ZK 9) = issue: we deleted the node just created by the new master This should not happen often (usually the znode will be deleted soon enough), but it can happen. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5882) Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor
[ https://issues.apache.org/jira/browse/HBASE-5882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Jindal updated HBASE-5882: --- Status: Open (was: Patch Available) Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor - Key: HBASE-5882 URL: https://issues.apache.org/jira/browse/HBASE-5882 Project: HBase Issue Type: Improvement Affects Versions: 0.92.1, 0.90.6 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Attachments: hbase_5882.patch, hbase_5882_V2.patch Currently on master restart if it tries to do processRIT, any region if found on dead server tries to avoid the nwe assignment so that timeout monitor can take care. This case is more prominent if the node is found in RS_ZK_REGION_OPENING state. I think we can handle this by triggering a new assignment with a new plan. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5882) Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor
[ https://issues.apache.org/jira/browse/HBASE-5882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Jindal updated HBASE-5882: --- Attachment: hbase_5882_V3.patch Updated patch for 0.96. Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor - Key: HBASE-5882 URL: https://issues.apache.org/jira/browse/HBASE-5882 Project: HBase Issue Type: Improvement Affects Versions: 0.90.6, 0.92.1 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Attachments: hbase_5882.patch, hbase_5882_V2.patch, hbase_5882_V3.patch Currently on master restart if it tries to do processRIT, any region if found on dead server tries to avoid the nwe assignment so that timeout monitor can take care. This case is more prominent if the node is found in RS_ZK_REGION_OPENING state. I think we can handle this by triggering a new assignment with a new plan. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5882) Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor
[ https://issues.apache.org/jira/browse/HBASE-5882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Jindal updated HBASE-5882: --- Status: Patch Available (was: Open) Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor - Key: HBASE-5882 URL: https://issues.apache.org/jira/browse/HBASE-5882 Project: HBase Issue Type: Improvement Affects Versions: 0.92.1, 0.90.6 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Attachments: hbase_5882.patch, hbase_5882_V2.patch, hbase_5882_V3.patch Currently on master restart if it tries to do processRIT, any region if found on dead server tries to avoid the nwe assignment so that timeout monitor can take care. This case is more prominent if the node is found in RS_ZK_REGION_OPENING state. I think we can handle this by triggering a new assignment with a new plan. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6011) Unable to start master in local mode
[ https://issues.apache.org/jira/browse/HBASE-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-6011: -- Attachment: HBASE-6011_1.patch Pls review this fix. It is running the TestLocalHBaseCluster and also HBASE-5806 related testcases. It will help us to mock the master and also can be run locally. Unable to start master in local mode Key: HBASE-6011 URL: https://issues.apache.org/jira/browse/HBASE-6011 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6011-v2.patch, 6011.patch, HBASE-6011_1.patch Got this trying to launch head of 0.94 branch in local mode from the build tree but it happens with trunk and 0.92 too: {noformat} 12/05/15 19:35:45 ERROR master.HMasterCommandLine: Failed to start master java.lang.ClassCastException: org.apache.hadoop.hbase.master.HMaster cannot be cast to org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1761) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5882) Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor
[ https://issues.apache.org/jira/browse/HBASE-5882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278716#comment-13278716 ] Hadoop QA commented on HBASE-5882: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528025/hbase_5882_V3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop23. The patch compiles against the hadoop 0.23.x profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 32 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestSplitLogManager org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1928//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1928//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1928//console This message is automatically generated. Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor - Key: HBASE-5882 URL: https://issues.apache.org/jira/browse/HBASE-5882 Project: HBase Issue Type: Improvement Affects Versions: 0.90.6, 0.92.1 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Attachments: hbase_5882.patch, hbase_5882_V2.patch, hbase_5882_V3.patch Currently on master restart if it tries to do processRIT, any region if found on dead server tries to avoid the nwe assignment so that timeout monitor can take care. This case is more prominent if the node is found in RS_ZK_REGION_OPENING state. I think we can handle this by triggering a new assignment with a new plan. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-6011) Unable to start master in local mode
[ https://issues.apache.org/jira/browse/HBASE-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan reopened HBASE-6011: --- Just will reopen this defect and run the qa for the patch. Anyway if we need to raise a seperate issue i can close this and raise new one. Unable to start master in local mode Key: HBASE-6011 URL: https://issues.apache.org/jira/browse/HBASE-6011 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6011-v2.patch, 6011.patch, HBASE-6011_1.patch Got this trying to launch head of 0.94 branch in local mode from the build tree but it happens with trunk and 0.92 too: {noformat} 12/05/15 19:35:45 ERROR master.HMasterCommandLine: Failed to start master java.lang.ClassCastException: org.apache.hadoop.hbase.master.HMaster cannot be cast to org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1761) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6011) Unable to start master in local mode
[ https://issues.apache.org/jira/browse/HBASE-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-6011: -- Status: Patch Available (was: Reopened) Unable to start master in local mode Key: HBASE-6011 URL: https://issues.apache.org/jira/browse/HBASE-6011 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6011-v2.patch, 6011.patch, HBASE-6011_1.patch Got this trying to launch head of 0.94 branch in local mode from the build tree but it happens with trunk and 0.92 too: {noformat} 12/05/15 19:35:45 ERROR master.HMasterCommandLine: Failed to start master java.lang.ClassCastException: org.apache.hadoop.hbase.master.HMaster cannot be cast to org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1761) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6011) Unable to start master in local mode
[ https://issues.apache.org/jira/browse/HBASE-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278725#comment-13278725 ] Hadoop QA commented on HBASE-6011: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528026/HBASE-6011_1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 7 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1929//console This message is automatically generated. Unable to start master in local mode Key: HBASE-6011 URL: https://issues.apache.org/jira/browse/HBASE-6011 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6011-v2.patch, 6011.patch, HBASE-6011_1.patch Got this trying to launch head of 0.94 branch in local mode from the build tree but it happens with trunk and 0.92 too: {noformat} 12/05/15 19:35:45 ERROR master.HMasterCommandLine: Failed to start master java.lang.ClassCastException: org.apache.hadoop.hbase.master.HMaster cannot be cast to org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1761) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6013) Polish sharp edges from CopyTable
[ https://issues.apache.org/jira/browse/HBASE-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278749#comment-13278749 ] Hudson commented on HBASE-6013: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #9 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/9/]) HBASE-6013 Polish sharp edges from CopyTable (Revision 1339929) Result = FAILURE jmhsieh : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/CopyTable.java Polish sharp edges from CopyTable - Key: HBASE-6013 URL: https://issues.apache.org/jira/browse/HBASE-6013 Project: HBase Issue Type: Improvement Affects Versions: 0.90.6, 0.92.1, 0.94.0, 0.96.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: hbase-6013-92.patch, hbase-6013.patch CopyTable doesn't report errors when invalid arguments are specified. For example, having a typo in --peer.adr (such as --peer.addr or -peer.adr) silently uses the default cluster and does a same-cluster copy. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6011) Unable to start master in local mode
[ https://issues.apache.org/jira/browse/HBASE-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278794#comment-13278794 ] ramkrishna.s.vasudevan commented on HBASE-6011: --- I attached patch for 0.94 as i was working on that. Will upload patch for Trunk. Unable to start master in local mode Key: HBASE-6011 URL: https://issues.apache.org/jira/browse/HBASE-6011 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6011-v2.patch, 6011.patch, HBASE-6011_1.patch Got this trying to launch head of 0.94 branch in local mode from the build tree but it happens with trunk and 0.92 too: {noformat} 12/05/15 19:35:45 ERROR master.HMasterCommandLine: Failed to start master java.lang.ClassCastException: org.apache.hadoop.hbase.master.HMaster cannot be cast to org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1761) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6046) Master retry on ZK session expiry causes inconsistent region assignments.
Gopinathan A created HBASE-6046: --- Summary: Master retry on ZK session expiry causes inconsistent region assignments. Key: HBASE-6046 URL: https://issues.apache.org/jira/browse/HBASE-6046 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.0, 0.92.1 Reporter: Gopinathan A Fix For: 0.92.2, 0.94.1 1 ZK Session timeout in the hmaster leads to bulk assignment though all the RSs are online. 2 While doing bulk assignment, if the master again goes down restart(or backup comes up) all the node created in the ZK will now be tried to reassign to the new RSs. This is leading to double assignment. we had 2800 regions, among this 1900 region got double assignment, taking the region count to 4700. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-6046) Master retry on ZK session expiry causes inconsistent region assignments.
[ https://issues.apache.org/jira/browse/HBASE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan reassigned HBASE-6046: - Assignee: ramkrishna.s.vasudevan Master retry on ZK session expiry causes inconsistent region assignments. - Key: HBASE-6046 URL: https://issues.apache.org/jira/browse/HBASE-6046 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.1, 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Fix For: 0.92.2, 0.94.1 1 ZK Session timeout in the hmaster leads to bulk assignment though all the RSs are online. 2 While doing bulk assignment, if the master again goes down restart(or backup comes up) all the node created in the ZK will now be tried to reassign to the new RSs. This is leading to double assignment. we had 2800 regions, among this 1900 region got double assignment, taking the region count to 4700. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6047) Put.has() can't determine result correctly
Wang Qiang created HBASE-6047: - Summary: Put.has() can't determine result correctly Key: HBASE-6047 URL: https://issues.apache.org/jira/browse/HBASE-6047 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.92.1 Reporter: Wang Qiang the public method 'has(byte [] family, byte [] qualifier)' internally invoked the private method 'has(byte [] family, byte [] qualifier, long ts, byte [] value, boolean ignoreTS, boolean ignoreValue)' with 'value=new byte[0], ignoreTS=true, ignoreValue=true', but there's a logical error in the body, it'll enter the block else if (ignoreValue) { for (KeyValue kv: list) { if (Arrays.equals(kv.getFamily(), family) Arrays.equals(kv.getQualifier(), qualifier) kv.getTimestamp() == ts) { return true; } } } the expression 'kv.getTimestamp() == ts' in the if conditions should only exist when 'ignoreTS=false', otherwise, the following code will return false! Put put = new Put(Bytes.toBytes(row-01)); put.add(Bytes.toBytes(family-01), Bytes.toBytes(qualifier-01), 1234567L, Bytes.toBytes(value-01)); System.out.println(put.has(Bytes.toBytes(family-01), Bytes.toBytes(qualifier-01))); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6048) Table Scan is failing if offheap cache enabled
Gopinathan A created HBASE-6048: --- Summary: Table Scan is failing if offheap cache enabled Key: HBASE-6048 URL: https://issues.apache.org/jira/browse/HBASE-6048 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Gopinathan A Table Scan is failing if offheap cache enabled. {noformat} 2012-05-18 20:03:38,446 DEBUG org.apache.hadoop.hbase.io.hfile.HFileWriterV2: Initialized with CacheConfig:enabled [cacheDataOnRead=true] [cacheDataOnWrite=false] [cacheIndexesOnWrite=false] [cacheBloomsOnWrite=false] [cacheEvictOnClose=false] [cacheCompressed=false] 2012-05-18 20:03:38,446 INFO org.apache.hadoop.hbase.regionserver.StoreFile: Delete Family Bloom filter type for hdfs://10.18.40.217:9000/hbase/ufdr/1d4656fd417a07c9171a38b8f4d08510/.tmp/03742024b28f443bb63cfc338d4ca422: CompoundBloomFilterWriter 2012-05-18 20:04:25,576 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 120.57 MB of total=1020.57 MB 2012-05-18 20:04:25,655 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=120.82 MB, total=907.89 MB, single=1012.11 MB, multi=6.12 MB, memory=0 KB 2012-05-18 20:04:25,733 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed openScanner java.lang.IllegalStateException: Schema metrics requested before table/CF name initialization: {tableName:null,cfName:null} at org.apache.hadoop.hbase.regionserver.metrics.SchemaConfigured.getSchemaMetrics(SchemaConfigured.java:182) at org.apache.hadoop.hbase.io.hfile.LruBlockCache.updateSizeMetrics(LruBlockCache.java:310) at org.apache.hadoop.hbase.io.hfile.LruBlockCache.cacheBlock(LruBlockCache.java:274) at org.apache.hadoop.hbase.io.hfile.LruBlockCache.cacheBlock(LruBlockCache.java:293) at org.apache.hadoop.hbase.io.hfile.DoubleBlockCache.getBlock(DoubleBlockCache.java:102) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:296) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(HFileBlockIndex.java:213) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:455) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:475) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:226) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:145) at org.apache.hadoop.hbase.regionserver.StoreScanner.init(StoreScanner.java:130) at org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:2001) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.init(HRegion.java:3274) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:1604) at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1596) at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1572) at org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2310) at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1376) 2012-05-18 20:04:25,828 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed openScanner {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6048) Table Scan is failing if offheap cache enabled
[ https://issues.apache.org/jira/browse/HBASE-6048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-6048: -- Priority: Critical (was: Major) Table Scan is failing if offheap cache enabled -- Key: HBASE-6048 URL: https://issues.apache.org/jira/browse/HBASE-6048 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Gopinathan A Priority: Critical Table Scan is failing if offheap cache enabled. {noformat} 2012-05-18 20:03:38,446 DEBUG org.apache.hadoop.hbase.io.hfile.HFileWriterV2: Initialized with CacheConfig:enabled [cacheDataOnRead=true] [cacheDataOnWrite=false] [cacheIndexesOnWrite=false] [cacheBloomsOnWrite=false] [cacheEvictOnClose=false] [cacheCompressed=false] 2012-05-18 20:03:38,446 INFO org.apache.hadoop.hbase.regionserver.StoreFile: Delete Family Bloom filter type for hdfs://10.18.40.217:9000/hbase/ufdr/1d4656fd417a07c9171a38b8f4d08510/.tmp/03742024b28f443bb63cfc338d4ca422: CompoundBloomFilterWriter 2012-05-18 20:04:25,576 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 120.57 MB of total=1020.57 MB 2012-05-18 20:04:25,655 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=120.82 MB, total=907.89 MB, single=1012.11 MB, multi=6.12 MB, memory=0 KB 2012-05-18 20:04:25,733 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed openScanner java.lang.IllegalStateException: Schema metrics requested before table/CF name initialization: {tableName:null,cfName:null} at org.apache.hadoop.hbase.regionserver.metrics.SchemaConfigured.getSchemaMetrics(SchemaConfigured.java:182) at org.apache.hadoop.hbase.io.hfile.LruBlockCache.updateSizeMetrics(LruBlockCache.java:310) at org.apache.hadoop.hbase.io.hfile.LruBlockCache.cacheBlock(LruBlockCache.java:274) at org.apache.hadoop.hbase.io.hfile.LruBlockCache.cacheBlock(LruBlockCache.java:293) at org.apache.hadoop.hbase.io.hfile.DoubleBlockCache.getBlock(DoubleBlockCache.java:102) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:296) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(HFileBlockIndex.java:213) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:455) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:475) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:226) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:145) at org.apache.hadoop.hbase.regionserver.StoreScanner.init(StoreScanner.java:130) at org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:2001) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.init(HRegion.java:3274) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:1604) at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1596) at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1572) at org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2310) at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1376) 2012-05-18 20:04:25,828 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed openScanner {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6048) Table Scan is failing if offheap cache enabled
[ https://issues.apache.org/jira/browse/HBASE-6048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278817#comment-13278817 ] ramkrishna.s.vasudevan commented on HBASE-6048: --- Also we get the following trace also while scanning {code} 2012-05-18 10:38:29,709 ERROR org.apache.hadoop.hbase.io.hfile.HFileReaderV2: Current pos = 655578; currKeyLen = 38; currValLen = 800; block limit = 655578; HFile name = 5325593bcaba4a8ba6f61f026abb82a9; currBlock currBlockOffset = 222963500 2012-05-18 10:38:29,709 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: java.lang.IllegalArgumentException at java.nio.Buffer.position(Buffer.java:218) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:630) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:130) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:95) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:406) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:127) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3354) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3310) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3327) at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2393) at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1376) {code} Table Scan is failing if offheap cache enabled -- Key: HBASE-6048 URL: https://issues.apache.org/jira/browse/HBASE-6048 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Gopinathan A Priority: Critical Table Scan is failing if offheap cache enabled. {noformat} 2012-05-18 20:03:38,446 DEBUG org.apache.hadoop.hbase.io.hfile.HFileWriterV2: Initialized with CacheConfig:enabled [cacheDataOnRead=true] [cacheDataOnWrite=false] [cacheIndexesOnWrite=false] [cacheBloomsOnWrite=false] [cacheEvictOnClose=false] [cacheCompressed=false] 2012-05-18 20:03:38,446 INFO org.apache.hadoop.hbase.regionserver.StoreFile: Delete Family Bloom filter type for hdfs://10.18.40.217:9000/hbase/ufdr/1d4656fd417a07c9171a38b8f4d08510/.tmp/03742024b28f443bb63cfc338d4ca422: CompoundBloomFilterWriter 2012-05-18 20:04:25,576 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 120.57 MB of total=1020.57 MB 2012-05-18 20:04:25,655 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=120.82 MB, total=907.89 MB, single=1012.11 MB, multi=6.12 MB, memory=0 KB 2012-05-18 20:04:25,733 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed openScanner java.lang.IllegalStateException: Schema metrics requested before table/CF name initialization: {tableName:null,cfName:null} at org.apache.hadoop.hbase.regionserver.metrics.SchemaConfigured.getSchemaMetrics(SchemaConfigured.java:182) at org.apache.hadoop.hbase.io.hfile.LruBlockCache.updateSizeMetrics(LruBlockCache.java:310) at org.apache.hadoop.hbase.io.hfile.LruBlockCache.cacheBlock(LruBlockCache.java:274) at org.apache.hadoop.hbase.io.hfile.LruBlockCache.cacheBlock(LruBlockCache.java:293) at org.apache.hadoop.hbase.io.hfile.DoubleBlockCache.getBlock(DoubleBlockCache.java:102) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:296) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(HFileBlockIndex.java:213) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:455) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:475) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:226) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:145) at org.apache.hadoop.hbase.regionserver.StoreScanner.init(StoreScanner.java:130) at org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:2001) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.init(HRegion.java:3274)
[jira] [Assigned] (HBASE-6048) Table Scan is failing if offheap cache enabled
[ https://issues.apache.org/jira/browse/HBASE-6048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan reassigned HBASE-6048: - Assignee: ramkrishna.s.vasudevan Table Scan is failing if offheap cache enabled -- Key: HBASE-6048 URL: https://issues.apache.org/jira/browse/HBASE-6048 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Critical Table Scan is failing if offheap cache enabled. {noformat} 2012-05-18 20:03:38,446 DEBUG org.apache.hadoop.hbase.io.hfile.HFileWriterV2: Initialized with CacheConfig:enabled [cacheDataOnRead=true] [cacheDataOnWrite=false] [cacheIndexesOnWrite=false] [cacheBloomsOnWrite=false] [cacheEvictOnClose=false] [cacheCompressed=false] 2012-05-18 20:03:38,446 INFO org.apache.hadoop.hbase.regionserver.StoreFile: Delete Family Bloom filter type for hdfs://10.18.40.217:9000/hbase/ufdr/1d4656fd417a07c9171a38b8f4d08510/.tmp/03742024b28f443bb63cfc338d4ca422: CompoundBloomFilterWriter 2012-05-18 20:04:25,576 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 120.57 MB of total=1020.57 MB 2012-05-18 20:04:25,655 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=120.82 MB, total=907.89 MB, single=1012.11 MB, multi=6.12 MB, memory=0 KB 2012-05-18 20:04:25,733 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed openScanner java.lang.IllegalStateException: Schema metrics requested before table/CF name initialization: {tableName:null,cfName:null} at org.apache.hadoop.hbase.regionserver.metrics.SchemaConfigured.getSchemaMetrics(SchemaConfigured.java:182) at org.apache.hadoop.hbase.io.hfile.LruBlockCache.updateSizeMetrics(LruBlockCache.java:310) at org.apache.hadoop.hbase.io.hfile.LruBlockCache.cacheBlock(LruBlockCache.java:274) at org.apache.hadoop.hbase.io.hfile.LruBlockCache.cacheBlock(LruBlockCache.java:293) at org.apache.hadoop.hbase.io.hfile.DoubleBlockCache.getBlock(DoubleBlockCache.java:102) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:296) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(HFileBlockIndex.java:213) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:455) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:475) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:226) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:145) at org.apache.hadoop.hbase.regionserver.StoreScanner.init(StoreScanner.java:130) at org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:2001) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.init(HRegion.java:3274) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:1604) at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1596) at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1572) at org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2310) at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1376) 2012-05-18 20:04:25,828 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed openScanner {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6011) Unable to start master in local mode
[ https://issues.apache.org/jira/browse/HBASE-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-6011: -- Attachment: HBASE-6011_addendum_0.92.patch Unable to start master in local mode Key: HBASE-6011 URL: https://issues.apache.org/jira/browse/HBASE-6011 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6011-v2.patch, 6011.patch, HBASE-6011_1.patch, HBASE-6011_addendum_0.92.patch Got this trying to launch head of 0.94 branch in local mode from the build tree but it happens with trunk and 0.92 too: {noformat} 12/05/15 19:35:45 ERROR master.HMasterCommandLine: Failed to start master java.lang.ClassCastException: org.apache.hadoop.hbase.master.HMaster cannot be cast to org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1761) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6011) Unable to start master in local mode
[ https://issues.apache.org/jira/browse/HBASE-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-6011: -- Attachment: HBASE-6011_addendum_0.94.patch Unable to start master in local mode Key: HBASE-6011 URL: https://issues.apache.org/jira/browse/HBASE-6011 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6011-v2.patch, 6011.patch, HBASE-6011_1.patch, HBASE-6011_addendum_0.92.patch, HBASE-6011_addendum_0.94.patch, HBASE-6011_addendum_trunk.patch Got this trying to launch head of 0.94 branch in local mode from the build tree but it happens with trunk and 0.92 too: {noformat} 12/05/15 19:35:45 ERROR master.HMasterCommandLine: Failed to start master java.lang.ClassCastException: org.apache.hadoop.hbase.master.HMaster cannot be cast to org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1761) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6011) Unable to start master in local mode
[ https://issues.apache.org/jira/browse/HBASE-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-6011: -- Attachment: HBASE-6011_addendum_trunk.patch Updated patches. Unable to start master in local mode Key: HBASE-6011 URL: https://issues.apache.org/jira/browse/HBASE-6011 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6011-v2.patch, 6011.patch, HBASE-6011_1.patch, HBASE-6011_addendum_0.92.patch, HBASE-6011_addendum_0.94.patch, HBASE-6011_addendum_trunk.patch Got this trying to launch head of 0.94 branch in local mode from the build tree but it happens with trunk and 0.92 too: {noformat} 12/05/15 19:35:45 ERROR master.HMasterCommandLine: Failed to start master java.lang.ClassCastException: org.apache.hadoop.hbase.master.HMaster cannot be cast to org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1761) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6011) Unable to start master in local mode
[ https://issues.apache.org/jira/browse/HBASE-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-6011: -- Status: Patch Available (was: Open) Unable to start master in local mode Key: HBASE-6011 URL: https://issues.apache.org/jira/browse/HBASE-6011 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6011-v2.patch, 6011.patch, HBASE-6011_1.patch, HBASE-6011_addendum_0.92.patch, HBASE-6011_addendum_0.94.patch, HBASE-6011_addendum_trunk.patch Got this trying to launch head of 0.94 branch in local mode from the build tree but it happens with trunk and 0.92 too: {noformat} 12/05/15 19:35:45 ERROR master.HMasterCommandLine: Failed to start master java.lang.ClassCastException: org.apache.hadoop.hbase.master.HMaster cannot be cast to org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1761) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6011) Unable to start master in local mode
[ https://issues.apache.org/jira/browse/HBASE-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-6011: -- Status: Open (was: Patch Available) Unable to start master in local mode Key: HBASE-6011 URL: https://issues.apache.org/jira/browse/HBASE-6011 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6011-v2.patch, 6011.patch, HBASE-6011_1.patch, HBASE-6011_addendum_0.92.patch, HBASE-6011_addendum_0.94.patch, HBASE-6011_addendum_trunk.patch Got this trying to launch head of 0.94 branch in local mode from the build tree but it happens with trunk and 0.92 too: {noformat} 12/05/15 19:35:45 ERROR master.HMasterCommandLine: Failed to start master java.lang.ClassCastException: org.apache.hadoop.hbase.master.HMaster cannot be cast to org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1761) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6011) Unable to start master in local mode
[ https://issues.apache.org/jira/browse/HBASE-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278835#comment-13278835 ] ramkrishna.s.vasudevan commented on HBASE-6011: --- In the latest JIRA even HadoopQA and Hudson are becoming watchers :) Unable to start master in local mode Key: HBASE-6011 URL: https://issues.apache.org/jira/browse/HBASE-6011 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6011-v2.patch, 6011.patch, HBASE-6011_1.patch, HBASE-6011_addendum_0.92.patch, HBASE-6011_addendum_0.94.patch, HBASE-6011_addendum_trunk.patch Got this trying to launch head of 0.94 branch in local mode from the build tree but it happens with trunk and 0.92 too: {noformat} 12/05/15 19:35:45 ERROR master.HMasterCommandLine: Failed to start master java.lang.ClassCastException: org.apache.hadoop.hbase.master.HMaster cannot be cast to org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1761) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6046) Master retry on ZK session expiry causes inconsistent region assignments.
[ https://issues.apache.org/jira/browse/HBASE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278839#comment-13278839 ] ramkrishna.s.vasudevan commented on HBASE-6046: --- The problem here is when the master retries to come out of zk expiry exception and if he succeeds the entire master is almost recreated in the sense {code} try { if (!becomeActiveMaster(status)) { return Boolean.FALSE; } initializeZKBasedSystemTrackers(); // Update in-memory structures to reflect our earlier Root/Meta assignment. assignRootAndMeta(status); // process RIT if any // TODO: Why does this not call AssignmentManager.joinCluster? Otherwise // we are not processing dead servers if any. assignmentManager.processDeadServersAndRegionsInTransition(); {code} Here the initializeZKBasedSystemTrackers() will even create new AssignmentManager. So what ever he does in processDeadServersAndRegionsInTransition() is like a fresh start. So in processDeadServersAndRegionsInTransition() {code} for (Map.EntryHRegionInfo, ServerName e: this.regions.entrySet()) { if (!e.getKey().isMetaTable() e.getValue() != null) { LOG.debug(Found + e + out on cluster); this.failover = true; break; } {code} Though all the RS is online we will have the 'this.regions' empty and hence we go with completely new assignment. Master retry on ZK session expiry causes inconsistent region assignments. - Key: HBASE-6046 URL: https://issues.apache.org/jira/browse/HBASE-6046 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.1, 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Fix For: 0.92.2, 0.94.1 1 ZK Session timeout in the hmaster leads to bulk assignment though all the RSs are online. 2 While doing bulk assignment, if the master again goes down restart(or backup comes up) all the node created in the ZK will now be tried to reassign to the new RSs. This is leading to double assignment. we had 2800 regions, among this 1900 region got double assignment, taking the region count to 4700. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6049) Serializing List containing null elements will cause NullPointerException in HbaseObjectWritable.writeObject()
Maryann Xue created HBASE-6049: -- Summary: Serializing List containing null elements will cause NullPointerException in HbaseObjectWritable.writeObject() Key: HBASE-6049 URL: https://issues.apache.org/jira/browse/HBASE-6049 Project: HBase Issue Type: Bug Components: io Affects Versions: 0.94.0 Reporter: Maryann Xue An error case could be in Coprocessor AggregationClient, the median() function handles an empty region and returns a List Object with the first element as a Null value. NPE occurs in the RPC response stage and the response never gets sent. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6049) Serializing List containing null elements will cause NullPointerException in HbaseObjectWritable.writeObject()
[ https://issues.apache.org/jira/browse/HBASE-6049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maryann Xue updated HBASE-6049: --- Attachment: HBASE-6049.patch handle null values in a list in writeObject() Serializing List containing null elements will cause NullPointerException in HbaseObjectWritable.writeObject() Key: HBASE-6049 URL: https://issues.apache.org/jira/browse/HBASE-6049 Project: HBase Issue Type: Bug Components: io Affects Versions: 0.94.0 Reporter: Maryann Xue Attachments: HBASE-6049.patch An error case could be in Coprocessor AggregationClient, the median() function handles an empty region and returns a List Object with the first element as a Null value. NPE occurs in the RPC response stage and the response never gets sent. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6049) Serializing List containing null elements will cause NullPointerException in HbaseObjectWritable.writeObject()
[ https://issues.apache.org/jira/browse/HBASE-6049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maryann Xue updated HBASE-6049: --- Status: Patch Available (was: Open) Serializing List containing null elements will cause NullPointerException in HbaseObjectWritable.writeObject() Key: HBASE-6049 URL: https://issues.apache.org/jira/browse/HBASE-6049 Project: HBase Issue Type: Bug Components: io Affects Versions: 0.94.0 Reporter: Maryann Xue Attachments: HBASE-6049.patch An error case could be in Coprocessor AggregationClient, the median() function handles an empty region and returns a List Object with the first element as a Null value. NPE occurs in the RPC response stage and the response never gets sent. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6049) Serializing List containing null elements will cause NullPointerException in HbaseObjectWritable.writeObject()
[ https://issues.apache.org/jira/browse/HBASE-6049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278848#comment-13278848 ] Zhihong Yu commented on HBASE-6049: --- Patch makes sense. Serializing List containing null elements will cause NullPointerException in HbaseObjectWritable.writeObject() Key: HBASE-6049 URL: https://issues.apache.org/jira/browse/HBASE-6049 Project: HBase Issue Type: Bug Components: io Affects Versions: 0.94.0 Reporter: Maryann Xue Attachments: HBASE-6049.patch An error case could be in Coprocessor AggregationClient, the median() function handles an empty region and returns a List Object with the first element as a Null value. NPE occurs in the RPC response stage and the response never gets sent. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6011) Unable to start master in local mode
[ https://issues.apache.org/jira/browse/HBASE-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278854#comment-13278854 ] Hadoop QA commented on HBASE-6011: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528062/HBASE-6011_addendum_trunk.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop23. The patch compiles against the hadoop 0.23.x profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 32 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestFromClientSide org.apache.hadoop.hbase.master.TestSplitLogManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1930//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1930//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1930//console This message is automatically generated. Unable to start master in local mode Key: HBASE-6011 URL: https://issues.apache.org/jira/browse/HBASE-6011 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6011-v2.patch, 6011.patch, HBASE-6011_1.patch, HBASE-6011_addendum_0.92.patch, HBASE-6011_addendum_0.94.patch, HBASE-6011_addendum_trunk.patch Got this trying to launch head of 0.94 branch in local mode from the build tree but it happens with trunk and 0.92 too: {noformat} 12/05/15 19:35:45 ERROR master.HMasterCommandLine: Failed to start master java.lang.ClassCastException: org.apache.hadoop.hbase.master.HMaster cannot be cast to org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1761) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.
[ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278881#comment-13278881 ] ramkrishna.s.vasudevan commented on HBASE-6050: --- Why are we trying to create the dstdir? What is the reason for it? Is the fix to be applied here or on the HBCK side so that he does not think that there is some inconsistency? But if we make this change in HBCK we are not sure how to delete the recovered.edits file created because master will never try to open this region? HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent. --- Key: HBASE-6050 URL: https://issues.apache.org/jira/browse/HBASE-6050 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan The scenario is like this - A region is getting splitted. - The master is still not processed the split . - Region server goes down. - Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path. - CJ starts and deletes the entry from META and also just completes the deletion of the region dir. - in hlogSplitter on final step we rename the recovered.edits to come under the regiondir. There if the regiondir doesnot exist we tend to create and then add the recovered.edits. Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo. Ideally cluster is fine but we it is misleading. {code} } else { Path dstdir = dst.getParent(); if (!fs.exists(dstdir)) { if (!fs.mkdirs(dstdir)) LOG.warn(mkdir failed on + dstdir); } } fs.rename(src, dst); LOG.debug( moved + src + = + dst); } else { LOG.debug(Could not move recovered edits from + src + as it doesn't exist); } } archiveLogs(null, corruptedLogs, processedLogs, oldLogDir, fs, conf); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.
ramkrishna.s.vasudevan created HBASE-6050: - Summary: HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent. Key: HBASE-6050 URL: https://issues.apache.org/jira/browse/HBASE-6050 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan The scenario is like this - A region is getting splitted. - The master is still not processed the split . - Region server goes down. - Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path. - CJ starts and deletes the entry from META and also just completes the deletion of the region dir. - in hlogSplitter on final step we rename the recovered.edits to come under the regiondir. There if the regiondir doesnot exist we tend to create and then add the recovered.edits. Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo. Ideally cluster is fine but we it is misleading. {code} } else { Path dstdir = dst.getParent(); if (!fs.exists(dstdir)) { if (!fs.mkdirs(dstdir)) LOG.warn(mkdir failed on + dstdir); } } fs.rename(src, dst); LOG.debug( moved + src + = + dst); } else { LOG.debug(Could not move recovered edits from + src + as it doesn't exist); } } archiveLogs(null, corruptedLogs, processedLogs, oldLogDir, fs, conf); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6044) copytable: remove rs.* parameters
[ https://issues.apache.org/jira/browse/HBASE-6044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-6044: -- Attachment: hbase-6044-v4.patch There was a conflict with recently committed patch. this one should apply. copytable: remove rs.* parameters - Key: HBASE-6044 URL: https://issues.apache.org/jira/browse/HBASE-6044 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: hbase-6044-v2.patch, hbase-6044-v3.patch, hbase-6044-v4.patch, hbase-6044.patch In discussion of HBASE-6013 it was suggested that we remove these arguments from 0.92+ (but keep in 0.90) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6049) Serializing List containing null elements will cause NullPointerException in HbaseObjectWritable.writeObject()
[ https://issues.apache.org/jira/browse/HBASE-6049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278905#comment-13278905 ] Hadoop QA commented on HBASE-6049: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528069/HBASE-6049.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop23. The patch compiles against the hadoop 0.23.x profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 32 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1931//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1931//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1931//console This message is automatically generated. Serializing List containing null elements will cause NullPointerException in HbaseObjectWritable.writeObject() Key: HBASE-6049 URL: https://issues.apache.org/jira/browse/HBASE-6049 Project: HBase Issue Type: Bug Components: io Affects Versions: 0.94.0 Reporter: Maryann Xue Attachments: HBASE-6049.patch An error case could be in Coprocessor AggregationClient, the median() function handles an empty region and returns a List Object with the first element as a Null value. NPE occurs in the RPC response stage and the response never gets sent. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6047) Put.has() can't determine result correctly
[ https://issues.apache.org/jira/browse/HBASE-6047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-6047: -- Description: the public method 'has(byte [] family, byte [] qualifier)' internally invoked the private method 'has(byte [] family, byte [] qualifier, long ts, byte [] value, boolean ignoreTS, boolean ignoreValue)' with 'value=new byte[0], ignoreTS=true, ignoreValue=true', but there's a logical error in the body, it'll enter the block {code} else if (ignoreValue) { for (KeyValue kv: list) { if (Arrays.equals(kv.getFamily(), family) Arrays.equals(kv.getQualifier(), qualifier) kv.getTimestamp() == ts) { return true; } } } {code} the expression 'kv.getTimestamp() == ts' in the if conditions should only exist when 'ignoreTS=false', otherwise, the following code will return false! {code} Put put = new Put(Bytes.toBytes(row-01)); put.add(Bytes.toBytes(family-01), Bytes.toBytes(qualifier-01), 1234567L, Bytes.toBytes(value-01)); System.out.println(put.has(Bytes.toBytes(family-01), Bytes.toBytes(qualifier-01))); {code} was: the public method 'has(byte [] family, byte [] qualifier)' internally invoked the private method 'has(byte [] family, byte [] qualifier, long ts, byte [] value, boolean ignoreTS, boolean ignoreValue)' with 'value=new byte[0], ignoreTS=true, ignoreValue=true', but there's a logical error in the body, it'll enter the block else if (ignoreValue) { for (KeyValue kv: list) { if (Arrays.equals(kv.getFamily(), family) Arrays.equals(kv.getQualifier(), qualifier) kv.getTimestamp() == ts) { return true; } } } the expression 'kv.getTimestamp() == ts' in the if conditions should only exist when 'ignoreTS=false', otherwise, the following code will return false! Put put = new Put(Bytes.toBytes(row-01)); put.add(Bytes.toBytes(family-01), Bytes.toBytes(qualifier-01), 1234567L, Bytes.toBytes(value-01)); System.out.println(put.has(Bytes.toBytes(family-01), Bytes.toBytes(qualifier-01))); Put.has() can't determine result correctly -- Key: HBASE-6047 URL: https://issues.apache.org/jira/browse/HBASE-6047 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.92.1 Reporter: Wang Qiang the public method 'has(byte [] family, byte [] qualifier)' internally invoked the private method 'has(byte [] family, byte [] qualifier, long ts, byte [] value, boolean ignoreTS, boolean ignoreValue)' with 'value=new byte[0], ignoreTS=true, ignoreValue=true', but there's a logical error in the body, it'll enter the block {code} else if (ignoreValue) { for (KeyValue kv: list) { if (Arrays.equals(kv.getFamily(), family) Arrays.equals(kv.getQualifier(), qualifier) kv.getTimestamp() == ts) { return true; } } } {code} the expression 'kv.getTimestamp() == ts' in the if conditions should only exist when 'ignoreTS=false', otherwise, the following code will return false! {code} Put put = new Put(Bytes.toBytes(row-01)); put.add(Bytes.toBytes(family-01), Bytes.toBytes(qualifier-01), 1234567L, Bytes.toBytes(value-01)); System.out.println(put.has(Bytes.toBytes(family-01), Bytes.toBytes(qualifier-01))); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6047) Put.has() can't determine result correctly
[ https://issues.apache.org/jira/browse/HBASE-6047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278916#comment-13278916 ] Zhihong Yu commented on HBASE-6047: --- Can you put the second code snippet into a test and provide a patch ? Thanks Put.has() can't determine result correctly -- Key: HBASE-6047 URL: https://issues.apache.org/jira/browse/HBASE-6047 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.92.1 Reporter: Wang Qiang the public method 'has(byte [] family, byte [] qualifier)' internally invoked the private method 'has(byte [] family, byte [] qualifier, long ts, byte [] value, boolean ignoreTS, boolean ignoreValue)' with 'value=new byte[0], ignoreTS=true, ignoreValue=true', but there's a logical error in the body, it'll enter the block {code} else if (ignoreValue) { for (KeyValue kv: list) { if (Arrays.equals(kv.getFamily(), family) Arrays.equals(kv.getQualifier(), qualifier) kv.getTimestamp() == ts) { return true; } } } {code} the expression 'kv.getTimestamp() == ts' in the if conditions should only exist when 'ignoreTS=false', otherwise, the following code will return false! {code} Put put = new Put(Bytes.toBytes(row-01)); put.add(Bytes.toBytes(family-01), Bytes.toBytes(qualifier-01), 1234567L, Bytes.toBytes(value-01)); System.out.println(put.has(Bytes.toBytes(family-01), Bytes.toBytes(qualifier-01))); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6011) Unable to start master in local mode
[ https://issues.apache.org/jira/browse/HBASE-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278918#comment-13278918 ] Andrew Purtell commented on HBASE-6011: --- Ah, I see, +1 on the addendum. Unable to start master in local mode Key: HBASE-6011 URL: https://issues.apache.org/jira/browse/HBASE-6011 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6011-v2.patch, 6011.patch, HBASE-6011_1.patch, HBASE-6011_addendum_0.92.patch, HBASE-6011_addendum_0.94.patch, HBASE-6011_addendum_trunk.patch Got this trying to launch head of 0.94 branch in local mode from the build tree but it happens with trunk and 0.92 too: {noformat} 12/05/15 19:35:45 ERROR master.HMasterCommandLine: Failed to start master java.lang.ClassCastException: org.apache.hadoop.hbase.master.HMaster cannot be cast to org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1761) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5926) Delete the master znode after a master crash
[ https://issues.apache.org/jira/browse/HBASE-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278919#comment-13278919 ] nkeywal commented on HBASE-5926: I think it's ok, I don't have this locally... Delete the master znode after a master crash Key: HBASE-5926 URL: https://issues.apache.org/jira/browse/HBASE-5926 Project: HBase Issue Type: Improvement Components: master, scripts Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5926.v10.patch, 5926.v11.patch, 5926.v13.patch, 5926.v6.patch, 5926.v8.patch, 5926.v9.patch This is the continuation of the work done in HBASE-5844. But we can't apply exactly the same strategy: for the region server, there is a znode per region server, while for the master backup master there is a single znode for both. So if we apply the same strategy as for a regionserver, we may have this scenario: 1) Master starts 2) Backup master starts 3) Master dies 4) ZK detects it 5) Backup master receives the update from ZK 6) Backup master creates the new master node and become the main master 7) Previous master script continues 8) Previous master script deletes the master node in ZK 9) = issue: we deleted the node just created by the new master This should not happen often (usually the znode will be deleted soon enough), but it can happen. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6011) Unable to start master in local mode
[ https://issues.apache.org/jira/browse/HBASE-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-6011: -- Resolution: Fixed Status: Resolved (was: Patch Available) Committed all addendum patches. Sorry Ram. Unable to start master in local mode Key: HBASE-6011 URL: https://issues.apache.org/jira/browse/HBASE-6011 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6011-v2.patch, 6011.patch, HBASE-6011_1.patch, HBASE-6011_addendum_0.92.patch, HBASE-6011_addendum_0.94.patch, HBASE-6011_addendum_trunk.patch Got this trying to launch head of 0.94 branch in local mode from the build tree but it happens with trunk and 0.92 too: {noformat} 12/05/15 19:35:45 ERROR master.HMasterCommandLine: Failed to start master java.lang.ClassCastException: org.apache.hadoop.hbase.master.HMaster cannot be cast to org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1761) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278927#comment-13278927 ] Zhihong Yu commented on HBASE-5699: --- @Ramkrishna: Your numbers look better than mine though the mix in my case was 50% updates and 50% puts. Can you publish latency numbers as well ? Run with 1 WAL in HRegionServer - Key: HBASE-5699 URL: https://issues.apache.org/jira/browse/HBASE-5699 Project: HBase Issue Type: Improvement Reporter: binlijin Assignee: Li Pi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6044) copytable: remove rs.* parameters
[ https://issues.apache.org/jira/browse/HBASE-6044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278948#comment-13278948 ] Hadoop QA commented on HBASE-6044: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528091/hbase-6044-v4.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop23. The patch compiles against the hadoop 0.23.x profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 32 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1932//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1932//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1932//console This message is automatically generated. copytable: remove rs.* parameters - Key: HBASE-6044 URL: https://issues.apache.org/jira/browse/HBASE-6044 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: hbase-6044-v2.patch, hbase-6044-v3.patch, hbase-6044-v4.patch, hbase-6044.patch In discussion of HBASE-6013 it was suggested that we remove these arguments from 0.92+ (but keep in 0.90) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5926) Delete the master znode after a master crash
[ https://issues.apache.org/jira/browse/HBASE-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278967#comment-13278967 ] Zhihong Yu commented on HBASE-5926: --- Minor comments: {code} + * servers. It allows to delete immediately the znode when the master or the regions server crash. {code} 'regions server crash' - 'region server crashes' {code} + * The region server / master write a specific file when they start / become main master. When they {code} 'write a specific' - 'writes a specific'. 'they start / become' - 'it starts / becomes' I think using 'they' is confusing because region server and master have different roles. {code} + clear Delete the master znode in ZooKeeper after a master crash\n + {code} 'master crash' - 'master crashes' Delete the master znode after a master crash Key: HBASE-5926 URL: https://issues.apache.org/jira/browse/HBASE-5926 Project: HBase Issue Type: Improvement Components: master, scripts Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5926.v10.patch, 5926.v11.patch, 5926.v13.patch, 5926.v6.patch, 5926.v8.patch, 5926.v9.patch This is the continuation of the work done in HBASE-5844. But we can't apply exactly the same strategy: for the region server, there is a znode per region server, while for the master backup master there is a single znode for both. So if we apply the same strategy as for a regionserver, we may have this scenario: 1) Master starts 2) Backup master starts 3) Master dies 4) ZK detects it 5) Backup master receives the update from ZK 6) Backup master creates the new master node and become the main master 7) Previous master script continues 8) Previous master script deletes the master node in ZK 9) = issue: we deleted the node just created by the new master This should not happen often (usually the znode will be deleted soon enough), but it can happen. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6011) Unable to start master in local mode
[ https://issues.apache.org/jira/browse/HBASE-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278985#comment-13278985 ] ramkrishna.s.vasudevan commented on HBASE-6011: --- No problem. Thanks Andy for committing them. Unable to start master in local mode Key: HBASE-6011 URL: https://issues.apache.org/jira/browse/HBASE-6011 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6011-v2.patch, 6011.patch, HBASE-6011_1.patch, HBASE-6011_addendum_0.92.patch, HBASE-6011_addendum_0.94.patch, HBASE-6011_addendum_trunk.patch Got this trying to launch head of 0.94 branch in local mode from the build tree but it happens with trunk and 0.92 too: {noformat} 12/05/15 19:35:45 ERROR master.HMasterCommandLine: Failed to start master java.lang.ClassCastException: org.apache.hadoop.hbase.master.HMaster cannot be cast to org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1761) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5926) Delete the master znode after a master crash
[ https://issues.apache.org/jira/browse/HBASE-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-5926: --- Status: Open (was: Patch Available) Delete the master znode after a master crash Key: HBASE-5926 URL: https://issues.apache.org/jira/browse/HBASE-5926 Project: HBase Issue Type: Improvement Components: master, scripts Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5926.v10.patch, 5926.v11.patch, 5926.v13.patch, 5926.v14.patch, 5926.v6.patch, 5926.v8.patch, 5926.v9.patch This is the continuation of the work done in HBASE-5844. But we can't apply exactly the same strategy: for the region server, there is a znode per region server, while for the master backup master there is a single znode for both. So if we apply the same strategy as for a regionserver, we may have this scenario: 1) Master starts 2) Backup master starts 3) Master dies 4) ZK detects it 5) Backup master receives the update from ZK 6) Backup master creates the new master node and become the main master 7) Previous master script continues 8) Previous master script deletes the master node in ZK 9) = issue: we deleted the node just created by the new master This should not happen often (usually the znode will be deleted soon enough), but it can happen. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5926) Delete the master znode after a master crash
[ https://issues.apache.org/jira/browse/HBASE-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-5926: --- Attachment: 5926.v14.patch Delete the master znode after a master crash Key: HBASE-5926 URL: https://issues.apache.org/jira/browse/HBASE-5926 Project: HBase Issue Type: Improvement Components: master, scripts Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5926.v10.patch, 5926.v11.patch, 5926.v13.patch, 5926.v14.patch, 5926.v6.patch, 5926.v8.patch, 5926.v9.patch This is the continuation of the work done in HBASE-5844. But we can't apply exactly the same strategy: for the region server, there is a znode per region server, while for the master backup master there is a single znode for both. So if we apply the same strategy as for a regionserver, we may have this scenario: 1) Master starts 2) Backup master starts 3) Master dies 4) ZK detects it 5) Backup master receives the update from ZK 6) Backup master creates the new master node and become the main master 7) Previous master script continues 8) Previous master script deletes the master node in ZK 9) = issue: we deleted the node just created by the new master This should not happen often (usually the znode will be deleted soon enough), but it can happen. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5926) Delete the master znode after a master crash
[ https://issues.apache.org/jira/browse/HBASE-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-5926: --- Status: Patch Available (was: Open) Delete the master znode after a master crash Key: HBASE-5926 URL: https://issues.apache.org/jira/browse/HBASE-5926 Project: HBase Issue Type: Improvement Components: master, scripts Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5926.v10.patch, 5926.v11.patch, 5926.v13.patch, 5926.v14.patch, 5926.v6.patch, 5926.v8.patch, 5926.v9.patch This is the continuation of the work done in HBASE-5844. But we can't apply exactly the same strategy: for the region server, there is a znode per region server, while for the master backup master there is a single znode for both. So if we apply the same strategy as for a regionserver, we may have this scenario: 1) Master starts 2) Backup master starts 3) Master dies 4) ZK detects it 5) Backup master receives the update from ZK 6) Backup master creates the new master node and become the main master 7) Previous master script continues 8) Previous master script deletes the master node in ZK 9) = issue: we deleted the node just created by the new master This should not happen often (usually the znode will be deleted soon enough), but it can happen. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5997) Fix concerns raised in HBASE-5922 related to HalfStoreFileReader
[ https://issues.apache.org/jira/browse/HBASE-5997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13278994#comment-13278994 ] Zhihong Yu commented on HBASE-5997: --- @Anoop: Is your patch supposed to fix the problem shown in the unit test published yesterday ? With the patch, I still see the following test failure: {code} testHalfScannerSeekBefore(org.apache.hadoop.hbase.io.TestHBASE5997) Time elapsed: 3.193 sec ERROR! org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=1, exceptions: Fri May 18 10:31:13 PDT 2012, org.apache.hadoop.hbase.client.HTable$4@605df3c5, org.apache.hadoop.hbase.regionserver.WrongRegionException: org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for Get on HRegion t1,r3,1337362272730.9f3ae0ff1963c95df2df0aa768e739fc., startKey='r3', getEndKey()='', row='r2' {code} Fix concerns raised in HBASE-5922 related to HalfStoreFileReader Key: HBASE-5997 URL: https://issues.apache.org/jira/browse/HBASE-5997 Project: HBase Issue Type: Bug Affects Versions: 0.90.6, 0.92.1, 0.94.0, 0.96.0 Reporter: ramkrishna.s.vasudevan Assignee: Anoop Sam John Attachments: HBASE-5997_0.94.patch, Testcase.patch.txt Pls refer to the comment https://issues.apache.org/jira/browse/HBASE-5922?focusedCommentId=13269346page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13269346. Raised this issue to solve that comment. Just incase we don't forget it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5453) Switch on-disk formats (reference files, HFile meta fields, etc) to PB
[ https://issues.apache.org/jira/browse/HBASE-5453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5453: - Attachment: 5453v12.txt Fixed nasty bug. Default value system and deserialization of HColumnDescriptor clashed. Resulting in cryptic odd failed scanner issue. Yuck. Switch on-disk formats (reference files, HFile meta fields, etc) to PB -- Key: HBASE-5453 URL: https://issues.apache.org/jira/browse/HBASE-5453 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: stack Fix For: 0.96.0 Attachments: 5453.txt, 5453v10.txt, 5453v11.txt, 5453v11.txt, 5453v12.txt, 5453v2.txt, 5453v3.txt, 5453v6.txt, 5453v9.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5453) Switch on-disk formats (reference files, HFile meta fields, etc) to PB
[ https://issues.apache.org/jira/browse/HBASE-5453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5453: - Status: Patch Available (was: Open) Switch on-disk formats (reference files, HFile meta fields, etc) to PB -- Key: HBASE-5453 URL: https://issues.apache.org/jira/browse/HBASE-5453 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: stack Fix For: 0.96.0 Attachments: 5453.txt, 5453v10.txt, 5453v11.txt, 5453v11.txt, 5453v12.txt, 5453v2.txt, 5453v3.txt, 5453v6.txt, 5453v9.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5453) Switch on-disk formats (reference files, HFile meta fields, etc) to PB
[ https://issues.apache.org/jira/browse/HBASE-5453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5453: - Status: Open (was: Patch Available) Switch on-disk formats (reference files, HFile meta fields, etc) to PB -- Key: HBASE-5453 URL: https://issues.apache.org/jira/browse/HBASE-5453 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: stack Fix For: 0.96.0 Attachments: 5453.txt, 5453v10.txt, 5453v11.txt, 5453v11.txt, 5453v12.txt, 5453v2.txt, 5453v3.txt, 5453v6.txt, 5453v9.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5926) Delete the master znode after a master crash
[ https://issues.apache.org/jira/browse/HBASE-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13279000#comment-13279000 ] stack commented on HBASE-5926: -- +1 on patch v14 Delete the master znode after a master crash Key: HBASE-5926 URL: https://issues.apache.org/jira/browse/HBASE-5926 Project: HBase Issue Type: Improvement Components: master, scripts Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5926.v10.patch, 5926.v11.patch, 5926.v13.patch, 5926.v14.patch, 5926.v6.patch, 5926.v8.patch, 5926.v9.patch This is the continuation of the work done in HBASE-5844. But we can't apply exactly the same strategy: for the region server, there is a znode per region server, while for the master backup master there is a single znode for both. So if we apply the same strategy as for a regionserver, we may have this scenario: 1) Master starts 2) Backup master starts 3) Master dies 4) ZK detects it 5) Backup master receives the update from ZK 6) Backup master creates the new master node and become the main master 7) Previous master script continues 8) Previous master script deletes the master node in ZK 9) = issue: we deleted the node just created by the new master This should not happen often (usually the znode will be deleted soon enough), but it can happen. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5926) Delete the master znode after a master crash
[ https://issues.apache.org/jira/browse/HBASE-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13279005#comment-13279005 ] stack commented on HBASE-5926: -- I'll fix Ted comments on commit. Delete the master znode after a master crash Key: HBASE-5926 URL: https://issues.apache.org/jira/browse/HBASE-5926 Project: HBase Issue Type: Improvement Components: master, scripts Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5926.v10.patch, 5926.v11.patch, 5926.v13.patch, 5926.v14.patch, 5926.v6.patch, 5926.v8.patch, 5926.v9.patch This is the continuation of the work done in HBASE-5844. But we can't apply exactly the same strategy: for the region server, there is a znode per region server, while for the master backup master there is a single znode for both. So if we apply the same strategy as for a regionserver, we may have this scenario: 1) Master starts 2) Backup master starts 3) Master dies 4) ZK detects it 5) Backup master receives the update from ZK 6) Backup master creates the new master node and become the main master 7) Previous master script continues 8) Previous master script deletes the master node in ZK 9) = issue: we deleted the node just created by the new master This should not happen often (usually the znode will be deleted soon enough), but it can happen. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5926) Delete the master znode after a master crash
[ https://issues.apache.org/jira/browse/HBASE-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5926: - Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Looks like N already addressed Ted's review comments. Great. Applied to trunk. Thanks for the patch N. Delete the master znode after a master crash Key: HBASE-5926 URL: https://issues.apache.org/jira/browse/HBASE-5926 Project: HBase Issue Type: Improvement Components: master, scripts Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5926.v10.patch, 5926.v11.patch, 5926.v13.patch, 5926.v14.patch, 5926.v6.patch, 5926.v8.patch, 5926.v9.patch This is the continuation of the work done in HBASE-5844. But we can't apply exactly the same strategy: for the region server, there is a znode per region server, while for the master backup master there is a single znode for both. So if we apply the same strategy as for a regionserver, we may have this scenario: 1) Master starts 2) Backup master starts 3) Master dies 4) ZK detects it 5) Backup master receives the update from ZK 6) Backup master creates the new master node and become the main master 7) Previous master script continues 8) Previous master script deletes the master node in ZK 9) = issue: we deleted the node just created by the new master This should not happen often (usually the znode will be deleted soon enough), but it can happen. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6002) Possible chance of resource leak in HlogSplitter
[ https://issues.apache.org/jira/browse/HBASE-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13279012#comment-13279012 ] Chinna Rao Lalam commented on HBASE-6002: - @Stack: yes writersClosed flag is used for not to attempt in finally block if the all writers are closed successfully. @Ted: bq.close previously closed stream has no effect I think previous close successful then it wont take any effect (I am fully not sure) i need to check this. Possible chance of resource leak in HlogSplitter Key: HBASE-6002 URL: https://issues.apache.org/jira/browse/HBASE-6002 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0, 0.96.0 Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HBASE-6002.patch, HBASE-6002_0.94_1.patch, HBASE-6002_trunk.patch In HLogSplitter.splitLogFileToTemp-Reader(in) is not closed and in finally block in loop while closing the writers(wap.w) if any exception comes other writers won't close. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6044) copytable: remove rs.* parameters
[ https://issues.apache.org/jira/browse/HBASE-6044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13279013#comment-13279013 ] stack commented on HBASE-6044: -- +1 Thanks for updating docs. copytable: remove rs.* parameters - Key: HBASE-6044 URL: https://issues.apache.org/jira/browse/HBASE-6044 Project: HBase Issue Type: New Feature Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: hbase-6044-v2.patch, hbase-6044-v3.patch, hbase-6044-v4.patch, hbase-6044.patch In discussion of HBASE-6013 it was suggested that we remove these arguments from 0.92+ (but keep in 0.90) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6011) Unable to start master in local mode
[ https://issues.apache.org/jira/browse/HBASE-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13279011#comment-13279011 ] Hudson commented on HBASE-6011: --- Integrated in HBase-TRUNK #2898 (See [https://builds.apache.org/job/HBase-TRUNK/2898/]) HBASE-6011. Addendum to support master mocking (Ram) (Revision 1340155) Result = FAILURE apurtell : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/LocalHBaseCluster.java Unable to start master in local mode Key: HBASE-6011 URL: https://issues.apache.org/jira/browse/HBASE-6011 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6011-v2.patch, 6011.patch, HBASE-6011_1.patch, HBASE-6011_addendum_0.92.patch, HBASE-6011_addendum_0.94.patch, HBASE-6011_addendum_trunk.patch Got this trying to launch head of 0.94 branch in local mode from the build tree but it happens with trunk and 0.92 too: {noformat} 12/05/15 19:35:45 ERROR master.HMasterCommandLine: Failed to start master java.lang.ClassCastException: org.apache.hadoop.hbase.master.HMaster cannot be cast to org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1761) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6002) Possible chance of resource leak in HlogSplitter
[ https://issues.apache.org/jira/browse/HBASE-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13279015#comment-13279015 ] stack commented on HBASE-6002: -- I'm fine w/ commit. Was setting wap.w = null after close discussed? (could copy it before actual call to close and then close the copy after setting wap.w = null) Possible chance of resource leak in HlogSplitter Key: HBASE-6002 URL: https://issues.apache.org/jira/browse/HBASE-6002 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0, 0.96.0 Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HBASE-6002.patch, HBASE-6002_0.94_1.patch, HBASE-6002_trunk.patch In HLogSplitter.splitLogFileToTemp-Reader(in) is not closed and in finally block in loop while closing the writers(wap.w) if any exception comes other writers won't close. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6002) Possible chance of resource leak in HlogSplitter
[ https://issues.apache.org/jira/browse/HBASE-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13279016#comment-13279016 ] stack commented on HBASE-6002: -- Sorry. My above suggestion would do away w/ the need of the flag and it would make the flag and it would make the check more particular being done per wap.w. Possible chance of resource leak in HlogSplitter Key: HBASE-6002 URL: https://issues.apache.org/jira/browse/HBASE-6002 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0, 0.96.0 Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HBASE-6002.patch, HBASE-6002_0.94_1.patch, HBASE-6002_trunk.patch In HLogSplitter.splitLogFileToTemp-Reader(in) is not closed and in finally block in loop while closing the writers(wap.w) if any exception comes other writers won't close. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5546) Master assigns region in the original region server when opening region failed
[ https://issues.apache.org/jira/browse/HBASE-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13279017#comment-13279017 ] ramkrishna.s.vasudevan commented on HBASE-5546: --- Commited to 0.94 and trunk. Thanks for the patch Ashutosh. Thanks for the review Ted and Stack. Master assigns region in the original region server when opening region failed Key: HBASE-5546 URL: https://issues.apache.org/jira/browse/HBASE-5546 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0 Reporter: gaojinchao Assignee: Ashutosh Jindal Priority: Minor Fix For: 0.96.0, 0.94.1 Attachments: hbase-5546.patch, hbase-5546_1.patch, hbase-5546_2.patch, hbase-5546_3.patch, hbase-5546_4.patch Master assigns region in the original region server when RS_ZK_REGION_FAILED_OPEN envent was coming. Maybe we should choose other region server. [2012-03-07 10:14:21,750] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:14:31,826] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:14:41,903] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:14:51,975] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:02,056] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:12,167] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:22,231] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:32,303] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:42,375] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:52,447] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:16:02,528] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:16:12,600] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:16:22,676] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5546) Master assigns region in the original region server when opening region failed
[ https://issues.apache.org/jira/browse/HBASE-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-5546: -- Resolution: Fixed Fix Version/s: 0.94.1 0.96.0 Status: Resolved (was: Patch Available) Thanks for the review Gao. Master assigns region in the original region server when opening region failed Key: HBASE-5546 URL: https://issues.apache.org/jira/browse/HBASE-5546 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0 Reporter: gaojinchao Assignee: Ashutosh Jindal Priority: Minor Fix For: 0.96.0, 0.94.1 Attachments: hbase-5546.patch, hbase-5546_1.patch, hbase-5546_2.patch, hbase-5546_3.patch, hbase-5546_4.patch Master assigns region in the original region server when RS_ZK_REGION_FAILED_OPEN envent was coming. Maybe we should choose other region server. [2012-03-07 10:14:21,750] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:14:31,826] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:14:41,903] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:14:51,975] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:02,056] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:12,167] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:22,231] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:32,303] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:42,375] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:52,447] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:16:02,528] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:16:12,600] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:16:22,676] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5882) Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor
[ https://issues.apache.org/jira/browse/HBASE-5882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13279023#comment-13279023 ] Zhihong Yu commented on HBASE-5882: --- {code} +} else if (wasOnDeadServer(sn, deadServers)){ {code} since regionInfo was not one of the parameters to wasOnDeadServer(), the method name still doesn't make sense. I think we can directly place the check ( deadServers.keySet().contains(sn) ) above. This way there is no need to introduce a new method. Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor - Key: HBASE-5882 URL: https://issues.apache.org/jira/browse/HBASE-5882 Project: HBase Issue Type: Improvement Affects Versions: 0.90.6, 0.92.1 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Attachments: hbase_5882.patch, hbase_5882_V2.patch, hbase_5882_V3.patch Currently on master restart if it tries to do processRIT, any region if found on dead server tries to avoid the nwe assignment so that timeout monitor can take care. This case is more prominent if the node is found in RS_ZK_REGION_OPENING state. I think we can handle this by triggering a new assignment with a new plan. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.
[ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13279029#comment-13279029 ] stack commented on HBASE-6050: -- Good one Ram. So, we are talking about the parent region? It does seem wrong that we would recreate a parent region dir in the distributed log splitter. How about we remove that dir creation code? I can see our making the recovered.edits dir because it won't always be there but creating all of its parent dirs is not right. My guess is that the mkdirs was done because it was just easier than verifying parent dir present. If parent dir not present, log the fact that there is no target region into which to put the edits and move on I'd say. HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent. --- Key: HBASE-6050 URL: https://issues.apache.org/jira/browse/HBASE-6050 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan The scenario is like this - A region is getting splitted. - The master is still not processed the split . - Region server goes down. - Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path. - CJ starts and deletes the entry from META and also just completes the deletion of the region dir. - in hlogSplitter on final step we rename the recovered.edits to come under the regiondir. There if the regiondir doesnot exist we tend to create and then add the recovered.edits. Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo. Ideally cluster is fine but we it is misleading. {code} } else { Path dstdir = dst.getParent(); if (!fs.exists(dstdir)) { if (!fs.mkdirs(dstdir)) LOG.warn(mkdir failed on + dstdir); } } fs.rename(src, dst); LOG.debug( moved + src + = + dst); } else { LOG.debug(Could not move recovered edits from + src + as it doesn't exist); } } archiveLogs(null, corruptedLogs, processedLogs, oldLogDir, fs, conf); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6000) Cleanup where we keep .proto files
[ https://issues.apache.org/jira/browse/HBASE-6000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6000: - Status: Patch Available (was: Open) Cleanup where we keep .proto files -- Key: HBASE-6000 URL: https://issues.apache.org/jira/browse/HBASE-6000 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: stack Assignee: stack Attachments: 6000.txt, 6000.txt I see Andrew for his pb work over in rest has .protos files under src/main/resources. We should unify where these files live. The recently added .protos place them under src/main/protobuf Its confusing. The thift idl files are here under resources too. Seems like we should move src/main/protobuf under src/resources to be consistent. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5926) Delete the master znode after a master crash
[ https://issues.apache.org/jira/browse/HBASE-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13279031#comment-13279031 ] Hudson commented on HBASE-5926: --- Integrated in HBase-TRUNK #2899 (See [https://builds.apache.org/job/HBase-TRUNK/2899/]) HBASE-5926 Delete the master znode after a master crash (Revision 1340185) Result = FAILURE stack : Files : * /hbase/trunk/bin/hbase-daemon.sh * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/ActiveMasterManager.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMasterCommandLine.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/MasterAddressTracker.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/zookeeper/TestZooKeeperNodeTracker.java Delete the master znode after a master crash Key: HBASE-5926 URL: https://issues.apache.org/jira/browse/HBASE-5926 Project: HBase Issue Type: Improvement Components: master, scripts Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5926.v10.patch, 5926.v11.patch, 5926.v13.patch, 5926.v14.patch, 5926.v6.patch, 5926.v8.patch, 5926.v9.patch This is the continuation of the work done in HBASE-5844. But we can't apply exactly the same strategy: for the region server, there is a znode per region server, while for the master backup master there is a single znode for both. So if we apply the same strategy as for a regionserver, we may have this scenario: 1) Master starts 2) Backup master starts 3) Master dies 4) ZK detects it 5) Backup master receives the update from ZK 6) Backup master creates the new master node and become the main master 7) Previous master script continues 8) Previous master script deletes the master node in ZK 9) = issue: we deleted the node just created by the new master This should not happen often (usually the znode will be deleted soon enough), but it can happen. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5546) Master assigns region in the original region server when opening region failed
[ https://issues.apache.org/jira/browse/HBASE-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13279032#comment-13279032 ] Hudson commented on HBASE-5546: --- Integrated in HBase-TRUNK #2899 (See [https://builds.apache.org/job/HBase-TRUNK/2899/]) HBASE-5546 Master assigns region in the original region server when opening region failed (Ashutosh) (Revision 1340187) Result = FAILURE ramkrishna : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java Master assigns region in the original region server when opening region failed Key: HBASE-5546 URL: https://issues.apache.org/jira/browse/HBASE-5546 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0 Reporter: gaojinchao Assignee: Ashutosh Jindal Priority: Minor Fix For: 0.96.0, 0.94.1 Attachments: hbase-5546.patch, hbase-5546_1.patch, hbase-5546_2.patch, hbase-5546_3.patch, hbase-5546_4.patch Master assigns region in the original region server when RS_ZK_REGION_FAILED_OPEN envent was coming. Maybe we should choose other region server. [2012-03-07 10:14:21,750] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:14:31,826] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:14:41,903] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:14:51,975] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:02,056] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:12,167] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:22,231] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:32,303] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:42,375] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:15:52,447] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:16:02,528] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:16:12,600] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 [2012-03-07 10:16:22,676] [DEBUG] [main-EventThread] [org.apache.hadoop.hbase.master.AssignmentManager 553] Handling transition=RS_ZK_REGION_FAILED_OPEN, server=158-1-130-11,20020,1331108408232, region=c70e98bdca98a0657a56436741523053 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.
[ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13279033#comment-13279033 ] ramkrishna.s.vasudevan commented on HBASE-6050: --- bq.So, we are talking about the parent region? Yes it is the parent region. bq.If parent dir not present, log the fact that there is no target region into which to put the edits and move on I'd say Yes if destination does not exist we can move one and so we will consider the log splitting process successful. But the file created in the splitlog folder by the distributed log splitting will never be cleared i think.? May be i need to check the code on that. I will come up with a patch on this tomorrow. HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent. --- Key: HBASE-6050 URL: https://issues.apache.org/jira/browse/HBASE-6050 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan The scenario is like this - A region is getting splitted. - The master is still not processed the split . - Region server goes down. - Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path. - CJ starts and deletes the entry from META and also just completes the deletion of the region dir. - in hlogSplitter on final step we rename the recovered.edits to come under the regiondir. There if the regiondir doesnot exist we tend to create and then add the recovered.edits. Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo. Ideally cluster is fine but we it is misleading. {code} } else { Path dstdir = dst.getParent(); if (!fs.exists(dstdir)) { if (!fs.mkdirs(dstdir)) LOG.warn(mkdir failed on + dstdir); } } fs.rename(src, dst); LOG.debug( moved + src + = + dst); } else { LOG.debug(Could not move recovered edits from + src + as it doesn't exist); } } archiveLogs(null, corruptedLogs, processedLogs, oldLogDir, fs, conf); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6011) Unable to start master in local mode
[ https://issues.apache.org/jira/browse/HBASE-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13279035#comment-13279035 ] Hudson commented on HBASE-6011: --- Integrated in HBase-0.94 #200 (See [https://builds.apache.org/job/HBase-0.94/200/]) HBASE-6011. Addendum to support master mocking (Ram) (Revision 1340156) Result = FAILURE apurtell : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/LocalHBaseCluster.java Unable to start master in local mode Key: HBASE-6011 URL: https://issues.apache.org/jira/browse/HBASE-6011 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: 6011-v2.patch, 6011.patch, HBASE-6011_1.patch, HBASE-6011_addendum_0.92.patch, HBASE-6011_addendum_0.94.patch, HBASE-6011_addendum_trunk.patch Got this trying to launch head of 0.94 branch in local mode from the build tree but it happens with trunk and 0.92 too: {noformat} 12/05/15 19:35:45 ERROR master.HMasterCommandLine: Failed to start master java.lang.ClassCastException: org.apache.hadoop.hbase.master.HMaster cannot be cast to org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1761) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5926) Delete the master znode after a master crash
[ https://issues.apache.org/jira/browse/HBASE-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13279042#comment-13279042 ] Hadoop QA commented on HBASE-5926: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528104/5926.v14.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop23. The patch compiles against the hadoop 0.23.x profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 33 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1933//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1933//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1933//console This message is automatically generated. Delete the master znode after a master crash Key: HBASE-5926 URL: https://issues.apache.org/jira/browse/HBASE-5926 Project: HBase Issue Type: Improvement Components: master, scripts Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5926.v10.patch, 5926.v11.patch, 5926.v13.patch, 5926.v14.patch, 5926.v6.patch, 5926.v8.patch, 5926.v9.patch This is the continuation of the work done in HBASE-5844. But we can't apply exactly the same strategy: for the region server, there is a znode per region server, while for the master backup master there is a single znode for both. So if we apply the same strategy as for a regionserver, we may have this scenario: 1) Master starts 2) Backup master starts 3) Master dies 4) ZK detects it 5) Backup master receives the update from ZK 6) Backup master creates the new master node and become the main master 7) Previous master script continues 8) Previous master script deletes the master node in ZK 9) = issue: we deleted the node just created by the new master This should not happen often (usually the znode will be deleted soon enough), but it can happen. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5987) HFileBlockIndex improvement
[ https://issues.apache.org/jira/browse/HBASE-5987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13279044#comment-13279044 ] Mikhail Bautin commented on HBASE-5987: --- @Ted: we put [89-fb] in the 89-fb versions of our code reviews for a particular JIRA, and omit them from trunk versions of code reviews for the same JIRA. HFileBlockIndex improvement --- Key: HBASE-5987 URL: https://issues.apache.org/jira/browse/HBASE-5987 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: D3237.1.patch, D3237.2.patch, D3237.3.patch, D3237.4.patch, D3237.5.patch, D3237.6.patch, D3237.7.patch, D3237.8.patch, screen_shot_of_sequential_scan_profiling.png Recently we find out a performance problem that it is quite slow when multiple requests are reading the same block of data or index. From the profiling, one of the causes is the IdLock contention which has been addressed in HBASE-5898. Another issue is that the HFileScanner will keep asking the HFileBlockIndex about the data block location for each target key value during the scan process(reSeekTo), even though the target key value has already been in the current data block. This issue will cause certain index block very HOT, especially when it is a sequential scan. To solve this issue, we propose the following solutions: First, we propose to lookahead for one more block index so that the HFileScanner would know the start key value of next data block. So if the target key value for the scan(reSeekTo) is smaller than that start kv of next data block, it means the target key value has a very high possibility in the current data block (if not in current data block, then the start kv of next data block should be returned. +Indexing on the start key has some defects here+) and it shall NOT query the HFileBlockIndex in this case. On the contrary, if the target key value is bigger, then it shall query the HFileBlockIndex. This improvement shall help to reduce the hotness of HFileBlockIndex and avoid some unnecessary IdLock Contention or Index Block Cache lookup. Secondary, we propose to push this idea a little further that the HFileBlockIndex shall index on the last key value of each data block instead of indexing on the start key value. The motivation is to solve the HBASE-4443 issue (avoid seeking to previous block when key you are interested in is the first one of a block) as well as +the defects mentioned above+. For example, if the target key value is smaller than the start key value of the data block N. There is no way for sure the target key value is in the data block N or N-1. So it has to seek from data block N-1. However, if the block index is based on the last key value for each data block and the target key value is beween the last key value of data block N-1 and data block N, then the target key value is supposed be data block N for sure. As long as HBase only supports the forward scan, the last key value makes more sense to be indexed on than the start key value. Thanks Kannan and Mikhail for the insightful discussions and suggestions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6009) Changes for HBASE-5209 are technically incompatible
[ https://issues.apache.org/jira/browse/HBASE-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13279043#comment-13279043 ] stack commented on HBASE-6009: -- Lets try and avoid serializing fat ClusterStatus objects twice. We have never made guarantee that old clients could talk to new servers pre-0.96 (Too hard in the Writables world). What is the scenario David? You cannot update the clients? Wouldn't you have to update the clients anyways if you introduce clusterstatus size? Pardon me if I'm not getting this. Changes for HBASE-5209 are technically incompatible --- Key: HBASE-6009 URL: https://issues.apache.org/jira/browse/HBASE-6009 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.1, 0.94.0 Reporter: David S. Wang The additions to add backup masters to ClusterStatus are technically incompatible between clients and servers. Older clients will basically not read the extra bits that the newer server pushes for the backup masters, thus screwing up the serialization for the next blob in the pipe. For the Writable, we can add a total size field for ClusterStatus at the beginning, or we can have start and end markers. I can make a patch for either approach; interested in whatever folks have to suggest. Would be good to get this in soon to limit the damage to 0.92.1 (don't know if we can get this in in time for 0.94.0). Either change will make us forward-compatible starting with when the change goes in, but will not fix the backwards incompatibility, which we will have to mark with a release note as there have already been releases with this change. Hopefully we can do this in a cleaner way when wire compat rolls around in 0.96. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5926) Delete the master znode after a master crash
[ https://issues.apache.org/jira/browse/HBASE-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13279040#comment-13279040 ] Zhihong Yu commented on HBASE-5926: --- ZNodeClearer.java isn't in source repo. Delete the master znode after a master crash Key: HBASE-5926 URL: https://issues.apache.org/jira/browse/HBASE-5926 Project: HBase Issue Type: Improvement Components: master, scripts Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5926.v10.patch, 5926.v11.patch, 5926.v13.patch, 5926.v14.patch, 5926.v6.patch, 5926.v8.patch, 5926.v9.patch This is the continuation of the work done in HBASE-5844. But we can't apply exactly the same strategy: for the region server, there is a znode per region server, while for the master backup master there is a single znode for both. So if we apply the same strategy as for a regionserver, we may have this scenario: 1) Master starts 2) Backup master starts 3) Master dies 4) ZK detects it 5) Backup master receives the update from ZK 6) Backup master creates the new master node and become the main master 7) Previous master script continues 8) Previous master script deletes the master node in ZK 9) = issue: we deleted the node just created by the new master This should not happen often (usually the znode will be deleted soon enough), but it can happen. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-5882) Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor
[ https://issues.apache.org/jira/browse/HBASE-5882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan reassigned HBASE-5882: - Assignee: Ashutosh Jindal (was: ramkrishna.s.vasudevan) Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor - Key: HBASE-5882 URL: https://issues.apache.org/jira/browse/HBASE-5882 Project: HBase Issue Type: Improvement Affects Versions: 0.90.6, 0.92.1 Reporter: ramkrishna.s.vasudevan Assignee: Ashutosh Jindal Attachments: hbase_5882.patch, hbase_5882_V2.patch, hbase_5882_V3.patch Currently on master restart if it tries to do processRIT, any region if found on dead server tries to avoid the nwe assignment so that timeout monitor can take care. This case is more prominent if the node is found in RS_ZK_REGION_OPENING state. I think we can handle this by triggering a new assignment with a new plan. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5926) Delete the master znode after a master crash
[ https://issues.apache.org/jira/browse/HBASE-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13279048#comment-13279048 ] stack commented on HBASE-5926: -- Fixed. Thanks Ted. Delete the master znode after a master crash Key: HBASE-5926 URL: https://issues.apache.org/jira/browse/HBASE-5926 Project: HBase Issue Type: Improvement Components: master, scripts Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5926.v10.patch, 5926.v11.patch, 5926.v13.patch, 5926.v14.patch, 5926.v6.patch, 5926.v8.patch, 5926.v9.patch This is the continuation of the work done in HBASE-5844. But we can't apply exactly the same strategy: for the region server, there is a znode per region server, while for the master backup master there is a single znode for both. So if we apply the same strategy as for a regionserver, we may have this scenario: 1) Master starts 2) Backup master starts 3) Master dies 4) ZK detects it 5) Backup master receives the update from ZK 6) Backup master creates the new master node and become the main master 7) Previous master script continues 8) Previous master script deletes the master node in ZK 9) = issue: we deleted the node just created by the new master This should not happen often (usually the znode will be deleted soon enough), but it can happen. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5826) Improve sync of HLog edits
[ https://issues.apache.org/jira/browse/HBASE-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13279049#comment-13279049 ] Zhihong Yu commented on HBASE-5826: --- @Todd: Do you have further comments ? Improve sync of HLog edits -- Key: HBASE-5826 URL: https://issues.apache.org/jira/browse/HBASE-5826 Project: HBase Issue Type: Improvement Reporter: Zhihong Yu Assignee: Todd Lipcon Fix For: 0.96.0 Attachments: 5826-v2.txt, 5826-v3.txt, 5826-v4.txt, 5826.txt HBASE-5782 solved the correctness issue for the sync of HLog edits. Todd provided a patch that would achieve higher throughput. This JIRA is a continuation of Todd's work submitted there. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6046) Master retry on ZK session expiry causes inconsistent region assignments.
[ https://issues.apache.org/jira/browse/HBASE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13279051#comment-13279051 ] stack commented on HBASE-6046: -- Nice work Ram. Master retry on ZK session expiry causes inconsistent region assignments. - Key: HBASE-6046 URL: https://issues.apache.org/jira/browse/HBASE-6046 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.1, 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Fix For: 0.92.2, 0.94.1 1 ZK Session timeout in the hmaster leads to bulk assignment though all the RSs are online. 2 While doing bulk assignment, if the master again goes down restart(or backup comes up) all the node created in the ZK will now be tried to reassign to the new RSs. This is leading to double assignment. we had 2800 regions, among this 1900 region got double assignment, taking the region count to 4700. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5453) Switch on-disk formats (reference files, HFile meta fields, etc) to PB
[ https://issues.apache.org/jira/browse/HBASE-5453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13279053#comment-13279053 ] Hadoop QA commented on HBASE-5453: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528107/5453v12.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 30 new or modified tests. +1 hadoop23. The patch compiles against the hadoop 0.23.x profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 32 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildOverlap org.apache.hadoop.hbase.replication.TestReplication org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster org.apache.hadoop.hbase.replication.TestMultiSlaveReplication org.apache.hadoop.hbase.coprocessor.TestClassLoading org.apache.hadoop.hbase.replication.TestMasterReplication Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1934//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1934//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1934//console This message is automatically generated. Switch on-disk formats (reference files, HFile meta fields, etc) to PB -- Key: HBASE-5453 URL: https://issues.apache.org/jira/browse/HBASE-5453 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: stack Fix For: 0.96.0 Attachments: 5453.txt, 5453v10.txt, 5453v11.txt, 5453v11.txt, 5453v12.txt, 5453v2.txt, 5453v3.txt, 5453v6.txt, 5453v9.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6048) Table Scan is failing if offheap cache enabled
[ https://issues.apache.org/jira/browse/HBASE-6048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13279052#comment-13279052 ] stack commented on HBASE-6048: -- I don't think this critical. Off-heap cache needs more work for it to be made useable. Until that is done my guess is that few use this feature. This feature is experimental still (How do we make this more clear)? Table Scan is failing if offheap cache enabled -- Key: HBASE-6048 URL: https://issues.apache.org/jira/browse/HBASE-6048 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Critical Table Scan is failing if offheap cache enabled. {noformat} 2012-05-18 20:03:38,446 DEBUG org.apache.hadoop.hbase.io.hfile.HFileWriterV2: Initialized with CacheConfig:enabled [cacheDataOnRead=true] [cacheDataOnWrite=false] [cacheIndexesOnWrite=false] [cacheBloomsOnWrite=false] [cacheEvictOnClose=false] [cacheCompressed=false] 2012-05-18 20:03:38,446 INFO org.apache.hadoop.hbase.regionserver.StoreFile: Delete Family Bloom filter type for hdfs://10.18.40.217:9000/hbase/ufdr/1d4656fd417a07c9171a38b8f4d08510/.tmp/03742024b28f443bb63cfc338d4ca422: CompoundBloomFilterWriter 2012-05-18 20:04:25,576 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 120.57 MB of total=1020.57 MB 2012-05-18 20:04:25,655 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=120.82 MB, total=907.89 MB, single=1012.11 MB, multi=6.12 MB, memory=0 KB 2012-05-18 20:04:25,733 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed openScanner java.lang.IllegalStateException: Schema metrics requested before table/CF name initialization: {tableName:null,cfName:null} at org.apache.hadoop.hbase.regionserver.metrics.SchemaConfigured.getSchemaMetrics(SchemaConfigured.java:182) at org.apache.hadoop.hbase.io.hfile.LruBlockCache.updateSizeMetrics(LruBlockCache.java:310) at org.apache.hadoop.hbase.io.hfile.LruBlockCache.cacheBlock(LruBlockCache.java:274) at org.apache.hadoop.hbase.io.hfile.LruBlockCache.cacheBlock(LruBlockCache.java:293) at org.apache.hadoop.hbase.io.hfile.DoubleBlockCache.getBlock(DoubleBlockCache.java:102) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:296) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(HFileBlockIndex.java:213) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:455) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:475) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:226) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:145) at org.apache.hadoop.hbase.regionserver.StoreScanner.init(StoreScanner.java:130) at org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:2001) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.init(HRegion.java:3274) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:1604) at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1596) at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1572) at org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2310) at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1376) 2012-05-18 20:04:25,828 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed openScanner {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5882) Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor
[ https://issues.apache.org/jira/browse/HBASE-5882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-5882: -- Attachment: hbase_5882_V4.patch Updated patch addressing Ted's comments. I can commit this if the patch is ok. Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor - Key: HBASE-5882 URL: https://issues.apache.org/jira/browse/HBASE-5882 Project: HBase Issue Type: Improvement Affects Versions: 0.90.6, 0.92.1 Reporter: ramkrishna.s.vasudevan Assignee: Ashutosh Jindal Attachments: hbase_5882.patch, hbase_5882_V2.patch, hbase_5882_V3.patch, hbase_5882_V4.patch Currently on master restart if it tries to do processRIT, any region if found on dead server tries to avoid the nwe assignment so that timeout monitor can take care. This case is more prominent if the node is found in RS_ZK_REGION_OPENING state. I think we can handle this by triggering a new assignment with a new plan. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5882) Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor
[ https://issues.apache.org/jira/browse/HBASE-5882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13279060#comment-13279060 ] Zhihong Yu commented on HBASE-5882: --- I don't see what is different in patch v4 compared to patch v3. Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor - Key: HBASE-5882 URL: https://issues.apache.org/jira/browse/HBASE-5882 Project: HBase Issue Type: Improvement Affects Versions: 0.90.6, 0.92.1 Reporter: ramkrishna.s.vasudevan Assignee: Ashutosh Jindal Attachments: hbase_5882.patch, hbase_5882_V2.patch, hbase_5882_V3.patch, hbase_5882_V4.patch Currently on master restart if it tries to do processRIT, any region if found on dead server tries to avoid the nwe assignment so that timeout monitor can take care. This case is more prominent if the node is found in RS_ZK_REGION_OPENING state. I think we can handle this by triggering a new assignment with a new plan. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira