[jira] Commented: (HBASE-3405) Allow HBaseRpcMetrics to register custom interface methods
[ https://issues.apache.org/jira/browse/HBASE-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978229#action_12978229 ] HBase Review Board commented on HBASE-3405: --- Message from: Gary Helmling ghelml...@gmail.com --- This is an automatically generated e-mail. To reply, visit: http://review.cloudera.org/r/1414/ --- Review request for hbase. Summary --- The patch allows alternate HRegionServer subclasses (and in the future coprocessors) to access HBaseRpcMetrics in order to register additional RPC methods into the metrics registry. The changes are: * add getRpcMetrics() to HBaseServer to allow access to the HBaseRpcMetrics instance * add getRpcMetrics() to RegionServerServices (and HRegionServer) to allow for future coprocessor accounting * add HBaseRpcMetrics.createMetrics(Class[], boolean) overload -- if boolean is true, the registered method names will be prefixed with the class name. This should help clarify origin for custom metrics and help prevent collisions. This addresses bug HBASE-3405. http://issues.apache.org/jira/browse/HBASE-3405 Diffs - src/main/java/org/apache/hadoop/hbase/ipc/HBaseRpcMetrics.java 19dbf2b src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 867a059 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d7147b5 src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java 1309f93 src/test/java/org/apache/hadoop/hbase/regionserver/TestRpcMetrics.java PRE-CREATION Diff: http://review.cloudera.org/r/1414/diff Testing --- Created new org.apache.hadoop.hbase.regionserver.TestRpcMetrics test case to verify registration and incrementing of metrics from HRegionServer subclasses. Thanks, Gary Allow HBaseRpcMetrics to register custom interface methods -- Key: HBASE-3405 URL: https://issues.apache.org/jira/browse/HBASE-3405 Project: HBase Issue Type: Improvement Components: ipc Reporter: Gary Helmling Priority: Minor Opened from comments on HBASE-2997. James Kennedy notes: {quote} HBaseRpcMetrics is now logging a WARN message every time it encounters an unregistered RPC method. In my case I now get huge log files filled with these warnings because the hbase-trx transactional extension of HBase uses a subclass of HRegionServer that adds new interface methods. It's easy enough to tell log4j to ignore HBaseRpcMetrics output. However, it would be nice if the Server/HRegionServer HBaseRpcMetrics mechanism was more extensible so I could pass down new interfaces or grab the HBaseRpcMetrics object to add interfaces from up top... {quote} {{HBaseRpcMetrics}} already has a public method {{createMetrics(Class)}} to register method counters. We just need a way to expose the metrics class to allow the region server subclass to call it -- add a {{getMetrics()}} method to {{RpcServer}} and {{HBaseServer}}. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Gray updated HBASE-3417: - Attachment: HBASE-3417-v5.patch Latest version. Adds proper handling in the compaction path (previous patch only dealt with flush). Changes regex to use [0-9a-f]+ to be backwards compatible. CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-3405) Allow HBaseRpcMetrics to register custom interface methods
[ https://issues.apache.org/jira/browse/HBASE-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Helmling updated HBASE-3405: - Attachment: HBASE-3405_2_0.90.patch The patch allows alternate HRegionServer subclasses (and in the future coprocessors) to access HBaseRpcMetrics in order to register additional RPC methods into the metrics registry. The changes are: * add getRpcMetrics() to HBaseServer to allow access to the HBaseRpcMetrics instance * add getRpcMetrics() to RegionServerServices (and HRegionServer) to allow for future coprocessor accounting * add HBaseRpcMetrics.createMetrics(Class[], boolean) overload -- if boolean is true, the registered method names will be prefixed with the class name. This should help clarify origin for custom metrics and help prevent collisions. This version differs from the previous review.hbase.org version only by changing the delimiter character for class + method attribute names from '.' to '$'. According to JMX spec all attribute name chars must pass Character.isJavaIdentifierPart(). Should have checked that earlier... Allow HBaseRpcMetrics to register custom interface methods -- Key: HBASE-3405 URL: https://issues.apache.org/jira/browse/HBASE-3405 Project: HBase Issue Type: Improvement Components: ipc Reporter: Gary Helmling Priority: Minor Attachments: HBASE-3405_2_0.90.patch Opened from comments on HBASE-2997. James Kennedy notes: {quote} HBaseRpcMetrics is now logging a WARN message every time it encounters an unregistered RPC method. In my case I now get huge log files filled with these warnings because the hbase-trx transactional extension of HBase uses a subclass of HRegionServer that adds new interface methods. It's easy enough to tell log4j to ignore HBaseRpcMetrics output. However, it would be nice if the Server/HRegionServer HBaseRpcMetrics mechanism was more extensible so I could pass down new interfaces or grab the HBaseRpcMetrics object to add interfaces from up top... {quote} {{HBaseRpcMetrics}} already has a public method {{createMetrics(Class)}} to register method counters. We just need a way to expose the metrics class to allow the region server subclass to call it -- add a {{getMetrics()}} method to {{RpcServer}} and {{HBaseServer}}. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978427#action_12978427 ] stack commented on HBASE-3417: -- Should you add in A-Z in below just in case? {code} +Pattern.compile(^([0-9a-f]+)(?:\\.(.+))?$); {code} Yeah, don't replace the '-' I'd say: {code} +return new Path(dir, UUID.randomUUID().toString().replaceAll(-, ) ++ ((suffix == null || suffix.length() = 0) ? : suffix)); {code} Then its easy to go back to UUID. You might want to do that so you can use the 128 bits as key in LRU rather than String? Otherwise, +1 IFF verified backward compatible. CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3405) Allow HBaseRpcMetrics to register custom interface methods
[ https://issues.apache.org/jira/browse/HBASE-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978430#action_12978430 ] stack commented on HBASE-3405: -- +1 for commit to branch and trunk. Excellent. Allow HBaseRpcMetrics to register custom interface methods -- Key: HBASE-3405 URL: https://issues.apache.org/jira/browse/HBASE-3405 Project: HBase Issue Type: Improvement Components: ipc Reporter: Gary Helmling Priority: Minor Attachments: HBASE-3405_2_0.90.patch Opened from comments on HBASE-2997. James Kennedy notes: {quote} HBaseRpcMetrics is now logging a WARN message every time it encounters an unregistered RPC method. In my case I now get huge log files filled with these warnings because the hbase-trx transactional extension of HBase uses a subclass of HRegionServer that adds new interface methods. It's easy enough to tell log4j to ignore HBaseRpcMetrics output. However, it would be nice if the Server/HRegionServer HBaseRpcMetrics mechanism was more extensible so I could pass down new interfaces or grab the HBaseRpcMetrics object to add interfaces from up top... {quote} {{HBaseRpcMetrics}} already has a public method {{createMetrics(Class)}} to register method counters. We just need a way to expose the metrics class to allow the region server subclass to call it -- add a {{getMetrics()}} method to {{RpcServer}} and {{HBaseServer}}. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978431#action_12978431 ] Jonathan Gray commented on HBASE-3417: -- bq. Should you add in A-Z in below just in case? Could add A-F (uuid is hex chars only), but it's unnecessary. bq. Then its easy to go back to UUID. You might want to do that so you can use the 128 bits as key in LRU rather than String? LRU uses a String for block name. I think it looks much nicer with a consistent looking naming scheme for region directories and storefiles. And I don't think we need to be overly concerned about the size... If 64K block, in the LRU we're talking about 0.05% overhead (or like 0.02% over a more compact version). Also, traditional GUID format reminds me of Microsoft SQL Server :) This latest v5 patch is being deployed on a 100 node cluster with existing data tonight. Will commit once verified that it's working there. CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3379) Log splitting slowed by repeated attempts at connecting to downed datanode
[ https://issues.apache.org/jira/browse/HBASE-3379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978434#action_12978434 ] stack commented on HBASE-3379: -- @Hairong Is there a new API in branch-0.20-append that we should be calling? Thanks. Log splitting slowed by repeated attempts at connecting to downed datanode -- Key: HBASE-3379 URL: https://issues.apache.org/jira/browse/HBASE-3379 Project: HBase Issue Type: Bug Components: wal Reporter: stack Priority: Critical Testing if I kill RS and DN on a node, log splitting takes longer as we doggedly try connecting to the downed DN to get WAL blocks. Here's the cycle I see: {code} 2010-12-21 17:34:48,239 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_900551257176291912_1203821 failed because recovery from primary datanode 10.20.20.182:10010 failed 5 times.Pipeline was 10.20.20.184:10010, 10.20.20.186:10010, 10.20.20.182:10010. Will retry... 2010-12-21 17:34:50,240 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 0 time(s). 2010-12-21 17:34:51,241 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 1 time(s). 2010-12-21 17:34:52,241 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 2 time(s). 2010-12-21 17:34:53,242 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 3 time(s). 2010-12-21 17:34:54,243 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 4 time(s). 2010-12-21 17:34:55,243 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 5 time(s). 2010-12-21 17:34:56,244 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 6 time(s). 2010-12-21 17:34:57,245 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 7 time(s). 2010-12-21 17:34:58,245 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 8 time(s). 2010-12-21 17:34:59,246 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 9 time(s). 2010-12-21 17:34:59,246 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #5 from primary datanode 10.20.20.182:10010 java.net.ConnectException: Call to /10.20.20.182:10020 failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client.wrapException(Client.java:767) at org.apache.hadoop.ipc.Client.call(Client.java:743) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at $Proxy8.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:346) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:383) ... {code} because recovery from primary datanode is done 5 times (hardcoded). Within these retries we'll do {code} this.maxRetries = conf.getInt(ipc.client.connect.max.retries, 10); {code} The hardcoding of 5 attempts we should get fixed and we should doc the ipc.client.connect.max.retries as important config. We should recommend bringing it down from default. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-3405) Allow HBaseRpcMetrics to register custom interface methods
[ https://issues.apache.org/jira/browse/HBASE-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Helmling updated HBASE-3405: - Attachment: HBASE-3405_trunk.patch Patch committed to trunk. Same as 0.90 patch with the addition of the getRpcMetrics() method to RpcServer interface. Allow HBaseRpcMetrics to register custom interface methods -- Key: HBASE-3405 URL: https://issues.apache.org/jira/browse/HBASE-3405 Project: HBase Issue Type: Improvement Components: ipc Reporter: Gary Helmling Priority: Minor Attachments: HBASE-3405_2_0.90.patch, HBASE-3405_trunk.patch Opened from comments on HBASE-2997. James Kennedy notes: {quote} HBaseRpcMetrics is now logging a WARN message every time it encounters an unregistered RPC method. In my case I now get huge log files filled with these warnings because the hbase-trx transactional extension of HBase uses a subclass of HRegionServer that adds new interface methods. It's easy enough to tell log4j to ignore HBaseRpcMetrics output. However, it would be nice if the Server/HRegionServer HBaseRpcMetrics mechanism was more extensible so I could pass down new interfaces or grab the HBaseRpcMetrics object to add interfaces from up top... {quote} {{HBaseRpcMetrics}} already has a public method {{createMetrics(Class)}} to register method counters. We just need a way to expose the metrics class to allow the region server subclass to call it -- add a {{getMetrics()}} method to {{RpcServer}} and {{HBaseServer}}. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HBASE-3405) Allow HBaseRpcMetrics to register custom interface methods
[ https://issues.apache.org/jira/browse/HBASE-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Helmling resolved HBASE-3405. -- Resolution: Fixed Fix Version/s: 0.90.0 Assignee: Gary Helmling Committed to 0.90 branch and trunk. Allow HBaseRpcMetrics to register custom interface methods -- Key: HBASE-3405 URL: https://issues.apache.org/jira/browse/HBASE-3405 Project: HBase Issue Type: Improvement Components: ipc Reporter: Gary Helmling Assignee: Gary Helmling Priority: Minor Fix For: 0.90.0 Attachments: HBASE-3405_2_0.90.patch, HBASE-3405_trunk.patch Opened from comments on HBASE-2997. James Kennedy notes: {quote} HBaseRpcMetrics is now logging a WARN message every time it encounters an unregistered RPC method. In my case I now get huge log files filled with these warnings because the hbase-trx transactional extension of HBase uses a subclass of HRegionServer that adds new interface methods. It's easy enough to tell log4j to ignore HBaseRpcMetrics output. However, it would be nice if the Server/HRegionServer HBaseRpcMetrics mechanism was more extensible so I could pass down new interfaces or grab the HBaseRpcMetrics object to add interfaces from up top... {quote} {{HBaseRpcMetrics}} already has a public method {{createMetrics(Class)}} to register method counters. We just need a way to expose the metrics class to allow the region server subclass to call it -- add a {{getMetrics()}} method to {{RpcServer}} and {{HBaseServer}}. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HBASE-3424) Add metrics for custom RPC methods called through HTable.coprocessorExec()
Add metrics for custom RPC methods called through HTable.coprocessorExec() -- Key: HBASE-3424 URL: https://issues.apache.org/jira/browse/HBASE-3424 Project: HBase Issue Type: Sub-task Reporter: Gary Helmling -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HBASE-3407) hbck should pause after fixing before re-checking state
[ https://issues.apache.org/jira/browse/HBASE-3407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon resolved HBASE-3407. Resolution: Fixed Fix Version/s: 0.90.0 Hadoop Flags: [Reviewed] hbck should pause after fixing before re-checking state --- Key: HBASE-3407 URL: https://issues.apache.org/jira/browse/HBASE-3407 Project: HBase Issue Type: Improvement Components: util Affects Versions: 0.90.1 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.90.0 Attachments: hbase-3407.txt Right now when run with the -fix option, hbck tries to fix up the issue and then immediately re-runs itself to see if the fix worked. However most of the fixes require some other nodes in the cluster to take some action, which will take a couple of seconds (eg for them to notice a change in ZK and pick up the fixed region). So, hbck should pause for some amount of time in between fixing and re-running. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-3401) Region IPC operations should be high priority
[ https://issues.apache.org/jira/browse/HBASE-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HBASE-3401: --- Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Region IPC operations should be high priority - Key: HBASE-3401 URL: https://issues.apache.org/jira/browse/HBASE-3401 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.1 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.90.0 Attachments: hbase-3401.txt, hbase-3401.txt I manufactured an imbalanced cluster so one region server had 300 regions and the others had very few. I then ran balancer while hitting the high-load region server with YCSB. I observed that the rate of load shedding was VERY slow since the closeRegion IPCs were getting stuck at the back of the IPC queue. All of these important master-RS RPC calls should be set to high priority. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3379) Log splitting slowed by repeated attempts at connecting to downed datanode
[ https://issues.apache.org/jira/browse/HBASE-3379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978482#action_12978482 ] stack commented on HBASE-3379: -- bq. I will upload a patch in HBASE=3285 to make HBase to use the new API. Can I assume that HBase is bundled only with append 0.20? No. We have to work w/ CDH too. If you want, I can muck with it if you write outline of what to do. We already have some reflection going on that tests for presence of methods. I can do a bit more to find recoverLease. Thanks H. Log splitting slowed by repeated attempts at connecting to downed datanode -- Key: HBASE-3379 URL: https://issues.apache.org/jira/browse/HBASE-3379 Project: HBase Issue Type: Bug Components: wal Reporter: stack Priority: Critical Testing if I kill RS and DN on a node, log splitting takes longer as we doggedly try connecting to the downed DN to get WAL blocks. Here's the cycle I see: {code} 2010-12-21 17:34:48,239 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_900551257176291912_1203821 failed because recovery from primary datanode 10.20.20.182:10010 failed 5 times.Pipeline was 10.20.20.184:10010, 10.20.20.186:10010, 10.20.20.182:10010. Will retry... 2010-12-21 17:34:50,240 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 0 time(s). 2010-12-21 17:34:51,241 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 1 time(s). 2010-12-21 17:34:52,241 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 2 time(s). 2010-12-21 17:34:53,242 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 3 time(s). 2010-12-21 17:34:54,243 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 4 time(s). 2010-12-21 17:34:55,243 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 5 time(s). 2010-12-21 17:34:56,244 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 6 time(s). 2010-12-21 17:34:57,245 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 7 time(s). 2010-12-21 17:34:58,245 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 8 time(s). 2010-12-21 17:34:59,246 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 9 time(s). 2010-12-21 17:34:59,246 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #5 from primary datanode 10.20.20.182:10010 java.net.ConnectException: Call to /10.20.20.182:10020 failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client.wrapException(Client.java:767) at org.apache.hadoop.ipc.Client.call(Client.java:743) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at $Proxy8.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:346) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:383) ... {code} because recovery from primary datanode is done 5 times (hardcoded). Within these retries we'll do {code} this.maxRetries = conf.getInt(ipc.client.connect.max.retries, 10); {code} The hardcoding of 5 attempts we should get fixed and we should doc the ipc.client.connect.max.retries as important config. We should recommend bringing it down from default. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3379) Log splitting slowed by repeated attempts at connecting to downed datanode
[ https://issues.apache.org/jira/browse/HBASE-3379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978496#action_12978496 ] Todd Lipcon commented on HBASE-3379: We can probably pull that new HDFS patch into CDH3 also. Ill put it on our list to evaluate. Log splitting slowed by repeated attempts at connecting to downed datanode -- Key: HBASE-3379 URL: https://issues.apache.org/jira/browse/HBASE-3379 Project: HBase Issue Type: Bug Components: wal Reporter: stack Priority: Critical Testing if I kill RS and DN on a node, log splitting takes longer as we doggedly try connecting to the downed DN to get WAL blocks. Here's the cycle I see: {code} 2010-12-21 17:34:48,239 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_900551257176291912_1203821 failed because recovery from primary datanode 10.20.20.182:10010 failed 5 times.Pipeline was 10.20.20.184:10010, 10.20.20.186:10010, 10.20.20.182:10010. Will retry... 2010-12-21 17:34:50,240 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 0 time(s). 2010-12-21 17:34:51,241 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 1 time(s). 2010-12-21 17:34:52,241 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 2 time(s). 2010-12-21 17:34:53,242 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 3 time(s). 2010-12-21 17:34:54,243 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 4 time(s). 2010-12-21 17:34:55,243 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 5 time(s). 2010-12-21 17:34:56,244 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 6 time(s). 2010-12-21 17:34:57,245 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 7 time(s). 2010-12-21 17:34:58,245 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 8 time(s). 2010-12-21 17:34:59,246 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.20.20.182:10020. Already tried 9 time(s). 2010-12-21 17:34:59,246 WARN org.apache.hadoop.hdfs.DFSClient: Failed recovery attempt #5 from primary datanode 10.20.20.182:10010 java.net.ConnectException: Call to /10.20.20.182:10020 failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client.wrapException(Client.java:767) at org.apache.hadoop.ipc.Client.call(Client.java:743) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at $Proxy8.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:346) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:383) ... {code} because recovery from primary datanode is done 5 times (hardcoded). Within these retries we'll do {code} this.maxRetries = conf.getInt(ipc.client.connect.max.retries, 10); {code} The hardcoding of 5 attempts we should get fixed and we should doc the ipc.client.connect.max.retries as important config. We should recommend bringing it down from default. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-3403) Region orphaned after failure during split
[ https://issues.apache.org/jira/browse/HBASE-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-3403: - Attachment: 3403-v2.txt Made Todd suggested changes as well as handling of the possible scenario he describes. The fixup code was copied from 0.20. In 0.20, it was not possible to get into the state Todd postulates above but in 0.90, when fixup is done in shutdown handler and not in the 'catalogJanitor', it could happen. I added test that manufactures the postulated condition and we seem to be doing right thing. Region orphaned after failure during split -- Key: HBASE-3403 URL: https://issues.apache.org/jira/browse/HBASE-3403 Project: HBase Issue Type: Bug Affects Versions: 0.90.0 Reporter: Todd Lipcon Priority: Blocker Fix For: 0.90.0 Attachments: 3403-v2.txt, 3403.txt, broken-split.txt, hbck-fix-missing-in-meta.txt, master-logs.txt.gz ERROR: Region hdfs://haus01.sf.cloudera.com:11020/hbase-normal/usertable/2ad8df700eea55f70e02ea89178a65a2 on HDFS, but not listed in META or deployed on any region server. ERROR: Found inconsistency in table usertable Not sure how I got into this state, will look through logs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HBASE-3425) HMaster sends duplicate ports to regionserver in HServerAddress
HMaster sends duplicate ports to regionserver in HServerAddress --- Key: HBASE-3425 URL: https://issues.apache.org/jira/browse/HBASE-3425 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.0 Reporter: Matt Corgan Fix For: 0.90.0 On regionserver startup, the regionserver receives an HServerAddress from the master as a Writable. It's a string hostname and an integer port. Our master is also appending the port to the string, so when they are concatenated it becomes hadoopnode98:60020:60020 and the HServerAddress cannot be instantiated. This should probably be fixed in the master as well, but I don't know where it happens. The attached patch handles it in the regionserver. Regionserver startup log: 2011-01-06 15:55:48,813 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Connected to master at hadoopmaster.hotpads.srv:6 2011-01-06 15:55:48,857 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at hadoopmaster.hotpads.srv:6 that we are up 2011-01-06 15:55:48,910 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.regionserver.address=HadoopNode98.hotpads.srv:60020 2011-01-06 15:55:48,910 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://hadoopmaster.hotpads.srv:54310/hbase 2011-01-06 15:55:48,910 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://hadoopmaster.hotpads.srv:54310/hbase 2011-01-06 15:55:48,945 ERROR org.apache.hadoop.hbase.HServerAddress: Could not resolve the DNS name of HadoopNode98.hotpads.srv:60020:60020 2011-01-06 15:55:48,945 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Failed initialization 2011-01-06 15:55:48,947 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init java.lang.IllegalArgumentException: Could not resolve the DNS name of HadoopNode98.hotpads.srv:60020:60020 at org.apache.hadoop.hbase.HServerAddress.checkBindAddressCanBeResolved(HServerAddress.java:105) at org.apache.hadoop.hbase.HServerAddress.init(HServerAddress.java:76) at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:798) at org.apache.hadoop.hbase.regionserver.HRegionServer.tryReportForDuty(HRegionServer.java:1394) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:522) at java.lang.Thread.run(Thread.java:619) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-3425) HMaster sends duplicate ports to regionserver in HServerAddress
[ https://issues.apache.org/jira/browse/HBASE-3425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Corgan updated HBASE-3425: --- Attachment: HBASE-3425[0.90.0].patch HMaster sends duplicate ports to regionserver in HServerAddress --- Key: HBASE-3425 URL: https://issues.apache.org/jira/browse/HBASE-3425 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.0 Reporter: Matt Corgan Fix For: 0.90.0 Attachments: HBASE-3425[0.90.0].patch On regionserver startup, the regionserver receives an HServerAddress from the master as a Writable. It's a string hostname and an integer port. Our master is also appending the port to the string, so when they are concatenated it becomes hadoopnode98:60020:60020 and the HServerAddress cannot be instantiated. This should probably be fixed in the master as well, but I don't know where it happens. The attached patch handles it in the regionserver. Regionserver startup log: 2011-01-06 15:55:48,813 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Connected to master at hadoopmaster.hotpads.srv:6 2011-01-06 15:55:48,857 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at hadoopmaster.hotpads.srv:6 that we are up 2011-01-06 15:55:48,910 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.regionserver.address=HadoopNode98.hotpads.srv:60020 2011-01-06 15:55:48,910 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://hadoopmaster.hotpads.srv:54310/hbase 2011-01-06 15:55:48,910 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://hadoopmaster.hotpads.srv:54310/hbase 2011-01-06 15:55:48,945 ERROR org.apache.hadoop.hbase.HServerAddress: Could not resolve the DNS name of HadoopNode98.hotpads.srv:60020:60020 2011-01-06 15:55:48,945 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Failed initialization 2011-01-06 15:55:48,947 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init java.lang.IllegalArgumentException: Could not resolve the DNS name of HadoopNode98.hotpads.srv:60020:60020 at org.apache.hadoop.hbase.HServerAddress.checkBindAddressCanBeResolved(HServerAddress.java:105) at org.apache.hadoop.hbase.HServerAddress.init(HServerAddress.java:76) at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:798) at org.apache.hadoop.hbase.regionserver.HRegionServer.tryReportForDuty(HRegionServer.java:1394) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:522) at java.lang.Thread.run(Thread.java:619) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-3425) HMaster sends duplicate ports to regionserver in HServerAddress
[ https://issues.apache.org/jira/browse/HBASE-3425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-3425: - Fix Version/s: (was: 0.90.0) 0.90.1 Moving out of 0.90.0 for now. HMaster sends duplicate ports to regionserver in HServerAddress --- Key: HBASE-3425 URL: https://issues.apache.org/jira/browse/HBASE-3425 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.0 Reporter: Matt Corgan Fix For: 0.90.1 Attachments: HBASE-3425[0.90.0].patch On regionserver startup, the regionserver receives an HServerAddress from the master as a Writable. It's a string hostname and an integer port. Our master is also appending the port to the string, so when they are concatenated it becomes hadoopnode98:60020:60020 and the HServerAddress cannot be instantiated. This should probably be fixed in the master as well, but I don't know where it happens. The attached patch handles it in the regionserver. Regionserver startup log: 2011-01-06 15:55:48,813 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Connected to master at hadoopmaster.hotpads.srv:6 2011-01-06 15:55:48,857 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at hadoopmaster.hotpads.srv:6 that we are up 2011-01-06 15:55:48,910 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.regionserver.address=HadoopNode98.hotpads.srv:60020 2011-01-06 15:55:48,910 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://hadoopmaster.hotpads.srv:54310/hbase 2011-01-06 15:55:48,910 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://hadoopmaster.hotpads.srv:54310/hbase 2011-01-06 15:55:48,945 ERROR org.apache.hadoop.hbase.HServerAddress: Could not resolve the DNS name of HadoopNode98.hotpads.srv:60020:60020 2011-01-06 15:55:48,945 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Failed initialization 2011-01-06 15:55:48,947 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init java.lang.IllegalArgumentException: Could not resolve the DNS name of HadoopNode98.hotpads.srv:60020:60020 at org.apache.hadoop.hbase.HServerAddress.checkBindAddressCanBeResolved(HServerAddress.java:105) at org.apache.hadoop.hbase.HServerAddress.init(HServerAddress.java:76) at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:798) at org.apache.hadoop.hbase.regionserver.HRegionServer.tryReportForDuty(HRegionServer.java:1394) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:522) at java.lang.Thread.run(Thread.java:619) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HBASE-3418) Increment operations can break when qualifiers are split between memstore/snapshot and storefiles
[ https://issues.apache.org/jira/browse/HBASE-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Gray resolved HBASE-3418. -- Resolution: Fixed Hadoop Flags: [Reviewed] Committed to branch and trunk. Thanks for review Stack. Increment operations can break when qualifiers are split between memstore/snapshot and storefiles - Key: HBASE-3418 URL: https://issues.apache.org/jira/browse/HBASE-3418 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.90.0, 0.92.0 Attachments: HBASE-3418-v1.patch Doing investigation around some observed resetting counter behavior. An optimization was added to check memstore/snapshots first and then check storefiles if not all counters were found. However it looks like this introduced a bug when columns for a given row/family in a single increment operation are spread across memstores and storefiles. The results from get operations on both memstores and storefiles are appended together but when processed are expected to be fully sorted. This can lead to invalid results. Need to sort the combined result of memstores + storefiles. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3419) If re-transition to OPENING during log replay fails, server aborts. Instead, should just cancel region open.
[ https://issues.apache.org/jira/browse/HBASE-3419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978553#action_12978553 ] stack commented on HBASE-3419: -- +1 on patch. Jon says he's running it too. If re-transition to OPENING during log replay fails, server aborts. Instead, should just cancel region open. - Key: HBASE-3419 URL: https://issues.apache.org/jira/browse/HBASE-3419 Project: HBase Issue Type: Bug Components: regionserver, zookeeper Affects Versions: 0.90.0, 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.90.1, 0.92.0 Attachments: HBASE-3419-v1.patch, HBASE-3419-v2.patch The {{Progressable}} used on region open to tickle the ZK OPENING node to prevent the master from timing out a region open operation will currently abort the RegionServer if this fails for some reason. However it could be normal for an RS to have a region open operation aborted by the master, so should just handle as it does other places by reverting the open. We had a cluster trip over some other issue (for some reason, the tickle was not happening in 30 seconds, so master was timing out every time). Because of the abort on BadVersion, this eventually led to every single RS aborting itself eventually taking down the cluster. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HBASE-3426) Allowing HRegionServer extensions to register interfaces for HBaseRPCMetrics
Allowing HRegionServer extensions to register interfaces for HBaseRPCMetrics Key: HBASE-3426 URL: https://issues.apache.org/jira/browse/HBASE-3426 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.90.0 Reporter: James Kennedy Fix For: 0.90.0 HBaseRpcMetrics is now logging a WARN message every time it encounters an unregistered RPC method. In my case I now get huge log files filled with these warnings because the hbase-trx transactional extension of HBase uses a subclass of HRegionServer that adds new interface methods. It's easy enough to tell log4j to ignore HBaseRpcMetrics output. However, it would be nice if the Server/HRegionServer HBaseRpcMetrics mechanism was more extensible so I could pass down new interfaces or grab the HBaseRpcMetrics from the HBaseRPC object to add interfaces from up top... -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3426) Allowing HRegionServer extensions to register interfaces for HBaseRPCMetrics
[ https://issues.apache.org/jira/browse/HBASE-3426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978584#action_12978584 ] Gary Helmling commented on HBASE-3426: -- Hi James, I already committed a change for this as HBASE-3405 to 0.90 and trunk. If there are any issues with the interface there, we can address them here. Otherwise we should close this as a dupe? Allowing HRegionServer extensions to register interfaces for HBaseRPCMetrics Key: HBASE-3426 URL: https://issues.apache.org/jira/browse/HBASE-3426 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.90.0 Reporter: James Kennedy Fix For: 0.90.0 HBaseRpcMetrics is now logging a WARN message every time it encounters an unregistered RPC method. In my case I now get huge log files filled with these warnings because the hbase-trx transactional extension of HBase uses a subclass of HRegionServer that adds new interface methods. It's easy enough to tell log4j to ignore HBaseRpcMetrics output. However, it would be nice if the Server/HRegionServer HBaseRpcMetrics mechanism was more extensible so I could pass down new interfaces or grab the HBaseRpcMetrics from the HBaseRPC object to add interfaces from up top... -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HBASE-3427) HBaseConfiguration problem ( config file location )
HBaseConfiguration problem ( config file location ) --- Key: HBASE-3427 URL: https://issues.apache.org/jira/browse/HBASE-3427 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.20.6 Environment: ubuntu , windows 7 Reporter: jo sung jun Is this a bug ? example hbase-site.xml is in conf/hbase-site.xml so I wrote below. HBaseConfiguration config = new HBaseConfiguration(); config.addResource(HBaseClient.class.getResource(/conf/hbase-default.xml)); config.addResource(HBaseClient.class.getResource(/conf/hbase-site.xml)); config.set(hbase.zookeeper.quorum, 192.168.0.203); config.set(hbase.zookeeper.property.clientPort, 2181); but, when I use IndexedTable, cannot create table(almost locking) because IndexedTableAdmin.reIndexTable IndexedTableAdmin.java private void reIndexTable(byte[] baseTableName, IndexSpecification indexSpec) throws IOException { HTable baseTable = new HTable(baseTableName); .. .. } We should use HTable(String, HBaseConfiguration), I think. because of that, I moved the config files to src/ not src/conf/ ***.xml This is so confused. we should explain this more in java doc. or would be fixing it -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HBASE-3428) HBase examples ( sources attachment )
HBase examples ( sources attachment ) - Key: HBASE-3428 URL: https://issues.apache.org/jira/browse/HBASE-3428 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.20.6 Environment: Ubuntu, Windows 7 32bit Reporter: jo sung jun I know I am new Hbase user. When I learn Hbase, I was to hard about learning that. I couldn't find well made examples now, I wrote somd examples for Hbase starter. I wish this examples can help you. few more days later, I'll write more good examples. sources. AbstractHBase.java HBaseClient.java StressTest.java Thanks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-3428) HBase examples ( sources attachment )
[ https://issues.apache.org/jira/browse/HBASE-3428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jo sung jun updated HBASE-3428: --- Attachment: StressTest.java HBaseClient.java AbstractHBase.java Don't forget inserting property in hbase-site.xml !-- Secondary Indexes in Habase. -- property namehbase.regionserver.class/name valueorg.apache.hadoop.hbase.ipc.IndexedRegionInterface/value /property property namehbase.regionserver.impl/name valueorg.apache.hadoop.hbase.regionserver.tableindexed.IndexedRegionServer/value /property Thanks. HBase examples ( sources attachment ) - Key: HBASE-3428 URL: https://issues.apache.org/jira/browse/HBASE-3428 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.20.6 Environment: Ubuntu, Windows 7 32bit Reporter: jo sung jun Attachments: AbstractHBase.java, HBaseClient.java, StressTest.java I know I am new Hbase user. When I learn Hbase, I was to hard about learning that. I couldn't find well made examples now, I wrote somd examples for Hbase starter. I wish this examples can help you. few more days later, I'll write more good examples. sources. AbstractHBase.java HBaseClient.java StressTest.java Thanks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HBASE-3426) Allowing HRegionServer extensions to register interfaces for HBaseRPCMetrics
[ https://issues.apache.org/jira/browse/HBASE-3426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Kennedy resolved HBASE-3426. -- Resolution: Duplicate Yeah duplicate of case 3405. Thanks for addressing this so quickly. Allowing HRegionServer extensions to register interfaces for HBaseRPCMetrics Key: HBASE-3426 URL: https://issues.apache.org/jira/browse/HBASE-3426 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.90.0 Reporter: James Kennedy Fix For: 0.90.0 HBaseRpcMetrics is now logging a WARN message every time it encounters an unregistered RPC method. In my case I now get huge log files filled with these warnings because the hbase-trx transactional extension of HBase uses a subclass of HRegionServer that adds new interface methods. It's easy enough to tell log4j to ignore HBaseRpcMetrics output. However, it would be nice if the Server/HRegionServer HBaseRpcMetrics mechanism was more extensible so I could pass down new interfaces or grab the HBaseRpcMetrics from the HBaseRPC object to add interfaces from up top... -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3425) HMaster sends duplicate ports to regionserver in HServerAddress
[ https://issues.apache.org/jira/browse/HBASE-3425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978594#action_12978594 ] Matt Corgan commented on HBASE-3425: Hmm - it's not happening anymore. We had just changed the DNS entry that pointed to that IP address to give the regionserver a more friendly name. While having the problem the regionserver would log this line: 2011-01-06 15:55:48,910 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.regionserver.address=HadoopNode98.hotpads.srv:60020 But now that is gone and it logs this one: 2011-01-06 18:29:10,903 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us address to use. Was=HadoopNode98.hotpads.srv:60020, Now=HadoopNode98.hotpads.srv:60020 Looking through HMaster.regionServerStartup, it calls HBaseServer.getRemoteIp(). That asks the currently open socket for the InetAddress, and then things get hairy with PlainSocketImpl, InetAddress, etc... Something strange probably happened here due to the DNS modifications. If for some reason any of those external networking classes returned the port number appended to the hostname, then HBase currently does nothing to catch it. Hbase instantiates an InetSocketAddress which doesn't validate the string, and that is passed to an HServerAddress which also doesn't validate it. Maybe the solution is to not handle the error, but to at least throw an exception earlier by validating the hostname string in HMaster.regionServerStartup. HMaster sends duplicate ports to regionserver in HServerAddress --- Key: HBASE-3425 URL: https://issues.apache.org/jira/browse/HBASE-3425 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.0 Reporter: Matt Corgan Fix For: 0.90.1 Attachments: HBASE-3425[0.90.0].patch On regionserver startup, the regionserver receives an HServerAddress from the master as a Writable. It's a string hostname and an integer port. Our master is also appending the port to the string, so when they are concatenated it becomes hadoopnode98:60020:60020 and the HServerAddress cannot be instantiated. This should probably be fixed in the master as well, but I don't know where it happens. The attached patch handles it in the regionserver. Regionserver startup log: 2011-01-06 15:55:48,813 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Connected to master at hadoopmaster.hotpads.srv:6 2011-01-06 15:55:48,857 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at hadoopmaster.hotpads.srv:6 that we are up 2011-01-06 15:55:48,910 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.regionserver.address=HadoopNode98.hotpads.srv:60020 2011-01-06 15:55:48,910 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://hadoopmaster.hotpads.srv:54310/hbase 2011-01-06 15:55:48,910 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://hadoopmaster.hotpads.srv:54310/hbase 2011-01-06 15:55:48,945 ERROR org.apache.hadoop.hbase.HServerAddress: Could not resolve the DNS name of HadoopNode98.hotpads.srv:60020:60020 2011-01-06 15:55:48,945 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Failed initialization 2011-01-06 15:55:48,947 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init java.lang.IllegalArgumentException: Could not resolve the DNS name of HadoopNode98.hotpads.srv:60020:60020 at org.apache.hadoop.hbase.HServerAddress.checkBindAddressCanBeResolved(HServerAddress.java:105) at org.apache.hadoop.hbase.HServerAddress.init(HServerAddress.java:76) at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:798) at org.apache.hadoop.hbase.regionserver.HRegionServer.tryReportForDuty(HRegionServer.java:1394) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:522) at java.lang.Thread.run(Thread.java:619) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3425) HMaster sends duplicate ports to regionserver in HServerAddress
[ https://issues.apache.org/jira/browse/HBASE-3425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978634#action_12978634 ] stack commented on HBASE-3425: -- It just something I've never seen before. Probably no harm in your patch as is. HMaster sends duplicate ports to regionserver in HServerAddress --- Key: HBASE-3425 URL: https://issues.apache.org/jira/browse/HBASE-3425 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.0 Reporter: Matt Corgan Fix For: 0.90.1 Attachments: HBASE-3425[0.90.0].patch On regionserver startup, the regionserver receives an HServerAddress from the master as a Writable. It's a string hostname and an integer port. Our master is also appending the port to the string, so when they are concatenated it becomes hadoopnode98:60020:60020 and the HServerAddress cannot be instantiated. This should probably be fixed in the master as well, but I don't know where it happens. The attached patch handles it in the regionserver. Regionserver startup log: 2011-01-06 15:55:48,813 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Connected to master at hadoopmaster.hotpads.srv:6 2011-01-06 15:55:48,857 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at hadoopmaster.hotpads.srv:6 that we are up 2011-01-06 15:55:48,910 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.regionserver.address=HadoopNode98.hotpads.srv:60020 2011-01-06 15:55:48,910 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://hadoopmaster.hotpads.srv:54310/hbase 2011-01-06 15:55:48,910 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://hadoopmaster.hotpads.srv:54310/hbase 2011-01-06 15:55:48,945 ERROR org.apache.hadoop.hbase.HServerAddress: Could not resolve the DNS name of HadoopNode98.hotpads.srv:60020:60020 2011-01-06 15:55:48,945 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Failed initialization 2011-01-06 15:55:48,947 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init java.lang.IllegalArgumentException: Could not resolve the DNS name of HadoopNode98.hotpads.srv:60020:60020 at org.apache.hadoop.hbase.HServerAddress.checkBindAddressCanBeResolved(HServerAddress.java:105) at org.apache.hadoop.hbase.HServerAddress.init(HServerAddress.java:76) at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:798) at org.apache.hadoop.hbase.regionserver.HRegionServer.tryReportForDuty(HRegionServer.java:1394) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:522) at java.lang.Thread.run(Thread.java:619) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3428) HBase examples ( sources attachment )
[ https://issues.apache.org/jira/browse/HBASE-3428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978637#action_12978637 ] stack commented on HBASE-3428: -- Thank you for doing the fancy examples Jo Sung Jun. HBase examples ( sources attachment ) - Key: HBASE-3428 URL: https://issues.apache.org/jira/browse/HBASE-3428 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.20.6 Environment: Ubuntu, Windows 7 32bit Reporter: jo sung jun Attachments: AbstractHBase.java, HBaseClient.java, StressTest.java I know I am new Hbase user. When I learn Hbase, I was to hard about learning that. I couldn't find well made examples now, I wrote somd examples for Hbase starter. I wish this examples can help you. few more days later, I'll write more good examples. sources. AbstractHBase.java HBaseClient.java StressTest.java Thanks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.