[jira] Commented: (HBASE-3405) Allow HBaseRpcMetrics to register custom interface methods

2011-01-06 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978229#action_12978229
 ] 

HBase Review Board commented on HBASE-3405:
---

Message from: Gary Helmling ghelml...@gmail.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1414/
---

Review request for hbase.


Summary
---

The patch allows alternate HRegionServer subclasses (and in the future 
coprocessors) to access HBaseRpcMetrics in order to register additional RPC 
methods into the metrics registry.

The changes are:
 * add getRpcMetrics() to HBaseServer to allow access to the HBaseRpcMetrics 
instance
 * add getRpcMetrics() to RegionServerServices (and HRegionServer) to allow for 
future coprocessor accounting
 * add HBaseRpcMetrics.createMetrics(Class[], boolean) overload -- if boolean 
is true, the registered method names will be prefixed with the class name.  
This should help clarify origin for custom metrics and help prevent collisions.


This addresses bug HBASE-3405.
http://issues.apache.org/jira/browse/HBASE-3405


Diffs
-

  src/main/java/org/apache/hadoop/hbase/ipc/HBaseRpcMetrics.java 19dbf2b 
  src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 867a059 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d7147b5 
  src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java 
1309f93 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestRpcMetrics.java 
PRE-CREATION 

Diff: http://review.cloudera.org/r/1414/diff


Testing
---

Created new org.apache.hadoop.hbase.regionserver.TestRpcMetrics test case to 
verify registration and incrementing of metrics from HRegionServer subclasses.


Thanks,

Gary




 Allow HBaseRpcMetrics to register custom interface methods
 --

 Key: HBASE-3405
 URL: https://issues.apache.org/jira/browse/HBASE-3405
 Project: HBase
  Issue Type: Improvement
  Components: ipc
Reporter: Gary Helmling
Priority: Minor

 Opened from comments on HBASE-2997.  James Kennedy notes:
 {quote}
 HBaseRpcMetrics is now logging a WARN message every time it encounters an 
 unregistered RPC method.
 In my case I now get huge log files filled with these warnings because the 
 hbase-trx transactional extension of HBase uses a subclass of HRegionServer 
 that adds new interface methods.
 It's easy enough to tell log4j to ignore HBaseRpcMetrics output.
 However, it would be nice if the Server/HRegionServer HBaseRpcMetrics 
 mechanism was more extensible so I could pass down new interfaces or grab the 
 HBaseRpcMetrics object to add interfaces from up top...
 {quote}
 {{HBaseRpcMetrics}} already has a public method {{createMetrics(Class)}} to 
 register method counters.  We just need a way to expose the metrics class to 
 allow the region server subclass to call it -- add a {{getMetrics()}} method 
 to {{RpcServer}} and {{HBaseServer}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-01-06 Thread Jonathan Gray (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Gray updated HBASE-3417:
-

Attachment: HBASE-3417-v5.patch

Latest version.  Adds proper handling in the compaction path (previous patch 
only dealt with flush).  Changes regex to use [0-9a-f]+ to be backwards 
compatible.

 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-v1.patch, HBASE-3417-v2.patch, 
 HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HBASE-3405) Allow HBaseRpcMetrics to register custom interface methods

2011-01-06 Thread Gary Helmling (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HBASE-3405:
-

Attachment: HBASE-3405_2_0.90.patch

The patch allows alternate HRegionServer subclasses (and in the future 
coprocessors) to access HBaseRpcMetrics in order to register additional RPC 
methods into the metrics registry.

The changes are:
 * add getRpcMetrics() to HBaseServer to allow access to the HBaseRpcMetrics 
instance
 * add getRpcMetrics() to RegionServerServices (and HRegionServer) to allow for 
future coprocessor accounting
 * add HBaseRpcMetrics.createMetrics(Class[], boolean) overload -- if boolean 
is true, the registered method names will be prefixed with the class name.  
This should help clarify origin for custom metrics and help prevent collisions.


This version differs from the previous review.hbase.org version only by 
changing the delimiter character for class + method attribute names from '.' to 
'$'.  According to JMX spec all attribute name chars must pass 
Character.isJavaIdentifierPart().  Should have checked that earlier...

 Allow HBaseRpcMetrics to register custom interface methods
 --

 Key: HBASE-3405
 URL: https://issues.apache.org/jira/browse/HBASE-3405
 Project: HBase
  Issue Type: Improvement
  Components: ipc
Reporter: Gary Helmling
Priority: Minor
 Attachments: HBASE-3405_2_0.90.patch


 Opened from comments on HBASE-2997.  James Kennedy notes:
 {quote}
 HBaseRpcMetrics is now logging a WARN message every time it encounters an 
 unregistered RPC method.
 In my case I now get huge log files filled with these warnings because the 
 hbase-trx transactional extension of HBase uses a subclass of HRegionServer 
 that adds new interface methods.
 It's easy enough to tell log4j to ignore HBaseRpcMetrics output.
 However, it would be nice if the Server/HRegionServer HBaseRpcMetrics 
 mechanism was more extensible so I could pass down new interfaces or grab the 
 HBaseRpcMetrics object to add interfaces from up top...
 {quote}
 {{HBaseRpcMetrics}} already has a public method {{createMetrics(Class)}} to 
 register method counters.  We just need a way to expose the metrics class to 
 allow the region server subclass to call it -- add a {{getMetrics()}} method 
 to {{RpcServer}} and {{HBaseServer}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-01-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978427#action_12978427
 ] 

stack commented on HBASE-3417:
--

Should you add in A-Z in below just in case?

{code}
+Pattern.compile(^([0-9a-f]+)(?:\\.(.+))?$);
{code}

Yeah, don't replace the '-' I'd say:

{code}
+return new Path(dir, UUID.randomUUID().toString().replaceAll(-, )
++ ((suffix == null || suffix.length() = 0) ?  : suffix));
{code}

Then its easy to go back to UUID.  You might want to do that so you can use the 
128 bits as key in LRU rather than String?

Otherwise, +1 IFF verified backward compatible.




 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-v1.patch, HBASE-3417-v2.patch, 
 HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3405) Allow HBaseRpcMetrics to register custom interface methods

2011-01-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978430#action_12978430
 ] 

stack commented on HBASE-3405:
--

+1 for commit to branch and trunk.  Excellent.

 Allow HBaseRpcMetrics to register custom interface methods
 --

 Key: HBASE-3405
 URL: https://issues.apache.org/jira/browse/HBASE-3405
 Project: HBase
  Issue Type: Improvement
  Components: ipc
Reporter: Gary Helmling
Priority: Minor
 Attachments: HBASE-3405_2_0.90.patch


 Opened from comments on HBASE-2997.  James Kennedy notes:
 {quote}
 HBaseRpcMetrics is now logging a WARN message every time it encounters an 
 unregistered RPC method.
 In my case I now get huge log files filled with these warnings because the 
 hbase-trx transactional extension of HBase uses a subclass of HRegionServer 
 that adds new interface methods.
 It's easy enough to tell log4j to ignore HBaseRpcMetrics output.
 However, it would be nice if the Server/HRegionServer HBaseRpcMetrics 
 mechanism was more extensible so I could pass down new interfaces or grab the 
 HBaseRpcMetrics object to add interfaces from up top...
 {quote}
 {{HBaseRpcMetrics}} already has a public method {{createMetrics(Class)}} to 
 register method counters.  We just need a way to expose the metrics class to 
 allow the region server subclass to call it -- add a {{getMetrics()}} method 
 to {{RpcServer}} and {{HBaseServer}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-01-06 Thread Jonathan Gray (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978431#action_12978431
 ] 

Jonathan Gray commented on HBASE-3417:
--

bq. Should you add in A-Z in below just in case?
Could add A-F (uuid is hex chars only), but it's unnecessary.

bq. Then its easy to go back to UUID. You might want to do that so you can use 
the 128 bits as key in LRU rather than String?
LRU uses a String for block name.  I think it looks much nicer with a 
consistent looking naming scheme for region directories and storefiles.  And I 
don't think we need to be overly concerned about the size... If 64K block, in 
the LRU we're talking about 0.05% overhead (or like 0.02% over a more compact 
version).

Also, traditional GUID format reminds me of Microsoft SQL Server :)

This latest v5 patch is being deployed on a 100 node cluster with existing data 
tonight.  Will commit once verified that it's working there.

 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-v1.patch, HBASE-3417-v2.patch, 
 HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3379) Log splitting slowed by repeated attempts at connecting to downed datanode

2011-01-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978434#action_12978434
 ] 

stack commented on HBASE-3379:
--

@Hairong Is there a new API in branch-0.20-append that we should be calling?  
Thanks.

 Log splitting slowed by repeated attempts at connecting to downed datanode
 --

 Key: HBASE-3379
 URL: https://issues.apache.org/jira/browse/HBASE-3379
 Project: HBase
  Issue Type: Bug
  Components: wal
Reporter: stack
Priority: Critical

 Testing if I kill RS and DN on a node, log splitting takes longer as we 
 doggedly try connecting to the downed DN to get WAL blocks.  Here's the cycle 
 I see:
 {code}
 2010-12-21 17:34:48,239 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery 
 for block blk_900551257176291912_1203821 failed  because recovery from 
 primary datanode 10.20.20.182:10010 failed 5 times.Pipeline was 
 10.20.20.184:10010, 10.20.20.186:10010, 10.20.20.182:10010. Will retry...
 2010-12-21 17:34:50,240 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 0 time(s).
 2010-12-21 17:34:51,241 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 1 time(s).
 2010-12-21 17:34:52,241 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 2 time(s).
 2010-12-21 17:34:53,242 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 3 time(s).
 2010-12-21 17:34:54,243 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 4 time(s).
 2010-12-21 17:34:55,243 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 5 time(s).
 2010-12-21 17:34:56,244 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 6 time(s).
 2010-12-21 17:34:57,245 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 7 time(s).
 2010-12-21 17:34:58,245 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 8 time(s).
 2010-12-21 17:34:59,246 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 9 time(s).
 2010-12-21 17:34:59,246 WARN org.apache.hadoop.hdfs.DFSClient: Failed 
 recovery attempt #5 from primary datanode 10.20.20.182:10010
 java.net.ConnectException: Call to /10.20.20.182:10020 failed on connection 
 exception: java.net.ConnectException: Connection refused
 at org.apache.hadoop.ipc.Client.wrapException(Client.java:767)
 at org.apache.hadoop.ipc.Client.call(Client.java:743)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
 at $Proxy8.getProtocolVersion(Unknown Source)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:346)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:383)
 ...
 {code}
 because recovery from primary datanode is done 5 times (hardcoded).  Within 
 these retries we'll do
 {code}
 this.maxRetries = conf.getInt(ipc.client.connect.max.retries, 10);
 {code}
 The hardcoding of 5 attempts we should get fixed and we should doc the 
 ipc.client.connect.max.retries as important config.  We should recommend 
 bringing it down from default.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HBASE-3405) Allow HBaseRpcMetrics to register custom interface methods

2011-01-06 Thread Gary Helmling (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HBASE-3405:
-

Attachment: HBASE-3405_trunk.patch

Patch committed to trunk.  Same as 0.90 patch with the addition of the 
getRpcMetrics() method to RpcServer interface.

 Allow HBaseRpcMetrics to register custom interface methods
 --

 Key: HBASE-3405
 URL: https://issues.apache.org/jira/browse/HBASE-3405
 Project: HBase
  Issue Type: Improvement
  Components: ipc
Reporter: Gary Helmling
Priority: Minor
 Attachments: HBASE-3405_2_0.90.patch, HBASE-3405_trunk.patch


 Opened from comments on HBASE-2997.  James Kennedy notes:
 {quote}
 HBaseRpcMetrics is now logging a WARN message every time it encounters an 
 unregistered RPC method.
 In my case I now get huge log files filled with these warnings because the 
 hbase-trx transactional extension of HBase uses a subclass of HRegionServer 
 that adds new interface methods.
 It's easy enough to tell log4j to ignore HBaseRpcMetrics output.
 However, it would be nice if the Server/HRegionServer HBaseRpcMetrics 
 mechanism was more extensible so I could pass down new interfaces or grab the 
 HBaseRpcMetrics object to add interfaces from up top...
 {quote}
 {{HBaseRpcMetrics}} already has a public method {{createMetrics(Class)}} to 
 register method counters.  We just need a way to expose the metrics class to 
 allow the region server subclass to call it -- add a {{getMetrics()}} method 
 to {{RpcServer}} and {{HBaseServer}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HBASE-3405) Allow HBaseRpcMetrics to register custom interface methods

2011-01-06 Thread Gary Helmling (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling resolved HBASE-3405.
--

   Resolution: Fixed
Fix Version/s: 0.90.0
 Assignee: Gary Helmling

Committed to 0.90 branch and trunk.

 Allow HBaseRpcMetrics to register custom interface methods
 --

 Key: HBASE-3405
 URL: https://issues.apache.org/jira/browse/HBASE-3405
 Project: HBase
  Issue Type: Improvement
  Components: ipc
Reporter: Gary Helmling
Assignee: Gary Helmling
Priority: Minor
 Fix For: 0.90.0

 Attachments: HBASE-3405_2_0.90.patch, HBASE-3405_trunk.patch


 Opened from comments on HBASE-2997.  James Kennedy notes:
 {quote}
 HBaseRpcMetrics is now logging a WARN message every time it encounters an 
 unregistered RPC method.
 In my case I now get huge log files filled with these warnings because the 
 hbase-trx transactional extension of HBase uses a subclass of HRegionServer 
 that adds new interface methods.
 It's easy enough to tell log4j to ignore HBaseRpcMetrics output.
 However, it would be nice if the Server/HRegionServer HBaseRpcMetrics 
 mechanism was more extensible so I could pass down new interfaces or grab the 
 HBaseRpcMetrics object to add interfaces from up top...
 {quote}
 {{HBaseRpcMetrics}} already has a public method {{createMetrics(Class)}} to 
 register method counters.  We just need a way to expose the metrics class to 
 allow the region server subclass to call it -- add a {{getMetrics()}} method 
 to {{RpcServer}} and {{HBaseServer}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HBASE-3424) Add metrics for custom RPC methods called through HTable.coprocessorExec()

2011-01-06 Thread Gary Helmling (JIRA)
Add metrics for custom RPC methods called through HTable.coprocessorExec()
--

 Key: HBASE-3424
 URL: https://issues.apache.org/jira/browse/HBASE-3424
 Project: HBase
  Issue Type: Sub-task
Reporter: Gary Helmling




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HBASE-3407) hbck should pause after fixing before re-checking state

2011-01-06 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved HBASE-3407.


   Resolution: Fixed
Fix Version/s: 0.90.0
 Hadoop Flags: [Reviewed]

 hbck should pause after fixing before re-checking state
 ---

 Key: HBASE-3407
 URL: https://issues.apache.org/jira/browse/HBASE-3407
 Project: HBase
  Issue Type: Improvement
  Components: util
Affects Versions: 0.90.1
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.90.0

 Attachments: hbase-3407.txt


 Right now when run with the -fix option, hbck tries to fix up the issue and 
 then immediately re-runs itself to see if the fix worked. However most of the 
 fixes require some other nodes in the cluster to take some action, which will 
 take a couple of seconds (eg for them to notice a change in ZK and pick up 
 the fixed region).
 So, hbck should pause for some amount of time in between fixing and 
 re-running.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HBASE-3401) Region IPC operations should be high priority

2011-01-06 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HBASE-3401:
---

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

 Region IPC operations should be high priority
 -

 Key: HBASE-3401
 URL: https://issues.apache.org/jira/browse/HBASE-3401
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.1
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.90.0

 Attachments: hbase-3401.txt, hbase-3401.txt


 I manufactured an imbalanced cluster so one region server had 300 regions and 
 the others had very few. I then ran balancer while hitting the high-load 
 region server with YCSB. I observed that the rate of load shedding was VERY 
 slow since the closeRegion IPCs were getting stuck at the back of the IPC 
 queue.
 All of these important master-RS RPC calls should be set to high priority.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3379) Log splitting slowed by repeated attempts at connecting to downed datanode

2011-01-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978482#action_12978482
 ] 

stack commented on HBASE-3379:
--

bq. I will upload a patch in HBASE=3285 to make HBase to use the new API. Can I 
assume that HBase is bundled only with append 0.20?

No.  We have to work w/ CDH too.  If you want, I can muck with it if you write 
outline of what to do.  We already have some reflection going on that tests for 
presence of methods.  I can do a bit more to find recoverLease.  Thanks  H.

 Log splitting slowed by repeated attempts at connecting to downed datanode
 --

 Key: HBASE-3379
 URL: https://issues.apache.org/jira/browse/HBASE-3379
 Project: HBase
  Issue Type: Bug
  Components: wal
Reporter: stack
Priority: Critical

 Testing if I kill RS and DN on a node, log splitting takes longer as we 
 doggedly try connecting to the downed DN to get WAL blocks.  Here's the cycle 
 I see:
 {code}
 2010-12-21 17:34:48,239 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery 
 for block blk_900551257176291912_1203821 failed  because recovery from 
 primary datanode 10.20.20.182:10010 failed 5 times.Pipeline was 
 10.20.20.184:10010, 10.20.20.186:10010, 10.20.20.182:10010. Will retry...
 2010-12-21 17:34:50,240 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 0 time(s).
 2010-12-21 17:34:51,241 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 1 time(s).
 2010-12-21 17:34:52,241 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 2 time(s).
 2010-12-21 17:34:53,242 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 3 time(s).
 2010-12-21 17:34:54,243 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 4 time(s).
 2010-12-21 17:34:55,243 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 5 time(s).
 2010-12-21 17:34:56,244 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 6 time(s).
 2010-12-21 17:34:57,245 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 7 time(s).
 2010-12-21 17:34:58,245 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 8 time(s).
 2010-12-21 17:34:59,246 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 9 time(s).
 2010-12-21 17:34:59,246 WARN org.apache.hadoop.hdfs.DFSClient: Failed 
 recovery attempt #5 from primary datanode 10.20.20.182:10010
 java.net.ConnectException: Call to /10.20.20.182:10020 failed on connection 
 exception: java.net.ConnectException: Connection refused
 at org.apache.hadoop.ipc.Client.wrapException(Client.java:767)
 at org.apache.hadoop.ipc.Client.call(Client.java:743)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
 at $Proxy8.getProtocolVersion(Unknown Source)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:346)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:383)
 ...
 {code}
 because recovery from primary datanode is done 5 times (hardcoded).  Within 
 these retries we'll do
 {code}
 this.maxRetries = conf.getInt(ipc.client.connect.max.retries, 10);
 {code}
 The hardcoding of 5 attempts we should get fixed and we should doc the 
 ipc.client.connect.max.retries as important config.  We should recommend 
 bringing it down from default.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3379) Log splitting slowed by repeated attempts at connecting to downed datanode

2011-01-06 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978496#action_12978496
 ] 

Todd Lipcon commented on HBASE-3379:


We can probably pull that new HDFS patch into CDH3 also. Ill put it on our 
list to evaluate.

 Log splitting slowed by repeated attempts at connecting to downed datanode
 --

 Key: HBASE-3379
 URL: https://issues.apache.org/jira/browse/HBASE-3379
 Project: HBase
  Issue Type: Bug
  Components: wal
Reporter: stack
Priority: Critical

 Testing if I kill RS and DN on a node, log splitting takes longer as we 
 doggedly try connecting to the downed DN to get WAL blocks.  Here's the cycle 
 I see:
 {code}
 2010-12-21 17:34:48,239 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery 
 for block blk_900551257176291912_1203821 failed  because recovery from 
 primary datanode 10.20.20.182:10010 failed 5 times.Pipeline was 
 10.20.20.184:10010, 10.20.20.186:10010, 10.20.20.182:10010. Will retry...
 2010-12-21 17:34:50,240 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 0 time(s).
 2010-12-21 17:34:51,241 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 1 time(s).
 2010-12-21 17:34:52,241 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 2 time(s).
 2010-12-21 17:34:53,242 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 3 time(s).
 2010-12-21 17:34:54,243 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 4 time(s).
 2010-12-21 17:34:55,243 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 5 time(s).
 2010-12-21 17:34:56,244 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 6 time(s).
 2010-12-21 17:34:57,245 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 7 time(s).
 2010-12-21 17:34:58,245 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 8 time(s).
 2010-12-21 17:34:59,246 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: /10.20.20.182:10020. Already tried 9 time(s).
 2010-12-21 17:34:59,246 WARN org.apache.hadoop.hdfs.DFSClient: Failed 
 recovery attempt #5 from primary datanode 10.20.20.182:10010
 java.net.ConnectException: Call to /10.20.20.182:10020 failed on connection 
 exception: java.net.ConnectException: Connection refused
 at org.apache.hadoop.ipc.Client.wrapException(Client.java:767)
 at org.apache.hadoop.ipc.Client.call(Client.java:743)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
 at $Proxy8.getProtocolVersion(Unknown Source)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:346)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:383)
 ...
 {code}
 because recovery from primary datanode is done 5 times (hardcoded).  Within 
 these retries we'll do
 {code}
 this.maxRetries = conf.getInt(ipc.client.connect.max.retries, 10);
 {code}
 The hardcoding of 5 attempts we should get fixed and we should doc the 
 ipc.client.connect.max.retries as important config.  We should recommend 
 bringing it down from default.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HBASE-3403) Region orphaned after failure during split

2011-01-06 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-3403:
-

Attachment: 3403-v2.txt

Made Todd suggested changes as well as handling of the possible scenario he 
describes.  The fixup code was copied from 0.20.  In 0.20, it was not possible 
to get into the state Todd postulates above but in 0.90, when fixup is done in 
shutdown handler and not in the 'catalogJanitor', it could happen.  I added 
test that manufactures the postulated condition and we seem to be doing right 
thing.

 Region orphaned after failure during split
 --

 Key: HBASE-3403
 URL: https://issues.apache.org/jira/browse/HBASE-3403
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.0
Reporter: Todd Lipcon
Priority: Blocker
 Fix For: 0.90.0

 Attachments: 3403-v2.txt, 3403.txt, broken-split.txt, 
 hbck-fix-missing-in-meta.txt, master-logs.txt.gz


 ERROR: Region 
 hdfs://haus01.sf.cloudera.com:11020/hbase-normal/usertable/2ad8df700eea55f70e02ea89178a65a2
  on HDFS, but not listed in META or deployed on any region server.
 ERROR: Found inconsistency in table usertable
 Not sure how I got into this state, will look through logs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HBASE-3425) HMaster sends duplicate ports to regionserver in HServerAddress

2011-01-06 Thread Matt Corgan (JIRA)
HMaster sends duplicate ports to regionserver in HServerAddress
---

 Key: HBASE-3425
 URL: https://issues.apache.org/jira/browse/HBASE-3425
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.0
Reporter: Matt Corgan
 Fix For: 0.90.0


On regionserver startup, the regionserver receives an HServerAddress from the 
master as a Writable.  It's a string hostname and an integer port.  Our master 
is also appending the port to the string, so when they are concatenated it 
becomes hadoopnode98:60020:60020 and the HServerAddress cannot be instantiated. 
 

This should probably be fixed in the master as well, but I don't know where it 
happens.  The attached patch handles it in the regionserver.

Regionserver startup log:

2011-01-06 15:55:48,813 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: Connected to master at 
hadoopmaster.hotpads.srv:6
2011-01-06 15:55:48,857 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 
hadoopmaster.hotpads.srv:6 that we are up
2011-01-06 15:55:48,910 DEBUG 
org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
hbase.regionserver.address=HadoopNode98.hotpads.srv:60020
2011-01-06 15:55:48,910 DEBUG 
org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
fs.default.name=hdfs://hadoopmaster.hotpads.srv:54310/hbase
2011-01-06 15:55:48,910 DEBUG 
org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
hbase.rootdir=hdfs://hadoopmaster.hotpads.srv:54310/hbase
2011-01-06 15:55:48,945 ERROR org.apache.hadoop.hbase.HServerAddress: Could not 
resolve the DNS name of HadoopNode98.hotpads.srv:60020:60020
2011-01-06 15:55:48,945 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Failed 
initialization
2011-01-06 15:55:48,947 ERROR 
org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init
java.lang.IllegalArgumentException: Could not resolve the DNS name of 
HadoopNode98.hotpads.srv:60020:60020
at 
org.apache.hadoop.hbase.HServerAddress.checkBindAddressCanBeResolved(HServerAddress.java:105)
at org.apache.hadoop.hbase.HServerAddress.init(HServerAddress.java:76)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:798)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.tryReportForDuty(HRegionServer.java:1394)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:522)
at java.lang.Thread.run(Thread.java:619)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HBASE-3425) HMaster sends duplicate ports to regionserver in HServerAddress

2011-01-06 Thread Matt Corgan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Corgan updated HBASE-3425:
---

Attachment: HBASE-3425[0.90.0].patch

 HMaster sends duplicate ports to regionserver in HServerAddress
 ---

 Key: HBASE-3425
 URL: https://issues.apache.org/jira/browse/HBASE-3425
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.0
Reporter: Matt Corgan
 Fix For: 0.90.0

 Attachments: HBASE-3425[0.90.0].patch


 On regionserver startup, the regionserver receives an HServerAddress from the 
 master as a Writable.  It's a string hostname and an integer port.  Our 
 master is also appending the port to the string, so when they are 
 concatenated it becomes hadoopnode98:60020:60020 and the HServerAddress 
 cannot be instantiated.  
 This should probably be fixed in the master as well, but I don't know where 
 it happens.  The attached patch handles it in the regionserver.
 Regionserver startup log:
 2011-01-06 15:55:48,813 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Connected to master at 
 hadoopmaster.hotpads.srv:6
 2011-01-06 15:55:48,857 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 
 hadoopmaster.hotpads.srv:6 that we are up
 2011-01-06 15:55:48,910 DEBUG 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 hbase.regionserver.address=HadoopNode98.hotpads.srv:60020
 2011-01-06 15:55:48,910 DEBUG 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 fs.default.name=hdfs://hadoopmaster.hotpads.srv:54310/hbase
 2011-01-06 15:55:48,910 DEBUG 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 hbase.rootdir=hdfs://hadoopmaster.hotpads.srv:54310/hbase
 2011-01-06 15:55:48,945 ERROR org.apache.hadoop.hbase.HServerAddress: Could 
 not resolve the DNS name of HadoopNode98.hotpads.srv:60020:60020
 2011-01-06 15:55:48,945 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Failed 
 initialization
 2011-01-06 15:55:48,947 ERROR 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init
 java.lang.IllegalArgumentException: Could not resolve the DNS name of 
 HadoopNode98.hotpads.srv:60020:60020
 at 
 org.apache.hadoop.hbase.HServerAddress.checkBindAddressCanBeResolved(HServerAddress.java:105)
 at 
 org.apache.hadoop.hbase.HServerAddress.init(HServerAddress.java:76)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:798)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.tryReportForDuty(HRegionServer.java:1394)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:522)
 at java.lang.Thread.run(Thread.java:619)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HBASE-3425) HMaster sends duplicate ports to regionserver in HServerAddress

2011-01-06 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-3425:
-

Fix Version/s: (was: 0.90.0)
   0.90.1

Moving out of 0.90.0 for now.

 HMaster sends duplicate ports to regionserver in HServerAddress
 ---

 Key: HBASE-3425
 URL: https://issues.apache.org/jira/browse/HBASE-3425
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.0
Reporter: Matt Corgan
 Fix For: 0.90.1

 Attachments: HBASE-3425[0.90.0].patch


 On regionserver startup, the regionserver receives an HServerAddress from the 
 master as a Writable.  It's a string hostname and an integer port.  Our 
 master is also appending the port to the string, so when they are 
 concatenated it becomes hadoopnode98:60020:60020 and the HServerAddress 
 cannot be instantiated.  
 This should probably be fixed in the master as well, but I don't know where 
 it happens.  The attached patch handles it in the regionserver.
 Regionserver startup log:
 2011-01-06 15:55:48,813 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Connected to master at 
 hadoopmaster.hotpads.srv:6
 2011-01-06 15:55:48,857 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 
 hadoopmaster.hotpads.srv:6 that we are up
 2011-01-06 15:55:48,910 DEBUG 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 hbase.regionserver.address=HadoopNode98.hotpads.srv:60020
 2011-01-06 15:55:48,910 DEBUG 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 fs.default.name=hdfs://hadoopmaster.hotpads.srv:54310/hbase
 2011-01-06 15:55:48,910 DEBUG 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 hbase.rootdir=hdfs://hadoopmaster.hotpads.srv:54310/hbase
 2011-01-06 15:55:48,945 ERROR org.apache.hadoop.hbase.HServerAddress: Could 
 not resolve the DNS name of HadoopNode98.hotpads.srv:60020:60020
 2011-01-06 15:55:48,945 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Failed 
 initialization
 2011-01-06 15:55:48,947 ERROR 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init
 java.lang.IllegalArgumentException: Could not resolve the DNS name of 
 HadoopNode98.hotpads.srv:60020:60020
 at 
 org.apache.hadoop.hbase.HServerAddress.checkBindAddressCanBeResolved(HServerAddress.java:105)
 at 
 org.apache.hadoop.hbase.HServerAddress.init(HServerAddress.java:76)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:798)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.tryReportForDuty(HRegionServer.java:1394)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:522)
 at java.lang.Thread.run(Thread.java:619)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HBASE-3418) Increment operations can break when qualifiers are split between memstore/snapshot and storefiles

2011-01-06 Thread Jonathan Gray (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Gray resolved HBASE-3418.
--

  Resolution: Fixed
Hadoop Flags: [Reviewed]

Committed to branch and trunk.  Thanks for review Stack.

 Increment operations can break when qualifiers are split between 
 memstore/snapshot and storefiles
 -

 Key: HBASE-3418
 URL: https://issues.apache.org/jira/browse/HBASE-3418
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.90.0, 0.92.0

 Attachments: HBASE-3418-v1.patch


 Doing investigation around some observed resetting counter behavior.
 An optimization was added to check memstore/snapshots first and then check 
 storefiles if not all counters were found.  However it looks like this 
 introduced a bug when columns for a given row/family in a single increment 
 operation are spread across memstores and storefiles.
 The results from get operations on both memstores and storefiles are appended 
 together but when processed are expected to be fully sorted.  This can lead 
 to invalid results.
 Need to sort the combined result of memstores + storefiles.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3419) If re-transition to OPENING during log replay fails, server aborts. Instead, should just cancel region open.

2011-01-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978553#action_12978553
 ] 

stack commented on HBASE-3419:
--

+1 on patch.

Jon says he's running it too.

 If re-transition to OPENING during log replay fails, server aborts.  Instead, 
 should just cancel region open.
 -

 Key: HBASE-3419
 URL: https://issues.apache.org/jira/browse/HBASE-3419
 Project: HBase
  Issue Type: Bug
  Components: regionserver, zookeeper
Affects Versions: 0.90.0, 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.90.1, 0.92.0

 Attachments: HBASE-3419-v1.patch, HBASE-3419-v2.patch


 The {{Progressable}} used on region open to tickle the ZK OPENING node to 
 prevent the master from timing out a region open operation will currently 
 abort the RegionServer if this fails for some reason.  However it could be 
 normal for an RS to have a region open operation aborted by the master, so 
 should just handle as it does other places by reverting the open.
 We had a cluster trip over some other issue (for some reason, the tickle was 
 not happening in  30 seconds, so master was timing out every time).  Because 
 of the abort on BadVersion, this eventually led to every single RS aborting 
 itself eventually taking down the cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HBASE-3426) Allowing HRegionServer extensions to register interfaces for HBaseRPCMetrics

2011-01-06 Thread James Kennedy (JIRA)
Allowing HRegionServer extensions to register interfaces for HBaseRPCMetrics


 Key: HBASE-3426
 URL: https://issues.apache.org/jira/browse/HBASE-3426
 Project: HBase
  Issue Type: Improvement
  Components: master, regionserver
Affects Versions: 0.90.0
Reporter: James Kennedy
 Fix For: 0.90.0


HBaseRpcMetrics is now logging a WARN message every time it encounters an
unregistered RPC method.

In my case I now get huge log files filled with these warnings because the
hbase-trx transactional extension of HBase uses a subclass of HRegionServer
that adds new interface methods.

It's easy enough to tell log4j to ignore HBaseRpcMetrics output.

However, it would be nice if the Server/HRegionServer HBaseRpcMetrics
mechanism was more extensible so I could pass down new interfaces or grab
the HBaseRpcMetrics from the HBaseRPC object to add interfaces from up
top...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3426) Allowing HRegionServer extensions to register interfaces for HBaseRPCMetrics

2011-01-06 Thread Gary Helmling (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978584#action_12978584
 ] 

Gary Helmling commented on HBASE-3426:
--

Hi James, I already committed a change for this as HBASE-3405 to 0.90 and 
trunk.  If there are any issues with the interface there, we can address them 
here.

Otherwise we should close this as a dupe?

 Allowing HRegionServer extensions to register interfaces for HBaseRPCMetrics
 

 Key: HBASE-3426
 URL: https://issues.apache.org/jira/browse/HBASE-3426
 Project: HBase
  Issue Type: Improvement
  Components: master, regionserver
Affects Versions: 0.90.0
Reporter: James Kennedy
 Fix For: 0.90.0


 HBaseRpcMetrics is now logging a WARN message every time it encounters an
 unregistered RPC method.
 In my case I now get huge log files filled with these warnings because the
 hbase-trx transactional extension of HBase uses a subclass of HRegionServer
 that adds new interface methods.
 It's easy enough to tell log4j to ignore HBaseRpcMetrics output.
 However, it would be nice if the Server/HRegionServer HBaseRpcMetrics
 mechanism was more extensible so I could pass down new interfaces or grab
 the HBaseRpcMetrics from the HBaseRPC object to add interfaces from up
 top...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HBASE-3427) HBaseConfiguration problem ( config file location )

2011-01-06 Thread jo sung jun (JIRA)
HBaseConfiguration problem ( config file location )
---

 Key: HBASE-3427
 URL: https://issues.apache.org/jira/browse/HBASE-3427
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.20.6
 Environment: ubuntu , windows 7 
Reporter: jo sung jun


Is this a bug ?

example hbase-site.xml is in conf/hbase-site.xml
so I wrote below.

HBaseConfiguration config = new HBaseConfiguration();
config.addResource(HBaseClient.class.getResource(/conf/hbase-default.xml));
config.addResource(HBaseClient.class.getResource(/conf/hbase-site.xml));

config.set(hbase.zookeeper.quorum, 192.168.0.203);
config.set(hbase.zookeeper.property.clientPort, 2181);

but, when I use IndexedTable, cannot create table(almost locking) because 
IndexedTableAdmin.reIndexTable

IndexedTableAdmin.java
 private void reIndexTable(byte[] baseTableName, IndexSpecification indexSpec) 
throws IOException {
HTable baseTable = new HTable(baseTableName);
   .. ..
}

We should use HTable(String, HBaseConfiguration), I think.

because of that, I moved the config files to src/  not src/conf/ ***.xml

This is so confused. we should explain this more in java doc. or would be 
fixing it





-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HBASE-3428) HBase examples ( sources attachment )

2011-01-06 Thread jo sung jun (JIRA)
HBase examples ( sources attachment )
-

 Key: HBASE-3428
 URL: https://issues.apache.org/jira/browse/HBASE-3428
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.20.6
 Environment: Ubuntu, Windows 7 32bit
Reporter: jo sung jun


I know I am new Hbase user.
When I learn Hbase, I was to hard about learning that.
I couldn't find well made examples

now, I wrote somd examples for Hbase starter.

I wish this examples can help you.

few more days later, I'll write more good examples.

sources.

AbstractHBase.java
HBaseClient.java
StressTest.java

Thanks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HBASE-3428) HBase examples ( sources attachment )

2011-01-06 Thread jo sung jun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jo sung jun updated HBASE-3428:
---

Attachment: StressTest.java
HBaseClient.java
AbstractHBase.java

Don't forget inserting property in hbase-site.xml

!--  Secondary Indexes in Habase. --
property
namehbase.regionserver.class/name

valueorg.apache.hadoop.hbase.ipc.IndexedRegionInterface/value
/property

property
namehbase.regionserver.impl/name

valueorg.apache.hadoop.hbase.regionserver.tableindexed.IndexedRegionServer/value
/property

Thanks.

 HBase examples ( sources attachment )
 -

 Key: HBASE-3428
 URL: https://issues.apache.org/jira/browse/HBASE-3428
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.20.6
 Environment: Ubuntu, Windows 7 32bit
Reporter: jo sung jun
 Attachments: AbstractHBase.java, HBaseClient.java, StressTest.java


 I know I am new Hbase user.
 When I learn Hbase, I was to hard about learning that.
 I couldn't find well made examples
 now, I wrote somd examples for Hbase starter.
 I wish this examples can help you.
 few more days later, I'll write more good examples.
 sources.
 AbstractHBase.java
 HBaseClient.java
 StressTest.java
 Thanks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HBASE-3426) Allowing HRegionServer extensions to register interfaces for HBaseRPCMetrics

2011-01-06 Thread James Kennedy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Kennedy resolved HBASE-3426.
--

Resolution: Duplicate

Yeah duplicate of case 3405.
Thanks for addressing this so quickly.

 Allowing HRegionServer extensions to register interfaces for HBaseRPCMetrics
 

 Key: HBASE-3426
 URL: https://issues.apache.org/jira/browse/HBASE-3426
 Project: HBase
  Issue Type: Improvement
  Components: master, regionserver
Affects Versions: 0.90.0
Reporter: James Kennedy
 Fix For: 0.90.0


 HBaseRpcMetrics is now logging a WARN message every time it encounters an
 unregistered RPC method.
 In my case I now get huge log files filled with these warnings because the
 hbase-trx transactional extension of HBase uses a subclass of HRegionServer
 that adds new interface methods.
 It's easy enough to tell log4j to ignore HBaseRpcMetrics output.
 However, it would be nice if the Server/HRegionServer HBaseRpcMetrics
 mechanism was more extensible so I could pass down new interfaces or grab
 the HBaseRpcMetrics from the HBaseRPC object to add interfaces from up
 top...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3425) HMaster sends duplicate ports to regionserver in HServerAddress

2011-01-06 Thread Matt Corgan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978594#action_12978594
 ] 

Matt Corgan commented on HBASE-3425:


Hmm - it's not happening anymore.  We had just changed the DNS entry that 
pointed to that IP address to give the regionserver a more friendly name.  
While having the problem the regionserver would log this line:

2011-01-06 15:55:48,910 DEBUG 
org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
hbase.regionserver.address=HadoopNode98.hotpads.srv:60020

But now that is gone and it logs this one:

2011-01-06 18:29:10,903 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us address to 
use. Was=HadoopNode98.hotpads.srv:60020, Now=HadoopNode98.hotpads.srv:60020

Looking through HMaster.regionServerStartup, it calls 
HBaseServer.getRemoteIp().  That asks the currently open socket for the 
InetAddress, and then things get hairy with PlainSocketImpl, InetAddress, 
etc...  Something strange probably happened here due to the DNS modifications.

If for some reason any of those external networking classes returned the port 
number appended to the hostname, then HBase currently does nothing to catch it. 
 Hbase instantiates an InetSocketAddress which doesn't validate the string, and 
that is passed to an HServerAddress which also doesn't validate it.  Maybe the 
solution is to not handle the error, but to at least throw an exception earlier 
by validating the hostname string in HMaster.regionServerStartup.

 HMaster sends duplicate ports to regionserver in HServerAddress
 ---

 Key: HBASE-3425
 URL: https://issues.apache.org/jira/browse/HBASE-3425
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.0
Reporter: Matt Corgan
 Fix For: 0.90.1

 Attachments: HBASE-3425[0.90.0].patch


 On regionserver startup, the regionserver receives an HServerAddress from the 
 master as a Writable.  It's a string hostname and an integer port.  Our 
 master is also appending the port to the string, so when they are 
 concatenated it becomes hadoopnode98:60020:60020 and the HServerAddress 
 cannot be instantiated.  
 This should probably be fixed in the master as well, but I don't know where 
 it happens.  The attached patch handles it in the regionserver.
 Regionserver startup log:
 2011-01-06 15:55:48,813 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Connected to master at 
 hadoopmaster.hotpads.srv:6
 2011-01-06 15:55:48,857 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 
 hadoopmaster.hotpads.srv:6 that we are up
 2011-01-06 15:55:48,910 DEBUG 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 hbase.regionserver.address=HadoopNode98.hotpads.srv:60020
 2011-01-06 15:55:48,910 DEBUG 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 fs.default.name=hdfs://hadoopmaster.hotpads.srv:54310/hbase
 2011-01-06 15:55:48,910 DEBUG 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 hbase.rootdir=hdfs://hadoopmaster.hotpads.srv:54310/hbase
 2011-01-06 15:55:48,945 ERROR org.apache.hadoop.hbase.HServerAddress: Could 
 not resolve the DNS name of HadoopNode98.hotpads.srv:60020:60020
 2011-01-06 15:55:48,945 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Failed 
 initialization
 2011-01-06 15:55:48,947 ERROR 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init
 java.lang.IllegalArgumentException: Could not resolve the DNS name of 
 HadoopNode98.hotpads.srv:60020:60020
 at 
 org.apache.hadoop.hbase.HServerAddress.checkBindAddressCanBeResolved(HServerAddress.java:105)
 at 
 org.apache.hadoop.hbase.HServerAddress.init(HServerAddress.java:76)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:798)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.tryReportForDuty(HRegionServer.java:1394)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:522)
 at java.lang.Thread.run(Thread.java:619)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3425) HMaster sends duplicate ports to regionserver in HServerAddress

2011-01-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978634#action_12978634
 ] 

stack commented on HBASE-3425:
--

It just something I've never seen before.  Probably no harm in your patch as is.

 HMaster sends duplicate ports to regionserver in HServerAddress
 ---

 Key: HBASE-3425
 URL: https://issues.apache.org/jira/browse/HBASE-3425
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.0
Reporter: Matt Corgan
 Fix For: 0.90.1

 Attachments: HBASE-3425[0.90.0].patch


 On regionserver startup, the regionserver receives an HServerAddress from the 
 master as a Writable.  It's a string hostname and an integer port.  Our 
 master is also appending the port to the string, so when they are 
 concatenated it becomes hadoopnode98:60020:60020 and the HServerAddress 
 cannot be instantiated.  
 This should probably be fixed in the master as well, but I don't know where 
 it happens.  The attached patch handles it in the regionserver.
 Regionserver startup log:
 2011-01-06 15:55:48,813 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Connected to master at 
 hadoopmaster.hotpads.srv:6
 2011-01-06 15:55:48,857 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at 
 hadoopmaster.hotpads.srv:6 that we are up
 2011-01-06 15:55:48,910 DEBUG 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 hbase.regionserver.address=HadoopNode98.hotpads.srv:60020
 2011-01-06 15:55:48,910 DEBUG 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 fs.default.name=hdfs://hadoopmaster.hotpads.srv:54310/hbase
 2011-01-06 15:55:48,910 DEBUG 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: 
 hbase.rootdir=hdfs://hadoopmaster.hotpads.srv:54310/hbase
 2011-01-06 15:55:48,945 ERROR org.apache.hadoop.hbase.HServerAddress: Could 
 not resolve the DNS name of HadoopNode98.hotpads.srv:60020:60020
 2011-01-06 15:55:48,945 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Failed 
 initialization
 2011-01-06 15:55:48,947 ERROR 
 org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init
 java.lang.IllegalArgumentException: Could not resolve the DNS name of 
 HadoopNode98.hotpads.srv:60020:60020
 at 
 org.apache.hadoop.hbase.HServerAddress.checkBindAddressCanBeResolved(HServerAddress.java:105)
 at 
 org.apache.hadoop.hbase.HServerAddress.init(HServerAddress.java:76)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:798)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.tryReportForDuty(HRegionServer.java:1394)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:522)
 at java.lang.Thread.run(Thread.java:619)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3428) HBase examples ( sources attachment )

2011-01-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978637#action_12978637
 ] 

stack commented on HBASE-3428:
--

Thank you for doing the fancy examples Jo Sung Jun.

 HBase examples ( sources attachment )
 -

 Key: HBASE-3428
 URL: https://issues.apache.org/jira/browse/HBASE-3428
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.20.6
 Environment: Ubuntu, Windows 7 32bit
Reporter: jo sung jun
 Attachments: AbstractHBase.java, HBaseClient.java, StressTest.java


 I know I am new Hbase user.
 When I learn Hbase, I was to hard about learning that.
 I couldn't find well made examples
 now, I wrote somd examples for Hbase starter.
 I wish this examples can help you.
 few more days later, I'll write more good examples.
 sources.
 AbstractHBase.java
 HBaseClient.java
 StressTest.java
 Thanks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.