[jira] [Commented] (HBASE-5824) HRegion.incrementColumnValue is not used in trunk

2012-04-20 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13258297#comment-13258297
 ] 

stack commented on HBASE-5824:
--

This patch only makes sense in trunk, not in 0.94.

What are the exceptions that now are different?

 HRegion.incrementColumnValue is not used in trunk
 -

 Key: HBASE-5824
 URL: https://issues.apache.org/jira/browse/HBASE-5824
 Project: HBase
  Issue Type: Bug
Reporter: Elliott Clark
Assignee: Jimmy Xiang
 Fix For: 0.96.0

 Attachments: hbase-5824.patch, hbase-5824_v2.patch, 
 hbase_5824.addendum


 on 0.94 a call to client.HTable#incrementColumnValue will cause 
 HRegion#incrementColumnValue.  On trunk all calls to 
 HTable.incrementColumnValue got to HRegion#increment.
 My guess is that HTable#incrementColumnValue and HTable#increment serialize 
 to the same thing over the wire so that the remote HRegionServer no longer 
 knows which htable method was called.
 To repro I checked out trunk and put a break point in 
 HRegion#incrementColumnValue and then ran TestFromClientSide.  The breakpoint 
 wasn't hit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5831) hadoopqa builds not completing

2012-04-19 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257647#comment-13257647
 ] 

stack commented on HBASE-5831:
--

@Todd We can do that and then require it.  Nkeyway has a script for test 
categorizations.  I've been talking w/ him about adding it to general build.  
We could expand it to require tests have timeouts too.

Let me try this patch again.  I want another clean run w/o a hang to be 
convinced this is the problem test.

Need to too amend Ted's little script to look for tests that run 0 tests.

 hadoopqa builds not completing
 --

 Key: HBASE-5831
 URL: https://issues.apache.org/jira/browse/HBASE-5831
 Project: HBase
  Issue Type: Bug
  Components: test
Reporter: stack
Assignee: stack
Priority: Blocker
 Attachments: 5831.remove.TestLoadIncrementalHFilesSplitRecovery.txt, 
 5831.remove.TestLoadIncrementalHFilesSplitRecovery.txt


 No test failures but build complains it has failed.  trunk build seems to 
 have the same affliction:
 {code}
 Results :
 Tests run: 909, Failures: 0, Errors: 0, Skipped: 9
 [INFO] 
 
 [INFO] BUILD FAILURE
 [INFO] 
 
 [INFO] Total time: 41:19.273s
 [INFO] Finished at: Wed Apr 18 21:54:31 UTC 2012
 [INFO] Final Memory: 59M/451M
 [INFO] 
 
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-surefire-plugin:2.12-TRUNK-HBASE-2:test 
 (secondPartTestsExecution) on project hbase: Failure or timeout - [Help 1]
 [ERROR] 
 [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
 switch.
 [ERROR] Re-run Maven using the -X switch to enable full debug logging.
 [ERROR] 
 [ERROR] For more information about the errors and possible solutions, please 
 read the following articles:
 [ERROR] [Help 1] 
 http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
 -1 overall.  Here are the results of testing the latest attachment 
   http://issues.apache.org/jira/secure/attachment/12523250/5811+%281%29.txt
   against trunk revision .
 +1 @author.  The patch does not contain any @author tags.
 +1 tests included.  The patch appears to include 3 new or modified tests.
 +1 javadoc.  The javadoc tool did not generate any warning messages.
 +1 javac.  The applied patch does not increase the total number of javac 
 compiler warnings.
 -1 findbugs.  The patch appears to introduce 6 new Findbugs (version 
 1.3.9) warnings.
 +1 release audit.  The applied patch does not increase the total number 
 of release audit warnings.
  -1 core tests.  The patch failed these unit tests:
 {code}
 Its not apparent that any particular test is not finishing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5829) Inconsistency between the regions map and the servers map in AssignmentManager

2012-04-19 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257704#comment-13257704
 ] 

stack commented on HBASE-5829:
--

Please explain where the disparity between this.server and this.regions is in 
in the code Maryann.

 Inconsistency between the regions map and the servers map in 
 AssignmentManager
 --

 Key: HBASE-5829
 URL: https://issues.apache.org/jira/browse/HBASE-5829
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1
Reporter: Maryann Xue

 There are occurrences in AM where this.servers is not kept consistent with 
 this.regions. This might cause balancer to offline a region from the RS that 
 already returned NotServingRegionException at a previous offline attempt.
 In AssignmentManager.unassign(HRegionInfo, boolean)
 try {
   // TODO: We should consider making this look more like it does for the
   // region open where we catch all throwables and never abort
   if (serverManager.sendRegionClose(server, state.getRegion(),
 versionOfClosingNode)) {
 LOG.debug(Sent CLOSE to  + server +  for region  +
   region.getRegionNameAsString());
 return;
   }
   // This never happens. Currently regionserver close always return true.
   LOG.warn(Server  + server +  region CLOSE RPC returned false for  +
 region.getRegionNameAsString());
 } catch (NotServingRegionException nsre) {
   LOG.info(Server  + server +  returned  + nsre +  for  +
 region.getRegionNameAsString());
   // Presume that master has stale data.  Presume remote side just split.
   // Presume that the split message when it comes in will fix up the 
 master's
   // in memory cluster state.
 } catch (Throwable t) {
   if (t instanceof RemoteException) {
 t = ((RemoteException)t).unwrapRemoteException();
 if (t instanceof NotServingRegionException) {
   if (checkIfRegionBelongsToDisabling(region)) {
 // Remove from the regionsinTransition map
 LOG.info(While trying to recover the table 
 + region.getTableNameAsString()
 +  to DISABLED state the region  + region
 +  was offlined but the table was in DISABLING state);
 synchronized (this.regionsInTransition) {
   this.regionsInTransition.remove(region.getEncodedName());
 }
 // Remove from the regionsMap
 synchronized (this.regions) {
   this.regions.remove(region);
 }
 deleteClosingOrClosedNode(region);
   }
 }
 // RS is already processing this region, only need to update the 
 timestamp
 if (t instanceof RegionAlreadyInTransitionException) {
   LOG.debug(update  + state +  the timestamp.);
   state.update(state.getState());
 }
   }
 In AssignmentManager.assign(HRegionInfo, RegionState, boolean, boolean, 
 boolean)
   synchronized (this.regions) {
 this.regions.put(plan.getRegionInfo(), plan.getDestination());
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5816) Balancer and ServerShutdownHandler concurrently reassigning the same region

2012-04-19 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257729#comment-13257729
 ] 

stack commented on HBASE-5816:
--

@Maryann Agree on your 1., and 2. above.  Its possible to make a standalone 
AssignmentManager using mocks -- see TestAssignmentManager.  Maybe we should 
try some of your suppositions over in unit tests Maryann and find holes in AM 
by writing unit tests?

 Balancer and ServerShutdownHandler concurrently reassigning the same region
 ---

 Key: HBASE-5816
 URL: https://issues.apache.org/jira/browse/HBASE-5816
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6
Reporter: Maryann Xue
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Attachments: HBASE-5816.patch


 The first assign thread exits with success after updating the RegionState to 
 PENDING_OPEN, while the second assign follows immediately into assign and 
 fails the RegionState check in setOfflineInZooKeeper(). This causes the 
 master to abort.
 In the below case, the two concurrent assigns occurred when AM tried to 
 assign a region to a dying/dead RS, and meanwhile the ShutdownServerHandler 
 tried to assign this region (from the region plan) spontaneously.
 2012-04-17 05:44:57,648 INFO org.apache.hadoop.hbase.master.HMaster: balance 
 hri=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b., 
 src=hadoop05.sh.intel.com,60020,1334544902186, 
 dest=xmlqa-clv16.sh.intel.com,60020,1334612497253
 2012-04-17 05:44:57,648 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
 region TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. 
 (offlining)
 2012-04-17 05:44:57,648 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to 
 serverName=hadoop05.sh.intel.com,60020,1334544902186, load=(requests=0, 
 regions=0, usedHeap=0, maxHeap=0) for region 
 TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b.
 2012-04-17 05:44:57,666 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned 
 node: /hbase/unassigned/fe38fe31caf40b6e607a3e6bbed6404b 
 (region=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b.,
  server=hadoop05.sh.intel.com,60020,1334544902186, state=RS_ZK_REGION_CLOSING)
 2012-04-17 05:52:58,984 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. 
 state=CLOSED, ts=1334612697672, 
 server=hadoop05.sh.intel.com,60020,1334544902186
 2012-04-17 05:52:58,984 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x236b912e9b3000e Creating (or updating) unassigned node for 
 fe38fe31caf40b6e607a3e6bbed6404b with OFFLINE state
 2012-04-17 05:52:59,096 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for 
 region TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b.; 
 plan=hri=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b.,
  src=hadoop05.sh.intel.com,60020,1334544902186, 
 dest=xmlqa-clv16.sh.intel.com,60020,1334612497253
 2012-04-17 05:52:59,096 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
 TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. to 
 xmlqa-clv16.sh.intel.com,60020,1334612497253
 2012-04-17 05:54:19,159 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. 
 state=PENDING_OPEN, ts=1334613179096, 
 server=xmlqa-clv16.sh.intel.com,60020,1334612497253
 2012-04-17 05:54:59,033 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of 
 TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. to 
 serverName=xmlqa-clv16.sh.intel.com,60020,1334612497253, load=(requests=0, 
 regions=0, usedHeap=0, maxHeap=0), trying to assign elsewhere instead; retry=0
 java.net.SocketTimeoutException: Call to /10.239.47.87:60020 failed on socket 
 timeout exception: java.net.SocketTimeoutException: 12 millis timeout 
 while waiting for channel to be ready for read. ch : 
 java.nio.channels.SocketChannel[connected local=/10.239.47.89:41302 
 remote=/10.239.47.87:60020]
 at 
 org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:805)
 at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:778)
 at 
 org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:283)
 at $Proxy7.openRegion(Unknown Source)
 at 
 org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:573)
 at 
 

[jira] [Commented] (HBASE-5654) [findbugs] Address dodgy bugs

2012-04-19 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257743#comment-13257743
 ] 

stack commented on HBASE-5654:
--

@Jon Our hadoopqa has been hanging with a while.  Its probably not this patch.  
Maybe compare to previous runs.  I'm working on trying to figure out why the 
hangs meantime.

 [findbugs] Address dodgy bugs
 -

 Key: HBASE-5654
 URL: https://issues.apache.org/jira/browse/HBASE-5654
 Project: HBase
  Issue Type: Sub-task
  Components: scripts
Affects Versions: 0.96.0
Reporter: Jonathan Hsieh
Assignee: Ashutosh Jindal
  Labels: patch
 Fix For: 0.96.0

 Attachments: Hbase 5654_v3.patch, Hbase-5654.patch, 
 Hbase_5654_V2.patch


 See 
 https://builds.apache.org/job/PreCommit-HBASE-Build/1313//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html#Warnings_STYLE
 This may be broken down further.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5548) Add ability to get a table in the shell

2012-04-19 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257760#comment-13257760
 ] 

stack commented on HBASE-5548:
--

Sorry Jesse for taking a while to get back to this.  Patch looks good.  I tried 
it some more and got this:

{code}
hbase(main):011:0 t.put 'x', 'y:x', 'x'
0 row(s) in 0.0110 seconds
hbase(main):012:0 t.get 'x'
COLUMN   CELL   



ERROR: undefined method `get_internal' for Hbase::Table - y:Hbase::Table

Here is some help for this command:
Get row or cell contents; pass table name, row, and optionally
a dictionary of column(s), timestamp, timerange and versions. Examples:

  hbase get 't1', 'r1'
  hbase get 't1', 'r1', {TIMERANGE = [ts1, ts2]}
  hbase get 't1', 'r1', {COLUMN = 'c1'}
  hbase get 't1', 'r1', {COLUMN = ['c1', 'c2', 'c3']}
  hbase get 't1', 'r1', {COLUMN = 'c1', TIMESTAMP = ts1}
  hbase get 't1', 'r1', {COLUMN = 'c1', TIMERANGE = [ts1, ts2], VERSIONS = 
4}
  hbase get 't1', 'r1', {COLUMN = 'c1', TIMESTAMP = ts1, VERSIONS = 4}
  hbase get 't1', 'r1', 'c1'
  hbase get 't1', 'r1', 'c1', 'c2'
  hbase get 't1', 'r1', ['c1', 'c2']

The same commands also can be run on a table reference. Suppose you had a 
reference
t to table 't1', the corresponding commands would be:

  hbase t.get 'r1'
  hbase t.get 'r1', {TIMERANGE = [ts1, ts2]}
  hbase t.get 'r1', {COLUMN = 'c1'}
  hbase t.get 'r1', {COLUMN = ['c1', 'c2', 'c3']}
  hbase t.get 'r1', {COLUMN = 'c1', TIMESTAMP = ts1}
  hbase t.get 'r1', {COLUMN = 'c1', TIMERANGE = [ts1, ts2], VERSIONS = 4}
  hbase t.get 'r1', {COLUMN = 'c1', TIMESTAMP = ts1, VERSIONS = 4}
  hbase t.get 'r1', 'c1'
  hbase t.get 'r1', 'c1', 'c2'
  hbase t.get 'r1', ['c1', 'c2']
{code}

Seems like an issue?

Also in the help, talks about a table reference without explaining what it is 
(there is no mention of what this is in the general help either it seems).  It 
could be confusing talking about a 't' w/o saying where it came from?

I like the output of t.help.

This is odd though:

{code}
  hbase t.put 'r', 'c', 'q', 'v'
 which puts a row 'r' with column family 'c', qualifier 'q' and value 'v' into 
table t.
{code}

In the rest of the shell columns are a combo of family and qualifier delimited 
by the ':'.  You are changing that w/ the above.



 Add ability to get a table in the shell
 ---

 Key: HBASE-5548
 URL: https://issues.apache.org/jira/browse/HBASE-5548
 Project: HBase
  Issue Type: Improvement
  Components: shell
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 0.96.0, 0.94.1

 Attachments: ruby_HBASE-5528-v0.patch, ruby_HBASE-5548-v1.patch, 
 ruby_HBASE-5548-v2.patch, ruby_HBASE-5548-v3.patch


 Currently, all the commands that operate on a table in the shell first have 
 to take the table as name as input. 
 There are two main considerations:
 * It is annoying to have to write the table name every time, when you should 
 just be able to get a reference to a table
 * the current implementation is very wasteful - it creates a new HTable for 
 each call (but reuses the connection since it uses the same configuration)
 We should be able to get a handle to a single HTable and then operate on that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5831) hadoopqa builds not completing

2012-04-19 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257773#comment-13257773
 ] 

stack commented on HBASE-5831:
--

@Todd That'd be nice (smile)

This test run did 936 tests which is more than normal.  Let me try again.

 hadoopqa builds not completing
 --

 Key: HBASE-5831
 URL: https://issues.apache.org/jira/browse/HBASE-5831
 Project: HBase
  Issue Type: Bug
  Components: test
Reporter: stack
Assignee: stack
Priority: Blocker
 Attachments: 5831.remove.TestLoadIncrementalHFilesSplitRecovery.txt, 
 5831.remove.TestLoadIncrementalHFilesSplitRecovery.txt, 
 5831.remove.TestLoadIncrementalHFilesSplitRecovery.txt, 
 5831.remove.TestLoadIncrementalHFilesSplitRecovery.txt, 
 5831.remove.TestLoadIncrementalHFilesSplitRecovery.txt


 No test failures but build complains it has failed.  trunk build seems to 
 have the same affliction:
 {code}
 Results :
 Tests run: 909, Failures: 0, Errors: 0, Skipped: 9
 [INFO] 
 
 [INFO] BUILD FAILURE
 [INFO] 
 
 [INFO] Total time: 41:19.273s
 [INFO] Finished at: Wed Apr 18 21:54:31 UTC 2012
 [INFO] Final Memory: 59M/451M
 [INFO] 
 
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-surefire-plugin:2.12-TRUNK-HBASE-2:test 
 (secondPartTestsExecution) on project hbase: Failure or timeout - [Help 1]
 [ERROR] 
 [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
 switch.
 [ERROR] Re-run Maven using the -X switch to enable full debug logging.
 [ERROR] 
 [ERROR] For more information about the errors and possible solutions, please 
 read the following articles:
 [ERROR] [Help 1] 
 http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
 -1 overall.  Here are the results of testing the latest attachment 
   http://issues.apache.org/jira/secure/attachment/12523250/5811+%281%29.txt
   against trunk revision .
 +1 @author.  The patch does not contain any @author tags.
 +1 tests included.  The patch appears to include 3 new or modified tests.
 +1 javadoc.  The javadoc tool did not generate any warning messages.
 +1 javac.  The applied patch does not increase the total number of javac 
 compiler warnings.
 -1 findbugs.  The patch appears to introduce 6 new Findbugs (version 
 1.3.9) warnings.
 +1 release audit.  The applied patch does not increase the total number 
 of release audit warnings.
  -1 core tests.  The patch failed these unit tests:
 {code}
 Its not apparent that any particular test is not finishing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3614) Expose per-region request rate metrics

2012-04-19 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257787#comment-13257787
 ] 

stack commented on HBASE-3614:
--

Since you renamed RegionOperationMetrics, is this right now:

{code}
+  private final OperationMetrics regionMetrics;
{code}

Should it be named metrics or operationMetrics?

Whats 'unknown' in the following? +  //null will be treated as unknown.

We are updating metrics w/o attributing them to a cf?

Fix misspell 'Inctement' in hbase-site change

Patch is good to go after addressing above.  Good stuff.







 Expose per-region request rate metrics
 --

 Key: HBASE-3614
 URL: https://issues.apache.org/jira/browse/HBASE-3614
 Project: HBase
  Issue Type: Improvement
  Components: metrics, regionserver
Reporter: Gary Helmling
Assignee: Elliott Clark
Priority: Minor
 Attachments: HBASE-3614-0.patch, HBASE-3614-1.patch, 
 HBASE-3614-2.patch, HBASE-3614-3.patch, HBASE-3614-4.patch, 
 HBASE-3614-5.patch, HBASE-3614-6.patch, HBASE-3614-7.patch, Screen Shot 
 2012-04-17 at 2.41.27 PM.png


 We currently export metrics on request rates for each region server, and this 
 can help with identifying uneven load at a high level. But once you see a 
 given server under high load, you're forced to extrapolate based on your 
 application patterns and the data it's serving what the likely culprit is.  
 This can and should be much easier if we just exported request rate metrics 
 per-region on each server.
 Dynamically updating the metrics keys based on assigned regions may pose some 
 minor challenges, but this seems a very valuable diagnostic tool to have 
 available.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5547) Don't delete HFiles when in backup mode

2012-04-19 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257810#comment-13257810
 ] 

stack commented on HBASE-5547:
--

bq. To get a guaranteed consistent snapshot the RegionServers need to check for 
the znode's value synchronously in the delete path (or at least I see no other 
way).  Otherwise there are times when the RegionServers do not agree and some 
files will be deleted and some will be backed up with no possibility for the 
client to know exactly as of when the backup would be consistent.

This would make for the narrowest possible window regards whether backup is on 
or off.

Does it have to be a custom znode?  If we had a Configuration or Table znode, 
it could read the content?  Maybe checking existence is cheaper than reading 
znode content though?



 Don't delete HFiles when in backup mode
 -

 Key: HBASE-5547
 URL: https://issues.apache.org/jira/browse/HBASE-5547
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Jesse Yates

 This came up in a discussion I had with Stack.
 It would be nice if HBase could be notified that a backup is in progress (via 
 a znode for example) and in that case either:
 1. rename HFiles to be delete to file.bck
 2. rename the HFiles into a special directory
 3. rename them to a general trash directory (which would not need to be tied 
 to backup mode).
 That way it should be able to get a consistent backup based on HFiles (HDFS 
 snapshots or hard links would be better options here, but we do not have 
 those).
 #1 makes cleanup a bit harder.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5824) HRegion.incrementColumnValue is not used in trunk

2012-04-19 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257849#comment-13257849
 ] 

stack commented on HBASE-5824:
--

+1 on the Jimmy patch.

@Elliott At least add a deprecate pointing to preferred code I'd say?

 HRegion.incrementColumnValue is not used in trunk
 -

 Key: HBASE-5824
 URL: https://issues.apache.org/jira/browse/HBASE-5824
 Project: HBase
  Issue Type: Bug
Reporter: Elliott Clark
Assignee: Jimmy Xiang
 Attachments: hbase-5824.patch


 on 0.94 a call to client.HTable#incrementColumnValue will cause 
 HRegion#incrementColumnValue.  On trunk all calls to 
 HTable.incrementColumnValue got to HRegion#increment.
 My guess is that HTable#incrementColumnValue and HTable#increment serialize 
 to the same thing over the wire so that the remote HRegionServer no longer 
 knows which htable method was called.
 To repro I checked out trunk and put a break point in 
 HRegion#incrementColumnValue and then ran TestFromClientSide.  The breakpoint 
 wasn't hit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5836) Backport per region metrics from HBASE-3614 to 0.94.1

2012-04-19 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257886#comment-13257886
 ] 

stack commented on HBASE-5836:
--

+1

 Backport per region metrics from HBASE-3614 to 0.94.1
 -

 Key: HBASE-5836
 URL: https://issues.apache.org/jira/browse/HBASE-5836
 Project: HBase
  Issue Type: Task
Reporter: stack
Assignee: Elliott Clark
 Fix For: 0.94.1


 This would be good to have in 0.94.  Can go into 0.94.1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5621) Convert admin protocol of HRegionInterface to PB

2012-04-19 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257914#comment-13257914
 ] 

stack commented on HBASE-5621:
--

Want to put your patch up here Jimmy and run it by hadoopqa? Thanks.

 Convert admin protocol of HRegionInterface to PB
 

 Key: HBASE-5621
 URL: https://issues.apache.org/jira/browse/HBASE-5621
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.96.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5831) hadoopqa builds not completing

2012-04-19 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257925#comment-13257925
 ] 

stack commented on HBASE-5831:
--

Thanks Jon.  I tried it over in HBASE-5794, setting it back down again, but it 
didn't seen to matter.  I committed the patch there which undoes the 100 
anyways since Mikhail said the change was good for some 0.89fb tests, he wasn't 
sure about trunk.

 hadoopqa builds not completing
 --

 Key: HBASE-5831
 URL: https://issues.apache.org/jira/browse/HBASE-5831
 Project: HBase
  Issue Type: Bug
  Components: test
Reporter: stack
Assignee: stack
Priority: Blocker
 Attachments: 5831.remove.TestLoadIncrementalHFilesSplitRecovery.txt, 
 5831.remove.TestLoadIncrementalHFilesSplitRecovery.txt, 
 5831.remove.TestLoadIncrementalHFilesSplitRecovery.txt, 
 5831.remove.TestLoadIncrementalHFilesSplitRecovery.txt, 
 5831.remove.TestLoadIncrementalHFilesSplitRecovery.txt, 
 5831.remove.all.mapreduce.txt, 5831.remove.all.mapreduce.txt


 No test failures but build complains it has failed.  trunk build seems to 
 have the same affliction:
 {code}
 Results :
 Tests run: 909, Failures: 0, Errors: 0, Skipped: 9
 [INFO] 
 
 [INFO] BUILD FAILURE
 [INFO] 
 
 [INFO] Total time: 41:19.273s
 [INFO] Finished at: Wed Apr 18 21:54:31 UTC 2012
 [INFO] Final Memory: 59M/451M
 [INFO] 
 
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-surefire-plugin:2.12-TRUNK-HBASE-2:test 
 (secondPartTestsExecution) on project hbase: Failure or timeout - [Help 1]
 [ERROR] 
 [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
 switch.
 [ERROR] Re-run Maven using the -X switch to enable full debug logging.
 [ERROR] 
 [ERROR] For more information about the errors and possible solutions, please 
 read the following articles:
 [ERROR] [Help 1] 
 http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
 -1 overall.  Here are the results of testing the latest attachment 
   http://issues.apache.org/jira/secure/attachment/12523250/5811+%281%29.txt
   against trunk revision .
 +1 @author.  The patch does not contain any @author tags.
 +1 tests included.  The patch appears to include 3 new or modified tests.
 +1 javadoc.  The javadoc tool did not generate any warning messages.
 +1 javac.  The applied patch does not increase the total number of javac 
 compiler warnings.
 -1 findbugs.  The patch appears to introduce 6 new Findbugs (version 
 1.3.9) warnings.
 +1 release audit.  The applied patch does not increase the total number 
 of release audit warnings.
  -1 core tests.  The patch failed these unit tests:
 {code}
 Its not apparent that any particular test is not finishing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5833) 0.92 build has been failing pretty consistently on TestMasterFailover....

2012-04-19 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13258006#comment-13258006
 ] 

stack commented on HBASE-5833:
--

Eh, Ted, builds.apache.org is a public web site.  I do not need your echoing 
whats there in here.

 0.92 build has been failing pretty consistently on TestMasterFailover
 -

 Key: HBASE-5833
 URL: https://issues.apache.org/jira/browse/HBASE-5833
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.92.2

 Attachments: 5833.txt


 Trunk seems fine but 0.92 fails on this test pretty regularly.  Running it 
 local it seems to hang for me.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5831) hadoopqa builds not completing

2012-04-19 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13258008#comment-13258008
 ] 

stack commented on HBASE-5831:
--

@Jon Hmm.. yes.  You are right.  Both times it passed.  It was worth committing 
hbase-5794 then.  Now to find the other hanging tests...

 hadoopqa builds not completing
 --

 Key: HBASE-5831
 URL: https://issues.apache.org/jira/browse/HBASE-5831
 Project: HBase
  Issue Type: Bug
  Components: test
Reporter: stack
Assignee: stack
Priority: Blocker
 Attachments: 5831.remove.TestLoadIncrementalHFilesSplitRecovery.txt, 
 5831.remove.TestLoadIncrementalHFilesSplitRecovery.txt, 
 5831.remove.TestLoadIncrementalHFilesSplitRecovery.txt, 
 5831.remove.TestLoadIncrementalHFilesSplitRecovery.txt, 
 5831.remove.TestLoadIncrementalHFilesSplitRecovery.txt, 
 5831.remove.all.mapreduce.txt, 5831.remove.all.mapreduce.txt


 No test failures but build complains it has failed.  trunk build seems to 
 have the same affliction:
 {code}
 Results :
 Tests run: 909, Failures: 0, Errors: 0, Skipped: 9
 [INFO] 
 
 [INFO] BUILD FAILURE
 [INFO] 
 
 [INFO] Total time: 41:19.273s
 [INFO] Finished at: Wed Apr 18 21:54:31 UTC 2012
 [INFO] Final Memory: 59M/451M
 [INFO] 
 
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-surefire-plugin:2.12-TRUNK-HBASE-2:test 
 (secondPartTestsExecution) on project hbase: Failure or timeout - [Help 1]
 [ERROR] 
 [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
 switch.
 [ERROR] Re-run Maven using the -X switch to enable full debug logging.
 [ERROR] 
 [ERROR] For more information about the errors and possible solutions, please 
 read the following articles:
 [ERROR] [Help 1] 
 http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
 -1 overall.  Here are the results of testing the latest attachment 
   http://issues.apache.org/jira/secure/attachment/12523250/5811+%281%29.txt
   against trunk revision .
 +1 @author.  The patch does not contain any @author tags.
 +1 tests included.  The patch appears to include 3 new or modified tests.
 +1 javadoc.  The javadoc tool did not generate any warning messages.
 +1 javac.  The applied patch does not increase the total number of javac 
 compiler warnings.
 -1 findbugs.  The patch appears to introduce 6 new Findbugs (version 
 1.3.9) warnings.
 +1 release audit.  The applied patch does not increase the total number 
 of release audit warnings.
  -1 core tests.  The patch failed these unit tests:
 {code}
 Its not apparent that any particular test is not finishing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3614) Expose per-region request rate metrics

2012-04-19 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13258011#comment-13258011
 ] 

stack commented on HBASE-3614:
--

@Todd This issue just exposes metrics that were already being collected per 
region.  I believe its over the metrics reporting period (5 seconds?).  Want 
that changed?  Metrics could do w/ a revamp/edit for sure.

 Expose per-region request rate metrics
 --

 Key: HBASE-3614
 URL: https://issues.apache.org/jira/browse/HBASE-3614
 Project: HBase
  Issue Type: Improvement
  Components: metrics, regionserver
Reporter: Gary Helmling
Assignee: Elliott Clark
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-3614-0.patch, HBASE-3614-1.patch, 
 HBASE-3614-2.patch, HBASE-3614-3.patch, HBASE-3614-4.patch, 
 HBASE-3614-5.patch, HBASE-3614-6.patch, HBASE-3614-7.patch, 
 HBASE-3614-8.patch, HBASE-3614-9.patch, Screen Shot 2012-04-17 at 2.41.27 
 PM.png


 We currently export metrics on request rates for each region server, and this 
 can help with identifying uneven load at a high level. But once you see a 
 given server under high load, you're forced to extrapolate based on your 
 application patterns and the data it's serving what the likely culprit is.  
 This can and should be much easier if we just exported request rate metrics 
 per-region on each server.
 Dynamically updating the metrics keys based on assigned regions may pose some 
 minor challenges, but this seems a very valuable diagnostic tool to have 
 available.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-18 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256788#comment-13256788
 ] 

stack commented on HBASE-5782:
--

You will need to pull in HLogPerformanceEvaluation.  Copy it whole (don't do 
the hbase-5792 because it got mod'd a few times subsequent to commit).  You 
could also just commit the unit test to trunk and not to 0.94; that should be 
fine long as we hold to committing patches to trunk first.

 Edits can be appended out of seqid order since HBASE-4487
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: Lars Hofhansl
Priority: Blocker
 Fix For: 0.94.0

 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782-v3.txt, 
 5782.txt, 5782.unfinished-stack.txt, 5782.unittest.txt, HBASE-5782.patch, 
 hbase-5782.txt


 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5816) Two concurrent assign would cause master to abort with msg Unexpected state trying to OFFLINE;

2012-04-18 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256799#comment-13256799
 ] 

stack commented on HBASE-5816:
--

Thanks for filing the issue Maryann.  I think we need to address the root 
problem of two threads in the master both at the same time trying to assign the 
same region rather than do as is done here where we just stop the abort.  The 
patch as is will only move the problem down the line (we'll likely end up w/ a 
single region double assigned?).  Let me update the issue title.  This log 
snippet is a really good find.

 Two concurrent assign would cause master to abort with msg Unexpected state 
 trying to OFFLINE; 
 -

 Key: HBASE-5816
 URL: https://issues.apache.org/jira/browse/HBASE-5816
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6
Reporter: Maryann Xue
 Attachments: HBASE-5816.patch


 The first assign thread exits with success after updating the RegionState to 
 PENDING_OPEN, while the second assign follows immediately into assign and 
 fails the RegionState check in setOfflineInZooKeeper(). This causes the 
 master to abort.
 In the below case, the two concurrent assigns occurred when AM tried to 
 assign a region to a dying/dead RS, and meanwhile the ShutdownServerHandler 
 tried to assign this region (from the region plan) spontaneously.
 2012-04-17 05:44:57,648 INFO org.apache.hadoop.hbase.master.HMaster: balance 
 hri=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b., 
 src=hadoop05.sh.intel.com,60020,1334544902186, 
 dest=xmlqa-clv16.sh.intel.com,60020,1334612497253
 2012-04-17 05:44:57,648 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
 region TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. 
 (offlining)
 2012-04-17 05:44:57,648 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to 
 serverName=hadoop05.sh.intel.com,60020,1334544902186, load=(requests=0, 
 regions=0, usedHeap=0, maxHeap=0) for region 
 TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b.
 2012-04-17 05:44:57,666 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned 
 node: /hbase/unassigned/fe38fe31caf40b6e607a3e6bbed6404b 
 (region=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b.,
  server=hadoop05.sh.intel.com,60020,1334544902186, state=RS_ZK_REGION_CLOSING)
 2012-04-17 05:52:58,984 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. 
 state=CLOSED, ts=1334612697672, 
 server=hadoop05.sh.intel.com,60020,1334544902186
 2012-04-17 05:52:58,984 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x236b912e9b3000e Creating (or updating) unassigned node for 
 fe38fe31caf40b6e607a3e6bbed6404b with OFFLINE state
 2012-04-17 05:52:59,096 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for 
 region TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b.; 
 plan=hri=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b.,
  src=hadoop05.sh.intel.com,60020,1334544902186, 
 dest=xmlqa-clv16.sh.intel.com,60020,1334612497253
 2012-04-17 05:52:59,096 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
 TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. to 
 xmlqa-clv16.sh.intel.com,60020,1334612497253
 2012-04-17 05:54:19,159 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. 
 state=PENDING_OPEN, ts=1334613179096, 
 server=xmlqa-clv16.sh.intel.com,60020,1334612497253
 2012-04-17 05:54:59,033 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of 
 TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. to 
 serverName=xmlqa-clv16.sh.intel.com,60020,1334612497253, load=(requests=0, 
 regions=0, usedHeap=0, maxHeap=0), trying to assign elsewhere instead; retry=0
 java.net.SocketTimeoutException: Call to /10.239.47.87:60020 failed on socket 
 timeout exception: java.net.SocketTimeoutException: 12 millis timeout 
 while waiting for channel to be ready for read. ch : 
 java.nio.channels.SocketChannel[connected local=/10.239.47.89:41302 
 remote=/10.239.47.87:60020]
 at 
 org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:805)
 at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:778)
 at 
 org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:283)
 at $Proxy7.openRegion(Unknown Source)
 at 
 

[jira] [Commented] (HBASE-5737) Minor Improvements related to balancer.

2012-04-18 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256827#comment-13256827
 ] 

stack commented on HBASE-5737:
--

I think this is weird '+this.balancer.setMasterServices(this);' but its not 
your change.

+1 on commit.

 Minor Improvements related to balancer.
 ---

 Key: HBASE-5737
 URL: https://issues.apache.org/jira/browse/HBASE-5737
 Project: HBase
  Issue Type: Improvement
  Components: master
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Minor
 Attachments: HBASE-5737.patch, HBASE-5737_1.patch, 
 HBASE-5737_2.patch, HBASE-5737_3.patch


 Currently in Am.getAssignmentByTable()  we use a result map which is currenly 
 a hashmap.  It could be better if we have a treeMap.  Even in 
 MetaReader.fullScan we have the treeMap only so that we have the naming order 
 maintained. I felt this change could be very useful in cases where we are 
 extending the DefaultLoadBalancer.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5823) Hbck should be able to print help

2012-04-18 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256835#comment-13256835
 ] 

stack commented on HBASE-5823:
--

+1 on patch.

 Hbck should be able to print help
 -

 Key: HBASE-5823
 URL: https://issues.apache.org/jira/browse/HBASE-5823
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.1, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
Priority: Minor
 Attachments: hbase-hbck.patch


 bin/hbase hbck -h and -help should print the help message. It used to print 
 help when unrecognized options are passed. We can backport this to 0.92/0.94 
 branches as well. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5547) Don't delete HFiles when in backup mode

2012-04-18 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256861#comment-13256861
 ] 

stack commented on HBASE-5547:
--

Is the problem in #1 the client waiting on acks from all the regionservers?  
Does it need to do this?  Can it not just set the state up in zk and then just 
move on (You have this in your patch already if I remember rightly).  Do you 
want the RS's acknowledging that they have been set into backup mode?  They 
could set a flag up in zk but this gets torturous when say we add a new feature 
that wants to do some thing similar.

If we had a dynamic Configuration system, one that didn't require roll of table 
to set the table 'read-only' or 'in-back-up mode', would that help here?

One option #2, yeah, its a pain going to zk for each WAL when there is this 
callback mechanism that all RS are subscribed to anyways.  For sure could poll 
zk the first time but should then cache the setting and only drop it later if a 
callback says it changed.

Agree roll of table to set the backup flag is much too heavyweight.

 Don't delete HFiles when in backup mode
 -

 Key: HBASE-5547
 URL: https://issues.apache.org/jira/browse/HBASE-5547
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Jesse Yates

 This came up in a discussion I had with Stack.
 It would be nice if HBase could be notified that a backup is in progress (via 
 a znode for example) and in that case either:
 1. rename HFiles to be delete to file.bck
 2. rename the HFiles into a special directory
 3. rename them to a general trash directory (which would not need to be tied 
 to backup mode).
 That way it should be able to get a consistent backup based on HFiles (HDFS 
 snapshots or hard links would be better options here, but we do not have 
 those).
 #1 makes cleanup a bit harder.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5823) Hbck should be able to print help

2012-04-18 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256880#comment-13256880
 ] 

stack commented on HBASE-5823:
--

I tried it.  Seems to work.

 Hbck should be able to print help
 -

 Key: HBASE-5823
 URL: https://issues.apache.org/jira/browse/HBASE-5823
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.1, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
Priority: Minor
 Fix For: 0.92.2, 0.94.0

 Attachments: hbase-hbck.patch


 bin/hbase hbck -h and -help should print the help message. It used to print 
 help when unrecognized options are passed. We can backport this to 0.92/0.94 
 branches as well. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5790) ZKUtil deleteRecursively should be a recoverable operation

2012-04-18 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256889#comment-13256889
 ] 

stack commented on HBASE-5790:
--

@Ted Its not a runtime requirement that client or ensemble be 3.4.x.  3.4.x 
client and ensemble is required if you run secure hbase else its not necessary 
and we should be wary requiring it; e.g. our ops didn't want to upgrade to 
3.4.x ensemble just yet and so we run w/ a 3.4.x client against 3.3.x ensemble.

@Jesse Sounds fine requiring 3.4.x in 0.96.  Want to raise a conversation out 
on mailing list?

 ZKUtil deleteRecursively should be a recoverable operation
 --

 Key: HBASE-5790
 URL: https://issues.apache.org/jira/browse/HBASE-5790
 Project: HBase
  Issue Type: Improvement
Reporter: Jesse Yates
Assignee: Jesse Yates
  Labels: zookeeper
 Fix For: 0.96.0, 0.94.1

 Attachments: java_HBASE-5790-v1.patch, java_HBASE-5790.patch


 As of 3.4.3 Zookeeper now has full, multi-operation transaction. This means 
 we can wholesale delete chunks of the zk tree and ensure that we don't have 
 any pesky recursive delete issues where we delete the children of a node, but 
 then a child joins before deletion of the parent. Even without transactions, 
 this should be the behavior, but it is possible to make it much cleaner now 
 that we have this new feature in zk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region

2012-04-18 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256921#comment-13256921
 ] 

stack commented on HBASE-5677:
--

Xufeng So we should close this issue and backport hbase-5454 to 0.90 and to 
0.92.2?   Or would you rather make a new issue that adds check initialized to 
createTable for trunk and 0.94 and that has a new version of hbase-5454 that 
includes checkinitialized in the patch we put on 0.90 and 0.92?

 The master never does balance because duplicate openhandled the one region
 --

 Key: HBASE-5677
 URL: https://issues.apache.org/jira/browse/HBASE-5677
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
 Environment: 0.90
Reporter: xufeng
Assignee: xufeng
 Fix For: 0.90.7, 0.92.2

 Attachments: 5677-proposal.txt, 5677-proposal.txt, 
 Backport-HBASE-5454-to-90.patch, Backport-HBASE-5454-to-92.patch, 
 HBASE-5677-90-v1.patch, surefire-report_no_patched_v1.html, 
 surefire-report_patched_v1.html


 If region be assigned When the master is doing initialization(before do 
 processFailover),the region will be duplicate openhandled.
 because the unassigned node in zookeeper will be handled again in 
 AssignmentManager#processFailover()
 it cause the region in RIT,thus the master never does balance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-18 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256950#comment-13256950
 ] 

stack commented on HBASE-5545:
--

The addtions to FSUtils are over the top but +1 on patch -- deleting tmp 
content on open seems useful.

 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: HBASE-5545.patch, HBASE-5545.patch


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 [2012-03-07 20:51:45,858] [WARN ] 
 [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] 
 [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] 

[jira] [Commented] (HBASE-5547) Don't delete HFiles when in backup mode

2012-04-18 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256956#comment-13256956
 ] 

stack commented on HBASE-5547:
--

bq. ...but is definitely a concern and something that I've seen take up to a 
few seconds to propagate.

Yeah.  If you don't want a window, query the regionservers (you'll need to add 
something to query but...)

bq. ... Are you basically talking about doing per-table configuration storage 
in the table znode?

I was stating then that we already are doing a per table attribute up in zk -- 
whether enabled or disabled -- and that rather than do up new nodes for a new 
attribute that instead we should add to the table znode the new attribute.  
That was then.  Now I'm suggesting we put all config up there.  We could start 
w/ HTD if we want to keep it table scoped (we'd have another tier in front of 
the one Nicolas added, a dynamic one).

If the above too ambitious, we should at least generalize the table znode so 
can add attributes and we might as well pb serialize the HTD as anything else?

bq. ...If they are disabled, they need to check everytime to see if it has been 
enabled

Or just watch the table znode and if it changes, check if backup has been 
flipped on.

 Don't delete HFiles when in backup mode
 -

 Key: HBASE-5547
 URL: https://issues.apache.org/jira/browse/HBASE-5547
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Jesse Yates

 This came up in a discussion I had with Stack.
 It would be nice if HBase could be notified that a backup is in progress (via 
 a znode for example) and in that case either:
 1. rename HFiles to be delete to file.bck
 2. rename the HFiles into a special directory
 3. rename them to a general trash directory (which would not need to be tied 
 to backup mode).
 That way it should be able to get a consistent backup based on HFiles (HDFS 
 snapshots or hard links would be better options here, but we do not have 
 those).
 #1 makes cleanup a bit harder.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-18 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256968#comment-13256968
 ] 

stack commented on HBASE-5782:
--

Sorry. Dumb. The tool calls system.exit.  Let me fix in another issue.

 Edits can be appended out of seqid order since HBASE-4487
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: Lars Hofhansl
Priority: Blocker
 Fix For: 0.94.0

 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782-v3.txt, 
 5782.txt, 5782.unfinished-stack.txt, 5782.unittest.txt, HBASE-5782.patch, 
 hbase-5782.txt


 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5825) TestHLog not running any tests; fix

2012-04-18 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256973#comment-13256973
 ] 

stack commented on HBASE-5825:
--

The commit on HBASE-5782 broke TestHLog (It included a unit test of mine that 
calls HLogPerformanceEvaluation -- it calls System.exit when done).

 TestHLog not running any tests; fix
 ---

 Key: HBASE-5825
 URL: https://issues.apache.org/jira/browse/HBASE-5825
 Project: HBase
  Issue Type: Bug
Reporter: stack



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5799) [89-fb] Multiget API may return incomplete resutls

2012-04-18 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257036#comment-13257036
 ] 

stack commented on HBASE-5799:
--

@Liyin Do we need this out on trunk?  What commit on 0.89fb was this fix?  
Thanks.

 [89-fb] Multiget API may return incomplete resutls
 --

 Key: HBASE-5799
 URL: https://issues.apache.org/jira/browse/HBASE-5799
 Project: HBase
  Issue Type: Bug
Reporter: Liyin Tang
Assignee: Liyin Tang

 There is a serious bug in the multiget which will cause the multiget function 
 only returns part of the results.
 In the process function: 
 The initial region is set before sorting the input list.
 So after the input list has been sorted, the initial region may no longer be 
 the correct region for the first row in the sorted list.
 So the first row in the sorted list may be sent to the wrong region server 
 which has no result for this row.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5737) Minor Improvements related to balancer.

2012-04-18 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257207#comment-13257207
 ] 

stack commented on HBASE-5737:
--

@Ram I do not follow.  Please rephrase.

 Minor Improvements related to balancer.
 ---

 Key: HBASE-5737
 URL: https://issues.apache.org/jira/browse/HBASE-5737
 Project: HBase
  Issue Type: Improvement
  Components: master
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-5737.patch, HBASE-5737_1.patch, 
 HBASE-5737_2.patch, HBASE-5737_3.patch


 Currently in Am.getAssignmentByTable()  we use a result map which is currenly 
 a hashmap.  It could be better if we have a treeMap.  Even in 
 MetaReader.fullScan we have the treeMap only so that we have the naming order 
 maintained. I felt this change could be very useful in cases where we are 
 extending the DefaultLoadBalancer.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5751) hbase master stop does not bring down backup masters

2012-04-18 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257216#comment-13257216
 ] 

stack commented on HBASE-5751:
--

When was it reverted Gregory?  There was a long run of fails after its commit.  
Thanks.

 hbase master stop does not bring down backup masters
 --

 Key: HBASE-5751
 URL: https://issues.apache.org/jira/browse/HBASE-5751
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl
Assignee: Gregory Chanan
 Fix For: 0.90.7


 Carry forward the discussion from parent for 0.90

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5349) Automagically tweak global memstore and block cache sizes based on workload

2012-04-18 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257219#comment-13257219
 ] 

stack commented on HBASE-5349:
--

I wouldn't mind more detail.

Can our LRU be resized?

Memstore upper bound can vary but there are interesting effects like if its too 
big, flushing can take so long, the memstore fills before we get around to 
flushing it again so we block.

Nit: 10 minutes seems like too coarse a granularity?

Good stuff Enis.

 Automagically tweak global memstore and block cache sizes based on workload
 ---

 Key: HBASE-5349
 URL: https://issues.apache.org/jira/browse/HBASE-5349
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
 Fix For: 0.96.0


 Hypertable does a neat thing where it changes the size given to the CellCache 
 (our MemStores) and Block Cache based on the workload. If you need an image, 
 scroll down at the bottom of this link: 
 http://www.hypertable.com/documentation/architecture/
 That'd be one less thing to configure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3614) Expose per-region request rate metrics

2012-04-18 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257226#comment-13257226
 ] 

stack commented on HBASE-3614:
--

This could be final: '+  private RegionOperationMetrics regionMetrics;'?

100 chars per line.

Just pass HRegionInfo altogether to the below?

{code}
+this.regionMetrics = new RegionOperationMetrics(conf, 
this.regionInfo.getTableNameAsString(), this.regionInfo.getEncodedName());
{code}

Err... your replacement is better than what was there previously in the below:

{code}
-final String metricPrefix = SchemaMetrics.generateSchemaMetricsPrefix(
-getTableDesc().getNameAsString(), familyMap.keySet());
-if (!metricPrefix.isEmpty()) {
-  RegionMetricsStorage.incrTimeVaryingMetric(metricPrefix + delete_, 
after - now);
-}
+this.regionMetrics.updateDeleteMetrics(familyMap.keySet(), after-now);
{code}

Whats happening here?

{code}
+if (cfSet == null) {
+  cfSet = put.getFamilyMap().keySet();
+} else {
+  cfSetConsistent = cfSetConsistent  put.equals(cfSet);
{code}

Do we have to get the column family set each time through?  It never changes 
(currently) while the region is open.

Whats a cfSetConsistent?  A comment would  help?

Yeah, I don't follow this stuff:

{code}
+  //See if the column families were consistent through the whole thing.
+  //if they were then keep them.  If they were not then pass a null.
+  //null will be treated as unknown.
{code}

Should be hbase.metrics.region.exposeOperationTimes instead of 
hbase.metrics.exposeOperationTimes to convey its on/off for per-region metrics?

This patch is great.

 Expose per-region request rate metrics
 --

 Key: HBASE-3614
 URL: https://issues.apache.org/jira/browse/HBASE-3614
 Project: HBase
  Issue Type: Improvement
  Components: metrics, regionserver
Reporter: Gary Helmling
Assignee: Elliott Clark
Priority: Minor
 Attachments: HBASE-3614-0.patch, HBASE-3614-1.patch, 
 HBASE-3614-2.patch, HBASE-3614-3.patch, HBASE-3614-4.patch, Screen Shot 
 2012-04-17 at 2.41.27 PM.png


 We currently export metrics on request rates for each region server, and this 
 can help with identifying uneven load at a high level. But once you see a 
 given server under high load, you're forced to extrapolate based on your 
 application patterns and the data it's serving what the likely culprit is.  
 This can and should be much easier if we just exported request rate metrics 
 per-region on each server.
 Dynamically updating the metrics keys based on assigned regions may pose some 
 minor challenges, but this seems a very valuable diagnostic tool to have 
 available.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5751) hbase master stop does not bring down backup masters

2012-04-18 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257230#comment-13257230
 ] 

stack commented on HBASE-5751:
--

https://builds.apache.org/view/G-L/view/HBase/job/hbase-0.90/456/console looks 
like a similar hang though not on same test; the tests are aborted midway 
through.

I think your arg. that its unrelated holds going by the fact that 471-473 fail 
TestLogRolling in the manner in which they failed when the patch was in place.  
Lets commit hbase-5213 and figure this failing TestLogRolling out in a new 
issue.

 hbase master stop does not bring down backup masters
 --

 Key: HBASE-5751
 URL: https://issues.apache.org/jira/browse/HBASE-5751
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl
Assignee: Gregory Chanan
 Fix For: 0.90.7


 Carry forward the discussion from parent for 0.90

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5737) Minor Improvements related to balancer.

2012-04-18 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257241#comment-13257241
 ] 

stack commented on HBASE-5737:
--

The above is from AM?  If so, I'm not sure it a bug.  At the time, my sense is 
that the balancer ran w/o keeping context.  Whats changed is that you seem to 
have a LB that is doing this now.

As to whether a bug or improvement, its your call boss.

 Minor Improvements related to balancer.
 ---

 Key: HBASE-5737
 URL: https://issues.apache.org/jira/browse/HBASE-5737
 Project: HBase
  Issue Type: Improvement
  Components: master
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-5737.patch, HBASE-5737_1.patch, 
 HBASE-5737_2.patch, HBASE-5737_3.patch


 Currently in Am.getAssignmentByTable()  we use a result map which is currenly 
 a hashmap.  It could be better if we have a treeMap.  Even in 
 MetaReader.fullScan we have the treeMap only so that we have the naming order 
 maintained. I felt this change could be very useful in cases where we are 
 extending the DefaultLoadBalancer.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5827) [Coprocessors] Observer notifications on exceptions

2012-04-18 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257244#comment-13257244
 ] 

stack commented on HBASE-5827:
--

This seems like something we need if cps are to be able to keep a running 
context.

 [Coprocessors] Observer notifications on exceptions
 ---

 Key: HBASE-5827
 URL: https://issues.apache.org/jira/browse/HBASE-5827
 Project: HBase
  Issue Type: Improvement
  Components: coprocessors
Reporter: Andrew Purtell

 Benjamin Busjaeger wrote on dev@:
 {quote}
 Is there a reason that RegionObservers are not notified when a get/put/delete 
 fails? Suppose I maintain some (transient) state in my Coprocessor that is 
 created during preGet and discarded during postGet. If the get fails, postGet 
 is not invoked, so I cannot remove the state.
 If there is a good reason, is there any other way to achieve the same thing? 
 If not, would  it be possible to add something the snippet below to the code 
 base?
 {code}
 // pre-get CP hook
 if (withCoprocessor  (coprocessorHost != null)) {
   if (coprocessorHost.preGet(get, results)) {
 return results;
   }
 }
 +try{
 ...
 +} catch (Throwable t) {
 +// failed-get CP hook
 +if (withCoprocessor  (coprocessorHost != null)) {
 +  coprocessorHost.failedGet(get, results);
 +}
 +rethrow t;
 +}
 // post-get CP hook
 if (withCoprocessor  (coprocessorHost != null)) {
   coprocessorHost.postGet(get, results);
 }
 {code}
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5816) Balancer and ServerShutdownHandler concurrently reassigning the same region

2012-04-18 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257252#comment-13257252
 ] 

stack commented on HBASE-5816:
--

Great stuff Maryann.  Where is the above bit of code from?  I don't find it in 
trunk (could be me).

bq. It should be safe for the later thread just return or get an exception if 
the region has already been assigned by an earlier thread.

What are you thinking?  When we go into the assign, we check if the region is 
in transition and unless its a force assign, just return?  Or would you do this 
earlier?  Maybe the balancer should be more deferential?  It could check if the 
regionserver its been asked move a region from is on the deadservers list.  
This would still be racy though.  Would doing the check in the assign method be 
enough?  (I've not looked at the code).

Thanks for the help on this stuff.


 Balancer and ServerShutdownHandler concurrently reassigning the same region
 ---

 Key: HBASE-5816
 URL: https://issues.apache.org/jira/browse/HBASE-5816
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6
Reporter: Maryann Xue
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Attachments: HBASE-5816.patch


 The first assign thread exits with success after updating the RegionState to 
 PENDING_OPEN, while the second assign follows immediately into assign and 
 fails the RegionState check in setOfflineInZooKeeper(). This causes the 
 master to abort.
 In the below case, the two concurrent assigns occurred when AM tried to 
 assign a region to a dying/dead RS, and meanwhile the ShutdownServerHandler 
 tried to assign this region (from the region plan) spontaneously.
 2012-04-17 05:44:57,648 INFO org.apache.hadoop.hbase.master.HMaster: balance 
 hri=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b., 
 src=hadoop05.sh.intel.com,60020,1334544902186, 
 dest=xmlqa-clv16.sh.intel.com,60020,1334612497253
 2012-04-17 05:44:57,648 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
 region TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. 
 (offlining)
 2012-04-17 05:44:57,648 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to 
 serverName=hadoop05.sh.intel.com,60020,1334544902186, load=(requests=0, 
 regions=0, usedHeap=0, maxHeap=0) for region 
 TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b.
 2012-04-17 05:44:57,666 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned 
 node: /hbase/unassigned/fe38fe31caf40b6e607a3e6bbed6404b 
 (region=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b.,
  server=hadoop05.sh.intel.com,60020,1334544902186, state=RS_ZK_REGION_CLOSING)
 2012-04-17 05:52:58,984 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. 
 state=CLOSED, ts=1334612697672, 
 server=hadoop05.sh.intel.com,60020,1334544902186
 2012-04-17 05:52:58,984 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x236b912e9b3000e Creating (or updating) unassigned node for 
 fe38fe31caf40b6e607a3e6bbed6404b with OFFLINE state
 2012-04-17 05:52:59,096 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for 
 region TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b.; 
 plan=hri=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b.,
  src=hadoop05.sh.intel.com,60020,1334544902186, 
 dest=xmlqa-clv16.sh.intel.com,60020,1334612497253
 2012-04-17 05:52:59,096 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
 TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. to 
 xmlqa-clv16.sh.intel.com,60020,1334612497253
 2012-04-17 05:54:19,159 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. 
 state=PENDING_OPEN, ts=1334613179096, 
 server=xmlqa-clv16.sh.intel.com,60020,1334612497253
 2012-04-17 05:54:59,033 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of 
 TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. to 
 serverName=xmlqa-clv16.sh.intel.com,60020,1334612497253, load=(requests=0, 
 regions=0, usedHeap=0, maxHeap=0), trying to assign elsewhere instead; retry=0
 java.net.SocketTimeoutException: Call to /10.239.47.87:60020 failed on socket 
 timeout exception: java.net.SocketTimeoutException: 12 millis timeout 
 while waiting for channel to be ready for read. ch : 
 java.nio.channels.SocketChannel[connected local=/10.239.47.89:41302 
 

[jira] [Commented] (HBASE-5816) Balancer and ServerShutdownHandler concurrently reassigning the same region

2012-04-18 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257255#comment-13257255
 ] 

stack commented on HBASE-5816:
--

Should we have the servershutdownhandler and the balancer feed a single queue 
that assignment manager pulls from?  If the region is already in the queue then 
we'd favor the purposed assignment (the balancers?) rather than the random one?

 Balancer and ServerShutdownHandler concurrently reassigning the same region
 ---

 Key: HBASE-5816
 URL: https://issues.apache.org/jira/browse/HBASE-5816
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6
Reporter: Maryann Xue
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Attachments: HBASE-5816.patch


 The first assign thread exits with success after updating the RegionState to 
 PENDING_OPEN, while the second assign follows immediately into assign and 
 fails the RegionState check in setOfflineInZooKeeper(). This causes the 
 master to abort.
 In the below case, the two concurrent assigns occurred when AM tried to 
 assign a region to a dying/dead RS, and meanwhile the ShutdownServerHandler 
 tried to assign this region (from the region plan) spontaneously.
 2012-04-17 05:44:57,648 INFO org.apache.hadoop.hbase.master.HMaster: balance 
 hri=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b., 
 src=hadoop05.sh.intel.com,60020,1334544902186, 
 dest=xmlqa-clv16.sh.intel.com,60020,1334612497253
 2012-04-17 05:44:57,648 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
 region TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. 
 (offlining)
 2012-04-17 05:44:57,648 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to 
 serverName=hadoop05.sh.intel.com,60020,1334544902186, load=(requests=0, 
 regions=0, usedHeap=0, maxHeap=0) for region 
 TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b.
 2012-04-17 05:44:57,666 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned 
 node: /hbase/unassigned/fe38fe31caf40b6e607a3e6bbed6404b 
 (region=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b.,
  server=hadoop05.sh.intel.com,60020,1334544902186, state=RS_ZK_REGION_CLOSING)
 2012-04-17 05:52:58,984 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. 
 state=CLOSED, ts=1334612697672, 
 server=hadoop05.sh.intel.com,60020,1334544902186
 2012-04-17 05:52:58,984 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x236b912e9b3000e Creating (or updating) unassigned node for 
 fe38fe31caf40b6e607a3e6bbed6404b with OFFLINE state
 2012-04-17 05:52:59,096 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for 
 region TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b.; 
 plan=hri=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b.,
  src=hadoop05.sh.intel.com,60020,1334544902186, 
 dest=xmlqa-clv16.sh.intel.com,60020,1334612497253
 2012-04-17 05:52:59,096 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
 TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. to 
 xmlqa-clv16.sh.intel.com,60020,1334612497253
 2012-04-17 05:54:19,159 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. 
 state=PENDING_OPEN, ts=1334613179096, 
 server=xmlqa-clv16.sh.intel.com,60020,1334612497253
 2012-04-17 05:54:59,033 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of 
 TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. to 
 serverName=xmlqa-clv16.sh.intel.com,60020,1334612497253, load=(requests=0, 
 regions=0, usedHeap=0, maxHeap=0), trying to assign elsewhere instead; retry=0
 java.net.SocketTimeoutException: Call to /10.239.47.87:60020 failed on socket 
 timeout exception: java.net.SocketTimeoutException: 12 millis timeout 
 while waiting for channel to be ready for read. ch : 
 java.nio.channels.SocketChannel[connected local=/10.239.47.89:41302 
 remote=/10.239.47.87:60020]
 at 
 org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:805)
 at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:778)
 at 
 org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:283)
 at $Proxy7.openRegion(Unknown Source)
 at 
 org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:573)
 at 
 

[jira] [Commented] (HBASE-5792) HLog Performance Evaluation Tool

2012-04-17 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255644#comment-13255644
 ] 

stack commented on HBASE-5792:
--

@Todd Thanks.  I removed TestHLogBench over in HBASE-5808.  The new test does 
verify and actually writes a log which TestHLogBench does not.

 HLog Performance Evaluation Tool
 

 Key: HBASE-5792
 URL: https://issues.apache.org/jira/browse/HBASE-5792
 Project: HBase
  Issue Type: Test
  Components: wal
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
  Labels: performance, wal
 Fix For: 0.96.0

 Attachments: HBASE-5792-v0.patch, HBASE-5792-v1.patch, 
 HBASE-5792-v2.patch, verify.txt, verify.txt


 Related to HDFS-3280 and the HBase WAL slowdown on 0.23+
 It would be nice to have a simple tool like HFilePerformanceEvaluation, ...
 to be able to check easily the HLog performance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5788) Move Dynamic Metrics storage off of HRegion.

2012-04-17 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255645#comment-13255645
 ] 

stack commented on HBASE-5788:
--

bq. TestRegionServerMetrics covers most of the functionality of the new class 
but I can create a new set of more explicit tests if you think that is needed.

Probably no need if we have some coverage already.  Just want to make sure the 
class does its basic contract.  Easier figuring this stuff in a unit test than 
up on a cluster, yadda, yadda, you know what I'm at.

 Move Dynamic Metrics storage off of HRegion.
 

 Key: HBASE-5788
 URL: https://issues.apache.org/jira/browse/HBASE-5788
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Minor
 Attachments: HBASE-5788-0.patch, HBASE-5788-1.patch, 
 HBASE-5788-2.patch


 HRegion right now has the responsibility of storing static counts and latency 
 numbers for use by the metrics package.  Since these maps are incremented and 
 set from lots of places it makes adding functionality hard.
  
 So move the metrics functionality into SchemaMetrics making it more than just 
 a class for naming.  The next step will be to simplify the api exposed so 
 that using it will be easier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5620) Convert the client protocol of HRegionInterface to PB

2012-04-17 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255667#comment-13255667
 ] 

stack commented on HBASE-5620:
--

@Jimmy So every invocation will throw an exception?

{code}
+// For protobuf protocols, ServiceException is expected
{code}

Whats the Set in Invocation doing?   You add it but don't seem to access it?

I like the removal of a call method down through the rpc stack

 Convert the client protocol of HRegionInterface to PB
 -

 Key: HBASE-5620
 URL: https://issues.apache.org/jira/browse/HBASE-5620
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.96.0

 Attachments: hbase-5620-sec.patch, hbase-5620_v3.patch, 
 hbase-5620_v4.patch, hbase-5620_v4.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5620) Convert the client protocol of HRegionInterface to PB

2012-04-17 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255778#comment-13255778
 ] 

stack commented on HBASE-5620:
--

I made HBASE-5810 to apply this Jimmy.   Good stuff.

 Convert the client protocol of HRegionInterface to PB
 -

 Key: HBASE-5620
 URL: https://issues.apache.org/jira/browse/HBASE-5620
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.96.0

 Attachments: hbase-5620-sec.patch, hbase-5620_v3.patch, 
 hbase-5620_v4.patch, hbase-5620_v4.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255877#comment-13255877
 ] 

stack commented on HBASE-5782:
--

Looking at Lars patch.

On you 1. and 2. above, apparently the append is also expensive according to 
Dhruba.  Just saying.

Also on ...might lead to sync be issued multiple time when only one was 
necessary (it seems the same race condition existed before).

Yes, this we have always had.

I'd say kill this stuff... it looks like rubbish to me:

{code}
+  syncBatchSize.addAndGet(doneUpto - this.syncedTillHere);
{code}

Its not read by anyone, looks like the math can go wonky, and when it is read, 
its set back to zero which is probably unexpected.  Kill it I'd say.

I think this is ok:

{code}
+  this.syncedTillHere = Math.max(this.syncedTillHere, doneUpto);
{code}

but this is racy

{code}
   long doneUpto = this.unflushedEntries.get();
{code}

It could be low in number; i.e. we could be putting into hdfs more edits than 
the current value of unflushedEntries if we read after an edit has been added 
to the queue but before the above is updated.  Is that ok?  Its ok if this is a 
little sloppy especially if it under reports?

On tactic for 0.94, sure on doing this for 0.94 though I like Todds fix better. 
 The verification tool will help you figure if this slows stuff much and if we 
are writing out of order.  Let me know if you want me to run it for you.  Let 
me add in log rolling too as per Todd suggestion.

 Edits can be appended out of seqid order since HBASE-4487
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 
 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt


 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255885#comment-13255885
 ] 

stack commented on HBASE-5782:
--

Can we try and make Todd's work?  It does some nice cleanup.

 Edits can be appended out of seqid order since HBASE-4487
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 
 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt


 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255937#comment-13255937
 ] 

stack commented on HBASE-5782:
--

bq. We won't write more into the log (once we take the pendingWrites they are 
gone

Is that so?  We don't get the pendingWrites until we are under the flush lock 
but we've taken doneUpTo before we go under the lock.

 Edits can be appended out of seqid order since HBASE-4487
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 
 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt


 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5812) Add log rolling to HLogPerformanceEvaluation

2012-04-17 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255946#comment-13255946
 ] 

stack commented on HBASE-5812:
--

Verify can deal w/ multiple logs and verify all logs were written in sequence 
id order.

 Add log rolling to HLogPerformanceEvaluation
 

 Key: HBASE-5812
 URL: https://issues.apache.org/jira/browse/HBASE-5812
 Project: HBase
  Issue Type: Task
Reporter: stack
 Attachments: 5812.txt


 Add being able to ask that HLogPerformanceEvaluation rolls logs when its 
 running.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255959#comment-13255959
 ] 

stack commented on HBASE-5782:
--

@Lars But it own't matter right since the map we are getting from is not under 
our new flush lock?  I think its harmless.  We will undercount whats been 
flushed I believe; we'll not overcount (and so possible lose data)?

I added log rolling and tested your patch using HLogPerformanceEvaluation.  It 
'works' at least.  If you want me to compare before and after, just say.

 Edits can be appended out of seqid order since HBASE-4487
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 
 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt


 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256043#comment-13256043
 ] 

stack commented on HBASE-5782:
--

I made the hlog perf tool work on hdfs and ran some basic tests.  Both Todd an 
Lars' patches seem faster than what we have currently.

Running w/o a fix on hdfs w/ current trunk I have to disable verify because it 
fails (verify happens after we print out test timings).

$ ./bin/hbase 
org.apache.hadoop.hbase.regionserver.wal.HLogPerformanceEvaluation -conf 
/home/stack/hadoop-conf/core-site.xml -path hdfs://sv4r11s38:7000/tmp -threads 
100 -roll 1

12/04/17 22:58:28 INFO wal.HLogPerformanceEvaluation: Summary: threads=100, 
iterations=1 took 100.630s 9937.395ops/s
12/04/17 23:00:33 INFO wal.HLogPerformanceEvaluation: Summary: threads=100, 
iterations=1 took 94.945s 10532.413ops/s

Todd patch on hdfs:

$ ./bin/hbase 
org.apache.hadoop.hbase.regionserver.wal.HLogPerformanceEvaluation -conf 
/home/stack/hadoop-conf/core-site.xml -path hdfs://sv4r11s38:7000/tmp -threads 
100 -roll 1 -verify

12/04/17 22:53:35 INFO wal.HLogPerformanceEvaluation: Summary: threads=100, 
iterations=1 took 81.202s 12314.967ops/s

Lars patch:

12/04/17 23:07:08 INFO wal.HLogPerformanceEvaluation: Summary: threads=100, 
iterations=1 took 76.800s 13020.833ops/s

For Todd and Lars, both pass verify which checks that seqids are ordered and 
that we wrote as much as we think we did.

 Edits can be appended out of seqid order since HBASE-4487
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 
 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt


 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256044#comment-13256044
 ] 

stack commented on HBASE-5782:
--

Ok on lars patch into 0.94.

 Edits can be appended out of seqid order since HBASE-4487
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 
 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt


 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256085#comment-13256085
 ] 

stack commented on HBASE-5782:
--

I tried to reproduce what JD is seeing on cluster using same sized keys and 
values but Lars' patch completes before Todds.  My test run may be too small  I 
did thread dumps during Lars and Todd runs.  Both seem to be down in sync 
mostly, down here 
'org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.waitForAckedSeqno(DFSClient.java:3789)'
 otherwise hung up on sync points around wal append/sync.

Lets go w/ the Lars patch because minimal changes.  As per Todd, lets file an 
issue to clean up this stuff with his patch as seed.  From J-D work, any grease 
lightening we can apply around hlog append makes for a big difference in 
overall write throughput.

 Edits can be appended out of seqid order since HBASE-4487
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 
 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt


 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256087#comment-13256087
 ] 

stack commented on HBASE-5782:
--

@Lars As to your patch being 'slower' when fewer threads, I think you can't do 
such a compare.  W/o your patch, we are broke.

 Edits can be appended out of seqid order since HBASE-4487
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 
 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt


 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5790) ZKUtil deleteRecursively should be a recoverable operation

2012-04-17 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256177#comment-13256177
 ] 

stack commented on HBASE-5790:
--

This patch requires zk 3.4.x right but it doesn't check that version running 
before it goes and uses this new Transaction feature (I'm not sure if you even 
can ask zk its ensemble version from the client)?  If a user puts 3.3.x under 
hbase, we'll hang doing this call?

 ZKUtil deleteRecursively should be a recoverable operation
 --

 Key: HBASE-5790
 URL: https://issues.apache.org/jira/browse/HBASE-5790
 Project: HBase
  Issue Type: Improvement
Reporter: Jesse Yates
Assignee: Jesse Yates
  Labels: zookeeper
 Fix For: 0.96.0, 0.94.1

 Attachments: java_HBASE-5790-v1.patch, java_HBASE-5790.patch


 As of 3.4.3 Zookeeper now has full, multi-operation transaction. This means 
 we can wholesale delete chunks of the zk tree and ensure that we don't have 
 any pesky recursive delete issues where we delete the children of a node, but 
 then a child joins before deletion of the parent. Even without transactions, 
 this should be the behavior, but it is possible to make it much cleaner now 
 that we have this new feature in zk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256185#comment-13256185
 ] 

stack commented on HBASE-5782:
--

Want me to make a test that does simple three threads with just a few edits ... 
say 1k... and then verifies all in order and all edits written so we notice 
regression?

 Edits can be appended out of seqid order since HBASE-4487
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 
 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt


 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5795) hbase-3927 breaks 0.92-0.94 compatibility

2012-04-16 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254794#comment-13254794
 ] 

stack commented on HBASE-5795:
--

v2 works out on a cluster for me

 hbase-3927 breaks 0.92-0.94 compatibility
 ---

 Key: HBASE-5795
 URL: https://issues.apache.org/jira/browse/HBASE-5795
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.94.0

 Attachments: 5795-v2.txt, 5795.unittest.txt


 This commit broke our 0.92/0.94 compatibility:
 {code}
 
 r1136686 | stack | 2011-06-16 14:18:08 -0700 (Thu, 16 Jun 2011) | 1 line
 HBASE-3927 display total uncompressed byte size of a region in web UI
 {code}
 I just tried the new RC for 0.94.  I brought up a 0.94 master on a 0.92 
 cluster and rather than just digest version 1 of the HServerLoad, I get this:
 {code}
 2012-04-14 22:47:59,752 WARN org.apache.hadoop.ipc.HBaseServer: Unable to 
 read call parameters for client 10.4.14.38
 java.io.IOException: Error in readFields
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:684)
 at 
 org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:125)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1269)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1184)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:722)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:513)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:488)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: A record version mismatch occured. Expecting v2, found v1
 at 
 org.apache.hadoop.io.VersionedWritable.readFields(VersionedWritable.java:46)
 at 
 org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:379)
 at 
 org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:686)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:681)
 ... 9 more
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5798) NPE running hbck on 0.94 out of reportTablesInFlux

2012-04-16 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254811#comment-13254811
 ] 

stack commented on HBASE-5798:
--

Error is transient.  Subsequent runs worked.

 NPE running hbck on 0.94 out of reportTablesInFlux
 --

 Key: HBASE-5798
 URL: https://issues.apache.org/jira/browse/HBASE-5798
 Project: HBase
  Issue Type: Bug
Reporter: stack

 Got this playing w/ hbck going against the 0.94RC:
 {code}
 12/04/16 17:03:14 INFO util.HBaseFsck: getHTableDescriptors == tableNames = 
 []
 Exception in thread main java.lang.NullPointerException
 at 
 org.apache.hadoop.hbase.util.HBaseFsck.reportTablesInFlux(HBaseFsck.java:553)
 at 
 org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:344)
 at 
 org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:380)
 at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3033)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5792) HLog Performance Evaluation Tool

2012-04-16 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254827#comment-13254827
 ] 

stack commented on HBASE-5792:
--

This is great Matteo.  We need this.  Yeah, agree, this tool will have most 
value if it puts nothing but a lone region (and WAL).

Few minors below:

Missing annotatations on audience.

Do you need these?  IIRC, the default exists w/ need of definition:

{code}
+  public HLogPerformanceEvaluation() {
+  }
{code}

You do it in another place at least too.

No harm adding a bit of class doc on HLogPutBenchmark

You don't want to use a command parser?



 HLog Performance Evaluation Tool
 

 Key: HBASE-5792
 URL: https://issues.apache.org/jira/browse/HBASE-5792
 Project: HBase
  Issue Type: Test
  Components: wal
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
  Labels: performance, wal
 Attachments: HBASE-5792-v0.patch, HBASE-5792-v1.patch


 Related to HDFS-3280 and the HBase WAL slowdown on 0.23+
 It would be nice to have a simple tool like HFilePerformanceEvaluation, ...
 to be able to check easily the HLog performance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5792) HLog Performance Evaluation Tool

2012-04-16 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254869#comment-13254869
 ] 

stack commented on HBASE-5792:
--

@Matteo NVM.  I want to use this tool now so I'll take care of the above.  Good 
stuff.

 HLog Performance Evaluation Tool
 

 Key: HBASE-5792
 URL: https://issues.apache.org/jira/browse/HBASE-5792
 Project: HBase
  Issue Type: Test
  Components: wal
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
  Labels: performance, wal
 Attachments: HBASE-5792-v0.patch, HBASE-5792-v1.patch, 
 HBASE-5792-v2.patch


 Related to HDFS-3280 and the HBase WAL slowdown on 0.23+
 It would be nice to have a simple tool like HFilePerformanceEvaluation, ...
 to be able to check easily the HLog performance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5795) hbase-3927 breaks 0.92-0.94 compatibility

2012-04-16 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254928#comment-13254928
 ] 

stack commented on HBASE-5795:
--

No.  Please include the unit test on commit.

 hbase-3927 breaks 0.92-0.94 compatibility
 ---

 Key: HBASE-5795
 URL: https://issues.apache.org/jira/browse/HBASE-5795
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.94.0, 0.96.0

 Attachments: 5795-v2.txt, 5795.unittest.txt


 This commit broke our 0.92/0.94 compatibility:
 {code}
 
 r1136686 | stack | 2011-06-16 14:18:08 -0700 (Thu, 16 Jun 2011) | 1 line
 HBASE-3927 display total uncompressed byte size of a region in web UI
 {code}
 I just tried the new RC for 0.94.  I brought up a 0.94 master on a 0.92 
 cluster and rather than just digest version 1 of the HServerLoad, I get this:
 {code}
 2012-04-14 22:47:59,752 WARN org.apache.hadoop.ipc.HBaseServer: Unable to 
 read call parameters for client 10.4.14.38
 java.io.IOException: Error in readFields
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:684)
 at 
 org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:125)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1269)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1184)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:722)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:513)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:488)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: A record version mismatch occured. Expecting v2, found v1
 at 
 org.apache.hadoop.io.VersionedWritable.readFields(VersionedWritable.java:46)
 at 
 org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:379)
 at 
 org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:686)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:681)
 ... 9 more
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5782) Not all the regions are getting assigned after the log splitting.

2012-04-16 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255034#comment-13255034
 ] 

stack commented on HBASE-5782:
--

I just committed a tool over on HBASE-5792.  It tests WALs. If you pass the 
-verify flag, you'll see that even w/ just three threads, sequence ids are out 
of order.  Could be useful verifying whatever fix we have here.

 Not all the regions are getting assigned after the log splitting.
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: HBASE-5782.patch


 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5634) document how to use uberhbck

2012-04-16 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255089#comment-13255089
 ] 

stack commented on HBASE-5634:
--

+1

Fix The using the -details option will report

I'm glad you don't call it uberhbck in the doc (well, you joke about it -- 
thats ok)





 document how to use uberhbck
 

 Key: HBASE-5634
 URL: https://issues.apache.org/jira/browse/HBASE-5634
 Project: HBase
  Issue Type: Improvement
  Components: documentation, hbck
Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Attachments: uber hbck docs.pdf


 The updated hbck from HBASE-5128 introduces many new repair options and, as a 
 side effect, offers many new opportunities to durably shoot oneself in the 
 foot.  Docs need to be written and added to the ref guide to explain its 
 usage and ramifications and discuss repair strategies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-16 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255201#comment-13255201
 ] 

stack commented on HBASE-5782:
--

Not sure I follow but I do know this patch more ambitious than what I was at

+ You remove the 'other' sequence numbering system, unflushedEntries?  That 
looks good.
+ Are asserts on by default?  We disabled them a while back I believe?  You run 
w/ asserts? (Yeah, thats a good thing to test -- should you use your guava test 
instead?)
+ Its ugly we call it hlogFlush but internal we do appends (thats not your 
change)
+ I agree that the reset of the the pending writes linked list needs to be done 
under the synchronization held by hlogFlush
+ I like how you do pushback of edits if we failin hlogFlush.
+ On  this thing: 

{code}
+  // TODO: restore metric syncBatchSize.addAndGet(doneUpto - 
this.syncedTillHere);
{code}

Its not used anywhere and it the math looked dodgy... then when you read it it 
gets set to zero so I'm not so sure it is of any use.


 Edits can be appended out of seqid order since HBASE-4487
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: 5782-sketch.txt, 5782.txt, HBASE-5782.patch


 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5802) Change the default metrics class to NullContextWithUpdateThread

2012-04-16 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255295#comment-13255295
 ] 

stack commented on HBASE-5802:
--

So, enable jmx emissions?  Would suggest that your patch include comment 
explaining what the exotic-sounding NullContextWithUpdateThread does.  Maybe 
copy the class comment into your patch somewhere:

{code}
 * A null context which has a thread calling
 * periodically when monitoring is started. This keeps the data sampled
 * correctly.
 * In all other respects, this is like the NULL context: No data is emitted.
 * This is suitable for Monitoring systems like JMX which reads the metrics
 *  when someone reads the data from JMX.
 *
 * The default impl of start and stop monitoring:
 *  is the AbstractMetricsContext is good enough.
{code}

Maybe update the reference guide too especially if you are changing default.

Good stuff E.

 Change the default metrics class to NullContextWithUpdateThread
 ---

 Key: HBASE-5802
 URL: https://issues.apache.org/jira/browse/HBASE-5802
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Minor
 Attachments: HBASE-5802-0.patch


 Since lots more metrics are being placed into the Dynamic metrics bucket 
 changing the default class to NullContextWithUpdateThread seems like it might 
 be worth it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3614) Expose per-region request rate metrics

2012-04-16 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255304#comment-13255304
 ] 

stack commented on HBASE-3614:
--

FYI 100 chars per line max and space around operators (this won't fly: 
cfSetConsistent?cfSet:null)

I like how you are removing metrics stuff from HRegion out to a region scoped 
metrics class.

'+public class RegionMetrics {' needs a class comment saying what its all 
about.   Does the class need to be public?  Can it be scoped to this package 
only?

Collect all the data members at the top of the class. Thats whats usually done 
in this code base.

So put the tablename etc. in RegionMetric before the constructor etc. rather 
than after.

Does this need to be public generateRegionMetricsPrefix?

What do these new metrics look like?  Is this all it takes to expose them?

Some regionnames are going to be really long.  Should you use the region 
encoded name instead of the full name?  Do you think we even need the table 
name as prefix?

Good stuff Elliott.

 Expose per-region request rate metrics
 --

 Key: HBASE-3614
 URL: https://issues.apache.org/jira/browse/HBASE-3614
 Project: HBase
  Issue Type: Improvement
  Components: metrics, regionserver
Reporter: Gary Helmling
Assignee: Elliott Clark
Priority: Minor
 Attachments: HBASE-3614-0.patch, HBASE-3614-1.patch


 We currently export metrics on request rates for each region server, and this 
 can help with identifying uneven load at a high level. But once you see a 
 given server under high load, you're forced to extrapolate based on your 
 application patterns and the data it's serving what the likely culprit is.  
 This can and should be much easier if we just exported request rate metrics 
 per-region on each server.
 Dynamically updating the metrics keys based on assigned regions may pose some 
 minor challenges, but this seems a very valuable diagnostic tool to have 
 available.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5733) AssignmentManager#processDeadServersAndRegionsInTransition can fail with NPE.

2012-04-16 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255313#comment-13255313
 ] 

stack commented on HBASE-5733:
--

Patch looks good to me.  I like the test.  The LOG.fatal is redundant.  The 
master abort does a log fatal.  Else patch is good.

 AssignmentManager#processDeadServersAndRegionsInTransition can fail with NPE.
 -

 Key: HBASE-5733
 URL: https://issues.apache.org/jira/browse/HBASE-5733
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G
 Attachments: HBASE-5733.patch, HBASE-5733.patch


 Found while going through the code...
 AssignmentManager#processDeadServersAndRegionsInTransition can fail with NPE 
 as this is directly iterating the nodes from 
 listChildrenAndWatchForNewChildren with-out checking for null.
 Here also we need to handle with  null  check like other places.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5788) Move Dynamic Metrics storage off of HRegion.

2012-04-16 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255323#comment-13255323
 ] 

stack commented on HBASE-5788:
--

Is MetricsStorage for Region metrics only?  If so, call it RegionMetrics?  Or 
maybe its generic metrics storage for this package?  If so, the name is right.  
Should it be down in the metrics package?  Regardless, new class needs class 
comment explaining class scope.  Does it have to public?  Can it be private to 
the package at least?  Lines  100 chars.  Why are data members in this new 
class public rather than private?  Even if they are static.   And static data 
members probably ain't a good idea because then there is one only per JVM and 
there can be many regionservers in the one JVM; e.g. in testing.  Yeah, do its 
method names need to be public?  Can these be package private?  Hmm... maybe 
they need to be public because called from the metrics subpackage?  I like all 
the code that comes out of HRegion.  Thats good.  And no harm in a basic unit 
test that your new class is basically working.  Any worries w/ concurrent 
access?   Good stuff Elliott.

 Move Dynamic Metrics storage off of HRegion.
 

 Key: HBASE-5788
 URL: https://issues.apache.org/jira/browse/HBASE-5788
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Minor
 Attachments: HBASE-5788-0.patch, HBASE-5788-1.patch, 
 HBASE-5788-2.patch


 HRegion right now has the responsibility of storing static counts and latency 
 numbers for use by the metrics package.  Since these maps are incremented and 
 set from lots of places it makes adding functionality hard.
  
 So move the metrics functionality into SchemaMetrics making it more than just 
 a class for naming.  The next step will be to simplify the api exposed so 
 that using it will be easier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3585) isLegalFamilyName() can throw ArrayOutOfBoundException

2012-04-16 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255325#comment-13255325
 ] 

stack commented on HBASE-3585:
--

Have a patch Uma?

 isLegalFamilyName() can throw ArrayOutOfBoundException
 --

 Key: HBASE-3585
 URL: https://issues.apache.org/jira/browse/HBASE-3585
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.1
Reporter: Prakash Khemani
Priority: Minor

 org.apache.hadoop.hbase.HColumnDescriptor.isLegalFamilyName(byte[]) accesses 
 byte[0] w/o first checking the array length.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-16 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255344#comment-13255344
 ] 

stack commented on HBASE-5782:
--

@Ram Read over the HLog comments.  Its got stuff on why we want sequenceids in 
order and where we have dependency on their being ordered, at least they are 
notes on how we used to think.  I was wondering too about ordering today.  If 
we didn't have to have order, then it would make stuff like running a 
regionserver with N WALs a bit easier, and we don't try to guarantee sequence 
order when replicating.  But I'm wary undoing order though without our giving 
the issue a bunch of thought first (Your patch above makes me nervous).

On the patch, Todds' seems way superior to me. His is more radical, removing 
what seems to be a confusing sequenceid double, and its more clear whats going 
on.

Oh, and thanks to you fellas for finding this one.  Its a good one.

 Edits can be appended out of seqid order since HBASE-4487
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: 5782-sketch.txt, 5782.txt, 5782.unfinished-stack.txt, 
 HBASE-5782.patch


 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5795) hbase-3927 breaks 0.92-0.94 compatibility

2012-04-15 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254441#comment-13254441
 ] 

stack commented on HBASE-5795:
--

I looked at Ted's patch.  That should do it.  See if it makes the unit test 
pass I'd say.  I can test on cluster tomorrow morning (will also finish my 
rolling restart and kill of meta on a cluster w/ 1k regions too...)

 hbase-3927 breaks 0.92-0.94 compatibility
 ---

 Key: HBASE-5795
 URL: https://issues.apache.org/jira/browse/HBASE-5795
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.94.0

 Attachments: 5795-v1.txt, 5795.unittest.txt


 This commit broke our 0.92/0.94 compatibility:
 {code}
 
 r1136686 | stack | 2011-06-16 14:18:08 -0700 (Thu, 16 Jun 2011) | 1 line
 HBASE-3927 display total uncompressed byte size of a region in web UI
 {code}
 I just tried the new RC for 0.94.  I brought up a 0.94 master on a 0.92 
 cluster and rather than just digest version 1 of the HServerLoad, I get this:
 {code}
 2012-04-14 22:47:59,752 WARN org.apache.hadoop.ipc.HBaseServer: Unable to 
 read call parameters for client 10.4.14.38
 java.io.IOException: Error in readFields
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:684)
 at 
 org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:125)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1269)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1184)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:722)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:513)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:488)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: A record version mismatch occured. Expecting v2, found v1
 at 
 org.apache.hadoop.io.VersionedWritable.readFields(VersionedWritable.java:46)
 at 
 org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:379)
 at 
 org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:686)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:681)
 ... 9 more
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5747) Forward port hbase-5708 [89-fb] Make MiniMapRedCluster directory a subdirectory of target/test

2012-04-15 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254520#comment-13254520
 ] 

stack commented on HBASE-5747:
--

But I didn't change anything!  Does that mean Jon fixed it w/ his hbck commit?

@Jon Let me ask @Mikhail why he went to 100 retries...

 Forward port hbase-5708 [89-fb] Make MiniMapRedCluster directory a 
 subdirectory of target/test
 

 Key: HBASE-5747
 URL: https://issues.apache.org/jira/browse/HBASE-5747
 Project: HBase
  Issue Type: Task
Reporter: stack
Assignee: stack
Priority: Blocker
 Fix For: 0.96.0

 Attachments: 5474.txt, 5474v2.txt, 5474v3 (1).txt, 5474v3.txt, 
 5708v4.txt, 5708v4.txt


 Forward port as much as we can of Mikhail's hard-won test cleanups over on 
 0.89 branch  Will improve our being able to run unit tests in //.  He also 
 found a few bugs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5747) Forward port hbase-5708 [89-fb] Make MiniMapRedCluster directory a subdirectory of target/test

2012-04-14 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254213#comment-13254213
 ] 

stack commented on HBASE-5747:
--

@Jon I can put it back.  I pulled in that from original patch.  Let me try 
setting it back.  See if that helps w/ test hangs.

I ran TestSchemaMetrics locally and it runs fine.  It also does not seem to be 
responsible for the test 'timeouts' that are subsequent to 2757.

 Forward port hbase-5708 [89-fb] Make MiniMapRedCluster directory a 
 subdirectory of target/test
 

 Key: HBASE-5747
 URL: https://issues.apache.org/jira/browse/HBASE-5747
 Project: HBase
  Issue Type: Task
Reporter: stack
Assignee: stack
Priority: Blocker
 Fix For: 0.96.0

 Attachments: 5474.txt, 5474v2.txt, 5474v3 (1).txt, 5474v3.txt, 
 5708v4.txt, 5708v4.txt


 Forward port as much as we can of Mikhail's hard-won test cleanups over on 
 0.89 branch  Will improve our being able to run unit tests in //.  He also 
 found a few bugs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5795) hbase-3927 breaks 0.92-0.94 compatibility

2012-04-14 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254224#comment-13254224
 ] 

stack commented on HBASE-5795:
--

Hmm... Its not hbase-3927 that broke compatibility, it seems rather to be this 
one that changes the RegionLoad VERSION:

{code}


r1238873 | tedyu | 2012-01-31 16:12:36 -0800 (Tue, 31 Jan 2012) | 2 lines   
  

 
HBASE-5256 Use WritableUtils.readVInt() in RegionLoad.readFields() (Mubarak) 
{code}

Looking at the patch, it breaks compatibility in a pretty radical way changing 
ints to vints on all RegionLoad members.

 hbase-3927 breaks 0.92-0.94 compatibility
 ---

 Key: HBASE-5795
 URL: https://issues.apache.org/jira/browse/HBASE-5795
 Project: HBase
  Issue Type: Bug
Reporter: stack

 This commit broke our 0.92/0.94 compatibility:
 {code}
 
 r1136686 | stack | 2011-06-16 14:18:08 -0700 (Thu, 16 Jun 2011) | 1 line
 HBASE-3927 display total uncompressed byte size of a region in web UI
 {code}
 I just tried the new RC for 0.94.  I brought up a 0.94 master on a 0.92 
 cluster and rather than just digest version 1 of the HServerLoad, I get this:
 {code}
 2012-04-14 22:47:59,752 WARN org.apache.hadoop.ipc.HBaseServer: Unable to 
 read call parameters for client 10.4.14.38
 java.io.IOException: Error in readFields
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:684)
 at 
 org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:125)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1269)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1184)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:722)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:513)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:488)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: A record version mismatch occured. Expecting v2, found v1
 at 
 org.apache.hadoop.io.VersionedWritable.readFields(VersionedWritable.java:46)
 at 
 org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:379)
 at 
 org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:686)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:681)
 ... 9 more
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5795) hbase-3927 breaks 0.92-0.94 compatibility

2012-04-14 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254227#comment-13254227
 ] 

stack commented on HBASE-5795:
--

I'd suggest backing out HBASE-5256.  Its a little weird in that it ups the 
VERSION on the inner class but not on the outer class.  Its not a critical fix 
either so we could probably do w/o it in 0.94.  Let me try removing it.

 hbase-3927 breaks 0.92-0.94 compatibility
 ---

 Key: HBASE-5795
 URL: https://issues.apache.org/jira/browse/HBASE-5795
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack

 This commit broke our 0.92/0.94 compatibility:
 {code}
 
 r1136686 | stack | 2011-06-16 14:18:08 -0700 (Thu, 16 Jun 2011) | 1 line
 HBASE-3927 display total uncompressed byte size of a region in web UI
 {code}
 I just tried the new RC for 0.94.  I brought up a 0.94 master on a 0.92 
 cluster and rather than just digest version 1 of the HServerLoad, I get this:
 {code}
 2012-04-14 22:47:59,752 WARN org.apache.hadoop.ipc.HBaseServer: Unable to 
 read call parameters for client 10.4.14.38
 java.io.IOException: Error in readFields
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:684)
 at 
 org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:125)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1269)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1184)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:722)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:513)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:488)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: A record version mismatch occured. Expecting v2, found v1
 at 
 org.apache.hadoop.io.VersionedWritable.readFields(VersionedWritable.java:46)
 at 
 org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:379)
 at 
 org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:686)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:681)
 ... 9 more
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5795) hbase-3927 breaks 0.92-0.94 compatibility

2012-04-14 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254229#comment-13254229
 ] 

stack commented on HBASE-5795:
--

Hmmm... not that easy.  This one messes us up too...

{code}

r1239157 | tedyu | 2012-02-01 06:56:20 -0800 (Wed, 01 Feb 2012) | 2 lines

HBASE-5283 Request counters may become negative for heavily loaded regions 
(Mubarak)
{code}

The above commit depends on hbase-5256.  If hbase-5256 were not in place, this 
would not break compatibility but since we have to back out hbase-5256, it 
does.  Looking..

 hbase-3927 breaks 0.92-0.94 compatibility
 ---

 Key: HBASE-5795
 URL: https://issues.apache.org/jira/browse/HBASE-5795
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack

 This commit broke our 0.92/0.94 compatibility:
 {code}
 
 r1136686 | stack | 2011-06-16 14:18:08 -0700 (Thu, 16 Jun 2011) | 1 line
 HBASE-3927 display total uncompressed byte size of a region in web UI
 {code}
 I just tried the new RC for 0.94.  I brought up a 0.94 master on a 0.92 
 cluster and rather than just digest version 1 of the HServerLoad, I get this:
 {code}
 2012-04-14 22:47:59,752 WARN org.apache.hadoop.ipc.HBaseServer: Unable to 
 read call parameters for client 10.4.14.38
 java.io.IOException: Error in readFields
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:684)
 at 
 org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:125)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1269)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1184)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:722)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:513)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:488)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: A record version mismatch occured. Expecting v2, found v1
 at 
 org.apache.hadoop.io.VersionedWritable.readFields(VersionedWritable.java:46)
 at 
 org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:379)
 at 
 org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:686)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:681)
 ... 9 more
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-13 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253454#comment-13253454
 ] 

stack commented on HBASE-5778:
--

I backed it out of 0.94 and trunk.

 Turn on WAL compression by default
 --

 Key: HBASE-5778
 URL: https://issues.apache.org/jira/browse/HBASE-5778
 Project: HBase
  Issue Type: Improvement
Reporter: Jean-Daniel Cryans
Assignee: Lars Hofhansl
Priority: Blocker
 Fix For: 0.94.0, 0.96.0

 Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch


 I ran some tests to verify if WAL compression should be turned on by default.
 For a use case where it's not very useful (values two order of magnitude 
 bigger than the keys), the insert time wasn't different and the CPU usage 15% 
 higher (150% CPU usage VS 130% when not compressing the WAL).
 When values are smaller than the keys, I saw a 38% improvement for the insert 
 run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure 
 WAL compression accounts for all the additional CPU usage, it might just be 
 that we're able to insert faster and we spend more time in the MemStore per 
 second (because our MemStores are bad when they contain tens of thousands of 
 values).
 Those are two extremes, but it shows that for the price of some CPU we can 
 save a lot. My machines have 2 quads with HT, so I still had a lot of idle 
 CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5784) Enable mvn deploy of website

2012-04-13 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253691#comment-13253691
 ] 

stack commented on HBASE-5784:
--

Committed to trunk

 Enable mvn deploy of website
 

 Key: HBASE-5784
 URL: https://issues.apache.org/jira/browse/HBASE-5784
 Project: HBase
  Issue Type: Improvement
Reporter: stack
Assignee: stack
 Fix For: 0.96.0

 Attachments: 5784.txt


 Up to this, deploy of website has been build local and then copy up to apache 
 and put it into place under /www/hbase.apache.org.  Change it so can have 
 maven deploy the site.  The good thing about having the latter do it is that 
 its regular; permissions will always be the same so Doug and I won't be 
 fighting each other when we stick stuff up there.  Also, its a one step 
 process rather than multiple.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5620) Convert the client protocol of HRegionInterface to PB

2012-04-13 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253708#comment-13253708
 ] 

stack commented on HBASE-5620:
--

It passed for me.  Let me commit this monster.

 Convert the client protocol of HRegionInterface to PB
 -

 Key: HBASE-5620
 URL: https://issues.apache.org/jira/browse/HBASE-5620
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.96.0

 Attachments: hbase-5620_v3.patch, hbase-5620_v4.patch, 
 hbase-5620_v4.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5620) Convert the client protocol of HRegionInterface to PB

2012-04-13 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253712#comment-13253712
 ] 

stack commented on HBASE-5620:
--

I think  src/main/java/org/apache/hadoop/hbase/protobuf/ClientProtocol.java is 
in wrong package.  Ditto for AdminProtocol.  What you think Jimmy?  Should we 
move them?  Where should they go?  At top level?  Or into client package?


 Convert the client protocol of HRegionInterface to PB
 -

 Key: HBASE-5620
 URL: https://issues.apache.org/jira/browse/HBASE-5620
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.96.0

 Attachments: hbase-5620_v3.patch, hbase-5620_v4.patch, 
 hbase-5620_v4.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5620) Convert the client protocol of HRegionInterface to PB

2012-04-13 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253714#comment-13253714
 ] 

stack commented on HBASE-5620:
--

Mind opening new issues Jimmy to do outstanding work like unit tests?

 Convert the client protocol of HRegionInterface to PB
 -

 Key: HBASE-5620
 URL: https://issues.apache.org/jira/browse/HBASE-5620
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.96.0

 Attachments: hbase-5620_v3.patch, hbase-5620_v4.patch, 
 hbase-5620_v4.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4336) Convert source tree into maven modules

2012-04-13 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253734#comment-13253734
 ] 

stack commented on HBASE-4336:
--

+1

Tell us more about the issue Jesse.  When I do mvn compile on a project of many 
modules, its fine except for the case where tests depend on the product of an 
earlier module?

 Convert source tree into maven modules
 --

 Key: HBASE-4336
 URL: https://issues.apache.org/jira/browse/HBASE-4336
 Project: HBase
  Issue Type: Task
  Components: build
Reporter: Gary Helmling
Priority: Critical
 Fix For: 0.96.0


 When we originally converted the build to maven we had a single core module 
 defined, but later reverted this to a module-less build for the sake of 
 simplicity.
 It now looks like it's time to re-address this, as we have an actual need for 
 modules to:
 * provide a trimmed down client library that applications can make use of
 * more cleanly support building against different versions of Hadoop, in 
 place of some of the reflection machinations currently required
 * incorporate the secure RPC engine that depends on some secure Hadoop classes
 I propose we start simply by refactoring into two initial modules:
 * core - common classes and utilities, and client-side code and interfaces
 * server - master and region server implementations and supporting code
 This would also lay the groundwork for incorporating the HBase security 
 features that have been developed.  Once the module structure is in place, 
 security-related features could then be incorporated into a third module -- 
 security -- after normal review and approval.  The security module could 
 then depend on secure Hadoop, without modifying the dependencies of the rest 
 of the HBase code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-13 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253778#comment-13253778
 ] 

stack commented on HBASE-5604:
--

Remove the Date stuff.  Just do basic ms.

 M/R tool to replay WAL files
 

 Key: HBASE-5604
 URL: https://issues.apache.org/jira/browse/HBASE-5604
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.0, 0.96.0

 Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt


 Just an idea I had. Might be useful for restore of a backup using the HLogs.
 This could an M/R (with a mapper per HLog file).
 The tool would get a timerange and a (set of) table(s). We'd pick the right 
 HLogs based on time before the M/R job is started and then have a mapper per 
 HLog file.
 The mapper would then go through the HLog, filter all WALEdits that didn't 
 fit into the time range or are not any of the tables and then uses 
 HFileOutputFormat to generate HFiles.
 Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5747) Forward port hbase-5708 [89-fb] Make MiniMapRedCluster directory a subdirectory of target/test

2012-04-13 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253832#comment-13253832
 ] 

stack commented on HBASE-5747:
--

TestWALPlayer is not because of this test and TestServerCustomProtocol passes 
locally.  Going to commit this v4.

 Forward port hbase-5708 [89-fb] Make MiniMapRedCluster directory a 
 subdirectory of target/test
 

 Key: HBASE-5747
 URL: https://issues.apache.org/jira/browse/HBASE-5747
 Project: HBase
  Issue Type: Task
Reporter: stack
Assignee: stack
Priority: Blocker
 Attachments: 5474.txt, 5474v2.txt, 5474v3 (1).txt, 5474v3.txt, 
 5708v4.txt, 5708v4.txt


 Forward port as much as we can of Mikhail's hard-won test cleanups over on 
 0.89 branch  Will improve our being able to run unit tests in //.  He also 
 found a few bugs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5620) Convert the client protocol of HRegionInterface to PB

2012-04-13 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253843#comment-13253843
 ] 

stack commented on HBASE-5620:
--

I did too and trunk is also no longer complaining.  The rat test is a PITA.  
There was probably some deitrus laying around that it picked up.  I modified 
the trunk build to keep the rat.txt report next time.

@Jimmy I think top-level is better than where it currently is.  What other 
Protocols would go up to the top level?  None I suppose.  I suppose they should 
be in client package but its a little perverse having the the client stuff 
reaching into zk and util and protobuf... 

 Convert the client protocol of HRegionInterface to PB
 -

 Key: HBASE-5620
 URL: https://issues.apache.org/jira/browse/HBASE-5620
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.96.0

 Attachments: hbase-5620_v3.patch, hbase-5620_v4.patch, 
 hbase-5620_v4.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5747) Forward port hbase-5708 [89-fb] Make MiniMapRedCluster directory a subdirectory of target/test

2012-04-13 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253894#comment-13253894
 ] 

stack commented on HBASE-5747:
--

@Mikhail I think it fair in cases like this where a bunch of the code base is 
touched that us frontier folk more familiar w/ trunk pitch in.  We probably 
know more whats portable and what to drop.

 Forward port hbase-5708 [89-fb] Make MiniMapRedCluster directory a 
 subdirectory of target/test
 

 Key: HBASE-5747
 URL: https://issues.apache.org/jira/browse/HBASE-5747
 Project: HBase
  Issue Type: Task
Reporter: stack
Assignee: stack
Priority: Blocker
 Fix For: 0.96.0

 Attachments: 5474.txt, 5474v2.txt, 5474v3 (1).txt, 5474v3.txt, 
 5708v4.txt, 5708v4.txt


 Forward port as much as we can of Mikhail's hard-won test cleanups over on 
 0.89 branch  Will improve our being able to run unit tests in //.  He also 
 found a few bugs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5620) Convert the client protocol of HRegionInterface to PB

2012-04-13 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253974#comment-13253974
 ] 

stack commented on HBASE-5620:
--

@Jimmy I think this is the biggest patch ever applied to HBase.  Congrats!

 Convert the client protocol of HRegionInterface to PB
 -

 Key: HBASE-5620
 URL: https://issues.apache.org/jira/browse/HBASE-5620
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.96.0

 Attachments: hbase-5620_v3.patch, hbase-5620_v4.patch, 
 hbase-5620_v4.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5620) Convert the client protocol of HRegionInterface to PB

2012-04-13 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253975#comment-13253975
 ] 

stack commented on HBASE-5620:
--

@Jimmy I think this is the biggest patch ever applied to HBase.  Congrats!

 Convert the client protocol of HRegionInterface to PB
 -

 Key: HBASE-5620
 URL: https://issues.apache.org/jira/browse/HBASE-5620
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.96.0

 Attachments: hbase-5620_v3.patch, hbase-5620_v4.patch, 
 hbase-5620_v4.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4336) Convert source tree into maven modules

2012-04-13 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253985#comment-13253985
 ] 

stack commented on HBASE-4336:
--

So, running test in a module scope, you cannot have dependencies outside of the 
module (You can depend on 3rd party jars but not ones made by this maven build 
-- or is it just test stuff?  Could security depend on hbase-common.jar in its 
tests?)

 Convert source tree into maven modules
 --

 Key: HBASE-4336
 URL: https://issues.apache.org/jira/browse/HBASE-4336
 Project: HBase
  Issue Type: Task
  Components: build
Reporter: Gary Helmling
Priority: Critical
 Fix For: 0.96.0


 When we originally converted the build to maven we had a single core module 
 defined, but later reverted this to a module-less build for the sake of 
 simplicity.
 It now looks like it's time to re-address this, as we have an actual need for 
 modules to:
 * provide a trimmed down client library that applications can make use of
 * more cleanly support building against different versions of Hadoop, in 
 place of some of the reflection machinations currently required
 * incorporate the secure RPC engine that depends on some secure Hadoop classes
 I propose we start simply by refactoring into two initial modules:
 * core - common classes and utilities, and client-side code and interfaces
 * server - master and region server implementations and supporting code
 This would also lay the groundwork for incorporating the HBase security 
 features that have been developed.  Once the module structure is in place, 
 security-related features could then be incorporated into a third module -- 
 security -- after normal review and approval.  The security module could 
 then depend on secure Hadoop, without modifying the dependencies of the rest 
 of the HBase code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5747) Forward port hbase-5708 [89-fb] Make MiniMapRedCluster directory a subdirectory of target/test

2012-04-12 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252228#comment-13252228
 ] 

stack commented on HBASE-5747:
--

Not sure why tests are not completing.  Running on a mac I see problem in this 
test:

{code}
Running org.apache.hadoop.hbase.zookeeper.TestZKLeaderManager
Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 4.415 sec  
FAILURE!

Results :

Failed tests:   
testLeaderSelection(org.apache.hadoop.hbase.zookeeper.TestZKLeaderManager): New 
leader should exist
{code}

 Forward port hbase-5708 [89-fb] Make MiniMapRedCluster directory a 
 subdirectory of target/test
 

 Key: HBASE-5747
 URL: https://issues.apache.org/jira/browse/HBASE-5747
 Project: HBase
  Issue Type: Task
Reporter: stack
Assignee: stack
Priority: Blocker
 Attachments: 5474.txt, 5474v2.txt, 5474v3 (1).txt, 5474v3.txt


 Forward port as much as we can of Mikhail's hard-won test cleanups over on 
 0.89 branch  Will improve our being able to run unit tests in //.  He also 
 found a few bugs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5754) data lost with gora continuous ingest test (goraci)

2012-04-12 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252504#comment-13252504
 ] 

stack commented on HBASE-5754:
--

Let me do the same.  I did not match generator map tasks to verify reducers.  
Then let me recreate the split issue Eric describes above.  Thanks lads.

 data lost with gora continuous ingest test (goraci)
 ---

 Key: HBASE-5754
 URL: https://issues.apache.org/jira/browse/HBASE-5754
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1
 Environment: 10 node test cluster
Reporter: Eric Newton
Assignee: stack

 Keith Turner re-wrote the accumulo continuous ingest test using gora, which 
 has both hbase and accumulo back-ends.
 I put a billion entries into HBase, and ran the Verify map/reduce job.  The 
 verification failed because about 21K entries were missing.  The goraci 
 [README|https://github.com/keith-turner/goraci] explains the test, and how it 
 detects missing data.
 I re-ran the test with 100 million entries, and it verified successfully.  
 Both of the times I tested using a billion entries, the verification failed.
 If I run the verification step twice, the results are consistent, so the 
 problem is
 probably not on the verify step.
 Here's the versions of the various packages:
 ||package||version||
 |hadoop|0.20.205.0|
 |hbase|0.92.1|
 |gora|http://svn.apache.org/repos/asf/gora/trunk r1311277|
 |goraci|https://github.com/ericnewton/goraci  tagged 2012-04-08|
 The change I made to goraci was to configure it for hbase and to allow it to 
 build properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5773) HtablePool constructor not reading config files in certain cases

2012-04-12 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252586#comment-13252586
 ] 

stack commented on HBASE-5773:
--

It doesn't apply to 0.90 branch.

 HtablePool constructor not reading config files in certain cases
 

 Key: HBASE-5773
 URL: https://issues.apache.org/jira/browse/HBASE-5773
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.6, 0.92.1, 0.94.1
Reporter: Ioan Eugen Stan
Priority: Minor
 Fix For: 0.92.2, 0.94.0

 Attachments: different-config-behaviour.patch


 Creating a HtablePool can issue two behaviour depanding on the constructor 
 called. 
 Case 1: loads the configs from hbase-site
   public HTablePool() {
 this(HBaseConfiguration.create(), Integer.MAX_VALUE);
   }
 Calling this with null values for Configuration: 
 public HTablePool(final Configuration config, final int maxSize) {
 this(config, maxSize, null, null);
   }
 will issue:
  public HTablePool(final Configuration config, final int maxSize,
   final HTableInterfaceFactory tableFactory, PoolType poolType) {
 // Make a new configuration instance so I can safely cleanup when
 // done with the pool.
 this.config = config == null ? new Configuration() : config;
 which does not read the hbase-site config files as 
 HBaseConfiguration.create() does. 
 I've tracked this problem to all versions of hbase. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3443) ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix

2012-04-12 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252709#comment-13252709
 ] 

stack commented on HBASE-3443:
--

0.94?

 ICV optimization to look in memstore first and then store files (HBASE-3082) 
 does not work when deletes are in the mix
 --

 Key: HBASE-3443
 URL: https://issues.apache.org/jira/browse/HBASE-3443
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.0, 0.90.1, 0.90.2, 0.90.3, 0.90.4, 0.90.5, 0.90.6, 
 0.92.0, 0.92.1
Reporter: Kannan Muthukkaruppan
Assignee: Lars Hofhansl
Priority: Critical
  Labels: corruption
 Fix For: 0.96.0

 Attachments: 3443.txt


 For incrementColumnValue() HBASE-3082 adds an optimization to check memstores 
 first, and only if not present in the memstore then check the store files. In 
 the presence of deletes, the above optimization is not reliable.
 If the column is marked as deleted in the memstore, one should not look 
 further into the store files. But currently, the code does so.
 Sample test code outline:
 {code}
 admin.createTable(desc)
 table = HTable.new(conf, tableName)
 table.incrementColumnValue(Bytes.toBytes(row), cf1name, 
 Bytes.toBytes(column), 5);
 admin.flush(tableName)
 sleep(2)
 del = Delete.new(Bytes.toBytes(row))
 table.delete(del)
 table.incrementColumnValue(Bytes.toBytes(row), cf1name, 
 Bytes.toBytes(column), 5);
 get = Get.new(Bytes.toBytes(row))
 keyValues = table.get(get).raw()
 keyValues.each do |keyValue|
   puts Expect 5; Got Value=#{Bytes.toLong(keyValue.getValue())};
 end
 {code}
 The above prints:
 {code}
 Expect 5; Got Value=10
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3443) ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix

2012-04-12 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252744#comment-13252744
 ] 

stack commented on HBASE-3443:
--

Oh, and if you don't fix it, you'll have to explain why you didn't to BenoƮt.

 ICV optimization to look in memstore first and then store files (HBASE-3082) 
 does not work when deletes are in the mix
 --

 Key: HBASE-3443
 URL: https://issues.apache.org/jira/browse/HBASE-3443
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.0, 0.90.1, 0.90.2, 0.90.3, 0.90.4, 0.90.5, 0.90.6, 
 0.92.0, 0.92.1
Reporter: Kannan Muthukkaruppan
Assignee: Lars Hofhansl
Priority: Critical
  Labels: corruption
 Fix For: 0.96.0

 Attachments: 3443.txt


 For incrementColumnValue() HBASE-3082 adds an optimization to check memstores 
 first, and only if not present in the memstore then check the store files. In 
 the presence of deletes, the above optimization is not reliable.
 If the column is marked as deleted in the memstore, one should not look 
 further into the store files. But currently, the code does so.
 Sample test code outline:
 {code}
 admin.createTable(desc)
 table = HTable.new(conf, tableName)
 table.incrementColumnValue(Bytes.toBytes(row), cf1name, 
 Bytes.toBytes(column), 5);
 admin.flush(tableName)
 sleep(2)
 del = Delete.new(Bytes.toBytes(row))
 table.delete(del)
 table.incrementColumnValue(Bytes.toBytes(row), cf1name, 
 Bytes.toBytes(column), 5);
 get = Get.new(Bytes.toBytes(row))
 keyValues = table.get(get).raw()
 keyValues.each do |keyValue|
   puts Expect 5; Got Value=#{Bytes.toLong(keyValue.getValue())};
 end
 {code}
 The above prints:
 {code}
 Expect 5; Got Value=10
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5777) MiniHBaseCluster cannot start multiple region servers

2012-04-12 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252881#comment-13252881
 ] 

stack commented on HBASE-5777:
--

We have an hbase-site.xml at src/test that is used when we run tests.  It 
disables the UI.  You think we should apply this patch too Jimmy?

 MiniHBaseCluster cannot start multiple region servers
 -

 Key: HBASE-5777
 URL: https://issues.apache.org/jira/browse/HBASE-5777
 Project: HBase
  Issue Type: Test
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: hbase-5777.patch


 MiniHBaseCluster can try to start multiple region servers.  But all of them 
 except one will die in putting up the web UI
 because of BindException since HConstants.REGIONSERVER_INFO_PORT_AUTO is set 
 to false by default.
 This issue will make many unit tests depending on multiple region servers 
 flaky, such as TestAdmin.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-12 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252947#comment-13252947
 ] 

stack commented on HBASE-5778:
--

+1  Add release note w/ how to turn it off

 Turn on WAL compression by default
 --

 Key: HBASE-5778
 URL: https://issues.apache.org/jira/browse/HBASE-5778
 Project: HBase
  Issue Type: Improvement
Reporter: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE-5778.patch


 I ran some tests to verify if WAL compression should be turned on by default.
 For a use case where it's not very useful (values two order of magnitude 
 bigger than the keys), the insert time wasn't different and the CPU usage 15% 
 higher (150% CPU usage VS 130% when not compressing the WAL).
 When values are smaller than the keys, I saw a 38% improvement for the insert 
 run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure 
 WAL compression accounts for all the additional CPU usage, it might just be 
 that we're able to insert faster and we spend more time in the MemStore per 
 second (because our MemStores are bad when they contain tens of thousands of 
 values).
 Those are two extremes, but it shows that for the price of some CPU we can 
 save a lot. My machines have 2 quads with HT, so I still had a lot of idle 
 CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5754) data lost with gora continuous ingest test (goraci)

2012-04-12 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253135#comment-13253135
 ] 

stack commented on HBASE-5754:
--

I ran w/ 10 generators and 10 slots for the verify step and got the below which 
doesn't prints out only a REFERENCED count.

Running these recent tests I let it do its natural splitting so it grew from 
zero to 260odd regions so maybe the issue you see Eric comes of manual splits 
coming out of the UI.  Let me try that next.

Thanks lads.

{code}
12/04/13 05:16:23 INFO mapred.JobClient:  map 100% reduce 99%
12/04/13 05:16:54 INFO mapred.JobClient:  map 100% reduce 100%
12/04/13 05:16:59 INFO mapred.JobClient: Job complete: job_201204092039_0046
12/04/13 05:16:59 INFO mapred.JobClient: Counters: 30
12/04/13 05:16:59 INFO mapred.JobClient:   Job Counters
12/04/13 05:16:59 INFO mapred.JobClient: Launched reduce tasks=10
12/04/13 05:16:59 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=30125694
12/04/13 05:16:59 INFO mapred.JobClient: Total time spent by all reduces 
waiting after reserving slots (ms)=0
12/04/13 05:16:59 INFO mapred.JobClient: Total time spent by all maps 
waiting after reserving slots (ms)=0
12/04/13 05:16:59 INFO mapred.JobClient: Rack-local map tasks=6
12/04/13 05:16:59 INFO mapred.JobClient: Launched map tasks=256
12/04/13 05:16:59 INFO mapred.JobClient: Data-local map tasks=250
12/04/13 05:16:59 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=5832198
12/04/13 05:16:59 INFO mapred.JobClient:   goraci.Verify$Counts
12/04/13 05:16:59 INFO mapred.JobClient: REFERENCED=10
12/04/13 05:16:59 INFO mapred.JobClient:   File Output Format Counters
12/04/13 05:16:59 INFO mapred.JobClient: Bytes Written=0
12/04/13 05:16:59 INFO mapred.JobClient:   FileSystemCounters
12/04/13 05:16:59 INFO mapred.JobClient: FILE_BYTES_READ=83022967343
12/04/13 05:16:59 INFO mapred.JobClient: HDFS_BYTES_READ=156414
12/04/13 05:16:59 INFO mapred.JobClient: FILE_BYTES_WRITTEN=112881560332
12/04/13 05:16:59 INFO mapred.JobClient:   File Input Format Counters
12/04/13 05:16:59 INFO mapred.JobClient: Bytes Read=0
12/04/13 05:16:59 INFO mapred.JobClient:   Map-Reduce Framework
12/04/13 05:16:59 INFO mapred.JobClient: Map output materialized 
bytes=29992170602
12/04/13 05:16:59 INFO mapred.JobClient: Map input records=10
12/04/13 05:16:59 INFO mapred.JobClient: Reduce shuffle bytes=29874879887
12/04/13 05:16:59 INFO mapred.JobClient: Spilled Records=7527086436
12/04/13 05:16:59 INFO mapred.JobClient: Map output bytes=25992155242
12/04/13 05:16:59 INFO mapred.JobClient: CPU time spent (ms)=20182570
12/04/13 05:16:59 INFO mapred.JobClient: Total committed heap usage 
(bytes)=99953082368
12/04/13 05:16:59 INFO mapred.JobClient: Combine input records=0
12/04/13 05:16:59 INFO mapred.JobClient: SPLIT_RAW_BYTES=156414
12/04/13 05:16:59 INFO mapred.JobClient: Reduce input records=20
12/04/13 05:16:59 INFO mapred.JobClient: Reduce input groups=10
12/04/13 05:16:59 INFO mapred.JobClient: Combine output records=0
12/04/13 05:16:59 INFO mapred.JobClient: Physical memory (bytes) 
snapshot=91762372608
12/04/13 05:16:59 INFO mapred.JobClient: Reduce output records=0
12/04/13 05:16:59 INFO mapred.JobClient: Virtual memory (bytes) 
snapshot=391126540288
12/04/13 05:16:59 INFO mapred.JobClient: Map output records=20
{code}

 data lost with gora continuous ingest test (goraci)
 ---

 Key: HBASE-5754
 URL: https://issues.apache.org/jira/browse/HBASE-5754
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1
 Environment: 10 node test cluster
Reporter: Eric Newton
Assignee: stack

 Keith Turner re-wrote the accumulo continuous ingest test using gora, which 
 has both hbase and accumulo back-ends.
 I put a billion entries into HBase, and ran the Verify map/reduce job.  The 
 verification failed because about 21K entries were missing.  The goraci 
 [README|https://github.com/keith-turner/goraci] explains the test, and how it 
 detects missing data.
 I re-ran the test with 100 million entries, and it verified successfully.  
 Both of the times I tested using a billion entries, the verification failed.
 If I run the verification step twice, the results are consistent, so the 
 problem is
 probably not on the verify step.
 Here's the versions of the various packages:
 ||package||version||
 |hadoop|0.20.205.0|
 |hbase|0.92.1|
 |gora|http://svn.apache.org/repos/asf/gora/trunk r1311277|
 |goraci|https://github.com/ericnewton/goraci  tagged 2012-04-08|
 The change I made to goraci was to configure it for hbase and to allow it to 
 build properly.

--
This message is automatically generated by JIRA.
If you think it was 

[jira] [Commented] (HBASE-5756) we can change defalult File Appender to RFA instead of DRFA.

2012-04-11 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13251950#comment-13251950
 ] 

stack commented on HBASE-5756:
--

Default is DRFA in 0.94 and before.  RFA after (0.96)

 we can change defalult File Appender to RFA instead of DRFA.
 

 Key: HBASE-5756
 URL: https://issues.apache.org/jira/browse/HBASE-5756
 Project: HBase
  Issue Type: Bug
Reporter: rohithsharma
Priority: Minor

 This can be a point of concern when on a certain day the logging happens more 
 because of more and more activity. In that case the log file for that day can 
 grow huge. These logs can not be opened for analysis since size is more.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5737) Minor Improvements related to balancer.

2012-04-11 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252207#comment-13252207
 ] 

stack commented on HBASE-5737:
--

Ram, the AM#setBalancer is not right?  Doesn't AM make a balancer instance of 
its own up in its constructor?  We should at least remove that.  Could we pass 
in the load balancer to use into the AM's constructor rather than call a 
setBalancer method?

 Minor Improvements related to balancer.
 ---

 Key: HBASE-5737
 URL: https://issues.apache.org/jira/browse/HBASE-5737
 Project: HBase
  Issue Type: Improvement
  Components: master
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Minor
 Attachments: HBASE-5737.patch, HBASE-5737_1.patch, HBASE-5737_2.patch


 Currently in Am.getAssignmentByTable()  we use a result map which is currenly 
 a hashmap.  It could be better if we have a treeMap.  Even in 
 MetaReader.fullScan we have the treeMap only so that we have the naming order 
 maintained. I felt this change could be very useful in cases where we are 
 extending the DefaultLoadBalancer.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4109) Hostname returned via reverse dns lookup contains trailing period if configured interface is not default

2012-04-10 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250729#comment-13250729
 ] 

stack commented on HBASE-4109:
--

@Adrian Forward port is over in hbase-5758.  I will commit later today.

 Hostname returned via reverse dns lookup contains trailing period if 
 configured interface is not default
 --

 Key: HBASE-4109
 URL: https://issues.apache.org/jira/browse/HBASE-4109
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver
Affects Versions: 0.90.3
Reporter: Shrijeet Paliwal
Assignee: Shrijeet Paliwal
 Fix For: 0.90.4

 Attachments: 
 0001-HBASE-4109-Sanitize-hostname-returned-from-DNS-class.patch


 If you are using an interface anything other than 'default' (literally that 
 keyword) DNS.java 's getDefaultHost will return a string which will 
 have a trailing period at the end. It seems javadoc of reverseDns in DNS.java 
 (see below) is conflicting with what that function is actually doing. 
 It is returning a PTR record while claims it returns a hostname. The PTR 
 record always has period at the end , RFC:  
 http://irbs.net/bog-4.9.5/bog47.html 
 We make call to DNS.getDefaultHost at more than one places and treat that as 
 actual hostname.
 Quoting HRegionServer for example
 {code}
 String machineName = DNS.getDefaultHost(conf.get(
 hbase.regionserver.dns.interface, default), conf.get(
 hbase.regionserver.dns.nameserver, default));
 {code}
 This causes inconsistencies. An example of such inconsistency was observed 
 while debugging the issue Regions not getting reassigned if RS is brought 
 down. More here 
 http://search-hadoop.com/m/CANUA1qRCkQ1 
 We may want to sanitize the string returned from DNS class. Or better we can 
 take a path of overhauling the way we do DNS name matching all over.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5728) Methods Missing in HTableInterface

2012-04-10 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250740#comment-13250740
 ] 

stack commented on HBASE-5728:
--

@Bing Yes.  If you are up for it.

@Lars Thanks for doing the research.  

 Methods Missing in HTableInterface
 --

 Key: HBASE-5728
 URL: https://issues.apache.org/jira/browse/HBASE-5728
 Project: HBase
  Issue Type: Improvement
  Components: client
Reporter: Bing Li

 Dear all,
 I found some methods existed in HTable were not in HTableInterface.
setAutoFlush
setWriteBufferSize
...
 In most cases, I manipulate HBase through HTableInterface from HTablePool. If 
 I need to use the above methods, how to do that?
 I am considering writing my own table pool if no proper ways. Is it fine?
 Thanks so much!
 Best regards,
 Bing

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5756) we can change defalult File Appender to RFA instead of DRFA.

2012-04-10 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250796#comment-13250796
 ] 

stack commented on HBASE-5756:
--

Its not clear what you are asking for Rohit.   Does this recent commit to TRUNK 
give you what you want? HBASE-5655

 we can change defalult File Appender to RFA instead of DRFA.
 

 Key: HBASE-5756
 URL: https://issues.apache.org/jira/browse/HBASE-5756
 Project: HBase
  Issue Type: Bug
Reporter: rohithsharma
Priority: Minor

 This can be a point of concern when on a certain day the logging happens more 
 because of more and more activity. In that case the log file for that day can 
 grow huge. These logs can not be opened for analysis since size is more.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4336) Convert source tree into maven modules

2012-04-10 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250808#comment-13250808
 ] 

stack commented on HBASE-4336:
--

Can we just have hbase-common?  No hbase-core.  No hbase-security (looks like 
security might be getting smashed into hbase-common).  Why do we need 
hbase-assemble?   It takes all that has gone before to package?  Is this a 
common pattern?  What about the profiles we currently have?  Like -Phadoop 
0.23.   Will those go away?

Thanks for doing this Jesse.  I think we should commit the refactor as long as 
its basically working.  We can fine tune later as we go.



 Convert source tree into maven modules
 --

 Key: HBASE-4336
 URL: https://issues.apache.org/jira/browse/HBASE-4336
 Project: HBase
  Issue Type: Task
  Components: build
Reporter: Gary Helmling
Priority: Critical
 Fix For: 0.96.0


 When we originally converted the build to maven we had a single core module 
 defined, but later reverted this to a module-less build for the sake of 
 simplicity.
 It now looks like it's time to re-address this, as we have an actual need for 
 modules to:
 * provide a trimmed down client library that applications can make use of
 * more cleanly support building against different versions of Hadoop, in 
 place of some of the reflection machinations currently required
 * incorporate the secure RPC engine that depends on some secure Hadoop classes
 I propose we start simply by refactoring into two initial modules:
 * core - common classes and utilities, and client-side code and interfaces
 * server - master and region server implementations and supporting code
 This would also lay the groundwork for incorporating the HBase security 
 features that have been developed.  Once the module structure is in place, 
 security-related features could then be incorporated into a third module -- 
 security -- after normal review and approval.  The security module could 
 then depend on secure Hadoop, without modifying the dependencies of the rest 
 of the HBase code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   3   4   5   6   7   8   9   10   >