[jira] [Commented] (HBASE-5231) Backport HBASE-3373 (per-table load balancing) to 0.92

2012-01-23 Thread gaojinchao (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13190945#comment-13190945
 ] 

gaojinchao commented on HBASE-5231:
---

I think we can do. 
regarding to log Done. Calculated a load balance in , we can move out 
balanceCluster.
move to below code ?

+  for (MapServerName, ListHRegionInfo assignments : 
assignmentsByTable.values()) {
+ListRegionPlan partialPlans = 
this.balancer.balanceCluster(assignments);
+if (partialPlans != null) plans.addAll(partialPlans);
   }

 Backport HBASE-3373 (per-table load balancing) to 0.92
 --

 Key: HBASE-5231
 URL: https://issues.apache.org/jira/browse/HBASE-5231
 Project: HBase
  Issue Type: Improvement
Reporter: Zhihong Yu
 Fix For: 0.92.1

 Attachments: 5231.txt


 This JIRA backports per-table load balancing to 0.90

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss

2012-01-23 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5179:
-

Fix Version/s: (was: 0.92.0)
   0.92.1

 Concurrent processing of processFaileOver and ServerShutdownHandler may cause 
 region to be assigned before log splitting is completed, causing data loss
 

 Key: HBASE-5179
 URL: https://issues.apache.org/jira/browse/HBASE-5179
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.2
Reporter: chunhui shen
Assignee: chunhui shen
Priority: Critical
 Fix For: 0.94.0, 0.92.1, 0.90.6

 Attachments: 5179-90.txt, 5179-90v10.patch, 5179-90v11.patch, 
 5179-90v12.patch, 5179-90v13.txt, 5179-90v14.patch, 5179-90v15.patch, 
 5179-90v16.patch, 5179-90v17.txt, 5179-90v2.patch, 5179-90v3.patch, 
 5179-90v4.patch, 5179-90v5.patch, 5179-90v6.patch, 5179-90v7.patch, 
 5179-90v8.patch, 5179-90v9.patch, 5179-92v17.patch, 5179-v11-92.txt, 
 5179-v11.txt, 5179-v2.txt, 5179-v3.txt, 5179-v4.txt, Errorlog, 
 hbase-5179.patch, hbase-5179v10.patch, hbase-5179v12.patch, 
 hbase-5179v17.patch, hbase-5179v5.patch, hbase-5179v6.patch, 
 hbase-5179v7.patch, hbase-5179v8.patch, hbase-5179v9.patch


 If master's processing its failover and ServerShutdownHandler's processing 
 happen concurrently, it may appear following  case.
 1.master completed splitLogAfterStartup()
 2.RegionserverA restarts, and ServerShutdownHandler is processing.
 3.master starts to rebuildUserRegions, and RegionserverA is considered as 
 dead server.
 4.master starts to assign regions of RegionserverA because it is a dead 
 server by step3.
 However, when doing step4(assigning region), ServerShutdownHandler may be 
 doing split log, Therefore, it may cause data loss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5237) Addendum for HBASE-5160 and HBASE-4397

2012-01-23 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5237:
-

Fix Version/s: (was: 0.92.0)
   0.92.1

 Addendum for HBASE-5160 and HBASE-4397
 --

 Key: HBASE-5237
 URL: https://issues.apache.org/jira/browse/HBASE-5237
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.5
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.1, 0.90.6

 Attachments: HBASE-5237_0.90.patch, HBASE-5237_trunk.patch


 As part of HBASE-4397 there is one more scenario where the patch has to be 
 applied.
 {code}
 RegionPlan plan = getRegionPlan(state, forceNewPlan);
   if (plan == null) {
 debugLog(state.getRegion(),
 Unable to determine a plan to assign  + state);
 return; // Should get reassigned later when RIT times out.
   }
 {code}
 I think in this scenario also 
 {code}
 this.timeoutMonitor.setAllRegionServersOffline(true);
 {code}
 this should be done.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3796) Per-Store Entries in Compaction Queue

2012-01-23 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-3796:
-

Fix Version/s: (was: 0.92.0)
   0.92.1

 Per-Store Entries in Compaction Queue
 -

 Key: HBASE-3796
 URL: https://issues.apache.org/jira/browse/HBASE-3796
 Project: HBase
  Issue Type: Bug
Reporter: Nicolas Spiegelberg
Assignee: Mikhail Bautin
Priority: Minor
 Fix For: 0.92.1

 Attachments: HBASE-3796-fixed.patch, HBASE-3796.patch


 Although compaction is decided on a per-store basis, right now the 
 CompactSplitThread only deals at the Region level for queueing.  Store-level 
 compaction queue entries will give us more visibility into compaction 
 workload + allow us to stop summarizing priorities.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5139) Compute (weighted) median using AggregateProtocol

2012-01-23 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191261#comment-13191261
 ] 

Hadoop QA commented on HBASE-5139:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12511514/5139.addendum
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/836//console

This message is automatically generated.

 Compute (weighted) median using AggregateProtocol
 -

 Key: HBASE-5139
 URL: https://issues.apache.org/jira/browse/HBASE-5139
 Project: HBase
  Issue Type: Sub-task
Reporter: Zhihong Yu
Assignee: Zhihong Yu
 Attachments: 5139-v2.txt, 5139.addendum


 Suppose cf:cq1 stores numeric values and optionally cf:cq2 stores weights. 
 This task finds out the median value among the values of cf:cq1 (See 
 http://www.stat.ucl.ac.be/ISdidactique/Rhelp/library/R.basic/html/weighted.median.html)
 This can be done in two passes.
 The first pass utilizes AggregateProtocol where the following tuple is 
 returned from each region:
 (partial-sum-of-values, partial-sum-of-weights)
 The start rowkey (supplied by coprocessor framework) would be used to sort 
 the tuples. This way we can determine which region (called R) contains the 
 (weighted) median. partial-sum-of-weights can be 0 if unweighted median is 
 sought
 The second pass involves scanning the table, beginning with startrow of 
 region R and computing partial (weighted) sum until the threshold of S/2 is 
 crossed. The (weighted) median is returned.
 However, this approach wouldn't work if there is mutation in the underlying 
 table between pass one and pass two.
 In that case, sequential scanning seems to be the solution which is slower 
 than the above approach.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-5231) Backport HBASE-3373 (per-table load balancing) to 0.92

2012-01-23 Thread Zhihong Yu (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13190945#comment-13190945
 ] 

Zhihong Yu edited comment on HBASE-5231 at 1/23/12 5:16 PM:


I think we can do. 
regarding to log Done. Calculated a load balance in , we can move out 
balanceCluster.
move to below code ?
{code}
+  for (MapServerName, ListHRegionInfo assignments : 
assignmentsByTable.values()) {
+ListRegionPlan partialPlans = 
this.balancer.balanceCluster(assignments);
+if (partialPlans != null) plans.addAll(partialPlans);
   }
{code}

  was (Author: sunnygao):
I think we can do. 
regarding to log Done. Calculated a load balance in , we can move out 
balanceCluster.
move to below code ?

+  for (MapServerName, ListHRegionInfo assignments : 
assignmentsByTable.values()) {
+ListRegionPlan partialPlans = 
this.balancer.balanceCluster(assignments);
+if (partialPlans != null) plans.addAll(partialPlans);
   }
  
 Backport HBASE-3373 (per-table load balancing) to 0.92
 --

 Key: HBASE-5231
 URL: https://issues.apache.org/jira/browse/HBASE-5231
 Project: HBase
  Issue Type: Improvement
Reporter: Zhihong Yu
 Fix For: 0.92.1

 Attachments: 5231.txt


 This JIRA backports per-table load balancing to 0.90

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5258) Move coprocessors set out of RegionLoad

2012-01-23 Thread Zhihong Yu (Created) (JIRA)
Move coprocessors set out of RegionLoad
---

 Key: HBASE-5258
 URL: https://issues.apache.org/jira/browse/HBASE-5258
 Project: HBase
  Issue Type: Task
Reporter: Zhihong Yu


When I worked on HBASE-5256, I revisited the code related to Ser/De of 
coprocessors set in RegionLoad.

I think the rationale for embedding coprocessors set is for maximum flexibility 
where each region can load different coprocessors.
This flexibility is causing extra cost in the region server to Master 
communication and increasing the footprint of Master heap.

Would HServerLoad be a better place for this set ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5255) Use singletons for OperationStatus to save memory

2012-01-23 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5255:
--

Attachment: 5255-v2.txt

Patch v2 makes exceptionMsg and code fields final.

 Use singletons for OperationStatus to save memory
 -

 Key: HBASE-5255
 URL: https://issues.apache.org/jira/browse/HBASE-5255
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.92.0, 0.90.5
Reporter: Benoit Sigoure
Assignee: Benoit Sigoure
Priority: Minor
  Labels: performance
 Fix For: 0.94.0, 0.92.1

 Attachments: 5255-v2.txt, 
 HBASE-5255-0.92-Use-singletons-to-remove-unnecessary-memory-allocati.patch, 
 HBASE-5255-trunk-Use-singletons-to-remove-unnecessary-memory-allocati.patch


 Every single {{Put}} causes the allocation of at least one 
 {{OperationStatus}}, yet {{OperationStatus}} is almost always stateless, so 
 these allocations are unnecessary and could be avoided.  Attached patch adds 
 a few singletons and uses them, with no public API change.  I didn't test the 
 patches, but you get the idea.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5231) Backport HBASE-3373 (per-table load balancing) to 0.92

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191281#comment-13191281
 ] 

Zhihong Yu commented on HBASE-5231:
---

The above mentioned log marks the completion of balancing each table (or the 
whole cluster) where actual region movement is scheduled.
I feel we can leave it there for now.

 Backport HBASE-3373 (per-table load balancing) to 0.92
 --

 Key: HBASE-5231
 URL: https://issues.apache.org/jira/browse/HBASE-5231
 Project: HBase
  Issue Type: Improvement
Reporter: Zhihong Yu
 Fix For: 0.92.1

 Attachments: 5231.txt


 This JIRA backports per-table load balancing to 0.90

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5231) Backport HBASE-3373 (per-table load balancing) to 0.92

2012-01-23 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5231:
--

Attachment: 5231-v2.txt

Patch v2 which I am going to integrate later today.

 Backport HBASE-3373 (per-table load balancing) to 0.92
 --

 Key: HBASE-5231
 URL: https://issues.apache.org/jira/browse/HBASE-5231
 Project: HBase
  Issue Type: Improvement
Reporter: Zhihong Yu
 Fix For: 0.92.1

 Attachments: 5231-v2.txt, 5231.txt


 This JIRA backports per-table load balancing to 0.90

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5240) HBase internalscanner.next javadoc doesn't imply whether or not results are appended or not

2012-01-23 Thread Alex Newman (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Newman updated HBASE-5240:
---

Status: Patch Available  (was: Open)

 HBase internalscanner.next javadoc doesn't imply whether or not results are 
 appended or not
 ---

 Key: HBASE-5240
 URL: https://issues.apache.org/jira/browse/HBASE-5240
 Project: HBase
  Issue Type: Bug
Reporter: Alex Newman
Assignee: Alex Newman
 Attachments: 
 0001-HBASE-5240.-HBase-internalscanner.next-javadoc-doesn.patch


 Just looking at 
 http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/regionserver/InternalScanner.html.
  We don't know whether or not the results are appended to results list, or if 
 we always clear it first.
 boolean   next(ListKeyValue results)
   Grab the next row's worth of values.
  boolean  next(ListKeyValue result, int limit)
   Grab the next row's worth of values with a limit on the number of 
 values to return.
  
 Method Detail
 next
 boolean next(ListKeyValue results)
  throws IOException
 Grab the next row's worth of values.
 Parameters:
 results - return output array 
 Returns:
 true if more rows exist after this one, false if scanner is done 
 Throws:
 IOException - e
 next
 boolean next(ListKeyValue result,
  int limit)
  throws IOException
 Grab the next row's worth of values with a limit on the number of values 
 to return.
 Parameters:
 result - return output array
 limit - limit on row count to get 
 Returns:
 true if more rows exist after this one, false if scanner is done 
 Throws:
 IOException - e

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5240) HBase internalscanner.next javadoc doesn't imply whether or not results are appended or not

2012-01-23 Thread Alex Newman (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Newman updated HBASE-5240:
---

Attachment: 0001-HBASE-5240.-HBase-internalscanner.next-javadoc-doesn.patch

 HBase internalscanner.next javadoc doesn't imply whether or not results are 
 appended or not
 ---

 Key: HBASE-5240
 URL: https://issues.apache.org/jira/browse/HBASE-5240
 Project: HBase
  Issue Type: Bug
Reporter: Alex Newman
Assignee: Alex Newman
 Attachments: 
 0001-HBASE-5240.-HBase-internalscanner.next-javadoc-doesn.patch


 Just looking at 
 http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/regionserver/InternalScanner.html.
  We don't know whether or not the results are appended to results list, or if 
 we always clear it first.
 boolean   next(ListKeyValue results)
   Grab the next row's worth of values.
  boolean  next(ListKeyValue result, int limit)
   Grab the next row's worth of values with a limit on the number of 
 values to return.
  
 Method Detail
 next
 boolean next(ListKeyValue results)
  throws IOException
 Grab the next row's worth of values.
 Parameters:
 results - return output array 
 Returns:
 true if more rows exist after this one, false if scanner is done 
 Throws:
 IOException - e
 next
 boolean next(ListKeyValue result,
  int limit)
  throws IOException
 Grab the next row's worth of values with a limit on the number of values 
 to return.
 Parameters:
 result - return output array
 limit - limit on row count to get 
 Returns:
 true if more rows exist after this one, false if scanner is done 
 Throws:
 IOException - e

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3796) Per-Store Entries in Compaction Queue

2012-01-23 Thread Mikhail Bautin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-3796:
--

Release Note:   (was: Sorry, it seems like I re-opened the wrong patch 
instead of HBASE-3976. Restoring the Fixed status.)

 Per-Store Entries in Compaction Queue
 -

 Key: HBASE-3796
 URL: https://issues.apache.org/jira/browse/HBASE-3796
 Project: HBase
  Issue Type: Bug
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Minor
 Fix For: 0.92.1

 Attachments: HBASE-3796-fixed.patch, HBASE-3796.patch


 Although compaction is decided on a per-store basis, right now the 
 CompactSplitThread only deals at the Region level for queueing.  Store-level 
 compaction queue entries will give us more visibility into compaction 
 workload + allow us to stop summarizing priorities.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-3796) Per-Store Entries in Compaction Queue

2012-01-23 Thread Mikhail Bautin (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin resolved HBASE-3796.
---

  Resolution: Fixed
Assignee: Nicolas Spiegelberg  (was: Mikhail Bautin)
Release Note: Sorry, it seems like I re-opened the wrong patch instead of 
HBASE-3976. Restoring the Fixed status.

 Per-Store Entries in Compaction Queue
 -

 Key: HBASE-3796
 URL: https://issues.apache.org/jira/browse/HBASE-3796
 Project: HBase
  Issue Type: Bug
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Minor
 Fix For: 0.92.1

 Attachments: HBASE-3796-fixed.patch, HBASE-3796.patch


 Although compaction is decided on a per-store basis, right now the 
 CompactSplitThread only deals at the Region level for queueing.  Store-level 
 compaction queue entries will give us more visibility into compaction 
 workload + allow us to stop summarizing priorities.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Reopened] (HBASE-3976) Disable Block Cache On Compactions

2012-01-23 Thread Mikhail Bautin (Reopened) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin reopened HBASE-3976:
---

  Assignee: Mikhail Bautin  (was: Nicolas Spiegelberg)

Re-opening until we add a unit test and implement a proper fix.

 Disable Block Cache On Compactions
 --

 Key: HBASE-3976
 URL: https://issues.apache.org/jira/browse/HBASE-3976
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.90.3
Reporter: Karthick Sankarachary
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: HBASE-3976-V3.patch, HBASE-3976-unconditional.patch, 
 HBASE-3976.patch


 Is there a good reason to believe that caching blocks during compactions is 
 beneficial? Currently, if block cache is enabled on a certain family, then 
 every time it's compacted, we load all of its blocks into the (LRU) cache, at 
 the expense of the legitimately hot ones.
 As a matter of fact, this concern was raised earlier in HBASE-1597, which 
 rightly points out that, we should not bog down the LRU with unneccessary 
 blocks during compaction. Even though that issue has been marked as fixed, 
 it looks like it ought to be reopened.
 Should we err on the side of caution and not cache blocks during compactions 
 period (as illustrated in the attached patch)? Or, can we be selectively 
 aggressive about what blocks do get cached during compaction (e.g., only 
 cache those blocks from the recent files)?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3796) Per-Store Entries in Compaction Queue

2012-01-23 Thread Mikhail Bautin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191303#comment-13191303
 ] 

Mikhail Bautin commented on HBASE-3796:
---

Sorry, it seems like I re-opened the wrong patch instead of HBASE-3976. 
Restoring the Fixed status.

 Per-Store Entries in Compaction Queue
 -

 Key: HBASE-3796
 URL: https://issues.apache.org/jira/browse/HBASE-3796
 Project: HBase
  Issue Type: Bug
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Minor
 Fix For: 0.92.1

 Attachments: HBASE-3796-fixed.patch, HBASE-3796.patch


 Although compaction is decided on a per-store basis, right now the 
 CompactSplitThread only deals at the Region level for queueing.  Store-level 
 compaction queue entries will give us more visibility into compaction 
 workload + allow us to stop summarizing priorities.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4920) We need a mascot, a totem

2012-01-23 Thread Marcy Davis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191306#comment-13191306
 ] 

Marcy Davis commented on HBASE-4920:


I will have my friend play around with the Orca image some more based on 
everyone's comments. @Lars Hofhansl, do you have an image of an octupus you 
want to suggest?

 We need a mascot, a totem
 -

 Key: HBASE-4920
 URL: https://issues.apache.org/jira/browse/HBASE-4920
 Project: HBase
  Issue Type: Task
Reporter: stack
 Attachments: HBase Orca Logo.jpg, Orca_479990801.jpg, Screen shot 
 2011-11-30 at 4.06.17 PM.png, photo (2).JPG


 We need a totem for our t-shirt that is yet to be printed.  O'Reilly owns the 
 Clyesdale.  We need something else.
 We could have a fluffy little duck that quacks 'hbase!' when you squeeze it 
 and we could order boxes of them from some off-shore sweatshop that 
 subcontracts to a contractor who employs child labor only.
 Or we could have an Orca (Big!, Fast!, Killer!, and in a poem that Marcy from 
 Salesforce showed me, that was a bit too spiritual for me to be seen quoting 
 here, it had the Orca as the 'Guardian of the Cosmic Memory': i.e. in 
 translation, bigdata).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5255) Use singletons for OperationStatus to save memory

2012-01-23 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191307#comment-13191307
 ] 

Hadoop QA commented on HBASE-5255:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12511518/5255-v2.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated -145 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 84 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.mapreduce.TestImportTsv
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/837//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/837//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/837//console

This message is automatically generated.

 Use singletons for OperationStatus to save memory
 -

 Key: HBASE-5255
 URL: https://issues.apache.org/jira/browse/HBASE-5255
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.92.0, 0.90.5
Reporter: Benoit Sigoure
Assignee: Benoit Sigoure
Priority: Minor
  Labels: performance
 Fix For: 0.94.0, 0.92.1

 Attachments: 5255-v2.txt, 
 HBASE-5255-0.92-Use-singletons-to-remove-unnecessary-memory-allocati.patch, 
 HBASE-5255-trunk-Use-singletons-to-remove-unnecessary-memory-allocati.patch


 Every single {{Put}} causes the allocation of at least one 
 {{OperationStatus}}, yet {{OperationStatus}} is almost always stateless, so 
 these allocations are unnecessary and could be avoided.  Attached patch adds 
 a few singletons and uses them, with no public API change.  I didn't test the 
 patches, but you get the idea.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4920) We need a mascot, a totem

2012-01-23 Thread Marcy Davis (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcy Davis updated HBASE-4920:
---

Attachment: apache hbase orca logo_Proof 3.pdf

Here are a few other Orca design options (2 in black and white). 

 We need a mascot, a totem
 -

 Key: HBASE-4920
 URL: https://issues.apache.org/jira/browse/HBASE-4920
 Project: HBase
  Issue Type: Task
Reporter: stack
 Attachments: HBase Orca Logo.jpg, Orca_479990801.jpg, Screen shot 
 2011-11-30 at 4.06.17 PM.png, apache hbase orca logo_Proof 3.pdf, photo 
 (2).JPG


 We need a totem for our t-shirt that is yet to be printed.  O'Reilly owns the 
 Clyesdale.  We need something else.
 We could have a fluffy little duck that quacks 'hbase!' when you squeeze it 
 and we could order boxes of them from some off-shore sweatshop that 
 subcontracts to a contractor who employs child labor only.
 Or we could have an Orca (Big!, Fast!, Killer!, and in a poem that Marcy from 
 Salesforce showed me, that was a bit too spiritual for me to be seen quoting 
 here, it had the Orca as the 'Guardian of the Cosmic Memory': i.e. in 
 translation, bigdata).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5210) HFiles are missing from an incremental load

2012-01-23 Thread Jimmy Xiang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191310#comment-13191310
 ] 

Jimmy Xiang commented on HBASE-5210:


Any fix in getRandomFilename will just reduce the chance of file name 
collision.  Since this a rare case, I think it may be better to just fail the 
task if failed to commit the files in the moveTaskOutputs(), without 
overwriting the existing files.  In HDFS 0.23, rename() takes an option not to 
overwrite.  With HADOOP 0.20, we can just do our best to check any conflicts 
before committing the files.

 HFiles are missing from an incremental load
 ---

 Key: HBASE-5210
 URL: https://issues.apache.org/jira/browse/HBASE-5210
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.2
 Environment: HBase 0.90.2 with Hadoop-0.20.2 (with durable sync).  
 RHEL 2.6.18-164.15.1.el5.  4 node cluster (1 master, 3 slaves)
Reporter: Lawrence Simpson
 Attachments: HBASE-5210-crazy-new-getRandomFilename.patch


 We run an overnight map/reduce job that loads data from an external source 
 and adds that data to an existing HBase table.  The input files have been 
 loaded into hdfs.  The map/reduce job uses the HFileOutputFormat (and the 
 TotalOrderPartitioner) to create HFiles which are subsequently added to the 
 HBase table.  On at least two separate occasions (that we know of), a range 
 of output would be missing for a given day.  The range of keys for the 
 missing values corresponded to those of a particular region.  This implied 
 that a complete HFile somehow went missing from the job.  Further 
 investigation revealed the following:
  * Two different reducers (running in separate JVMs and thus separate class 
 loaders)
  * in the same server can end up using the same file names for their
  * HFiles.  The scenario is as follows:
  *1.  Both reducers start near the same time.
  *2.  The first reducer reaches the point where it wants to write its 
 first file.
  *3.  It uses the StoreFile class which contains a static Random 
 object 
  *which is initialized by default using a timestamp.
  *4.  The file name is generated using the random number generator.
  *5.  The file name is checked against other existing files.
  *6.  The file is written into temporary files in a directory named
  *after the reducer attempt.
  *7.  The second reduce task reaches the same point, but its 
 StoreClass
  *(which is now in the file system's cache) gets loaded within the
  *time resolution of the OS and thus initializes its Random()
  *object with the same seed as the first task.
  *8.  The second task also checks for an existing file with the name
  *generated by the random number generator and finds no conflict
  *because each task is writing files in its own temporary folder.
  *9.  The first task finishes and gets its temporary files committed
  *to the real folder specified for output of the HFiles.
  * 10.The second task then reaches its own conclusion and commits its
  *files (moveTaskOutputs).  The released Hadoop code just 
 overwrites
  *any files with the same name.  No warning messages or anything.
  *The first task's HFiles just go missing.
  * 
  *  Note:  The reducers here are NOT different attempts at the same 
  *reduce task.  They are different reduce tasks so data is
  *really lost.
 I am currently testing a fix in which I have added code to the Hadoop 
 FileOutputCommitter.moveTaskOutputs method to check for a conflict with
 an existing file in the final output folder and to rename the HFile if
 needed.  This may not be appropriate for all uses of FileOutputFormat.
 So I have put this into a new class which is then used by a subclass of
 HFileOutputFormat.  Subclassing of FileOutputCommitter itself was a bit 
 more of a problem due to private declarations.
 I don't know if my approach is the best fix for the problem.  If someone
 more knowledgeable than myself deems that it is, I will be happy to share
 what I have done and by that time I may have some information on the
 results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5255) Use singletons for OperationStatus to save memory

2012-01-23 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5255:
--

Attachment: 5255-92.txt

Patch for 0.92 branch

 Use singletons for OperationStatus to save memory
 -

 Key: HBASE-5255
 URL: https://issues.apache.org/jira/browse/HBASE-5255
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.92.0, 0.90.5
Reporter: Benoit Sigoure
Assignee: Benoit Sigoure
Priority: Minor
  Labels: performance
 Fix For: 0.94.0, 0.92.1

 Attachments: 5255-92.txt, 5255-v2.txt, 
 HBASE-5255-0.92-Use-singletons-to-remove-unnecessary-memory-allocati.patch, 
 HBASE-5255-trunk-Use-singletons-to-remove-unnecessary-memory-allocati.patch


 Every single {{Put}} causes the allocation of at least one 
 {{OperationStatus}}, yet {{OperationStatus}} is almost always stateless, so 
 these allocations are unnecessary and could be avoided.  Attached patch adds 
 a few singletons and uses them, with no public API change.  I didn't test the 
 patches, but you get the idea.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5257) Allow filter to be evaluated after version handling

2012-01-23 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191313#comment-13191313
 ] 

Lars Hofhansl commented on HBASE-5257:
--

@Ted: Linked the issues instead.

As for this issue... For maximum flexibility and to avoid introducing wire 
incompatibility I propose a small code change in ScanQueryMatcher and a new 
VersionFilterWrapper that takes two Filters (both of course can the 
FilterLists), the first is evaluated pre column tracker, the 2nd is run post 
column tracker.


 Allow filter to be evaluated after version handling
 ---

 Key: HBASE-5257
 URL: https://issues.apache.org/jira/browse/HBASE-5257
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl

 There are various usecases and filter types where evaluating the filter 
 before version are handled either do not make sense, or make filter handling 
 more complicated.
 Also see this comment in ScanQueryMatcher:
 {code}
 /**
  * Filters should be checked before checking column trackers. If we do
  * otherwise, as was previously being done, ColumnTracker may increment 
 its
  * counter for even that KV which may be discarded later on by Filter. 
 This
  * would lead to incorrect results in certain cases.
  */
 {code}
 So we had Filters after the column trackers (which do the version checking), 
 and then moved it.
 Should be at the discretion of the Filter.
 Could either add a new method to FilterBase (maybe excludeVersions() or 
 something). Or have a new Filter wrapper (like WhileMatchFilter), that should 
 only be used as outmost filter and indicates the same (maybe 
 ExcludeVersionsFilter).
 See latest comments on HBASE-5229 for motivation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5210) HFiles are missing from an incremental load

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191314#comment-13191314
 ] 

Zhihong Yu commented on HBASE-5210:
---

I prefer Lawrence's approach.
The only consideration is that it takes relatively long for the proposed change 
in FileOutputCommitter.moveTaskOutputs() to be published, reviewed and pushed 
upstream.

 HFiles are missing from an incremental load
 ---

 Key: HBASE-5210
 URL: https://issues.apache.org/jira/browse/HBASE-5210
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.2
 Environment: HBase 0.90.2 with Hadoop-0.20.2 (with durable sync).  
 RHEL 2.6.18-164.15.1.el5.  4 node cluster (1 master, 3 slaves)
Reporter: Lawrence Simpson
 Attachments: HBASE-5210-crazy-new-getRandomFilename.patch


 We run an overnight map/reduce job that loads data from an external source 
 and adds that data to an existing HBase table.  The input files have been 
 loaded into hdfs.  The map/reduce job uses the HFileOutputFormat (and the 
 TotalOrderPartitioner) to create HFiles which are subsequently added to the 
 HBase table.  On at least two separate occasions (that we know of), a range 
 of output would be missing for a given day.  The range of keys for the 
 missing values corresponded to those of a particular region.  This implied 
 that a complete HFile somehow went missing from the job.  Further 
 investigation revealed the following:
  * Two different reducers (running in separate JVMs and thus separate class 
 loaders)
  * in the same server can end up using the same file names for their
  * HFiles.  The scenario is as follows:
  *1.  Both reducers start near the same time.
  *2.  The first reducer reaches the point where it wants to write its 
 first file.
  *3.  It uses the StoreFile class which contains a static Random 
 object 
  *which is initialized by default using a timestamp.
  *4.  The file name is generated using the random number generator.
  *5.  The file name is checked against other existing files.
  *6.  The file is written into temporary files in a directory named
  *after the reducer attempt.
  *7.  The second reduce task reaches the same point, but its 
 StoreClass
  *(which is now in the file system's cache) gets loaded within the
  *time resolution of the OS and thus initializes its Random()
  *object with the same seed as the first task.
  *8.  The second task also checks for an existing file with the name
  *generated by the random number generator and finds no conflict
  *because each task is writing files in its own temporary folder.
  *9.  The first task finishes and gets its temporary files committed
  *to the real folder specified for output of the HFiles.
  * 10.The second task then reaches its own conclusion and commits its
  *files (moveTaskOutputs).  The released Hadoop code just 
 overwrites
  *any files with the same name.  No warning messages or anything.
  *The first task's HFiles just go missing.
  * 
  *  Note:  The reducers here are NOT different attempts at the same 
  *reduce task.  They are different reduce tasks so data is
  *really lost.
 I am currently testing a fix in which I have added code to the Hadoop 
 FileOutputCommitter.moveTaskOutputs method to check for a conflict with
 an existing file in the final output folder and to rename the HFile if
 needed.  This may not be appropriate for all uses of FileOutputFormat.
 So I have put this into a new class which is then used by a subclass of
 HFileOutputFormat.  Subclassing of FileOutputCommitter itself was a bit 
 more of a problem due to private declarations.
 I don't know if my approach is the best fix for the problem.  If someone
 more knowledgeable than myself deems that it is, I will be happy to share
 what I have done and by that time I may have some information on the
 results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5230) Unit test to ensure compactions don't cache data on write

2012-01-23 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5230:
---

Attachment: D1353.3.patch

mbautin updated the revision [jira] [HBASE-5230] Extend TestCacheOnWrite to 
ensure we don't cache data blocks on compaction.
Reviewers: nspiegelberg, tedyu, Liyin, stack, JIRA

  Rebasing on trunk changes.

REVISION DETAIL
  https://reviews.facebook.net/D1353

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
  src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java


 Unit test to ensure compactions don't cache data on write
 -

 Key: HBASE-5230
 URL: https://issues.apache.org/jira/browse/HBASE-5230
 Project: HBase
  Issue Type: Test
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch


 Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
 write during compactions even if cache-on-write is enabled generally 
 enabled). This is because we have very different implementations of 
 HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
 and with CacheConfig (presumably it's there but not sure if it even works, 
 since the patch in HBASE-3976 may not have been committed). We need to create 
 a unit test to verify that we don't cache data blocks on write during 
 compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4920) We need a mascot, a totem

2012-01-23 Thread Enis Soztutar (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191317#comment-13191317
 ] 

Enis Soztutar commented on HBASE-4920:
--

Orca +1, Hadoop - Elephant, HBase - Orca makes sense in my view. I liked 
design option 2's as well, can your friend put together the logo with the HBase 
text, to see how they look together. 

 We need a mascot, a totem
 -

 Key: HBASE-4920
 URL: https://issues.apache.org/jira/browse/HBASE-4920
 Project: HBase
  Issue Type: Task
Reporter: stack
 Attachments: HBase Orca Logo.jpg, Orca_479990801.jpg, Screen shot 
 2011-11-30 at 4.06.17 PM.png, apache hbase orca logo_Proof 3.pdf, photo 
 (2).JPG


 We need a totem for our t-shirt that is yet to be printed.  O'Reilly owns the 
 Clyesdale.  We need something else.
 We could have a fluffy little duck that quacks 'hbase!' when you squeeze it 
 and we could order boxes of them from some off-shore sweatshop that 
 subcontracts to a contractor who employs child labor only.
 Or we could have an Orca (Big!, Fast!, Killer!, and in a poem that Marcy from 
 Salesforce showed me, that was a bit too spiritual for me to be seen quoting 
 here, it had the Orca as the 'Guardian of the Cosmic Memory': i.e. in 
 translation, bigdata).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5230) Unit test to ensure compactions don't cache data on write

2012-01-23 Thread Mikhail Bautin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-5230:
--

Attachment: Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch

Attaching the most recent patch (rebased on trunk changes -- maybe even 
identical).

 Unit test to ensure compactions don't cache data on write
 -

 Key: HBASE-5230
 URL: https://issues.apache.org/jira/browse/HBASE-5230
 Project: HBase
  Issue Type: Test
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch


 Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
 write during compactions even if cache-on-write is enabled generally 
 enabled). This is because we have very different implementations of 
 HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
 and with CacheConfig (presumably it's there but not sure if it even works, 
 since the patch in HBASE-3976 may not have been committed). We need to create 
 a unit test to verify that we don't cache data blocks on write during 
 compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5230) Unit test to ensure compactions don't cache data on write

2012-01-23 Thread Mikhail Bautin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191320#comment-13191320
 ] 

Mikhail Bautin commented on HBASE-5230:
---

@Ted: the unit test failure at 
https://builds.apache.org/job/PreCommit-HBASE-Build/824//testReport/org.apache.hadoop.hbase.regionserver/TestAtomicOperation/testRowMutationMultiThreads/
 seems unrelated. Is this patch OK to be committed? (We can wait for another 
run of unit tests if necessary, I've just re-uploaded the patch.)

 Unit test to ensure compactions don't cache data on write
 -

 Key: HBASE-5230
 URL: https://issues.apache.org/jira/browse/HBASE-5230
 Project: HBase
  Issue Type: Test
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch


 Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
 write during compactions even if cache-on-write is enabled generally 
 enabled). This is because we have very different implementations of 
 HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
 and with CacheConfig (presumably it's there but not sure if it even works, 
 since the patch in HBASE-3976 may not have been committed). We need to create 
 a unit test to verify that we don't cache data blocks on write during 
 compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5240) HBase internalscanner.next javadoc doesn't imply whether or not results are appended or not

2012-01-23 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191328#comment-13191328
 ] 

jirapos...@reviews.apache.org commented on HBASE-5240:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3594/
---

Review request for hbase.


Summary
---

Just looking at 
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/regionserver/InternalScanner.html.
 We don't know whether or not the results are appended to results list, or if 
we always clear it first.

boolean next(ListKeyValue results)
Grab the next row's worth of values.
boolean next(ListKeyValue result, int limit)
Grab the next row's worth of values with a limit on the number of values to 
return.

Method Detail
next

boolean next(ListKeyValue results)
throws IOException

Grab the next row's worth of values.

Parameters:
results - return output array 
Returns:
true if more rows exist after this one, false if scanner is done 
Throws:
IOException - e

next

boolean next(ListKeyValue result,
int limit)
throws IOException

Grab the next row's worth of values with a limit on the number of values to 
return.

Parameters:
result - return output array
limit - limit on row count to get 
Returns:
true if more rows exist after this one, false if scanner is done 
Throws:
IOException - e


This addresses bug HBASE-5240.
https://issues.apache.org/jira/browse/HBASE-5240


Diffs
-

  src/main/java/org/apache/hadoop/hbase/regionserver/InternalScanner.java 
0f5f36c 

Diff: https://reviews.apache.org/r/3594/diff


Testing
---


Thanks,

Alex



 HBase internalscanner.next javadoc doesn't imply whether or not results are 
 appended or not
 ---

 Key: HBASE-5240
 URL: https://issues.apache.org/jira/browse/HBASE-5240
 Project: HBase
  Issue Type: Bug
Reporter: Alex Newman
Assignee: Alex Newman
 Attachments: 
 0001-HBASE-5240.-HBase-internalscanner.next-javadoc-doesn.patch


 Just looking at 
 http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/regionserver/InternalScanner.html.
  We don't know whether or not the results are appended to results list, or if 
 we always clear it first.
 boolean   next(ListKeyValue results)
   Grab the next row's worth of values.
  boolean  next(ListKeyValue result, int limit)
   Grab the next row's worth of values with a limit on the number of 
 values to return.
  
 Method Detail
 next
 boolean next(ListKeyValue results)
  throws IOException
 Grab the next row's worth of values.
 Parameters:
 results - return output array 
 Returns:
 true if more rows exist after this one, false if scanner is done 
 Throws:
 IOException - e
 next
 boolean next(ListKeyValue result,
  int limit)
  throws IOException
 Grab the next row's worth of values with a limit on the number of values 
 to return.
 Parameters:
 result - return output array
 limit - limit on row count to get 
 Returns:
 true if more rows exist after this one, false if scanner is done 
 Throws:
 IOException - e

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5210) HFiles are missing from an incremental load

2012-01-23 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191327#comment-13191327
 ] 

Todd Lipcon commented on HBASE-5210:


Why not change the output file name to be based on the task attempt ID? There 
is already a unique id for each task available...

 HFiles are missing from an incremental load
 ---

 Key: HBASE-5210
 URL: https://issues.apache.org/jira/browse/HBASE-5210
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.2
 Environment: HBase 0.90.2 with Hadoop-0.20.2 (with durable sync).  
 RHEL 2.6.18-164.15.1.el5.  4 node cluster (1 master, 3 slaves)
Reporter: Lawrence Simpson
 Attachments: HBASE-5210-crazy-new-getRandomFilename.patch


 We run an overnight map/reduce job that loads data from an external source 
 and adds that data to an existing HBase table.  The input files have been 
 loaded into hdfs.  The map/reduce job uses the HFileOutputFormat (and the 
 TotalOrderPartitioner) to create HFiles which are subsequently added to the 
 HBase table.  On at least two separate occasions (that we know of), a range 
 of output would be missing for a given day.  The range of keys for the 
 missing values corresponded to those of a particular region.  This implied 
 that a complete HFile somehow went missing from the job.  Further 
 investigation revealed the following:
  * Two different reducers (running in separate JVMs and thus separate class 
 loaders)
  * in the same server can end up using the same file names for their
  * HFiles.  The scenario is as follows:
  *1.  Both reducers start near the same time.
  *2.  The first reducer reaches the point where it wants to write its 
 first file.
  *3.  It uses the StoreFile class which contains a static Random 
 object 
  *which is initialized by default using a timestamp.
  *4.  The file name is generated using the random number generator.
  *5.  The file name is checked against other existing files.
  *6.  The file is written into temporary files in a directory named
  *after the reducer attempt.
  *7.  The second reduce task reaches the same point, but its 
 StoreClass
  *(which is now in the file system's cache) gets loaded within the
  *time resolution of the OS and thus initializes its Random()
  *object with the same seed as the first task.
  *8.  The second task also checks for an existing file with the name
  *generated by the random number generator and finds no conflict
  *because each task is writing files in its own temporary folder.
  *9.  The first task finishes and gets its temporary files committed
  *to the real folder specified for output of the HFiles.
  * 10.The second task then reaches its own conclusion and commits its
  *files (moveTaskOutputs).  The released Hadoop code just 
 overwrites
  *any files with the same name.  No warning messages or anything.
  *The first task's HFiles just go missing.
  * 
  *  Note:  The reducers here are NOT different attempts at the same 
  *reduce task.  They are different reduce tasks so data is
  *really lost.
 I am currently testing a fix in which I have added code to the Hadoop 
 FileOutputCommitter.moveTaskOutputs method to check for a conflict with
 an existing file in the final output folder and to rename the HFile if
 needed.  This may not be appropriate for all uses of FileOutputFormat.
 So I have put this into a new class which is then used by a subclass of
 HFileOutputFormat.  Subclassing of FileOutputCommitter itself was a bit 
 more of a problem due to private declarations.
 I don't know if my approach is the best fix for the problem.  If someone
 more knowledgeable than myself deems that it is, I will be happy to share
 what I have done and by that time I may have some information on the
 results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-23 Thread Liyin Tang (Created) (JIRA)
Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
---

 Key: HBASE-5259
 URL: https://issues.apache.org/jira/browse/HBASE-5259
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang


Assuming the HBase and MapReduce running in the same cluster, the 
TableInputFormat is to override the split function which divides all the 
regions from one particular table into a series of mapper tasks. So each mapper 
task can process a region or one part of a region. Ideally, the mapper task 
should run on the same machine on which the region server hosts the 
corresponding region. That's the motivation that the TableInputFormat sets the 
RegionLocation so that the MapReduce framework can respect the node locality. 

The code simply set the host name of the region server as the HRegionLocation. 
However, the host name of the region server may have different format with the 
host name of the task tracker (Mapper task). The task tracker always gets its 
hostname by the reverse DNS lookup. And the DNS service may return different 
host name format. For example, the host name of the region server is correctly 
set as a.b.c.d while the reverse DNS lookup may return a.b.c.d. (With an 
additional doc in the end).

So the solution is to set the RegionLocation by the reverse DNS lookup as well. 
No matter what host name format the DNS system is using, the TableInputFormat 
has the responsibility to keep the consistent host name format with the 
MapReduce framework.







--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4920) We need a mascot, a totem

2012-01-23 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191339#comment-13191339
 ] 

Jonathan Hsieh commented on HBASE-4920:
---

I feel the cyber look and the hard edges of the wordmark doesn't quite fit 
with the roundness of the image but like the general idea (maybe a shaper 
style for the same idea).

@Stack http://en.wikipedia.org/wiki/Vancouver_Canucks

@Lars Actual pacific northwest native american totem poles with octopus/squid
http://users.imag.net/~sry.jkramer/nativetotems/common.htm
http://www.flickr.com/photos/lostviking/3419653151/

Giant squids are pretty close to the International Orange (Engineering) color.
http://blogs.sfweekly.com/thesnitch/2008/01/breaking_giant_cartoon_squid_a.php

 We need a mascot, a totem
 -

 Key: HBASE-4920
 URL: https://issues.apache.org/jira/browse/HBASE-4920
 Project: HBase
  Issue Type: Task
Reporter: stack
 Attachments: HBase Orca Logo.jpg, Orca_479990801.jpg, Screen shot 
 2011-11-30 at 4.06.17 PM.png, apache hbase orca logo_Proof 3.pdf, photo 
 (2).JPG


 We need a totem for our t-shirt that is yet to be printed.  O'Reilly owns the 
 Clyesdale.  We need something else.
 We could have a fluffy little duck that quacks 'hbase!' when you squeeze it 
 and we could order boxes of them from some off-shore sweatshop that 
 subcontracts to a contractor who employs child labor only.
 Or we could have an Orca (Big!, Fast!, Killer!, and in a poem that Marcy from 
 Salesforce showed me, that was a bit too spiritual for me to be seen quoting 
 here, it had the Orca as the 'Guardian of the Cosmic Memory': i.e. in 
 translation, bigdata).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-4920) We need a mascot, a totem

2012-01-23 Thread Jonathan Hsieh (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191339#comment-13191339
 ] 

Jonathan Hsieh edited comment on HBASE-4920 at 1/23/12 6:42 PM:


I feel the cyber look and the hard edges of the wordmark doesn't quite fit 
with the roundness of the image but like the general idea (maybe a sharper 
style for the same idea).

@Stack http://en.wikipedia.org/wiki/Vancouver_Canucks

@Lars Actual pacific northwest native american totem poles with octopus/squid
http://users.imag.net/~sry.jkramer/nativetotems/common.htm
http://www.flickr.com/photos/lostviking/3419653151/

Giant squids are pretty close to the International Orange (Engineering) color.
http://blogs.sfweekly.com/thesnitch/2008/01/breaking_giant_cartoon_squid_a.php

  was (Author: jmhsieh):
I feel the cyber look and the hard edges of the wordmark doesn't quite 
fit with the roundness of the image but like the general idea (maybe a shaper 
style for the same idea).

@Stack http://en.wikipedia.org/wiki/Vancouver_Canucks

@Lars Actual pacific northwest native american totem poles with octopus/squid
http://users.imag.net/~sry.jkramer/nativetotems/common.htm
http://www.flickr.com/photos/lostviking/3419653151/

Giant squids are pretty close to the International Orange (Engineering) color.
http://blogs.sfweekly.com/thesnitch/2008/01/breaking_giant_cartoon_squid_a.php
  
 We need a mascot, a totem
 -

 Key: HBASE-4920
 URL: https://issues.apache.org/jira/browse/HBASE-4920
 Project: HBase
  Issue Type: Task
Reporter: stack
 Attachments: HBase Orca Logo.jpg, Orca_479990801.jpg, Screen shot 
 2011-11-30 at 4.06.17 PM.png, apache hbase orca logo_Proof 3.pdf, photo 
 (2).JPG


 We need a totem for our t-shirt that is yet to be printed.  O'Reilly owns the 
 Clyesdale.  We need something else.
 We could have a fluffy little duck that quacks 'hbase!' when you squeeze it 
 and we could order boxes of them from some off-shore sweatshop that 
 subcontracts to a contractor who employs child labor only.
 Or we could have an Orca (Big!, Fast!, Killer!, and in a poem that Marcy from 
 Salesforce showed me, that was a bit too spiritual for me to be seen quoting 
 here, it had the Orca as the 'Guardian of the Cosmic Memory': i.e. in 
 translation, bigdata).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5240) HBase internalscanner.next javadoc doesn't imply whether or not results are appended or not

2012-01-23 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191341#comment-13191341
 ] 

Hadoop QA commented on HBASE-5240:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12511522/0001-HBASE-5240.-HBase-internalscanner.next-javadoc-doesn.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated -145 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 156 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.regionserver.TestAtomicOperation
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestImportTsv

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/838//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/838//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/838//console

This message is automatically generated.

 HBase internalscanner.next javadoc doesn't imply whether or not results are 
 appended or not
 ---

 Key: HBASE-5240
 URL: https://issues.apache.org/jira/browse/HBASE-5240
 Project: HBase
  Issue Type: Bug
Reporter: Alex Newman
Assignee: Alex Newman
 Attachments: 
 0001-HBASE-5240.-HBase-internalscanner.next-javadoc-doesn.patch


 Just looking at 
 http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/regionserver/InternalScanner.html.
  We don't know whether or not the results are appended to results list, or if 
 we always clear it first.
 boolean   next(ListKeyValue results)
   Grab the next row's worth of values.
  boolean  next(ListKeyValue result, int limit)
   Grab the next row's worth of values with a limit on the number of 
 values to return.
  
 Method Detail
 next
 boolean next(ListKeyValue results)
  throws IOException
 Grab the next row's worth of values.
 Parameters:
 results - return output array 
 Returns:
 true if more rows exist after this one, false if scanner is done 
 Throws:
 IOException - e
 next
 boolean next(ListKeyValue result,
  int limit)
  throws IOException
 Grab the next row's worth of values with a limit on the number of values 
 to return.
 Parameters:
 result - return output array
 limit - limit on row count to get 
 Returns:
 true if more rows exist after this one, false if scanner is done 
 Throws:
 IOException - e

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5230) Unit test to ensure compactions don't cache data on write

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191345#comment-13191345
 ] 

Zhihong Yu commented on HBASE-5230:
---

TestAtomicOperation passed locally.
The patch should be good to go.

 Unit test to ensure compactions don't cache data on write
 -

 Key: HBASE-5230
 URL: https://issues.apache.org/jira/browse/HBASE-5230
 Project: HBase
  Issue Type: Test
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch


 Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
 write during compactions even if cache-on-write is enabled generally 
 enabled). This is because we have very different implementations of 
 HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
 and with CacheConfig (presumably it's there but not sure if it even works, 
 since the patch in HBASE-3976 may not have been committed). We need to create 
 a unit test to verify that we don't cache data blocks on write during 
 compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5222) Stopping replication via the stop_replication command in hbase shell on a slave cluster isn't acknowledged in the replication sink

2012-01-23 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5222:
--

Component/s: shell
 replication

 Stopping replication via the stop_replication command in hbase shell on a 
 slave cluster isn't acknowledged in the replication sink
 

 Key: HBASE-5222
 URL: https://issues.apache.org/jira/browse/HBASE-5222
 Project: HBase
  Issue Type: Bug
  Components: replication, shell
Affects Versions: 0.90.4
Reporter: Josh Wymer

 After running stop_replication in the hbase shell on our slave cluster we 
 saw replication continue for weeks. Turns out that the replication sink is 
 missing a check to get the replication state and therefore continued to write.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5210) HFiles are missing from an incremental load

2012-01-23 Thread Jimmy Xiang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191351#comment-13191351
 ] 

Jimmy Xiang commented on HBASE-5210:


I like this one.  It's really simple and clean.

 HFiles are missing from an incremental load
 ---

 Key: HBASE-5210
 URL: https://issues.apache.org/jira/browse/HBASE-5210
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.2
 Environment: HBase 0.90.2 with Hadoop-0.20.2 (with durable sync).  
 RHEL 2.6.18-164.15.1.el5.  4 node cluster (1 master, 3 slaves)
Reporter: Lawrence Simpson
 Attachments: HBASE-5210-crazy-new-getRandomFilename.patch


 We run an overnight map/reduce job that loads data from an external source 
 and adds that data to an existing HBase table.  The input files have been 
 loaded into hdfs.  The map/reduce job uses the HFileOutputFormat (and the 
 TotalOrderPartitioner) to create HFiles which are subsequently added to the 
 HBase table.  On at least two separate occasions (that we know of), a range 
 of output would be missing for a given day.  The range of keys for the 
 missing values corresponded to those of a particular region.  This implied 
 that a complete HFile somehow went missing from the job.  Further 
 investigation revealed the following:
  * Two different reducers (running in separate JVMs and thus separate class 
 loaders)
  * in the same server can end up using the same file names for their
  * HFiles.  The scenario is as follows:
  *1.  Both reducers start near the same time.
  *2.  The first reducer reaches the point where it wants to write its 
 first file.
  *3.  It uses the StoreFile class which contains a static Random 
 object 
  *which is initialized by default using a timestamp.
  *4.  The file name is generated using the random number generator.
  *5.  The file name is checked against other existing files.
  *6.  The file is written into temporary files in a directory named
  *after the reducer attempt.
  *7.  The second reduce task reaches the same point, but its 
 StoreClass
  *(which is now in the file system's cache) gets loaded within the
  *time resolution of the OS and thus initializes its Random()
  *object with the same seed as the first task.
  *8.  The second task also checks for an existing file with the name
  *generated by the random number generator and finds no conflict
  *because each task is writing files in its own temporary folder.
  *9.  The first task finishes and gets its temporary files committed
  *to the real folder specified for output of the HFiles.
  * 10.The second task then reaches its own conclusion and commits its
  *files (moveTaskOutputs).  The released Hadoop code just 
 overwrites
  *any files with the same name.  No warning messages or anything.
  *The first task's HFiles just go missing.
  * 
  *  Note:  The reducers here are NOT different attempts at the same 
  *reduce task.  They are different reduce tasks so data is
  *really lost.
 I am currently testing a fix in which I have added code to the Hadoop 
 FileOutputCommitter.moveTaskOutputs method to check for a conflict with
 an existing file in the final output folder and to rename the HFile if
 needed.  This may not be appropriate for all uses of FileOutputFormat.
 So I have put this into a new class which is then used by a subclass of
 HFileOutputFormat.  Subclassing of FileOutputCommitter itself was a bit 
 more of a problem due to private declarations.
 I don't know if my approach is the best fix for the problem.  If someone
 more knowledgeable than myself deems that it is, I will be happy to share
 what I have done and by that time I may have some information on the
 results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5260) [book] troubleshooting.xml - Troubleshooting/Network/Loopback IP using incorrect XML element to config entry

2012-01-23 Thread Doug Meil (Created) (JIRA)
[book] troubleshooting.xml - Troubleshooting/Network/Loopback IP using 
incorrect XML element to config entry


 Key: HBASE-5260
 URL: https://issues.apache.org/jira/browse/HBASE-5260
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Trivial


troubleshooting.xml
* the Troubleshooting/Network/Loopback IP entry is using the incorrect XML 
element to link to the Config section.  It's using link instead of an xref, 
so the description is ???   Oddly enough, though, the link actually works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5260) [book] troubleshooting.xml - Troubleshooting/Network/Loopback IP using incorrect XML element to config entry

2012-01-23 Thread Doug Meil (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5260:
-

Status: Patch Available  (was: Open)

 [book] troubleshooting.xml - Troubleshooting/Network/Loopback IP using 
 incorrect XML element to config entry
 

 Key: HBASE-5260
 URL: https://issues.apache.org/jira/browse/HBASE-5260
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Trivial
 Attachments: troubleshooting_hbase_5260.xml.patch


 troubleshooting.xml
 * the Troubleshooting/Network/Loopback IP entry is using the incorrect XML 
 element to link to the Config section.  It's using link instead of an 
 xref, so the description is ???   Oddly enough, though, the link actually 
 works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5260) [book] troubleshooting.xml - Troubleshooting/Network/Loopback IP using incorrect XML element to config entry

2012-01-23 Thread Doug Meil (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5260:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 [book] troubleshooting.xml - Troubleshooting/Network/Loopback IP using 
 incorrect XML element to config entry
 

 Key: HBASE-5260
 URL: https://issues.apache.org/jira/browse/HBASE-5260
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Trivial
 Attachments: troubleshooting_hbase_5260.xml.patch


 troubleshooting.xml
 * the Troubleshooting/Network/Loopback IP entry is using the incorrect XML 
 element to link to the Config section.  It's using link instead of an 
 xref, so the description is ???   Oddly enough, though, the link actually 
 works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5260) [book] troubleshooting.xml - Troubleshooting/Network/Loopback IP using incorrect XML element to config entry

2012-01-23 Thread Doug Meil (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5260:
-

Attachment: troubleshooting_hbase_5260.xml.patch

 [book] troubleshooting.xml - Troubleshooting/Network/Loopback IP using 
 incorrect XML element to config entry
 

 Key: HBASE-5260
 URL: https://issues.apache.org/jira/browse/HBASE-5260
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Trivial
 Attachments: troubleshooting_hbase_5260.xml.patch


 troubleshooting.xml
 * the Troubleshooting/Network/Loopback IP entry is using the incorrect XML 
 element to link to the Config section.  It's using link instead of an 
 xref, so the description is ???   Oddly enough, though, the link actually 
 works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5255) Use singletons for OperationStatus to save memory

2012-01-23 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191356#comment-13191356
 ] 

Hadoop QA commented on HBASE-5255:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12511526/5255-92.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated -145 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 84 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/839//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/839//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/839//console

This message is automatically generated.

 Use singletons for OperationStatus to save memory
 -

 Key: HBASE-5255
 URL: https://issues.apache.org/jira/browse/HBASE-5255
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.90.5, 0.92.0
Reporter: Benoit Sigoure
Assignee: Benoit Sigoure
Priority: Minor
  Labels: performance
 Fix For: 0.94.0, 0.92.1

 Attachments: 5255-92.txt, 5255-v2.txt, 
 HBASE-5255-0.92-Use-singletons-to-remove-unnecessary-memory-allocati.patch, 
 HBASE-5255-trunk-Use-singletons-to-remove-unnecessary-memory-allocati.patch


 Every single {{Put}} causes the allocation of at least one 
 {{OperationStatus}}, yet {{OperationStatus}} is almost always stateless, so 
 these allocations are unnecessary and could be avoided.  Attached patch adds 
 a few singletons and uses them, with no public API change.  I didn't test the 
 patches, but you get the idea.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5230) Unit test to ensure compactions don't cache data on write

2012-01-23 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191357#comment-13191357
 ] 

Hadoop QA commented on HBASE-5230:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12511528/Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 8 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -145 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 84 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.coprocessor.TestClassLoading
  org.apache.hadoop.hbase.mapreduce.TestImportTsv
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/840//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/840//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/840//console

This message is automatically generated.

 Unit test to ensure compactions don't cache data on write
 -

 Key: HBASE-5230
 URL: https://issues.apache.org/jira/browse/HBASE-5230
 Project: HBase
  Issue Type: Test
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch


 Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
 write during compactions even if cache-on-write is enabled generally 
 enabled). This is because we have very different implementations of 
 HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
 and with CacheConfig (presumably it's there but not sure if it even works, 
 since the patch in HBASE-3976 may not have been committed). We need to create 
 a unit test to verify that we don't cache data blocks on write during 
 compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server

2012-01-23 Thread Mubarak Seyed (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mubarak Seyed updated HBASE-4720:
-

Attachment: HBASE-4720.trunk.v6.patch

The attached file (HBASE-4720.trunk.v6.patch) is updated patch file. Thanks.

 Implement atomic update operations (checkAndPut, checkAndDelete) for REST 
 client/server 
 

 Key: HBASE-4720
 URL: https://issues.apache.org/jira/browse/HBASE-4720
 Project: HBase
  Issue Type: Improvement
Reporter: Daniel Lord
Assignee: Mubarak Seyed
 Fix For: 0.94.0

 Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, 
 HBASE-4720.trunk.v3.patch, HBASE-4720.trunk.v4.patch, 
 HBASE-4720.trunk.v5.patch, HBASE-4720.trunk.v6.patch, HBASE-4720.v1.patch, 
 HBASE-4720.v3.patch


 I have several large application/HBase clusters where an application node 
 will occasionally need to talk to HBase from a different cluster.  In order 
 to help ensure some of my consistency guarantees I have a sentinel table that 
 is updated atomically as users interact with the system.  This works quite 
 well for the regular hbase client but the REST client does not implement 
 the checkAndPut and checkAndDelete operations.  This exposes the application 
 to some race conditions that have to be worked around.  It would be ideal if 
 the same checkAndPut/checkAndDelete operations could be supported by the REST 
 client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4397) -ROOT-, .META. tables stay offline for too long in recovery phase after all RSs are shutdown at the same time

2012-01-23 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191363#comment-13191363
 ] 

Hudson commented on HBASE-4397:
---

Integrated in HBase-0.92 #257 (See 
[https://builds.apache.org/job/HBase-0.92/257/])
HBASE-5237 Addendum for HBASE-5160 and HBASE-4397(Ram)

ramkrishna : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java


 -ROOT-, .META. tables stay offline for too long in recovery phase after all 
 RSs are shutdown at the same time
 -

 Key: HBASE-4397
 URL: https://issues.apache.org/jira/browse/HBASE-4397
 Project: HBase
  Issue Type: Bug
Reporter: Ming Ma
Assignee: Ming Ma
 Fix For: 0.94.0, 0.92.0

 Attachments: HBASE-4397-0.92.patch


 1. Shutdown all RSs.
 2. Bring all RS back online.
 The -ROOT-, .META. stay in offline state until timeout monitor force 
 assignment 30 minutes later. That is because HMaster can't find a RS to 
 assign the tables to in assign operation.
 011-09-13 13:25:52,743 WARN org.apache.hadoop.hbase.master.AssignmentManager: 
 Failed assignment of -ROOT-,,0.70236052 to sea-lab-4,60020,1315870341387, 
 trying to assign elsewhere instead; retry=0
 java.net.ConnectException: Connection refused
 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
 at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
 at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
 at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:373)
 at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:345)
 at 
 org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1002)
 at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:854)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:148)
 at $Proxy9.openRegion(Unknown Source)
 at 
 org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:407)
 at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1408)
 at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1153)
 at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1128)
 at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1123)
 at 
 org.apache.hadoop.hbase.master.AssignmentManager.assignRoot(AssignmentManager.java:1788)
 at 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.verifyAndAssignRoot(ServerShutdownHandler.java:100)
 at 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.verifyAndAssignRootWithRetries(ServerShutdownHandler.java:118)
 at 
 org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:181)
 at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:167)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 2011-09-13 13:25:52,743 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Unable to find a viable 
 location to assign region -ROOT-,,0.70236052
 Possible fixes:
 1. Have serverManager handle server online event similar to how 
 RegionServerTracker.java calls servermanager.expireServer in the case server 
 goes down.
 2. Make timeoutMonitor handle the situation better. This is a special 
 situation in the cluster. 30 minutes timeout can be skipped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5243) LogSyncerThread not getting shutdown waiting for the interrupted flag

2012-01-23 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191360#comment-13191360
 ] 

Hudson commented on HBASE-5243:
---

Integrated in HBase-0.92 #257 (See 
[https://builds.apache.org/job/HBase-0.92/257/])
HBASE-5243 LogSyncerThread not getting shutdown waiting for the interrupted 
flag(Ram).

ramkrishna : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java


 LogSyncerThread not getting shutdown waiting for the interrupted flag
 -

 Key: HBASE-5243
 URL: https://issues.apache.org/jira/browse/HBASE-5243
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.5
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.6, 0.92.1

 Attachments: HBASE-5243_0.90.patch, HBASE-5243_0.90_1.patch, 
 HBASE-5243_trunk.patch


 In the LogSyncer run() we keep looping till this.isInterrupted flag is set.
 But in some cases the DFSclient is consuming the Interrupted exception.  So
 we are running into infinite loop in some shutdown cases.
 I would suggest that as we are the ones who tries to close down the
 LogSyncerThread we can introduce a variable like
 Close or shutdown and based on the state of this flag along with
 isInterrupted() we can make the thread stop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5160) Backport HBASE-4397 - -ROOT-, .META. tables stay offline for too long in recovery phase after all RSs are shutdown at the same time

2012-01-23 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191362#comment-13191362
 ] 

Hudson commented on HBASE-5160:
---

Integrated in HBase-0.92 #257 (See 
[https://builds.apache.org/job/HBase-0.92/257/])
HBASE-5237 Addendum for HBASE-5160 and HBASE-4397(Ram)

ramkrishna : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java


 Backport HBASE-4397 - -ROOT-, .META. tables stay offline for too long in 
 recovery phase after all RSs are shutdown at the same time
 ---

 Key: HBASE-5160
 URL: https://issues.apache.org/jira/browse/HBASE-5160
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.90.6

 Attachments: HBASE-5160-AssignmentManager.patch, HBASE-5160_2.patch


 Backporting to 0.90.6 considering the importance of the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5235) HLogSplitter writer thread's streams not getting closed when any of the writer threads has exceptions.

2012-01-23 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191359#comment-13191359
 ] 

Hudson commented on HBASE-5235:
---

Integrated in HBase-0.92 #257 (See 
[https://builds.apache.org/job/HBase-0.92/257/])
HBASE-5235 HLogSplitter writer thread's streams not getting closed when any 
of the writer threads has exceptions. (Ram)

ramkrishna : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java


 HLogSplitter writer thread's streams not getting closed when any of the 
 writer threads has exceptions.
 --

 Key: HBASE-5235
 URL: https://issues.apache.org/jira/browse/HBASE-5235
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.5, 0.92.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.6, 0.92.1

 Attachments: HBASE-5235_0.90.patch, HBASE-5235_0.90_1.patch, 
 HBASE-5235_0.90_2.patch, HBASE-5235_trunk.patch


 Pls find the analysis.  Correct me if am wrong
 {code}
 2012-01-15 05:14:02,374 FATAL 
 org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: WriterThread-9 Got 
 while writing log entry to log
 java.io.IOException: All datanodes 10.18.40.200:50010 are bad. Aborting...
   at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3373)
   at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2811)
   at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:3026)
 {code}
 Here we have an exception in one of the writer threads. If any exception we 
 try to hold it in an Atomic variable 
 {code}
   private void writerThreadError(Throwable t) {
 thrown.compareAndSet(null, t);
   }
 {code}
 In the finally block of splitLog we try to close the streams.
 {code}
   for (WriterThread t: writerThreads) {
 try {
   t.join();
 } catch (InterruptedException ie) {
   throw new IOException(ie);
 }
 checkForErrors();
   }
   LOG.info(Split writers finished);
   
   return closeStreams();
 {code}
 Inside checkForErrors
 {code}
   private void checkForErrors() throws IOException {
 Throwable thrown = this.thrown.get();
 if (thrown == null) return;
 if (thrown instanceof IOException) {
   throw (IOException)thrown;
 } else {
   throw new RuntimeException(thrown);
 }
   }
 So once we throw the exception the DFSStreamer threads are not getting closed.
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5237) Addendum for HBASE-5160 and HBASE-4397

2012-01-23 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191361#comment-13191361
 ] 

Hudson commented on HBASE-5237:
---

Integrated in HBase-0.92 #257 (See 
[https://builds.apache.org/job/HBase-0.92/257/])
HBASE-5237 Addendum for HBASE-5160 and HBASE-4397(Ram)

ramkrishna : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java


 Addendum for HBASE-5160 and HBASE-4397
 --

 Key: HBASE-5237
 URL: https://issues.apache.org/jira/browse/HBASE-5237
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.5
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.6, 0.92.1

 Attachments: HBASE-5237_0.90.patch, HBASE-5237_trunk.patch


 As part of HBASE-4397 there is one more scenario where the patch has to be 
 applied.
 {code}
 RegionPlan plan = getRegionPlan(state, forceNewPlan);
   if (plan == null) {
 debugLog(state.getRegion(),
 Unable to determine a plan to assign  + state);
 return; // Should get reassigned later when RIT times out.
   }
 {code}
 I think in this scenario also 
 {code}
 this.timeoutMonitor.setAllRegionServersOffline(true);
 {code}
 this should be done.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5231) Backport HBASE-3373 (per-table load balancing) to 0.92

2012-01-23 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191365#comment-13191365
 ] 

Hudson commented on HBASE-5231:
---

Integrated in HBase-0.92 #257 (See 
[https://builds.apache.org/job/HBase-0.92/257/])
HBASE-5231  Backport HBASE-3373 (per-table load balancing) to 0.92

tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/DefaultLoadBalancer.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java


 Backport HBASE-3373 (per-table load balancing) to 0.92
 --

 Key: HBASE-5231
 URL: https://issues.apache.org/jira/browse/HBASE-5231
 Project: HBase
  Issue Type: Improvement
Reporter: Zhihong Yu
 Fix For: 0.92.1

 Attachments: 5231-v2.txt, 5231.txt


 This JIRA backports per-table load balancing to 0.90

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3373) Allow regions to be load-balanced by table

2012-01-23 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191364#comment-13191364
 ] 

Hudson commented on HBASE-3373:
---

Integrated in HBase-0.92 #257 (See 
[https://builds.apache.org/job/HBase-0.92/257/])
HBASE-5231  Backport HBASE-3373 (per-table load balancing) to 0.92

tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/DefaultLoadBalancer.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java


 Allow regions to be load-balanced by table
 --

 Key: HBASE-3373
 URL: https://issues.apache.org/jira/browse/HBASE-3373
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.20.6
Reporter: Ted Yu
Assignee: Zhihong Yu
 Fix For: 0.94.0

 Attachments: 3373.txt, HbaseBalancerTest2.java


 From our experience, cluster can be well balanced and yet, one table's 
 regions may be badly concentrated on few region servers.
 For example, one table has 839 regions (380 regions at time of table 
 creation) out of which 202 are on one server.
 It would be desirable for load balancer to distribute regions for specified 
 tables evenly across the cluster. Each of such tables has number of regions 
 many times the cluster size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5261) Update HBase for Java 7

2012-01-23 Thread Mikhail Bautin (Created) (JIRA)
Update HBase for Java 7
---

 Key: HBASE-5261
 URL: https://issues.apache.org/jira/browse/HBASE-5261
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin


We need to make sure that HBase compiles and works with JDK 7. Once we verify 
it is reasonably stable, we can explore utilizing the G1 garbage collector. 
When all deployments are ready to move to JDK 7, we can start using new 
language features, but in the transition period we will need to maintain a 
codebase that compiles both with JDK 6 and JDK 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5230) Unit test to ensure compactions don't cache data on write

2012-01-23 Thread Mikhail Bautin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191372#comment-13191372
 ] 

Mikhail Bautin commented on HBASE-5230:
---

The above failed tests passed locally:

Running org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 216.187 sec
Running org.apache.hadoop.hbase.mapreduce.TestImportTsv
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 78.841 sec
Running org.apache.hadoop.hbase.mapreduce.TestTableMapReduce
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 97.529 sec
Running org.apache.hadoop.hbase.mapred.TestTableMapReduce
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 64.111 sec
Running org.apache.hadoop.hbase.coprocessor.TestClassLoading
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 28.787 sec

Results :

Tests run: 24, Failures: 0, Errors: 0, Skipped: 0


 Unit test to ensure compactions don't cache data on write
 -

 Key: HBASE-5230
 URL: https://issues.apache.org/jira/browse/HBASE-5230
 Project: HBase
  Issue Type: Test
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch


 Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
 write during compactions even if cache-on-write is enabled generally 
 enabled). This is because we have very different implementations of 
 HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
 and with CacheConfig (presumably it's there but not sure if it even works, 
 since the patch in HBASE-3976 may not have been committed). We need to create 
 a unit test to verify that we don't cache data blocks on write during 
 compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5231) Backport HBASE-3373 (per-table load balancing) to 0.92

2012-01-23 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191377#comment-13191377
 ] 

stack commented on HBASE-5231:
--

It looks like a method named getAssignmentsByTable will only do this if a 
particular configuration is set; else it will do assignments the old way.  
Seems like an odd name for this method.  I'd have thought it would have 
remained getAssignments and then in getAssignments we'd switch on whether to do 
by table or not.

Does this change default? I can't tell.


 Backport HBASE-3373 (per-table load balancing) to 0.92
 --

 Key: HBASE-5231
 URL: https://issues.apache.org/jira/browse/HBASE-5231
 Project: HBase
  Issue Type: Improvement
Reporter: Zhihong Yu
 Fix For: 0.92.1

 Attachments: 5231-v2.txt, 5231.txt


 This JIRA backports per-table load balancing to 0.90

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4141) Fix LRU stats message

2012-01-23 Thread Vikram Srivastava (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Srivastava updated HBASE-4141:
-

Attachment: LruBlockCache_HBASE_4141.patch

Fixed the brackets. Currently the comma would not be printed if the value is 
zero.

 Fix LRU stats message
 -

 Key: HBASE-4141
 URL: https://issues.apache.org/jira/browse/HBASE-4141
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: Lars George
Priority: Trivial
  Labels: newbie
 Attachments: LruBlockCache_HBASE_4141.patch


 Currently the DEBUG message looks like this:
 {noformat}
 2011-07-26 04:21:52,344 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: 
 LRU Stats: total=3.24 MB, free=391.76 MB, max=395 MB, blocks=0, 
 accesses=118458, hits=0, hitRatio=0.00%%, cachingAccesses=0, cachingHits=0, 
 cachingHitsRatio=�%, evictions=0, evicted=0, evictedPerRun=NaN
 {noformat}
 Note the double percent on hitRatio, and the stray character at 
 cachingHitsRatio.
 The former is a added by the code in LruBlockCache.java:
 {code}
 ...
 hitRatio= +
   (stats.getHitCount() == 0 ? 0 : 
 (StringUtils.formatPercent(stats.getHitRatio(), 2) + %, )) +
 ...
 {code}
 The StringUtils already adds a percent sign, so the trailing one here can be 
 dropped.
 The latter I presume is caused by the value not between 0.0 and 1.0. This 
 should be checked and NaN or so displayed instead as is done for other 
 values.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server

2012-01-23 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191388#comment-13191388
 ] 

Hadoop QA commented on HBASE-4720:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12511535/HBASE-4720.trunk.v6.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -145 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 85 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.mapreduce.TestImportTsv
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/841//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/841//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/841//console

This message is automatically generated.

 Implement atomic update operations (checkAndPut, checkAndDelete) for REST 
 client/server 
 

 Key: HBASE-4720
 URL: https://issues.apache.org/jira/browse/HBASE-4720
 Project: HBase
  Issue Type: Improvement
Reporter: Daniel Lord
Assignee: Mubarak Seyed
 Fix For: 0.94.0

 Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, 
 HBASE-4720.trunk.v3.patch, HBASE-4720.trunk.v4.patch, 
 HBASE-4720.trunk.v5.patch, HBASE-4720.trunk.v6.patch, HBASE-4720.v1.patch, 
 HBASE-4720.v3.patch


 I have several large application/HBase clusters where an application node 
 will occasionally need to talk to HBase from a different cluster.  In order 
 to help ensure some of my consistency guarantees I have a sentinel table that 
 is updated atomically as users interact with the system.  This works quite 
 well for the regular hbase client but the REST client does not implement 
 the checkAndPut and checkAndDelete operations.  This exposes the application 
 to some race conditions that have to be worked around.  It would be ideal if 
 the same checkAndPut/checkAndDelete operations could be supported by the REST 
 client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-5261) Update HBase for Java 7

2012-01-23 Thread Mikhail Bautin (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin reassigned HBASE-5261:
-

Assignee: Mikhail Bautin

 Update HBase for Java 7
 ---

 Key: HBASE-5261
 URL: https://issues.apache.org/jira/browse/HBASE-5261
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin

 We need to make sure that HBase compiles and works with JDK 7. Once we verify 
 it is reasonably stable, we can explore utilizing the G1 garbage collector. 
 When all deployments are ready to move to JDK 7, we can start using new 
 language features, but in the transition period we will need to maintain a 
 codebase that compiles both with JDK 6 and JDK 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5262) Structured event log for HBase for monitoring and auto-tuning performance

2012-01-23 Thread Mikhail Bautin (Created) (JIRA)
Structured event log for HBase for monitoring and auto-tuning performance
-

 Key: HBASE-5262
 URL: https://issues.apache.org/jira/browse/HBASE-5262
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin


Creating this JIRA to open a discussion about a structured (machine-readable) 
log that will record events such as compaction start/end times, compaction 
input/output files, their sizes, the same for flushes, etc. This can be stored 
e.g. in a new system table in HBase itself. The data from this log can then be 
analyzed and used to optimize compactions at run time, or otherwise auto-tune 
HBase configuration to reduce the number of knobs the user has to configure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5257) Allow filter to be evaluated after version handling

2012-01-23 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191397#comment-13191397
 ] 

Lars Hofhansl commented on HBASE-5257:
--

Running filter post column trackers does only work for Filters that do nothing 
in filterRowKey and filterRow.

 Allow filter to be evaluated after version handling
 ---

 Key: HBASE-5257
 URL: https://issues.apache.org/jira/browse/HBASE-5257
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl

 There are various usecases and filter types where evaluating the filter 
 before version are handled either do not make sense, or make filter handling 
 more complicated.
 Also see this comment in ScanQueryMatcher:
 {code}
 /**
  * Filters should be checked before checking column trackers. If we do
  * otherwise, as was previously being done, ColumnTracker may increment 
 its
  * counter for even that KV which may be discarded later on by Filter. 
 This
  * would lead to incorrect results in certain cases.
  */
 {code}
 So we had Filters after the column trackers (which do the version checking), 
 and then moved it.
 Should be at the discretion of the Filter.
 Could either add a new method to FilterBase (maybe excludeVersions() or 
 something). Or have a new Filter wrapper (like WhileMatchFilter), that should 
 only be used as outmost filter and indicates the same (maybe 
 ExcludeVersionsFilter).
 See latest comments on HBASE-5229 for motivation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5189) Add metrics to keep track of region-splits in RS

2012-01-23 Thread Mubarak Seyed (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mubarak Seyed updated HBASE-5189:
-

Attachment: HBASE-5189.trunk.v2.patch

If we move getMetrics().incrementSplitFailureCount() before rollback() and if 
rollback() returns false (or) throws RuntimeException then we don't need to 
increment split failure count as RS is going to abort itself.

The one place which needs to call getMetrics().incrementSplitFailureCount() is 
catch block

{code}
} catch (IOException ex) {
  LOG.error(Split failed  + this, RemoteExceptionHandler
  .checkIOException(ex));
  this.server.getMetrics().incrementSplitFailureCount();
  server.checkFileSystem();
{code}

as rollback() throws IOException.

The attached patch (HBASE-5189.trunk.v2.patch) updates the patch.
Thanks.

 Add metrics to keep track of region-splits in RS
 

 Key: HBASE-5189
 URL: https://issues.apache.org/jira/browse/HBASE-5189
 Project: HBase
  Issue Type: Improvement
  Components: metrics, regionserver
Affects Versions: 0.90.5, 0.92.0
Reporter: Mubarak Seyed
Assignee: Mubarak Seyed
Priority: Minor
  Labels: noob
 Attachments: HBASE-5189.trunk.v1.patch, HBASE-5189.trunk.v2.patch


 For write-heavy workload with region-size 1 GB, region-split is considerably 
 high. We do normally grep the NN log (grep mkdir*.split NN.log | sort | 
 uniq -c) to get the count.
 I would like to have a counter incremented each time region-split execution 
 succeeds and this counter exposed via the metrics stuff in HBase.
 - regionSplitSuccessCount
 - regionSplitFailureCount (will help us to correlate the timestamp range in 
 RS logs across all RS)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4131) Make the Replication Service pluggable via a standard interface definition

2012-01-23 Thread Jeff Whiting (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191408#comment-13191408
 ] 

Jeff Whiting commented on HBASE-4131:
-

This work is great.  However we need this in 0.92 (and maybe 0.90).  I'm 
thinking it shouldn't be too big of a deal to backport this as it doesn't 
change any replication functionality but just makes it pluggable. 

I'll do the footwork of making the patches for the older versions and creating 
a new jira for the backport. Do you think it is this feasible to get this back 
ported?

 Make the Replication Service pluggable via a standard interface definition
 --

 Key: HBASE-4131
 URL: https://issues.apache.org/jira/browse/HBASE-4131
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.94.0

 Attachments: 4131-backedout.txt, replicationInterface1.txt, 
 replicationInterface2.txt, replicationInterface3.txt, 
 replicationInterface4.txt


 The current HBase code supports a replication service that can be used to 
 sync data from from one hbase cluster to another. It would be nice to make it 
 a pluggable interface so that other cross-data-center replication services 
 can be used in conjuction with HBase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5189) Add metrics to keep track of region-splits in RS

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191421#comment-13191421
 ] 

Zhihong Yu commented on HBASE-5189:
---

Patch v2 makes sense.

 Add metrics to keep track of region-splits in RS
 

 Key: HBASE-5189
 URL: https://issues.apache.org/jira/browse/HBASE-5189
 Project: HBase
  Issue Type: Improvement
  Components: metrics, regionserver
Affects Versions: 0.90.5, 0.92.0
Reporter: Mubarak Seyed
Assignee: Mubarak Seyed
Priority: Minor
  Labels: noob
 Attachments: HBASE-5189.trunk.v1.patch, HBASE-5189.trunk.v2.patch


 For write-heavy workload with region-size 1 GB, region-split is considerably 
 high. We do normally grep the NN log (grep mkdir*.split NN.log | sort | 
 uniq -c) to get the count.
 I would like to have a counter incremented each time region-split execution 
 succeeds and this counter exposed via the metrics stuff in HBase.
 - regionSplitSuccessCount
 - regionSplitFailureCount (will help us to correlate the timestamp range in 
 RS logs across all RS)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5231) Backport HBASE-3373 (per-table load balancing) to 0.92

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191432#comment-13191432
 ] 

Zhihong Yu commented on HBASE-5231:
---

hbase.master.loadbalance.bytable controls whether per-table assignment is 
used.
If per-table assignment is off, the original getAssignments() would be called.

Both getAssignmentsByTable() and getAssignments() are package private.

 Backport HBASE-3373 (per-table load balancing) to 0.92
 --

 Key: HBASE-5231
 URL: https://issues.apache.org/jira/browse/HBASE-5231
 Project: HBase
  Issue Type: Improvement
Reporter: Zhihong Yu
 Fix For: 0.92.1

 Attachments: 5231-v2.txt, 5231.txt


 This JIRA backports per-table load balancing to 0.90

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5258) Move coprocessors set out of RegionLoad

2012-01-23 Thread Eugene Koontz (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191443#comment-13191443
 ] 

Eugene Koontz commented on HBASE-5258:
--

Hi Ted,
Do you have an estimate for how much network traffic or heap footprint that 
this would save?
Just curious, not an objection.
-Eugene

 Move coprocessors set out of RegionLoad
 ---

 Key: HBASE-5258
 URL: https://issues.apache.org/jira/browse/HBASE-5258
 Project: HBase
  Issue Type: Task
Reporter: Zhihong Yu

 When I worked on HBASE-5256, I revisited the code related to Ser/De of 
 coprocessors set in RegionLoad.
 I think the rationale for embedding coprocessors set is for maximum 
 flexibility where each region can load different coprocessors.
 This flexibility is causing extra cost in the region server to Master 
 communication and increasing the footprint of Master heap.
 Would HServerLoad be a better place for this set ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5258) Move coprocessors set out of RegionLoad

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191461#comment-13191461
 ] 

Zhihong Yu commented on HBASE-5258:
---

Since each coprocessor is represented by a string, the potential savings can be 
considerable, especially if many regions are hosted on each region server.


 Move coprocessors set out of RegionLoad
 ---

 Key: HBASE-5258
 URL: https://issues.apache.org/jira/browse/HBASE-5258
 Project: HBase
  Issue Type: Task
Reporter: Zhihong Yu

 When I worked on HBASE-5256, I revisited the code related to Ser/De of 
 coprocessors set in RegionLoad.
 I think the rationale for embedding coprocessors set is for maximum 
 flexibility where each region can load different coprocessors.
 This flexibility is causing extra cost in the region server to Master 
 communication and increasing the footprint of Master heap.
 Would HServerLoad be a better place for this set ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5243) LogSyncerThread not getting shutdown waiting for the interrupted flag

2012-01-23 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5243:
--

Attachment: 5243-92.addendum

The addendum fixes broken 0.92 build

 LogSyncerThread not getting shutdown waiting for the interrupted flag
 -

 Key: HBASE-5243
 URL: https://issues.apache.org/jira/browse/HBASE-5243
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.5
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.6, 0.92.1

 Attachments: 5243-92.addendum, HBASE-5243_0.90.patch, 
 HBASE-5243_0.90_1.patch, HBASE-5243_trunk.patch


 In the LogSyncer run() we keep looping till this.isInterrupted flag is set.
 But in some cases the DFSclient is consuming the Interrupted exception.  So
 we are running into infinite loop in some shutdown cases.
 I would suggest that as we are the ones who tries to close down the
 LogSyncerThread we can introduce a variable like
 Close or shutdown and based on the state of this flag along with
 isInterrupted() we can make the thread stop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5230) Unit test to ensure compactions don't cache data on write

2012-01-23 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191488#comment-13191488
 ] 

Phabricator commented on HBASE-5230:


nspiegelberg has commented on the revision [jira] [HBASE-5230] Extend 
TestCacheOnWrite to ensure we don't cache data blocks on compaction.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:764-766 
currently, there is no intelligence to estimate the resulting compacted 
filesize and cache compactions up to a max size, correct?
  
src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java:879-880
 use this static function to write a toString method?
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4502 
this comment should be changed

  // read the row, this should be a cache miss because we don't cache on 
compaction
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java:254 add

 // TODO: need to change this test if we add a cache size threshold for 
compactions
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java:295 
Assert.assertNull() is nice for clarity

REVISION DETAIL
  https://reviews.facebook.net/D1353


 Unit test to ensure compactions don't cache data on write
 -

 Key: HBASE-5230
 URL: https://issues.apache.org/jira/browse/HBASE-5230
 Project: HBase
  Issue Type: Test
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch


 Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
 write during compactions even if cache-on-write is enabled generally 
 enabled). This is because we have very different implementations of 
 HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
 and with CacheConfig (presumably it's there but not sure if it even works, 
 since the patch in HBASE-3976 may not have been committed). We need to create 
 a unit test to verify that we don't cache data blocks on write during 
 compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5243) LogSyncerThread not getting shutdown waiting for the interrupted flag

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191486#comment-13191486
 ] 

Zhihong Yu commented on HBASE-5243:
---

Applied addendum to 0.92 branch

 LogSyncerThread not getting shutdown waiting for the interrupted flag
 -

 Key: HBASE-5243
 URL: https://issues.apache.org/jira/browse/HBASE-5243
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.5
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.6, 0.92.1

 Attachments: 5243-92.addendum, HBASE-5243_0.90.patch, 
 HBASE-5243_0.90_1.patch, HBASE-5243_trunk.patch


 In the LogSyncer run() we keep looping till this.isInterrupted flag is set.
 But in some cases the DFSclient is consuming the Interrupted exception.  So
 we are running into infinite loop in some shutdown cases.
 I would suggest that as we are the ones who tries to close down the
 LogSyncerThread we can introduce a variable like
 Close or shutdown and based on the state of this flag along with
 isInterrupted() we can make the thread stop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5243) LogSyncerThread not getting shutdown waiting for the interrupted flag

2012-01-23 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191494#comment-13191494
 ] 

Hadoop QA commented on HBASE-5243:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12511556/5243-92.addendum
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The patch appears to cause mvn compile goal to fail.

-1 findbugs.  The patch appears to cause Findbugs (version 1.3.9) to fail.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/842//testReport/
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/842//console

This message is automatically generated.

 LogSyncerThread not getting shutdown waiting for the interrupted flag
 -

 Key: HBASE-5243
 URL: https://issues.apache.org/jira/browse/HBASE-5243
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.5
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.6, 0.92.1

 Attachments: 5243-92.addendum, HBASE-5243_0.90.patch, 
 HBASE-5243_0.90_1.patch, HBASE-5243_trunk.patch


 In the LogSyncer run() we keep looping till this.isInterrupted flag is set.
 But in some cases the DFSclient is consuming the Interrupted exception.  So
 we are running into infinite loop in some shutdown cases.
 I would suggest that as we are the ones who tries to close down the
 LogSyncerThread we can introduce a variable like
 Close or shutdown and based on the state of this flag along with
 isInterrupted() we can make the thread stop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5230) Unit test to ensure compactions don't cache data on write

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191497#comment-13191497
 ] 

Zhihong Yu commented on HBASE-5230:
---

This patch should be applied to 0.92, right ?
A patch for 0.92 would be desirable.

 Unit test to ensure compactions don't cache data on write
 -

 Key: HBASE-5230
 URL: https://issues.apache.org/jira/browse/HBASE-5230
 Project: HBase
  Issue Type: Test
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch


 Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
 write during compactions even if cache-on-write is enabled generally 
 enabled). This is because we have very different implementations of 
 HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
 and with CacheConfig (presumably it's there but not sure if it even works, 
 since the patch in HBASE-3976 may not have been committed). We need to create 
 a unit test to verify that we don't cache data blocks on write during 
 compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5255) Use singletons for OperationStatus to save memory

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191522#comment-13191522
 ] 

Zhihong Yu commented on HBASE-5255:
---

Integrated to 0.92 and TRUNK.

 Use singletons for OperationStatus to save memory
 -

 Key: HBASE-5255
 URL: https://issues.apache.org/jira/browse/HBASE-5255
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.90.5, 0.92.0
Reporter: Benoit Sigoure
Assignee: Benoit Sigoure
Priority: Minor
  Labels: performance
 Fix For: 0.94.0, 0.92.1

 Attachments: 5255-92.txt, 5255-v2.txt, 
 HBASE-5255-0.92-Use-singletons-to-remove-unnecessary-memory-allocati.patch, 
 HBASE-5255-trunk-Use-singletons-to-remove-unnecessary-memory-allocati.patch


 Every single {{Put}} causes the allocation of at least one 
 {{OperationStatus}}, yet {{OperationStatus}} is almost always stateless, so 
 these allocations are unnecessary and could be avoided.  Attached patch adds 
 a few singletons and uses them, with no public API change.  I didn't test the 
 patches, but you get the idea.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-5209) HConnection/HMasterInterface should allow for way to get hostname of currently active master in multi-master HBase setup

2012-01-23 Thread David S. Wang (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David S. Wang reassigned HBASE-5209:


Assignee: David S. Wang

 HConnection/HMasterInterface should allow for way to get hostname of 
 currently active master in multi-master HBase setup
 

 Key: HBASE-5209
 URL: https://issues.apache.org/jira/browse/HBASE-5209
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.94.0, 0.90.5, 0.92.0
Reporter: Aditya Acharya
Assignee: David S. Wang

 I have a multi-master HBase set up, and I'm trying to programmatically 
 determine which of the masters is currently active. But the API does not 
 allow me to do this. There is a getMaster() method in the HConnection class, 
 but it returns an HMasterInterface, whose methods do not allow me to find out 
 which master won the last race. The API should have a 
 getActiveMasterHostname() or something to that effect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191564#comment-13191564
 ] 

Zhihong Yu commented on HBASE-4720:
---

Patch v6 looks good.
Will integrate if Andrew doesn't have further comment.

 Implement atomic update operations (checkAndPut, checkAndDelete) for REST 
 client/server 
 

 Key: HBASE-4720
 URL: https://issues.apache.org/jira/browse/HBASE-4720
 Project: HBase
  Issue Type: Improvement
Reporter: Daniel Lord
Assignee: Mubarak Seyed
 Fix For: 0.94.0

 Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, 
 HBASE-4720.trunk.v3.patch, HBASE-4720.trunk.v4.patch, 
 HBASE-4720.trunk.v5.patch, HBASE-4720.trunk.v6.patch, HBASE-4720.v1.patch, 
 HBASE-4720.v3.patch


 I have several large application/HBase clusters where an application node 
 will occasionally need to talk to HBase from a different cluster.  In order 
 to help ensure some of my consistency guarantees I have a sentinel table that 
 is updated atomically as users interact with the system.  This works quite 
 well for the regular hbase client but the REST client does not implement 
 the checkAndPut and checkAndDelete operations.  This exposes the application 
 to some race conditions that have to be worked around.  It would be ideal if 
 the same checkAndPut/checkAndDelete operations could be supported by the REST 
 client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5263) Preserving cached data on compactions through cache-on-write

2012-01-23 Thread Mikhail Bautin (Created) (JIRA)
Preserving cached data on compactions through cache-on-write


 Key: HBASE-5263
 URL: https://issues.apache.org/jira/browse/HBASE-5263
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor


We are tackling HBASE-3976 and HBASE-5230 to make sure we don't trash the block 
cache on compactions if cache-on-write is enabled. However, it would be ideal 
to reduce the effect compactions have on the cached data. For every block we 
are writing for a compacted file we can decide whether it needs to be cached 
based on whether the original blocks containing the same data were already in 
cache. More precisely, for every HFile reader in a compaction we can maintain a 
boolean flag saying whether the current key-value came from a disk IO or the 
block cache. In the HFile writer for the compaction's output we can maintain a 
flag that is set if any of the key-values in the block being written came from 
a cached block, use that flag at the end of a block to decide whether to 
cache-on-write the block, and reset the flag to false on a block boundary. If 
such an inclusive approach would still trash the cache, we could restrict the 
total number of blocks to be cached per an output HFile, switch to an and 
logic instead of or logic for deciding whether to cache an output file block, 
or only cache a certain percentage of output file blocks that contain some of 
the previously cached data. 

Thanks to Nicolas for this elegant online algorithm idea!


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5258) Move coprocessors set out of RegionLoad

2012-01-23 Thread Andrew Purtell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191589#comment-13191589
 ] 

Andrew Purtell commented on HBASE-5258:
---

bq. This flexibility is causing extra cost in the region server to Master 
communication and increasing the footprint of Master heap.

No doubt it is redundant to have each region report a coprocessor given how the 
framework currently works: All regions for a table will have an identical set 
of coprocessors loaded, or there is something bad happening.

bq. Would HServerLoad be a better place for this set ?

I have no major objection.

However, maybe we want a way to know if something bad happened on a region and 
a coprocessor on it went away? One could comb logs but that is hardly a 
convenient way to get online state.



 Move coprocessors set out of RegionLoad
 ---

 Key: HBASE-5258
 URL: https://issues.apache.org/jira/browse/HBASE-5258
 Project: HBase
  Issue Type: Task
Reporter: Zhihong Yu

 When I worked on HBASE-5256, I revisited the code related to Ser/De of 
 coprocessors set in RegionLoad.
 I think the rationale for embedding coprocessors set is for maximum 
 flexibility where each region can load different coprocessors.
 This flexibility is causing extra cost in the region server to Master 
 communication and increasing the footprint of Master heap.
 Would HServerLoad be a better place for this set ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5258) Move coprocessors set out of RegionLoad

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191594#comment-13191594
 ] 

Zhihong Yu commented on HBASE-5258:
---

To my knowledge, for a mis-behaving coprocessor we either remove the buggy 
coprocessor or abort.
I wonder what scenario would lead to imbalanced coprocessors on a region.

 Move coprocessors set out of RegionLoad
 ---

 Key: HBASE-5258
 URL: https://issues.apache.org/jira/browse/HBASE-5258
 Project: HBase
  Issue Type: Task
Reporter: Zhihong Yu

 When I worked on HBASE-5256, I revisited the code related to Ser/De of 
 coprocessors set in RegionLoad.
 I think the rationale for embedding coprocessors set is for maximum 
 flexibility where each region can load different coprocessors.
 This flexibility is causing extra cost in the region server to Master 
 communication and increasing the footprint of Master heap.
 Would HServerLoad be a better place for this set ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5230) Unit test to ensure compactions don't cache data on write

2012-01-23 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5230:
---

Attachment: D1353.4.patch

mbautin updated the revision [jira] [HBASE-5230] Extend TestCacheOnWrite to 
ensure we don't cache data blocks on compaction.
Reviewers: nspiegelberg, tedyu, Liyin, stack, JIRA

  Addressing Nicolas's comments. Re-running all unit tests.

REVISION DETAIL
  https://reviews.facebook.net/D1353

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
  src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java


 Unit test to ensure compactions don't cache data on write
 -

 Key: HBASE-5230
 URL: https://issues.apache.org/jira/browse/HBASE-5230
 Project: HBase
  Issue Type: Test
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
 D1353.4.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch


 Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
 write during compactions even if cache-on-write is enabled generally 
 enabled). This is because we have very different implementations of 
 HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
 and with CacheConfig (presumably it's there but not sure if it even works, 
 since the patch in HBASE-3976 may not have been committed). We need to create 
 a unit test to verify that we don't cache data blocks on write during 
 compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5230) Unit test to ensure compactions don't cache data on write

2012-01-23 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191606#comment-13191606
 ] 

Phabricator commented on HBASE-5230:


mbautin has commented on the revision [jira] [HBASE-5230] Extend 
TestCacheOnWrite to ensure we don't cache data blocks on compaction.

  Responses to comments inline.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java:746 Done.
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:756 Done.
  
src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java:880
 Replaced this with a method that prints out the metrics into a StringBuilder 
and returns a string.
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java:253 
Since this is for data blocks, I renamed this to 
testNotCachingDataBlocksDuringCompaction.
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:764-766 That is 
correct, to my best knowledge. Eventually, I think we would like to 
intelligently decide whether to cache-on-write a block based on whether the 
data in question is already in the block cache as part of uncompacted files: 
https://issues.apache.org/jira/browse/HBASE-5263

  
src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java:879-880
 In my debugging I have not yet come across a case when I would have found a 
SchemaMetrics.toString method useful. Also, adapting this static method to 
implement toString would be tricky, since it relies on getMetricsSnapshot() 
that takes a snapshot of _all_ metrics, not just those for a particular 
table/CF combination corresponding to one SchemaMetrics instance. Therefore, I 
would prefer to leave SchemaMetrics.toString() out for now.
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java:4502 
Done.
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java:254 
Added.
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java:295 Done.

REVISION DETAIL
  https://reviews.facebook.net/D1353


 Unit test to ensure compactions don't cache data on write
 -

 Key: HBASE-5230
 URL: https://issues.apache.org/jira/browse/HBASE-5230
 Project: HBase
  Issue Type: Test
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
 D1353.4.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch


 Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
 write during compactions even if cache-on-write is enabled generally 
 enabled). This is because we have very different implementations of 
 HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
 and with CacheConfig (presumably it's there but not sure if it even works, 
 since the patch in HBASE-3976 may not have been committed). We need to create 
 a unit test to verify that we don't cache data blocks on write during 
 compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5264) Add 0.92.0 upgrade guide

2012-01-23 Thread stack (Created) (JIRA)
Add 0.92.0 upgrade guide


 Key: HBASE-5264
 URL: https://issues.apache.org/jira/browse/HBASE-5264
 Project: HBase
  Issue Type: Task
Reporter: stack
 Attachments: 5264.txt

Add an upgrade guide for going from 0.90 to 0.92.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5130) A map-reduce wrapper for HBase test suite (mr-test-runner)

2012-01-23 Thread Mikhail Bautin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-5130:
--

Description: We have a tool we call mrunit (but will call 
mr-test-runner in the open-source version) that runs HBase unit tests on a 
map-reduce cluster. We need modify it to use distributed cache to deploy the 
code on the cluster instead of our internal deployment tool, and open-source 
it.  (was: We have a tool we call mrunit that runs HBase unit tests on a 
map-reduce cluster. We need modify it to use distributed cache to deploy the 
code on the cluster instead of our internal deployment tool, and open-source 
it.)
Summary: A map-reduce wrapper for HBase test suite (mr-test-runner)  
(was: A map-reduce wrapper for HBase test suite (mrunit))

 A map-reduce wrapper for HBase test suite (mr-test-runner)
 

 Key: HBASE-5130
 URL: https://issues.apache.org/jira/browse/HBASE-5130
 Project: HBase
  Issue Type: Test
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin

 We have a tool we call mrunit (but will call mr-test-runner in the 
 open-source version) that runs HBase unit tests on a map-reduce cluster. We 
 need modify it to use distributed cache to deploy the code on the cluster 
 instead of our internal deployment tool, and open-source it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5264) Add 0.92.0 upgrade guide

2012-01-23 Thread stack (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-5264.
--

   Resolution: Fixed
Fix Version/s: 0.94.0

Committed TRUNK

 Add 0.92.0 upgrade guide
 

 Key: HBASE-5264
 URL: https://issues.apache.org/jira/browse/HBASE-5264
 Project: HBase
  Issue Type: Task
Reporter: stack
 Fix For: 0.94.0

 Attachments: 5264.txt


 Add an upgrade guide for going from 0.90 to 0.92.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5264) Add 0.92.0 upgrade guide

2012-01-23 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5264:
-

Attachment: 5264.txt

A patch J-D and I hacked up.

 Add 0.92.0 upgrade guide
 

 Key: HBASE-5264
 URL: https://issues.apache.org/jira/browse/HBASE-5264
 Project: HBase
  Issue Type: Task
Reporter: stack
 Fix For: 0.94.0

 Attachments: 5264.txt


 Add an upgrade guide for going from 0.90 to 0.92.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5230) Unit test to ensure compactions don't cache data on write

2012-01-23 Thread Mikhail Bautin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-5230:
--

Attachment: Don-t-cache-data-blocks-on-compaction-2012-01-23_15_27_23.patch

A new patch addressing Nicolas's comments.

 Unit test to ensure compactions don't cache data on write
 -

 Key: HBASE-5230
 URL: https://issues.apache.org/jira/browse/HBASE-5230
 Project: HBase
  Issue Type: Test
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
 D1353.4.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-23_15_27_23.patch


 Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
 write during compactions even if cache-on-write is enabled generally 
 enabled). This is because we have very different implementations of 
 HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
 and with CacheConfig (presumably it's there but not sure if it even works, 
 since the patch in HBASE-3976 may not have been committed). We need to create 
 a unit test to verify that we don't cache data blocks on write during 
 compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss

2012-01-23 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5179:
--

Attachment: 5179-90v18.txt

Patch v18 addresses Stack's comments.

The sleep() isn't for unit test. I lowered wait interval to 500ms.

I created waitUntilNoLogDir(HServerAddress serverAddress) so that -ROOT- and 
.META. servers can reuse the logic.

Renamed logDirExists() to getLogDirIfExists()

 Concurrent processing of processFaileOver and ServerShutdownHandler may cause 
 region to be assigned before log splitting is completed, causing data loss
 

 Key: HBASE-5179
 URL: https://issues.apache.org/jira/browse/HBASE-5179
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.2
Reporter: chunhui shen
Assignee: chunhui shen
Priority: Critical
 Fix For: 0.94.0, 0.90.6, 0.92.1

 Attachments: 5179-90.txt, 5179-90v10.patch, 5179-90v11.patch, 
 5179-90v12.patch, 5179-90v13.txt, 5179-90v14.patch, 5179-90v15.patch, 
 5179-90v16.patch, 5179-90v17.txt, 5179-90v18.txt, 5179-90v2.patch, 
 5179-90v3.patch, 5179-90v4.patch, 5179-90v5.patch, 5179-90v6.patch, 
 5179-90v7.patch, 5179-90v8.patch, 5179-90v9.patch, 5179-92v17.patch, 
 5179-v11-92.txt, 5179-v11.txt, 5179-v2.txt, 5179-v3.txt, 5179-v4.txt, 
 Errorlog, hbase-5179.patch, hbase-5179v10.patch, hbase-5179v12.patch, 
 hbase-5179v17.patch, hbase-5179v5.patch, hbase-5179v6.patch, 
 hbase-5179v7.patch, hbase-5179v8.patch, hbase-5179v9.patch


 If master's processing its failover and ServerShutdownHandler's processing 
 happen concurrently, it may appear following  case.
 1.master completed splitLogAfterStartup()
 2.RegionserverA restarts, and ServerShutdownHandler is processing.
 3.master starts to rebuildUserRegions, and RegionserverA is considered as 
 dead server.
 4.master starts to assign regions of RegionserverA because it is a dead 
 server by step3.
 However, when doing step4(assigning region), ServerShutdownHandler may be 
 doing split log, Therefore, it may cause data loss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss

2012-01-23 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191630#comment-13191630
 ] 

Hadoop QA commented on HBASE-5179:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12511589/5179-90v18.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/844//console

This message is automatically generated.

 Concurrent processing of processFaileOver and ServerShutdownHandler may cause 
 region to be assigned before log splitting is completed, causing data loss
 

 Key: HBASE-5179
 URL: https://issues.apache.org/jira/browse/HBASE-5179
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.2
Reporter: chunhui shen
Assignee: chunhui shen
Priority: Critical
 Fix For: 0.94.0, 0.90.6, 0.92.1

 Attachments: 5179-90.txt, 5179-90v10.patch, 5179-90v11.patch, 
 5179-90v12.patch, 5179-90v13.txt, 5179-90v14.patch, 5179-90v15.patch, 
 5179-90v16.patch, 5179-90v17.txt, 5179-90v18.txt, 5179-90v2.patch, 
 5179-90v3.patch, 5179-90v4.patch, 5179-90v5.patch, 5179-90v6.patch, 
 5179-90v7.patch, 5179-90v8.patch, 5179-90v9.patch, 5179-92v17.patch, 
 5179-v11-92.txt, 5179-v11.txt, 5179-v2.txt, 5179-v3.txt, 5179-v4.txt, 
 Errorlog, hbase-5179.patch, hbase-5179v10.patch, hbase-5179v12.patch, 
 hbase-5179v17.patch, hbase-5179v5.patch, hbase-5179v6.patch, 
 hbase-5179v7.patch, hbase-5179v8.patch, hbase-5179v9.patch


 If master's processing its failover and ServerShutdownHandler's processing 
 happen concurrently, it may appear following  case.
 1.master completed splitLogAfterStartup()
 2.RegionserverA restarts, and ServerShutdownHandler is processing.
 3.master starts to rebuildUserRegions, and RegionserverA is considered as 
 dead server.
 4.master starts to assign regions of RegionserverA because it is a dead 
 server by step3.
 However, when doing step4(assigning region), ServerShutdownHandler may be 
 doing split log, Therefore, it may cause data loss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5258) Move coprocessors set out of RegionLoad

2012-01-23 Thread Andrew Purtell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191639#comment-13191639
 ] 

Andrew Purtell commented on HBASE-5258:
---

bq. for a mis-behaving coprocessor we either remove the buggy coprocessor

... from the coprocessor host for the given region (in the case of 
RegionCoprocessorHost) only...


 Move coprocessors set out of RegionLoad
 ---

 Key: HBASE-5258
 URL: https://issues.apache.org/jira/browse/HBASE-5258
 Project: HBase
  Issue Type: Task
Reporter: Zhihong Yu

 When I worked on HBASE-5256, I revisited the code related to Ser/De of 
 coprocessors set in RegionLoad.
 I think the rationale for embedding coprocessors set is for maximum 
 flexibility where each region can load different coprocessors.
 This flexibility is causing extra cost in the region server to Master 
 communication and increasing the footprint of Master heap.
 Would HServerLoad be a better place for this set ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5243) LogSyncerThread not getting shutdown waiting for the interrupted flag

2012-01-23 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191638#comment-13191638
 ] 

Hudson commented on HBASE-5243:
---

Integrated in HBase-0.92 #258 (See 
[https://builds.apache.org/job/HBase-0.92/258/])
HBASE-5243 Addendum moves the close() method to right place

tedyu : 
Files : 
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java


 LogSyncerThread not getting shutdown waiting for the interrupted flag
 -

 Key: HBASE-5243
 URL: https://issues.apache.org/jira/browse/HBASE-5243
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.5
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.6, 0.92.1

 Attachments: 5243-92.addendum, HBASE-5243_0.90.patch, 
 HBASE-5243_0.90_1.patch, HBASE-5243_trunk.patch


 In the LogSyncer run() we keep looping till this.isInterrupted flag is set.
 But in some cases the DFSclient is consuming the Interrupted exception.  So
 we are running into infinite loop in some shutdown cases.
 I would suggest that as we are the ones who tries to close down the
 LogSyncerThread we can introduce a variable like
 Close or shutdown and based on the state of this flag along with
 isInterrupted() we can make the thread stop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5258) Move coprocessors set out of RegionLoad

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191651#comment-13191651
 ] 

Zhihong Yu commented on HBASE-5258:
---

Since the combination of coprocessors on a region server is limited, I was 
suggesting that the report of uneven coprocessor presence be embedded in 
HServerLoad.

 Move coprocessors set out of RegionLoad
 ---

 Key: HBASE-5258
 URL: https://issues.apache.org/jira/browse/HBASE-5258
 Project: HBase
  Issue Type: Task
Reporter: Zhihong Yu

 When I worked on HBASE-5256, I revisited the code related to Ser/De of 
 coprocessors set in RegionLoad.
 I think the rationale for embedding coprocessors set is for maximum 
 flexibility where each region can load different coprocessors.
 This flexibility is causing extra cost in the region server to Master 
 communication and increasing the footprint of Master heap.
 Would HServerLoad be a better place for this set ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5258) Move coprocessors set out of RegionLoad

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191646#comment-13191646
 ] 

Zhihong Yu commented on HBASE-5258:
---

I agree with the last comment @ 23/Jan/12 22:58
My understanding of the feature is that user should validate coprocessor by 
choosing the Abort policy for buggy coprocessor in pre-deployment stage. In 
production, the chance of buggy coprocessor dropping from individual region(s) 
should be low.

The ability of querying imbalanced coprocessors on a region server should be 
on-demand feature.

 Move coprocessors set out of RegionLoad
 ---

 Key: HBASE-5258
 URL: https://issues.apache.org/jira/browse/HBASE-5258
 Project: HBase
  Issue Type: Task
Reporter: Zhihong Yu

 When I worked on HBASE-5256, I revisited the code related to Ser/De of 
 coprocessors set in RegionLoad.
 I think the rationale for embedding coprocessors set is for maximum 
 flexibility where each region can load different coprocessors.
 This flexibility is causing extra cost in the region server to Master 
 communication and increasing the footprint of Master heap.
 Would HServerLoad be a better place for this set ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5230) Unit test to ensure compactions don't cache data on write

2012-01-23 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191657#comment-13191657
 ] 

Hadoop QA commented on HBASE-5230:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12511585/Don-t-cache-data-blocks-on-compaction-2012-01-23_15_27_23.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 8 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -145 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 84 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.replication.TestReplication
  org.apache.hadoop.hbase.mapreduce.TestImportTsv
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
  org.apache.hadoop.hbase.master.TestSplitLogManager

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/843//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/843//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/843//console

This message is automatically generated.

 Unit test to ensure compactions don't cache data on write
 -

 Key: HBASE-5230
 URL: https://issues.apache.org/jira/browse/HBASE-5230
 Project: HBase
  Issue Type: Test
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
 D1353.4.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-23_15_27_23.patch


 Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
 write during compactions even if cache-on-write is enabled generally 
 enabled). This is because we have very different implementations of 
 HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
 and with CacheConfig (presumably it's there but not sure if it even works, 
 since the patch in HBASE-3976 may not have been committed). We need to create 
 a unit test to verify that we don't cache data blocks on write during 
 compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5265) Fix 'revoke' shell command

2012-01-23 Thread Andrew Purtell (Created) (JIRA)
Fix 'revoke' shell command
--

 Key: HBASE-5265
 URL: https://issues.apache.org/jira/browse/HBASE-5265
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.94.0, 0.92.1


The 'revoke' shell command needs to be reworked for the AccessControlProtocol 
implementation that was finalized for 0.92. The permissions being removed must 
exactly match what was previously granted. No wildcard matching is done server 
side.

Allow two forms of the command in the shell for convenience:

Revocation of a specific grant:
{code}
revoke user, table, column family [ , column_qualifier ]
{code}

Have the shell automatically do so for all permissions on a table for a given 
user:
{code}
revoke user, table
{code}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191678#comment-13191678
 ] 

Zhihong Yu commented on HBASE-5179:
---

What do we do with 5179-92v17.patch ?
Test harness may not be ready in 0.92 for the (future) trunk patch to be 
applied.

 Concurrent processing of processFaileOver and ServerShutdownHandler may cause 
 region to be assigned before log splitting is completed, causing data loss
 

 Key: HBASE-5179
 URL: https://issues.apache.org/jira/browse/HBASE-5179
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.2
Reporter: chunhui shen
Assignee: chunhui shen
Priority: Critical
 Fix For: 0.94.0, 0.90.6, 0.92.1

 Attachments: 5179-90.txt, 5179-90v10.patch, 5179-90v11.patch, 
 5179-90v12.patch, 5179-90v13.txt, 5179-90v14.patch, 5179-90v15.patch, 
 5179-90v16.patch, 5179-90v17.txt, 5179-90v18.txt, 5179-90v2.patch, 
 5179-90v3.patch, 5179-90v4.patch, 5179-90v5.patch, 5179-90v6.patch, 
 5179-90v7.patch, 5179-90v8.patch, 5179-90v9.patch, 5179-92v17.patch, 
 5179-v11-92.txt, 5179-v11.txt, 5179-v2.txt, 5179-v3.txt, 5179-v4.txt, 
 Errorlog, hbase-5179.patch, hbase-5179v10.patch, hbase-5179v12.patch, 
 hbase-5179v17.patch, hbase-5179v5.patch, hbase-5179v6.patch, 
 hbase-5179v7.patch, hbase-5179v8.patch, hbase-5179v9.patch


 If master's processing its failover and ServerShutdownHandler's processing 
 happen concurrently, it may appear following  case.
 1.master completed splitLogAfterStartup()
 2.RegionserverA restarts, and ServerShutdownHandler is processing.
 3.master starts to rebuildUserRegions, and RegionserverA is considered as 
 dead server.
 4.master starts to assign regions of RegionserverA because it is a dead 
 server by step3.
 However, when doing step4(assigning region), ServerShutdownHandler may be 
 doing split log, Therefore, it may cause data loss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss

2012-01-23 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5179:
--

Comment: was deleted

(was: -1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12511589/5179-90v18.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/844//console

This message is automatically generated.)

 Concurrent processing of processFaileOver and ServerShutdownHandler may cause 
 region to be assigned before log splitting is completed, causing data loss
 

 Key: HBASE-5179
 URL: https://issues.apache.org/jira/browse/HBASE-5179
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.2
Reporter: chunhui shen
Assignee: chunhui shen
Priority: Critical
 Fix For: 0.94.0, 0.90.6, 0.92.1

 Attachments: 5179-90.txt, 5179-90v10.patch, 5179-90v11.patch, 
 5179-90v12.patch, 5179-90v13.txt, 5179-90v14.patch, 5179-90v15.patch, 
 5179-90v16.patch, 5179-90v17.txt, 5179-90v18.txt, 5179-90v2.patch, 
 5179-90v3.patch, 5179-90v4.patch, 5179-90v5.patch, 5179-90v6.patch, 
 5179-90v7.patch, 5179-90v8.patch, 5179-90v9.patch, 5179-92v17.patch, 
 5179-v11-92.txt, 5179-v11.txt, 5179-v2.txt, 5179-v3.txt, 5179-v4.txt, 
 Errorlog, hbase-5179.patch, hbase-5179v10.patch, hbase-5179v12.patch, 
 hbase-5179v17.patch, hbase-5179v5.patch, hbase-5179v6.patch, 
 hbase-5179v7.patch, hbase-5179v8.patch, hbase-5179v9.patch


 If master's processing its failover and ServerShutdownHandler's processing 
 happen concurrently, it may appear following  case.
 1.master completed splitLogAfterStartup()
 2.RegionserverA restarts, and ServerShutdownHandler is processing.
 3.master starts to rebuildUserRegions, and RegionserverA is considered as 
 dead server.
 4.master starts to assign regions of RegionserverA because it is a dead 
 server by step3.
 However, when doing step4(assigning region), ServerShutdownHandler may be 
 doing split log, Therefore, it may cause data loss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5262) Structured event log for HBase for monitoring and auto-tuning performance

2012-01-23 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191682#comment-13191682
 ] 

Lars Hofhansl commented on HBASE-5262:
--

You thinking JSON or something?

 Structured event log for HBase for monitoring and auto-tuning performance
 -

 Key: HBASE-5262
 URL: https://issues.apache.org/jira/browse/HBASE-5262
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin

 Creating this JIRA to open a discussion about a structured (machine-readable) 
 log that will record events such as compaction start/end times, compaction 
 input/output files, their sizes, the same for flushes, etc. This can be 
 stored e.g. in a new system table in HBase itself. The data from this log can 
 then be analyzed and used to optimize compactions at run time, or otherwise 
 auto-tune HBase configuration to reduce the number of knobs the user has to 
 configure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5256) Use WritableUtils.readVInt() in RegionLoad.readFields()

2012-01-23 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191683#comment-13191683
 ] 

Zhihong Yu commented on HBASE-5256:
---

Since the version of RegionLoad would be bumped, I think this change should be 
applied to all integer/long metrics.

 Use WritableUtils.readVInt() in RegionLoad.readFields()
 ---

 Key: HBASE-5256
 URL: https://issues.apache.org/jira/browse/HBASE-5256
 Project: HBase
  Issue Type: Task
Reporter: Zhihong Yu
 Fix For: 0.94.0


 Currently in.readInt() is used in RegionLoad.readFields()
 More metrics would be added to RegionLoad in the future, we should utilize 
 WritableUtils.readVInt() to reduce the amount of data exchanged between 
 Master and region servers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5261) Update HBase for Java 7

2012-01-23 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191686#comment-13191686
 ] 

Lars Hofhansl commented on HBASE-5261:
--

I think we need to be careful to maintain compatibility with JDK 6 for a 
lng time. For many enterprises switched to JDK 7 is a major effort.


 Update HBase for Java 7
 ---

 Key: HBASE-5261
 URL: https://issues.apache.org/jira/browse/HBASE-5261
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin

 We need to make sure that HBase compiles and works with JDK 7. Once we verify 
 it is reasonably stable, we can explore utilizing the G1 garbage collector. 
 When all deployments are ready to move to JDK 7, we can start using new 
 language features, but in the transition period we will need to maintain a 
 codebase that compiles both with JDK 6 and JDK 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5266) Add documentation for ColumnRangeFilter

2012-01-23 Thread Lars Hofhansl (Created) (JIRA)
Add documentation for ColumnRangeFilter
---

 Key: HBASE-5266
 URL: https://issues.apache.org/jira/browse/HBASE-5266
 Project: HBase
  Issue Type: Sub-task
  Components: documentation
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Fix For: 0.94.0


There are only a few lines of documentation for ColumnRangeFilter.
Given the usefulness of this filter for efficient intra-row scanning (see 
HASE-5229 and HBASE-4256), we should make this filter more prominent in the 
documentation.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5262) Structured event log for HBase for monitoring and auto-tuning performance

2012-01-23 Thread Mikhail Bautin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191689#comment-13191689
 ] 

Mikhail Bautin commented on HBASE-5262:
---

JSON could be the encoding for the value part of each log entry. However, if 
we decide to store this type of information in HBase itself, we will need to 
think through the schema from the point of view of at least couple of different 
use cases, e.g. analyzing compaction performance, auto-tuning the compaction 
algorithm, maybe auto-tuning some block cache settings, etc.

 Structured event log for HBase for monitoring and auto-tuning performance
 -

 Key: HBASE-5262
 URL: https://issues.apache.org/jira/browse/HBASE-5262
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin

 Creating this JIRA to open a discussion about a structured (machine-readable) 
 log that will record events such as compaction start/end times, compaction 
 input/output files, their sizes, the same for flushes, etc. This can be 
 stored e.g. in a new system table in HBase itself. The data from this log can 
 then be analyzed and used to optimize compactions at run time, or otherwise 
 auto-tune HBase configuration to reduce the number of knobs the user has to 
 configure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server

2012-01-23 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191690#comment-13191690
 ] 

Lars Hofhansl commented on HBASE-4720:
--

Should we add another jira for supporting the new RowMutations? (HBASE-3584 and 
HBASE-5203)

 Implement atomic update operations (checkAndPut, checkAndDelete) for REST 
 client/server 
 

 Key: HBASE-4720
 URL: https://issues.apache.org/jira/browse/HBASE-4720
 Project: HBase
  Issue Type: Improvement
Reporter: Daniel Lord
Assignee: Mubarak Seyed
 Fix For: 0.94.0

 Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, 
 HBASE-4720.trunk.v3.patch, HBASE-4720.trunk.v4.patch, 
 HBASE-4720.trunk.v5.patch, HBASE-4720.trunk.v6.patch, HBASE-4720.v1.patch, 
 HBASE-4720.v3.patch


 I have several large application/HBase clusters where an application node 
 will occasionally need to talk to HBase from a different cluster.  In order 
 to help ensure some of my consistency guarantees I have a sentinel table that 
 is updated atomically as users interact with the system.  This works quite 
 well for the regular hbase client but the REST client does not implement 
 the checkAndPut and checkAndDelete operations.  This exposes the application 
 to some race conditions that have to be worked around.  It would be ideal if 
 the same checkAndPut/checkAndDelete operations could be supported by the REST 
 client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5262) Structured event log for HBase for monitoring and auto-tuning performance

2012-01-23 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191692#comment-13191692
 ] 

Todd Lipcon commented on HBASE-5262:


I have some preference against JSON -- IMO JSON isn't strict enough, so 
people tend to break the format over time without a centralized schema 
definition to remind them that this is a real interface. I would be happier to 
see either avro or protobuf used. But definitely +1 for something that's 
consumable by non-Java programs and won't need to be actively upgraded when we 
add or remove fields.

 Structured event log for HBase for monitoring and auto-tuning performance
 -

 Key: HBASE-5262
 URL: https://issues.apache.org/jira/browse/HBASE-5262
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin

 Creating this JIRA to open a discussion about a structured (machine-readable) 
 log that will record events such as compaction start/end times, compaction 
 input/output files, their sizes, the same for flushes, etc. This can be 
 stored e.g. in a new system table in HBase itself. The data from this log can 
 then be analyzed and used to optimize compactions at run time, or otherwise 
 auto-tune HBase configuration to reduce the number of knobs the user has to 
 configure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5266) Add documentation for ColumnRangeFilter

2012-01-23 Thread Andrew Purtell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191699#comment-13191699
 ] 

Andrew Purtell commented on HBASE-5266:
---

Maybe you could work up an example of that for the docs Lars?

 Add documentation for ColumnRangeFilter
 ---

 Key: HBASE-5266
 URL: https://issues.apache.org/jira/browse/HBASE-5266
 Project: HBase
  Issue Type: Sub-task
  Components: documentation
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Fix For: 0.94.0


 There are only a few lines of documentation for ColumnRangeFilter.
 Given the usefulness of this filter for efficient intra-row scanning (see 
 HASE-5229 and HBASE-4256), we should make this filter more prominent in the 
 documentation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   >