date:20120223

[
https://issues.apache.org/jira/browse/HBASE-5437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214447#comment-13214447
]

Hadoop QA commented on HBASE-5437:
--

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12515720/HBASE-5437.D1887.2.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified
tests.
Please justify why no new tests are needed for this
patch.
Also please list what manual steps were performed to
verify this patch.

-1 javadoc. The javadoc tool appears to have generated -136 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 152 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.mapreduce.TestImportTsv
org.apache.hadoop.hbase.mapred.TestTableMapReduce
org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1022//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1022//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1022//console

This message is automatically generated.

HRegionThriftServer does not start because of a bug in
HbaseHandlerMetricsProxy
---

Key: HBASE-5437
URL: https://issues.apache.org/jira/browse/HBASE-5437
Project: HBase
Issue Type: Bug
Components: metrics, thrift
Reporter: Scott Chen
Assignee: Scott Chen
Fix For: 0.94.0

Attachments: HBASE-5437.D1857.1.patch, HBASE-5437.D1887.1.patch,
HBASE-5437.D1887.2.patch

3.facebook.com,60020,1329865516120: Initialization of RS failed. Hence
aborting RS.
java.lang.ClassCastException: $Proxy9 cannot be cast to
org.apache.hadoop.hbase.thrift.generated.Hbase$Iface
at
org.apache.hadoop.hbase.thrift.HbaseHandlerMetricsProxy.newInstance(HbaseHandlerMetricsProxy.java:47)
at
org.apache.hadoop.hbase.thrift.ThriftServerRunner.init(ThriftServerRunner.java:239)
at
org.apache.hadoop.hbase.regionserver.HRegionThriftServer.init(HRegionThriftServer.java:74)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.initializeThreads(HRegionServer.java:646)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:546)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:658)
at java.lang.Thread.run(Thread.java:662)
2012-02-21 15:05:18,749 FATAL org.apache.hadoop.h

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler

2012-02-23 Thread chunhui shen (Updated) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

chunhui shen updated HBASE-5270:

Attachment: hbase-5270v5.patch

Handle potential data loss due to concurrent processing of processFaileOver
and ServerShutdownHandler
-

Key: HBASE-5270
URL: https://issues.apache.org/jira/browse/HBASE-5270
Project: HBase
Issue Type: Sub-task
Components: master
Reporter: Zhihong Yu
Assignee: chunhui shen
Fix For: 0.92.1, 0.94.0

Attachments: 5270-90-testcase.patch, 5270-90-testcasev2.patch,
5270-90.patch, 5270-90v2.patch, 5270-90v3.patch, 5270-testcase.patch,
5270-testcasev2.patch, hbase-5270.patch, hbase-5270v2.patch,
hbase-5270v4.patch, hbase-5270v5.patch, sampletest.txt

This JIRA continues the effort from HBASE-5179. Starting with Stack's
comments about patches for 0.92 and TRUNK:
Reviewing 0.92v17
isDeadServerInProgress is a new public method in ServerManager but it does
not seem to be used anywhere.
Does isDeadRootServerInProgress need to be public? Ditto for meta version.
This method param names are not right 'definitiveRootServer'; what is meant
by definitive? Do they need this qualifier?
Is there anything in place to stop us expiring a server twice if its carrying
root and meta?
What is difference between asking assignment manager isCarryingRoot and this
variable that is passed in? Should be doc'd at least. Ditto for meta.
I think I've asked for this a few times - onlineServers needs to be
explained... either in javadoc or in comment. This is the param passed into
joinCluster. How does it arise? I think I know but am unsure. God love the
poor noob that comes awandering this code trying to make sense of it all.
It looks like we get the list by trawling zk for regionserver znodes that
have not checked in. Don't we do this operation earlier in master setup? Are
we doing it again here?
Though distributed split log is configured, we will do in master single
process splitting under some conditions with this patch. Its not explained in
code why we would do this. Why do we think master log splitting 'high
priority' when it could very well be slower. Should we only go this route if
distributed splitting is not going on. Do we know if concurrent distributed
log splitting and master splitting works?
Why would we have dead servers in progress here in master startup? Because a
servershutdownhandler fired?
This patch is different to the patch for 0.90. Should go into trunk first
with tests, then 0.92. Should it be in this issue? This issue is really hard
to follow now. Maybe this issue is for 0.90.x and new issue for more work on
this trunk patch?
This patch needs to have the v18 differences applied.

[jira] [Commented] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler

2012-02-23 Thread chunhui shen (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214475#comment-13214475
]

chunhui shen commented on HBASE-5270:
-

@Ted
I submit patch v5.

bq. So a server could be in both deadNotExpiredServers and deadservers ? I
don't see return statement in the if block.
I'm sorry I make a mistake to miss return statement in the if block.

Also we check that we're not in safe mode in expireDelayedServers().

And master is in safe mode only when it is initializing now.

Handle potential data loss due to concurrent processing of processFaileOver
and ServerShutdownHandler
-

Key: HBASE-5270
URL: https://issues.apache.org/jira/browse/HBASE-5270
Project: HBase
Issue Type: Sub-task
Components: master
Reporter: Zhihong Yu
Assignee: chunhui shen
Fix For: 0.92.1, 0.94.0

[jira] [Commented] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler

2012-02-23 Thread chunhui shen (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214476#comment-13214476
]

chunhui shen commented on HBASE-5270:
-

I can't add review request, it throws error:The file
'https://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java'
(r1292711) could not be found in the repository
why?

Handle potential data loss due to concurrent processing of processFaileOver
and ServerShutdownHandler
-

Key: HBASE-5270
URL: https://issues.apache.org/jira/browse/HBASE-5270
Project: HBase
Issue Type: Sub-task
Components: master
Reporter: Zhihong Yu
Assignee: chunhui shen
Fix For: 0.92.1, 0.94.0

[jira] [Created] (HBASE-5462) [monitor] Ganglia metric hbase.master.cluster_requests should exclude the scan meta request generated by master, or create a new metric which could show the real request

2012-02-23 Thread johnyang (Created) (JIRA)

[monitor] Ganglia metric hbase.master.cluster_requests should exclude the scan 
meta request generated by master, or create a new metric which could show the 
real request from client
-

 Key: HBASE-5462
 URL: https://issues.apache.org/jira/browse/HBASE-5462
 Project: HBase
  Issue Type: Bug
  Components: monitoring
 Environment: hbase 0.90.5
Reporter: johnyang


We have a big table which have 30k regions but the request is not very high 
(about 50K per day).
We use the hbase.master.cluster_request metrics to monitor the cluster request 
but find that lots of requests is generated by master, which scan the meta 
table at regular intervals.
It is hard for us to monitor the real request from the client, it is possible 
to filter the scanning meta table or create a new metric which could show the 
real request from client.

Thank you.




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4491) HBase Locality Checker

2012-02-23 Thread Anoop Sam John (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214617#comment-13214617
 ] 

Anoop Sam John commented on HBASE-4491:
---

@Liyin: Looks very useful. Any update on this new feature

 HBase Locality Checker
 --

 Key: HBASE-4491
 URL: https://issues.apache.org/jira/browse/HBASE-4491
 Project: HBase
  Issue Type: New Feature
Reporter: Liyin Tang
Assignee: Liyin Tang

 If we run data node and region server in the same physical machine, region 
 server will be benefit if the store files for its serving regions have a 
 local replica in the data node process.
 So for each regions, there exists a best locality region server which has 
 most local blocks for this region.
 The HBase Locality Checker will show how many regions is running on its best 
 locality region server. 
 The higher the number is, the more performance benefits HBase can get from 
 data locality.
 Also there would be a followup task to use these region locality information 
 for region assignment. Assignment manager will prefer assign regions to its 
 best locality region server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5455) Add test to avoid unintentional reordering of items in HbaseObjectWritable

2012-02-23 Thread Michael Drzal (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214678#comment-13214678
 ] 

Michael Drzal commented on HBASE-5455:
--

Lars, sure, let me throw something together, and I'll send it your way.

 Add test to avoid unintentional reordering of items in HbaseObjectWritable
 --

 Key: HBASE-5455
 URL: https://issues.apache.org/jira/browse/HBASE-5455
 Project: HBase
  Issue Type: Test
Reporter: Michael Drzal
Priority: Minor
 Fix For: 0.94.0


 HbaseObjectWritable has a static initialization block that assigns ints to 
 various classes.  The int is assigned by using a local variable that is 
 incremented after each use.  If someone adds a line in the middle of the 
 block, this throws off everything after the change, and can break client 
 compatibility.  There is already a comment to not add/remove lines at the 
 beginning of this block.  It might make sense to have a test against a static 
 set of ids.  If something gets changed unintentionally, it would at least 
 fail the tests.  If the change was intentional, at the very least the test 
 would need to get updated, and it would be a conscious decision.
 https://issues.apache.org/jira/browse/HBASE-5204 contains the the fix for one 
 issue of this type.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler

2012-02-23 Thread Nicolas Spiegelberg (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214793#comment-13214793
]

Zhihong Yu commented on HBASE-5270:
---

I was able to create new request.
Select hbase for Repository.
Enter '/' for Base Directory.

Leave Bugs field blank.
Enter hbase to Groups field.

Handle potential data loss due to concurrent processing of processFaileOver
and ServerShutdownHandler
-

Key: HBASE-5270
URL: https://issues.apache.org/jira/browse/HBASE-5270
Project: HBase
Issue Type: Sub-task
Components: master
Reporter: Zhihong Yu
Assignee: chunhui shen
Fix For: 0.92.1, 0.94.0

[jira] [Commented] (HBASE-5461) Set hbase.hstore.compaction.min.size way down to 4MB


[ 
https://issues.apache.org/jira/browse/HBASE-5461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214838#comment-13214838
 ] 

Nicolas Spiegelberg commented on HBASE-5461:


bq. I wonder if a good default would be some fraction of the flushsize. Maybe 
1/4*flushsize, or something.

Note that we raised the min size so that we'd aggressively compact the .META. 
table, which is normally pretty small.

 Set hbase.hstore.compaction.min.size way down to 4MB
 

 Key: HBASE-5461
 URL: https://issues.apache.org/jira/browse/HBASE-5461
 Project: HBase
  Issue Type: Task
Affects Versions: 0.92.1
Reporter: stack
Priority: Critical

 See discussion over in HBASE-3149.  Nicolas suggests setting this setting way 
 down, to the below:
 {code}
 property
namehbase.hstore.compaction.min.size/name
value4194304/value
description
  The minimum compaction size. All files below this size are always
  included into a compaction, even if outside compaction ratio times
  the total size of all files added to compaction so far.
/description
   /property
 {code}
 Lets try it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5437) HRegionThriftServer does not start because of a bug in HbaseHandlerMetricsProxy


[ 
https://issues.apache.org/jira/browse/HBASE-5437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214857#comment-13214857
 ] 

stack commented on HBASE-5437:
--

+1

 HRegionThriftServer does not start because of a bug in 
 HbaseHandlerMetricsProxy
 ---

 Key: HBASE-5437
 URL: https://issues.apache.org/jira/browse/HBASE-5437
 Project: HBase
  Issue Type: Bug
  Components: metrics, thrift
Reporter: Scott Chen
Assignee: Scott Chen
 Fix For: 0.94.0

 Attachments: HBASE-5437.D1857.1.patch, HBASE-5437.D1887.1.patch, 
 HBASE-5437.D1887.2.patch


 3.facebook.com,60020,1329865516120: Initialization of RS failed.  Hence 
 aborting RS.
 java.lang.ClassCastException: $Proxy9 cannot be cast to 
 org.apache.hadoop.hbase.thrift.generated.Hbase$Iface
 at 
 org.apache.hadoop.hbase.thrift.HbaseHandlerMetricsProxy.newInstance(HbaseHandlerMetricsProxy.java:47)
 at 
 org.apache.hadoop.hbase.thrift.ThriftServerRunner.init(ThriftServerRunner.java:239)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionThriftServer.init(HRegionThriftServer.java:74)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.initializeThreads(HRegionServer.java:646)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:546)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:658)
 at java.lang.Thread.run(Thread.java:662)
 2012-02-21 15:05:18,749 FATAL org.apache.hadoop.h

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5423) Regionserver may block forever on waitOnAllRegionsToClose when aborting


[ 
https://issues.apache.org/jira/browse/HBASE-5423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214865#comment-13214865
 ] 

Zhihong Yu commented on HBASE-5423:
---

{code}
+   if (this.regionsInTransitionInRS.isEmpty()) {
+ if (!isOnlineRegionsEmpty()) {
+   LOG.info(We were exiting though online regions are not empty, 
because some regions failed closing);
{code}
I think regionsInTransitionOnRS is better name for the new Map.
The log line above exceeds 80 chars.

 Regionserver may block forever on waitOnAllRegionsToClose when aborting
 ---

 Key: HBASE-5423
 URL: https://issues.apache.org/jira/browse/HBASE-5423
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.94.0

 Attachments: hbase-5423.patch, hbase-5423v2.patch


 If closeRegion throws any exception (It would be caused by FS ) when RS is 
 aborting, 
 RS will block forever on waitOnAllRegionsToClose().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4403) Adopt interface stability/audience classifications from Hadoop

2012-02-23 Thread Jimmy Xiang (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-4403:
---

Status: Patch Available  (was: Open)

 Adopt interface stability/audience classifications from Hadoop
 --

 Key: HBASE-4403
 URL: https://issues.apache.org/jira/browse/HBASE-4403
 Project: HBase
  Issue Type: Task
Affects Versions: 0.92.0, 0.90.5
Reporter: Todd Lipcon
Assignee: Jimmy Xiang
 Fix For: 0.94.0

 Attachments: hbase-4403-interface.txt, hbase-4403-interface_v2.txt, 
 hbase-4403-interface_v3.txt, hbase-4403-nowhere-near-done.txt, 
 hbase-4403.patch, hbase-4403.patch


 As HBase gets more widely used, we need to be more explicit about which APIs 
 are stable and not expected to break between versions, which APIs are still 
 evolving, etc. We also have many public classes that are really internal to 
 the RS or Master and not meant to be used by users. Hadoop has adopted a 
 classification scheme for audience (public, private, or limited-private) as 
 well as stability (stable, evolving, unstable). I think we should copy these 
 annotations to HBase and start to classify our public classes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5442) Use builder pattern in StoreFile and HFile

[
https://issues.apache.org/jira/browse/HBASE-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214868#comment-13214868
]

stack commented on HBASE-5442:
--

@Mikhail Thats the usual set of three that fail on hadoopqa, fyi.

Use builder pattern in StoreFile and HFile
--

Key: HBASE-5442
URL: https://issues.apache.org/jira/browse/HBASE-5442
Project: HBase
Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Attachments: D1893.1.patch, D1893.2.patch,
HFile-StoreFile-builder-2012-02-22_22_49_00.patch

We have five ways to create an HFile writer, two ways to create a StoreFile
writer, and the sets of parameters keep changing, creating a lot of
confusion, especially when porting patches across branches. The same thing is
happening to HColumnDescriptor. I think we should move to a builder pattern
solution, e.g.
{code:java}
HFileWriter w = HFile.getWriterBuilder(conf, some common args)
.setParameter1(value1)
.setParameter2(value2)
...
.build();
{code}
Each parameter setter being on its own line will make merges/cherry-pick work
properly, we will not have to even mention default parameters again, and we
can eliminate a dozen impossible-to-remember constructors.
This particular JIRA addresses StoreFile and HFile refactoring. For
HColumnDescriptor refactoring see HBASE-5357.

[jira] [Commented] (HBASE-5454) Refuse operations from Admin befor master is initialized


[ 
https://issues.apache.org/jira/browse/HBASE-5454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214870#comment-13214870
 ] 

stack commented on HBASE-5454:
--

So, you want to mash this patch into hbase-5270?  If so, close this one as 
won't fix?

 Refuse operations from Admin befor master is initialized
 

 Key: HBASE-5454
 URL: https://issues.apache.org/jira/browse/HBASE-5454
 Project: HBase
  Issue Type: Improvement
Reporter: chunhui shen
 Attachments: hbase-5454.patch


 In our testing environment,
 When master is initializing, we found conflict problems between 
 master#assignAllUserRegions and EnableTable event, causing assigning region 
 throw exception so that master abort itself.
 We think we'd better refuse operations from Admin, such as CreateTable, 
 EnableTable,etc, It could reduce error.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5166) MultiThreaded Table Mapper analogous to MultiThreaded Mapper in hadoop

2012-02-23 Thread stack (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5166:
-

Attachment: 5166-v9.txt

Same as 0008.  Uploading again to rerun hadoopqa.  Shouldn't be failing that 
many tests with this patch.

 MultiThreaded Table Mapper analogous to MultiThreaded Mapper in hadoop
 --

 Key: HBASE-5166
 URL: https://issues.apache.org/jira/browse/HBASE-5166
 Project: HBase
  Issue Type: Improvement
Reporter: Jai Kumar Singh
Priority: Minor
  Labels: multithreaded, tablemapper
 Attachments: 0001-Added-MultithreadedTableMapper-HBASE-5166.patch, 
 0003-Added-MultithreadedTableMapper-HBASE-5166.patch, 
 0005-HBASE-5166-Added-MultithreadedTableMapper.patch, 
 0006-HBASE-5166-Added-MultithreadedTableMapper.patch, 
 0008-HBASE-5166-Added-MultithreadedTableMapper.patch, 5166-v9.txt

   Original Estimate: 0.5h
  Remaining Estimate: 0.5h

 There is no MultiThreadedTableMapper in hbase currently just like we have a 
 MultiThreadedMapper in Hadoop for IO Bound Jobs. 
 UseCase, webcrawler: take input (urls) from a hbase table and put the content 
 (urls, content) back into hbase. 
 Running these kind of hbase mapreduce job with normal table mapper is quite 
 slow as we are not utilizing CPU fully (N/W IO Bound).
 Moreover, I want to know whether It would be a good/bad idea to use HBase for 
 these kind of usecases ?. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5166) MultiThreaded Table Mapper analogous to MultiThreaded Mapper in hadoop

2012-02-23 Thread stack (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5166:
-

Status: Open  (was: Patch Available)

 MultiThreaded Table Mapper analogous to MultiThreaded Mapper in hadoop
 --

 Key: HBASE-5166
 URL: https://issues.apache.org/jira/browse/HBASE-5166
 Project: HBase
  Issue Type: Improvement
Reporter: Jai Kumar Singh
Priority: Minor
  Labels: multithreaded, tablemapper
 Attachments: 0001-Added-MultithreadedTableMapper-HBASE-5166.patch, 
 0003-Added-MultithreadedTableMapper-HBASE-5166.patch, 
 0005-HBASE-5166-Added-MultithreadedTableMapper.patch, 
 0006-HBASE-5166-Added-MultithreadedTableMapper.patch, 
 0008-HBASE-5166-Added-MultithreadedTableMapper.patch, 5166-v9.txt

   Original Estimate: 0.5h
  Remaining Estimate: 0.5h

 There is no MultiThreadedTableMapper in hbase currently just like we have a 
 MultiThreadedMapper in Hadoop for IO Bound Jobs. 
 UseCase, webcrawler: take input (urls) from a hbase table and put the content 
 (urls, content) back into hbase. 
 Running these kind of hbase mapreduce job with normal table mapper is quite 
 slow as we are not utilizing CPU fully (N/W IO Bound).
 Moreover, I want to know whether It would be a good/bad idea to use HBase for 
 these kind of usecases ?. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5166) MultiThreaded Table Mapper analogous to MultiThreaded Mapper in hadoop

2012-02-23 Thread stack (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5166:
-

Status: Patch Available  (was: Open)

 MultiThreaded Table Mapper analogous to MultiThreaded Mapper in hadoop
 --

 Key: HBASE-5166
 URL: https://issues.apache.org/jira/browse/HBASE-5166
 Project: HBase
  Issue Type: Improvement
Reporter: Jai Kumar Singh
Priority: Minor
  Labels: multithreaded, tablemapper
 Attachments: 0001-Added-MultithreadedTableMapper-HBASE-5166.patch, 
 0003-Added-MultithreadedTableMapper-HBASE-5166.patch, 
 0005-HBASE-5166-Added-MultithreadedTableMapper.patch, 
 0006-HBASE-5166-Added-MultithreadedTableMapper.patch, 
 0008-HBASE-5166-Added-MultithreadedTableMapper.patch, 5166-v9.txt

   Original Estimate: 0.5h
  Remaining Estimate: 0.5h

 There is no MultiThreadedTableMapper in hbase currently just like we have a 
 MultiThreadedMapper in Hadoop for IO Bound Jobs. 
 UseCase, webcrawler: take input (urls) from a hbase table and put the content 
 (urls, content) back into hbase. 
 Running these kind of hbase mapreduce job with normal table mapper is quite 
 slow as we are not utilizing CPU fully (N/W IO Bound).
 Moreover, I want to know whether It would be a good/bad idea to use HBase for 
 these kind of usecases ?. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5166) MultiThreaded Table Mapper analogous to MultiThreaded Mapper in hadoop

[
https://issues.apache.org/jira/browse/HBASE-5166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214912#comment-13214912
]

Hadoop QA commented on HBASE-5166:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12515764/5166-v9.txt
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified tests.

-1 javadoc. The javadoc tool appears to have generated -134 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 153 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
org.apache.hadoop.hbase.mapred.TestTableMapReduce
org.apache.hadoop.hbase.mapreduce.TestImportTsv

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1024//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1024//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1024//console

This message is automatically generated.

MultiThreaded Table Mapper analogous to MultiThreaded Mapper in hadoop
--

Key: HBASE-5166
URL: https://issues.apache.org/jira/browse/HBASE-5166
Project: HBase
Issue Type: Improvement
Reporter: Jai Kumar Singh
Priority: Minor
Labels: multithreaded, tablemapper
Attachments: 0001-Added-MultithreadedTableMapper-HBASE-5166.patch,
0003-Added-MultithreadedTableMapper-HBASE-5166.patch,
0005-HBASE-5166-Added-MultithreadedTableMapper.patch,
0006-HBASE-5166-Added-MultithreadedTableMapper.patch,
0008-HBASE-5166-Added-MultithreadedTableMapper.patch, 5166-v9.txt

Original Estimate: 0.5h
Remaining Estimate: 0.5h

There is no MultiThreadedTableMapper in hbase currently just like we have a
MultiThreadedMapper in Hadoop for IO Bound Jobs.
UseCase, webcrawler: take input (urls) from a hbase table and put the content
(urls, content) back into hbase.
Running these kind of hbase mapreduce job with normal table mapper is quite
slow as we are not utilizing CPU fully (N/W IO Bound).
Moreover, I want to know whether It would be a good/bad idea to use HBase for
these kind of usecases ?.

[jira] [Commented] (HBASE-5457) add inline index in data block for data which are not clustered together

2012-02-23 Thread He Yongqiang (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214915#comment-13214915
]

He Yongqiang commented on HBASE-5457:
-

@stack, we haven't thought that in much detail, but we can start the discussion
by an example.

Let's say there is one column family, and it only contains one type column
whose name is a combine of 'string and ts'. So the data is sorted by 'string'
first. But one query wants the data to be sorted by ts instead.

add inline index in data block for data which are not clustered together

Key: HBASE-5457
URL: https://issues.apache.org/jira/browse/HBASE-5457
Project: HBase
Issue Type: New Feature
Reporter: He Yongqiang

As we are go through our data schema, and we found we have one large column
family which is just duplicating data from another column family and is just
a re-org of the data to cluster data in a different way than the original
column family in order to serve another type of queries efficiently.
If we compare this second column family with similar situation in mysql, it
is like an index in mysql. So if we can add inline block index on required
columns, the second column family then is not needed.

[jira] [Commented] (HBASE-5457) add inline index in data block for data which are not clustered together

2012-02-23 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214922#comment-13214922
 ] 

Lars Hofhansl commented on HBASE-5457:
--

@He. So you found the row and then you search inside the row with a ColumnRange 
or ColumnPrefix filter?

 add inline index in data block for data which are not clustered together
 

 Key: HBASE-5457
 URL: https://issues.apache.org/jira/browse/HBASE-5457
 Project: HBase
  Issue Type: New Feature
Reporter: He Yongqiang

 As we are go through our data schema, and we found we have one large column 
 family which is just duplicating data from another column family and is just 
 a re-org of the data to cluster data in a different way than the original 
 column family in order to serve another type of queries efficiently.
 If we compare this second column family with similar situation in mysql, it 
 is like an index in mysql. So if we can add inline block index on required 
 columns, the second column family then is not needed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5349) Automagically tweak global memstore and block cache sizes based on workload

2012-02-23 Thread Jean-Daniel Cryans (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214944#comment-13214944
 ] 

Jean-Daniel Cryans commented on HBASE-5349:
---

@Mubarak

We already have that through HeapSize, it's really just a matter of knowing 
what to auto-tune and when.

 Automagically tweak global memstore and block cache sizes based on workload
 ---

 Key: HBASE-5349
 URL: https://issues.apache.org/jira/browse/HBASE-5349
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
 Fix For: 0.94.0


 Hypertable does a neat thing where it changes the size given to the CellCache 
 (our MemStores) and Block Cache based on the workload. If you need an image, 
 scroll down at the bottom of this link: 
 http://www.hypertable.com/documentation/architecture/
 That'd be one less thing to configure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5434) [REST] Include more metrics in cluster status request

2012-02-23 Thread Mubarak Seyed (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214945#comment-13214945
 ] 

Mubarak Seyed commented on HBASE-5434:
--

@Stack
When i tested with -P runMediumTests, it went well but code annotate as 
smallTests, will update a patch soon. Sorry for the inconvenience. Thanks.

 [REST] Include more metrics in cluster status request
 -

 Key: HBASE-5434
 URL: https://issues.apache.org/jira/browse/HBASE-5434
 Project: HBase
  Issue Type: Improvement
  Components: metrics, rest
Affects Versions: 0.94.0
Reporter: Mubarak Seyed
Assignee: Mubarak Seyed
Priority: Minor
  Labels: noob
 Fix For: 0.94.0

 Attachments: HBASE-5434.trunk.v1.patch


 /status/cluster shows only
 {code}
 stores=2
 storefiless=0
 storefileSizeMB=0
 memstoreSizeMB=0
 storefileIndexSizeMB=0
 {code}
 for a region but master web-ui shows
 {code}
 stores=1,
 storefiles=0,
 storefileUncompressedSizeMB=0
 storefileSizeMB=0
 memstoreSizeMB=0
 storefileIndexSizeMB=0
 readRequestsCount=0
 writeRequestsCount=0
 rootIndexSizeKB=0
 totalStaticIndexSizeKB=0
 totalStaticBloomSizeKB=0
 totalCompactingKVs=0
 currentCompactedKVs=0
 compactionProgressPct=NaN
 {code}
 In a write-heavy REST gateway based production environment, ops team needs to 
 verify whether write counters are getting incremented per region (they do run 
 /status/cluster on each REST server), we can get the same values from 
 *rpc.metrics.put_num_ops* and *hbase.regionserver.writeRequestsCount* but 
 some home-grown tools needs to parse the output of /status/cluster and 
 updates the dashboard.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5463) Why is my upload to mvn spread across multiple repositories?

2012-02-23 Thread stack (Created) (JIRA)

Why is my upload to mvn spread across multiple repositories?


 Key: HBASE-5463
 URL: https://issues.apache.org/jira/browse/HBASE-5463
 Project: HBase
  Issue Type: Task
Reporter: stack


I'm been struggling publishing a release to repository.apache.org.  Its worked 
for me in the past.  If you look at 
https://repository.apache.org/index.html#stagingRepositories (you need to be 
logged in), you will see that I somehow made twelve repositories when I did my 
mvn release:perform, each artifact element to its own repo.  Any idea how that 
happens?  (I'll attach a png that shows similar).  How do I prevent it?

I have another issue where the upload to apache will fail with a 400 Bad 
Request very frequently uploading one of my artifact items -- usually 
maven-metadata.xml -- but then, just now, it went through fine.  Pointers 
appreciated on this little nugget too.

I'm using mvn 3.0.4 and 2.2.2 of the maven:release plugin.  Otherwise, my 
settings.xml is one that has worked for me in the past:

{code}
  servers
!-- To publish a snapshot of some part of Maven --
server
  idapache.snapshots.https/id
  usernamestack
  /username
  password
  /password
/server
!-- To publish a website using Maven --
!-- To stage a release of some part of Maven --
server
  idapache.releases.https/id
  usernamestack
  /username
  password
  /password
/server
  /servers
  profiles
profile
  idapache-release/id
  properties
gpg.keyname00A5F21E/gpg.keyname
gpg.passphrase
/gpg.passphrase
  /properties
/profile
  /profiles
/settings
{code}

My pom is here: 
http://svn.apache.org/viewvc/hbase/tags/0.92.0mvn/pom.xml?view=markup

Thanks for any pointers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5357) Use builder pattern in HColumnDescriptor


 [ 
https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5357:
---

Attachment: D1851.3.patch

mbautin updated the revision [jira] [HBASE-5357] Refactoring: use the builder 
pattern for HColumnDescriptor.
Reviewers: JIRA, todd, stack, tedyu, Kannan, Karthik, Liyin

  Rebasing on trunk changes.

REVISION DETAIL
  https://reviews.facebook.net/D1851

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
  src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java
  
src/main/java/org/apache/hadoop/hbase/client/UnmodifyableHColumnDescriptor.java
  src/main/java/org/apache/hadoop/hbase/thrift/ThriftUtilities.java
  src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
  src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
  src/test/java/org/apache/hadoop/hbase/TestSerialization.java
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
  src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
  
src/test/java/org/apache/hadoop/hbase/io/hfile/TestForceCacheImportantBlocks.java
  
src/test/java/org/apache/hadoop/hbase/io/hfile/TestScannerSelectionUsingTTL.java
  src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java
  src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestBlocksRead.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestColumnSeeking.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestScanner.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestWideScanner.java
  
src/test/java/org/apache/hadoop/hbase/thrift2/TestThriftHBaseServiceHandler.java


 Use builder pattern in HColumnDescriptor
 

 Key: HBASE-5357
 URL: https://issues.apache.org/jira/browse/HBASE-5357
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: D1851.1.patch, D1851.2.patch, D1851.3.patch, 
 Use-builder-pattern-for-HColumnDescriptor-2012-02-21_19_13_35.patch


 We have five ways to create an HFile writer, two ways to create a StoreFile 
 writer, and the sets of parameters keep changing, creating a lot of 
 confusion, especially when porting patches across branches. The same thing is 
 happening to HColumnDescriptor. I think we should move to a builder pattern 
 solution, e.g.
 {code:java}
   HFileWriter w = HFile.getWriterBuilder(conf, some common args)
   .setParameter1(value1)
   .setParameter2(value2)
   ...
   .build();
 {code}
 Each parameter setter being on its own line will make merges/cherry-pick work 
 properly, we will not have to even mention default parameters again, and we 
 can eliminate a dozen impossible-to-remember constructors.
 This particular JIRA addresses the HColumnDescriptor refactoring. For 
 StoreFile/HFile refactoring see HBASE-5442.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-5463) Why is my upload to mvn spread across multiple repositories?

2012-02-23 Thread stack (Resolved) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

stack resolved HBASE-5463.
--

Resolution: Won't Fix

Resolving. Meant to file this against infra:
https://issues.apache.org/jira/browse/INFRA-4482

Why is my upload to mvn spread across multiple repositories?

Key: HBASE-5463
URL: https://issues.apache.org/jira/browse/HBASE-5463
Project: HBase
Issue Type: Task
Reporter: stack

I'm been struggling publishing a release to repository.apache.org. Its
worked for me in the past. If you look at
https://repository.apache.org/index.html#stagingRepositories (you need to be
logged in), you will see that I somehow made twelve repositories when I did
my mvn release:perform, each artifact element to its own repo. Any idea how
that happens? (I'll attach a png that shows similar). How do I prevent it?
I have another issue where the upload to apache will fail with a 400 Bad
Request very frequently uploading one of my artifact items -- usually
maven-metadata.xml -- but then, just now, it went through fine. Pointers
appreciated on this little nugget too.
I'm using mvn 3.0.4 and 2.2.2 of the maven:release plugin. Otherwise, my
settings.xml is one that has worked for me in the past:
{code}
servers
!-- To publish a snapshot of some part of Maven --
server
idapache.snapshots.https/id
usernamestack
/username
password
/password
/server
!-- To publish a website using Maven --
!-- To stage a release of some part of Maven --
server
idapache.releases.https/id
usernamestack
/username
password
/password
/server
/servers
profiles
profile
idapache-release/id
properties
gpg.keyname00A5F21E/gpg.keyname
gpg.passphrase
/gpg.passphrase
/properties
/profile
/profiles
/settings
{code}
My pom is here:
http://svn.apache.org/viewvc/hbase/tags/0.92.0mvn/pom.xml?view=markup
Thanks for any pointers.

[jira] [Commented] (HBASE-5357) Use builder pattern in HColumnDescriptor

[
https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214964#comment-13214964
]

Hadoop QA commented on HBASE-5357:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12515781/D1851.3.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 60 new or modified tests.

-1 javadoc. The javadoc tool appears to have generated -136 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 152 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.regionserver.TestWideScanner

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1025//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1025//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1025//console

This message is automatically generated.

Use builder pattern in HColumnDescriptor

We have five ways to create an HFile writer, two ways to create a StoreFile
writer, and the sets of parameters keep changing, creating a lot of
confusion, especially when porting patches across branches. The same thing is
happening to HColumnDescriptor. I think we should move to a builder pattern
solution, e.g.
{code:java}
HFileWriter w = HFile.getWriterBuilder(conf, some common args)
.setParameter1(value1)
.setParameter2(value2)
...
.build();
{code}
Each parameter setter being on its own line will make merges/cherry-pick work
properly, we will not have to even mention default parameters again, and we
can eliminate a dozen impossible-to-remember constructors.
This particular JIRA addresses the HColumnDescriptor refactoring. For
StoreFile/HFile refactoring see HBASE-5442.

[jira] [Updated] (HBASE-5357) Use builder pattern in HColumnDescriptor


 [ 
https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5357:
---

Attachment: D1851.4.patch

mbautin updated the revision [jira] [HBASE-5357] Refactoring: use the builder 
pattern for HColumnDescriptor.
Reviewers: JIRA, todd, stack, tedyu, Kannan, Karthik, Liyin

  Fix TestWideScanner failure.

REVISION DETAIL
  https://reviews.facebook.net/D1851

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
  src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java
  
src/main/java/org/apache/hadoop/hbase/client/UnmodifyableHColumnDescriptor.java
  src/main/java/org/apache/hadoop/hbase/thrift/ThriftUtilities.java
  src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
  src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
  src/test/java/org/apache/hadoop/hbase/TestSerialization.java
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
  src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
  
src/test/java/org/apache/hadoop/hbase/io/hfile/TestForceCacheImportantBlocks.java
  
src/test/java/org/apache/hadoop/hbase/io/hfile/TestScannerSelectionUsingTTL.java
  src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java
  src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestBlocksRead.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestColumnSeeking.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestScanner.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestWideScanner.java
  
src/test/java/org/apache/hadoop/hbase/thrift2/TestThriftHBaseServiceHandler.java


 Use builder pattern in HColumnDescriptor
 

 Key: HBASE-5357
 URL: https://issues.apache.org/jira/browse/HBASE-5357
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: D1851.1.patch, D1851.2.patch, D1851.3.patch, 
 D1851.4.patch, 
 Use-builder-pattern-for-HColumnDescriptor-2012-02-21_19_13_35.patch


 We have five ways to create an HFile writer, two ways to create a StoreFile 
 writer, and the sets of parameters keep changing, creating a lot of 
 confusion, especially when porting patches across branches. The same thing is 
 happening to HColumnDescriptor. I think we should move to a builder pattern 
 solution, e.g.
 {code:java}
   HFileWriter w = HFile.getWriterBuilder(conf, some common args)
   .setParameter1(value1)
   .setParameter2(value2)
   ...
   .build();
 {code}
 Each parameter setter being on its own line will make merges/cherry-pick work 
 properly, we will not have to even mention default parameters again, and we 
 can eliminate a dozen impossible-to-remember constructors.
 This particular JIRA addresses the HColumnDescriptor refactoring. For 
 StoreFile/HFile refactoring see HBASE-5442.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5357) Use builder pattern in HColumnDescriptor


 [ 
https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-5357:
--

Attachment: 
Use-builder-pattern-for-HColumnDescriptor-20120223113155-e387d251.patch

 Use builder pattern in HColumnDescriptor
 

 Key: HBASE-5357
 URL: https://issues.apache.org/jira/browse/HBASE-5357
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: D1851.1.patch, D1851.2.patch, D1851.3.patch, 
 D1851.4.patch, 
 Use-builder-pattern-for-HColumnDescriptor-2012-02-21_19_13_35.patch, 
 Use-builder-pattern-for-HColumnDescriptor-20120223113155-e387d251.patch


 We have five ways to create an HFile writer, two ways to create a StoreFile 
 writer, and the sets of parameters keep changing, creating a lot of 
 confusion, especially when porting patches across branches. The same thing is 
 happening to HColumnDescriptor. I think we should move to a builder pattern 
 solution, e.g.
 {code:java}
   HFileWriter w = HFile.getWriterBuilder(conf, some common args)
   .setParameter1(value1)
   .setParameter2(value2)
   ...
   .build();
 {code}
 Each parameter setter being on its own line will make merges/cherry-pick work 
 properly, we will not have to even mention default parameters again, and we 
 can eliminate a dozen impossible-to-remember constructors.
 This particular JIRA addresses the HColumnDescriptor refactoring. For 
 StoreFile/HFile refactoring see HBASE-5442.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5357) Use builder pattern in HColumnDescriptor

[
https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214974#comment-13214974
]

Phabricator commented on HBASE-5357:

mbautin has commented on the revision [jira] [HBASE-5357] Refactoring: use the
builder pattern for HColumnDescriptor.

This should pass all unit tests now (re-running internally as well as on
Hadoop QA). Someone please accept this patch if there are no additional
comments. Thanks!

REVISION DETAIL
https://reviews.facebook.net/D1851

Use builder pattern in HColumnDescriptor

Key: HBASE-5357
URL: https://issues.apache.org/jira/browse/HBASE-5357
Project: HBase
Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Attachments: D1851.1.patch, D1851.2.patch, D1851.3.patch,
D1851.4.patch,
Use-builder-pattern-for-HColumnDescriptor-2012-02-21_19_13_35.patch,
Use-builder-pattern-for-HColumnDescriptor-20120223113155-e387d251.patch

We have five ways to create an HFile writer, two ways to create a StoreFile
writer, and the sets of parameters keep changing, creating a lot of
confusion, especially when porting patches across branches. The same thing is
happening to HColumnDescriptor. I think we should move to a builder pattern
solution, e.g.
{code:java}
HFileWriter w = HFile.getWriterBuilder(conf, some common args)
.setParameter1(value1)
.setParameter2(value2)
...
.build();
{code}
Each parameter setter being on its own line will make merges/cherry-pick work
properly, we will not have to even mention default parameters again, and we
can eliminate a dozen impossible-to-remember constructors.
This particular JIRA addresses the HColumnDescriptor refactoring. For
StoreFile/HFile refactoring see HBASE-5442.

[jira] [Updated] (HBASE-5464) Log warning message when thrift calls throw exceptions


 [ 
https://issues.apache.org/jira/browse/HBASE-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5464:
---

Attachment: HBASE-5464.D1899.1.patch

sc requested code review of HBASE-5464 [jira] Log warning message when thrift 
calls throw exceptions.
Reviewers: tedyu, dhruba, JIRA

  Log warning when thrift calls throws exceptions

  Task ID: #

  Blame Rev:

  Currently there is no logging message when client calls throw exceptions. It 
will be easier to debug if we have them.

TEST PLAN
  none

  Revert Plan:

  Tags:

REVISION DETAIL
  https://reviews.facebook.net/D1899

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java
  src/main/java/org/apache/hadoop/hbase/thrift/ThriftServerRunner.java

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/4047/

Tip: use the X-Herald-Rules header to filter Herald messages in your client.


 Log warning message when thrift calls throw exceptions
 --

 Key: HBASE-5464
 URL: https://issues.apache.org/jira/browse/HBASE-5464
 Project: HBase
  Issue Type: Improvement
  Components: thrift
Reporter: Scott Chen
Assignee: Scott Chen
Priority: Trivial
 Attachments: HBASE-5464.D1899.1.patch


 Currently there is no logging message when client calls throw exceptions. It 
 will be easier to debug if we have them.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5464) Log warning message when thrift calls throw exceptions

2012-02-23 Thread Scott Chen (Created) (JIRA)

Log warning message when thrift calls throw exceptions
--

 Key: HBASE-5464
 URL: https://issues.apache.org/jira/browse/HBASE-5464
 Project: HBase
  Issue Type: Improvement
  Components: thrift
Reporter: Scott Chen
Assignee: Scott Chen
Priority: Trivial
 Attachments: HBASE-5464.D1899.1.patch

Currently there is no logging message when client calls throw exceptions. It 
will be easier to debug if we have them.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5465) [book] chaning book to reference guide (content only, not filenames)


 [ 
https://issues.apache.org/jira/browse/HBASE-5465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5465:
-

Attachment: docbkx_hbase_5465.patch

 [book] chaning book to reference guide (content only, not filenames)
 

 Key: HBASE-5465
 URL: https://issues.apache.org/jira/browse/HBASE-5465
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: docbkx_hbase_5465.patch


 book.xml
 preface.xml
 Changing book to reference guide
 Note:  the filenames are still the same.  This is only a change to the way 
 the document refers to itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5465) [book] chaning book to reference guide (content only, not filenames)


 [ 
https://issues.apache.org/jira/browse/HBASE-5465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5465:
-

Status: Patch Available  (was: Open)

 [book] chaning book to reference guide (content only, not filenames)
 

 Key: HBASE-5465
 URL: https://issues.apache.org/jira/browse/HBASE-5465
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: docbkx_hbase_5465.patch


 book.xml
 preface.xml
 Changing book to reference guide
 Note:  the filenames are still the same.  This is only a change to the way 
 the document refers to itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5465) [book] changing book to reference guide (content only, not filenames)


 [ 
https://issues.apache.org/jira/browse/HBASE-5465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5465:
-

Summary: [book] changing book to reference guide (content only, not 
filenames)  (was: [book] chaning book to reference guide (content only, not 
filenames))

 [book] changing book to reference guide (content only, not filenames)
 -

 Key: HBASE-5465
 URL: https://issues.apache.org/jira/browse/HBASE-5465
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: docbkx_hbase_5465.patch


 book.xml
 preface.xml
 Changing book to reference guide
 Note:  the filenames are still the same.  This is only a change to the way 
 the document refers to itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5465) [book] changing book to reference guide (content only, not filenames)


 [ 
https://issues.apache.org/jira/browse/HBASE-5465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5465:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 [book] changing book to reference guide (content only, not filenames)
 -

 Key: HBASE-5465
 URL: https://issues.apache.org/jira/browse/HBASE-5465
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: docbkx_hbase_5465.patch


 book.xml
 preface.xml
 Changing book to reference guide
 Note:  the filenames are still the same.  This is only a change to the way 
 the document refers to itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3134) [replication] Add the ability to enable/disable streams

2012-02-23 Thread Jean-Daniel Cryans (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214984#comment-13214984
 ] 

Jean-Daniel Cryans commented on HBASE-3134:
---

Patch looks good. Was it tested outside of unit tests?

 [replication] Add the ability to enable/disable streams
 ---

 Key: HBASE-3134
 URL: https://issues.apache.org/jira/browse/HBASE-3134
 Project: HBase
  Issue Type: New Feature
  Components: replication
Reporter: Jean-Daniel Cryans
Assignee: Teruyoshi Zenmyo
Priority: Minor
  Labels: replication
 Fix For: 0.94.0

 Attachments: 3134-v2.txt, 3134-v3.txt, 3134.txt, HBASE-3134.patch, 
 HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch


 This jira was initially in the scope of HBASE-2201, but was pushed out since 
 it has low value compared to the required effort (and when want to ship 
 0.90.0 rather soonish).
 We need to design a way to enable/disable replication streams in a 
 determinate fashion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5357) Use builder pattern in HColumnDescriptor


[ 
https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214988#comment-13214988
 ] 

Zhihong Yu commented on HBASE-5357:
---

@Mikhail:
Can you attach patch generated with --no-prefix ?

 Use builder pattern in HColumnDescriptor
 

 Key: HBASE-5357
 URL: https://issues.apache.org/jira/browse/HBASE-5357
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: D1851.1.patch, D1851.2.patch, D1851.3.patch, 
 D1851.4.patch, 
 Use-builder-pattern-for-HColumnDescriptor-2012-02-21_19_13_35.patch, 
 Use-builder-pattern-for-HColumnDescriptor-20120223113155-e387d251.patch


 We have five ways to create an HFile writer, two ways to create a StoreFile 
 writer, and the sets of parameters keep changing, creating a lot of 
 confusion, especially when porting patches across branches. The same thing is 
 happening to HColumnDescriptor. I think we should move to a builder pattern 
 solution, e.g.
 {code:java}
   HFileWriter w = HFile.getWriterBuilder(conf, some common args)
   .setParameter1(value1)
   .setParameter2(value2)
   ...
   .build();
 {code}
 Each parameter setter being on its own line will make merges/cherry-pick work 
 properly, we will not have to even mention default parameters again, and we 
 can eliminate a dozen impossible-to-remember constructors.
 This particular JIRA addresses the HColumnDescriptor refactoring. For 
 StoreFile/HFile refactoring see HBASE-5442.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5357) Use builder pattern in HColumnDescriptor

[
https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214996#comment-13214996
]

Phabricator commented on HBASE-5357:

Kannan has commented on the revision [jira] [HBASE-5357] Refactoring: use the
builder pattern for HColumnDescriptor.

nice change-- the unit tests are a lot more readable now! One inlined
comment...

INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:438 changing the
return type can break some existing apps out there, no?

REVISION DETAIL
https://reviews.facebook.net/D1851

Use builder pattern in HColumnDescriptor

We have five ways to create an HFile writer, two ways to create a StoreFile
writer, and the sets of parameters keep changing, creating a lot of
confusion, especially when porting patches across branches. The same thing is
happening to HColumnDescriptor. I think we should move to a builder pattern
solution, e.g.
{code:java}
HFileWriter w = HFile.getWriterBuilder(conf, some common args)
.setParameter1(value1)
.setParameter2(value2)
...
.build();
{code}
Each parameter setter being on its own line will make merges/cherry-pick work
properly, we will not have to even mention default parameters again, and we
can eliminate a dozen impossible-to-remember constructors.
This particular JIRA addresses the HColumnDescriptor refactoring. For
StoreFile/HFile refactoring see HBASE-5442.

[jira] [Commented] (HBASE-5317) Fix TestHFileOutputFormat to work against hadoop 0.23

2012-02-23 Thread Gregory Chanan (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214998#comment-13214998
 ] 

Gregory Chanan commented on HBASE-5317:
---

Okay, this looks like a mvn issue.

On linux, if JAVA_HOME is not set, mvn sets JAVA_HOME to be whatever shows up 
in mvn --version
On mac, mvn appears to not set JAVA_HOME if it is not already set.

So if you set JAVA_HOME to 
/System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home the test 
should pass.
The new version of mapreduce requires JAVA_HOME to be set and tries to execute 
$JAVA_HOME + bin/java -- they should probably report a bug if JAVA_HOME is 
not set.

Let me know if you want me to do anything else wrt this issue.

 Fix TestHFileOutputFormat to work against hadoop 0.23
 -

 Key: HBASE-5317
 URL: https://issues.apache.org/jira/browse/HBASE-5317
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.92.0, 0.94.0
Reporter: Gregory Chanan
Assignee: Gregory Chanan
 Attachments: HBASE-5317-v0.patch, HBASE-5317-v1.patch, 
 HBASE-5317-v3.patch, HBASE-5317-v4.patch, HBASE-5317-v5.patch, 
 HBASE-5317-v6.patch, 
 TEST-org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat.xml


 Running
 mvn -Dhadoop.profile=23 test -P localTests 
 -Dtest=org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
 yields this on 0.92:
 Failed tests:   
 testColumnFamilyCompression(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat):
  HFile for column family info-A not found
 Tests in error: 
   test_TIMERANGE(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat): 
 /home/gchanan/workspace/apache92/target/test-data/276cbd0c-c771-4f81-9ba8-c464c9dd7486/test_TIMERANGE_present/_temporary/0/_temporary/_attempt_200707121733_0001_m_00_0
  (Is a directory)
   
 testMRIncrementalLoad(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat):
  TestTable
   
 testMRIncrementalLoadWithSplit(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat):
  TestTable
 It looks like on trunk, this also results in an error:
   
 testExcludeMinorCompaction(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat):
  TestTable
 I have a patch that fixes testColumnFamilyCompression and test_TIMERANGE, but 
 haven't fixed the other 3 yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5457) add inline index in data block for data which are not clustered together

2012-02-23 Thread He Yongqiang (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215002#comment-13215002
]

He Yongqiang commented on HBASE-5457:
-

@lars, in today's implementation we actually create another column family and
reorg the column name to be 'ts and string', so the data is sorted by ts in
this new column family. And we redirect the query to use the second column
family. But this approach duplicates data.
Without the second column family, we can do a search once we found the row. but
that requires searching all data with the target row key. It hurts cpu.

add inline index in data block for data which are not clustered together

Key: HBASE-5457
URL: https://issues.apache.org/jira/browse/HBASE-5457
Project: HBase
Issue Type: New Feature
Reporter: He Yongqiang

[jira] [Commented] (HBASE-5317) Fix TestHFileOutputFormat to work against hadoop 0.23


[ 
https://issues.apache.org/jira/browse/HBASE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215017#comment-13215017
 ] 

Zhihong Yu commented on HBASE-5317:
---

{code}
echo ${JAVA_HOME}
/System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home

mvn -Dhadoop.profile=23 clean test -P localTests 
-Dtest=org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
{code}
Same test failures, e.g.:
{code}
testMRIncrementalLoadWithSplit(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat)
  Time elapsed: 60.857 sec   FAILURE!
java.lang.AssertionError
  at org.junit.Assert.fail(Assert.java:92)
  at org.junit.Assert.assertTrue(Assert.java:43)
  at org.junit.Assert.assertTrue(Assert.java:54)
  at 
org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat.runIncrementalPELoad(TestHFileOutputFormat.java:478)
  at 
org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat.doIncrementalLoadTest(TestHFileOutputFormat.java:390)
  at 
org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat.testMRIncrementalLoadWithSplit(TestHFileOutputFormat.java:369)
{code}

 Fix TestHFileOutputFormat to work against hadoop 0.23
 -

 Key: HBASE-5317
 URL: https://issues.apache.org/jira/browse/HBASE-5317
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.92.0, 0.94.0
Reporter: Gregory Chanan
Assignee: Gregory Chanan
 Attachments: HBASE-5317-v0.patch, HBASE-5317-v1.patch, 
 HBASE-5317-v3.patch, HBASE-5317-v4.patch, HBASE-5317-v5.patch, 
 HBASE-5317-v6.patch, 
 TEST-org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat.xml


 Running
 mvn -Dhadoop.profile=23 test -P localTests 
 -Dtest=org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
 yields this on 0.92:
 Failed tests:   
 testColumnFamilyCompression(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat):
  HFile for column family info-A not found
 Tests in error: 
   test_TIMERANGE(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat): 
 /home/gchanan/workspace/apache92/target/test-data/276cbd0c-c771-4f81-9ba8-c464c9dd7486/test_TIMERANGE_present/_temporary/0/_temporary/_attempt_200707121733_0001_m_00_0
  (Is a directory)
   
 testMRIncrementalLoad(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat):
  TestTable
   
 testMRIncrementalLoadWithSplit(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat):
  TestTable
 It looks like on trunk, this also results in an error:
   
 testExcludeMinorCompaction(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat):
  TestTable
 I have a patch that fixes testColumnFamilyCompression and test_TIMERANGE, but 
 haven't fixed the other 3 yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5317) Fix TestHFileOutputFormat to work against hadoop 0.23


[ 
https://issues.apache.org/jira/browse/HBASE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215024#comment-13215024
 ] 

Zhihong Yu commented on HBASE-5317:
---

{code}
LM-SJN-00713032:org.apache.hadoop.mapred.MiniMRCluster zhihyu$ find . -name 
stderr
./org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_0/application_1330028372796_0001/container_1330028372796_0001_01_07/stderr
./org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_2/application_1330028372796_0001/container_1330028372796_0001_01_01/stderr
./org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_2/application_1330028372796_0001/container_1330028372796_0001_01_06/stderr
./org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_3/application_1330028372796_0001/container_1330028372796_0001_01_04/stderr
./org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_3/application_1330028372796_0001/container_1330028372796_0001_01_05/stderr
./org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-1_2/application_1330028372796_0001/container_1330028372796_0001_01_02/stderr
./org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-1_3/application_1330028372796_0001/container_1330028372796_0001_01_03/stderr
LM-SJN-00713032:org.apache.hadoop.mapred.MiniMRCluster zhihyu$ cat 
org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_0/application_1330028372796_0001/container_1330028372796_0001_01_07/stderr
WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use 
org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
LM-SJN-00713032:org.apache.hadoop.mapred.MiniMRCluster zhihyu$ cat 
org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-1_3/application_1330028372796_0001/container_1330028372796_0001_01_03/stderr
WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use 
org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
{code}
From 
org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_0/application_1330028372796_0001/container_1330028372796_0001_01_07/syslog:
{code}
2012-02-23 12:22:34,208 INFO [fetcher#2] 
org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler: Assiging 
lm-sjn-00713032.corp.ebay.com:61263 with 1 to fetcher#2
2012-02-23 12:22:34,208 INFO [fetcher#2] 
org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler: assigned 1 of 1 to 
lm-sjn-00713032.corp.ebay.com:61263 to fetcher#2
2012-02-23 12:22:34,469 WARN [fetcher#2] 
org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to connect to 
lm-sjn-00713032.corp.ebay.com:61263 with 1 map outputs
java.io.IOException: Server returned HTTP response code: 503 for URL: 
http://lm-sjn-00713032.corp.ebay.com:61263/mapOutput?job=job_1330028372796_0001reduce=0map=attempt_1330028372796_0001_m_00_1
  at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1436)
  at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:220)
  at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:152)
2012-02-23 12:22:34,471 INFO [fetcher#2] 
org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler: Reporting fetch 
failure for attempt_1330028372796_0001_m_00_1 to jobtracker.
2012-02-23 12:22:34,471 FATAL [fetcher#2] 
org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler: Shuffle failed with 
too many fetch failures and insufficient progress!
2012-02-23 12:22:34,472 INFO [fetcher#2] 
org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler: 
lm-sjn-00713032.corp.ebay.com:61263 freed by fetcher#2 in 264s
2012-02-23 12:22:34,472 ERROR [main] 
org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException 
as:zhihyu (auth:SIMPLE) 
cause:org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
shuffle in fetcher#2
2012-02-23 12:22:34,472 WARN [main] org.apache.hadoop.mapred.YarnChild: 
Exception running child : 
org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle 
in fetcher#2
  at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123)
  at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371)
  at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:147)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:396)
  at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177)
  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:142)
Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
  at 
org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.checkReducerHealth(ShuffleScheduler.java:253)
  at 
org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.copyFailed(ShuffleScheduler.java:187)
  at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:247)
  at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:152)

2012-02-23 12:22:34,476 INFO

[jira] [Updated] (HBASE-5357) Use builder pattern in HColumnDescriptor


 [ 
https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-5357:
--

Attachment: 
Use-builder-pattern-for-HColumnDescriptor-2012-02-23_12_42_25.patch

 Use builder pattern in HColumnDescriptor
 

 Key: HBASE-5357
 URL: https://issues.apache.org/jira/browse/HBASE-5357
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: D1851.1.patch, D1851.2.patch, D1851.3.patch, 
 D1851.4.patch, 
 Use-builder-pattern-for-HColumnDescriptor-2012-02-21_19_13_35.patch, 
 Use-builder-pattern-for-HColumnDescriptor-2012-02-23_12_42_25.patch, 
 Use-builder-pattern-for-HColumnDescriptor-20120223113155-e387d251.patch


 We have five ways to create an HFile writer, two ways to create a StoreFile 
 writer, and the sets of parameters keep changing, creating a lot of 
 confusion, especially when porting patches across branches. The same thing is 
 happening to HColumnDescriptor. I think we should move to a builder pattern 
 solution, e.g.
 {code:java}
   HFileWriter w = HFile.getWriterBuilder(conf, some common args)
   .setParameter1(value1)
   .setParameter2(value2)
   ...
   .build();
 {code}
 Each parameter setter being on its own line will make merges/cherry-pick work 
 properly, we will not have to even mention default parameters again, and we 
 can eliminate a dozen impossible-to-remember constructors.
 This particular JIRA addresses the HColumnDescriptor refactoring. For 
 StoreFile/HFile refactoring see HBASE-5442.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5357) Use builder pattern in HColumnDescriptor

2012-02-23 Thread Mikhail Bautin (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215030#comment-13215030
]

Mikhail Bautin commented on HBASE-5357:
---

@Ted: I thought that was what I did. Attaching again.

Use builder pattern in HColumnDescriptor

We have five ways to create an HFile writer, two ways to create a StoreFile
writer, and the sets of parameters keep changing, creating a lot of
confusion, especially when porting patches across branches. The same thing is
happening to HColumnDescriptor. I think we should move to a builder pattern
solution, e.g.
{code:java}
HFileWriter w = HFile.getWriterBuilder(conf, some common args)
.setParameter1(value1)
.setParameter2(value2)
...
.build();
{code}
Each parameter setter being on its own line will make merges/cherry-pick work
properly, we will not have to even mention default parameters again, and we
can eliminate a dozen impossible-to-remember constructors.
This particular JIRA addresses the HColumnDescriptor refactoring. For
StoreFile/HFile refactoring see HBASE-5442.

[jira] [Updated] (HBASE-5357) Use builder pattern in HColumnDescriptor


 [ 
https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-5357:
--

Attachment: (was: 
Use-builder-pattern-for-HColumnDescriptor-2012-02-23_12_42_25.patch)

 Use builder pattern in HColumnDescriptor
 

 Key: HBASE-5357
 URL: https://issues.apache.org/jira/browse/HBASE-5357
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: D1851.1.patch, D1851.2.patch, D1851.3.patch, 
 D1851.4.patch, 
 Use-builder-pattern-for-HColumnDescriptor-2012-02-21_19_13_35.patch, 
 Use-builder-pattern-for-HColumnDescriptor-2012-02-23_12_42_49.patch, 
 Use-builder-pattern-for-HColumnDescriptor-20120223113155-e387d251.patch


 We have five ways to create an HFile writer, two ways to create a StoreFile 
 writer, and the sets of parameters keep changing, creating a lot of 
 confusion, especially when porting patches across branches. The same thing is 
 happening to HColumnDescriptor. I think we should move to a builder pattern 
 solution, e.g.
 {code:java}
   HFileWriter w = HFile.getWriterBuilder(conf, some common args)
   .setParameter1(value1)
   .setParameter2(value2)
   ...
   .build();
 {code}
 Each parameter setter being on its own line will make merges/cherry-pick work 
 properly, we will not have to even mention default parameters again, and we 
 can eliminate a dozen impossible-to-remember constructors.
 This particular JIRA addresses the HColumnDescriptor refactoring. For 
 StoreFile/HFile refactoring see HBASE-5442.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5357) Use builder pattern in HColumnDescriptor


 [ 
https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-5357:
--

Attachment: 
Use-builder-pattern-for-HColumnDescriptor-2012-02-23_12_42_49.patch

 Use builder pattern in HColumnDescriptor
 

 Key: HBASE-5357
 URL: https://issues.apache.org/jira/browse/HBASE-5357
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: D1851.1.patch, D1851.2.patch, D1851.3.patch, 
 D1851.4.patch, 
 Use-builder-pattern-for-HColumnDescriptor-2012-02-21_19_13_35.patch, 
 Use-builder-pattern-for-HColumnDescriptor-2012-02-23_12_42_49.patch, 
 Use-builder-pattern-for-HColumnDescriptor-20120223113155-e387d251.patch


 We have five ways to create an HFile writer, two ways to create a StoreFile 
 writer, and the sets of parameters keep changing, creating a lot of 
 confusion, especially when porting patches across branches. The same thing is 
 happening to HColumnDescriptor. I think we should move to a builder pattern 
 solution, e.g.
 {code:java}
   HFileWriter w = HFile.getWriterBuilder(conf, some common args)
   .setParameter1(value1)
   .setParameter2(value2)
   ...
   .build();
 {code}
 Each parameter setter being on its own line will make merges/cherry-pick work 
 properly, we will not have to even mention default parameters again, and we 
 can eliminate a dozen impossible-to-remember constructors.
 This particular JIRA addresses the HColumnDescriptor refactoring. For 
 StoreFile/HFile refactoring see HBASE-5442.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5440) Allow import to optionally use HFileOutputFormat

2012-02-23 Thread Lars Hofhansl (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5440:
-

Attachment: 5440.txt

First cut.

* a new import mapper that writes KeyValues
* uses KeyValueSortReducer

Only used when -Dimport.bulk.output=path/to/output is set.

I did experiment with a Reducer that accepts Mutation (common super class of 
Put and Delete), but that caused more problems than it solved, hence the 
KeyValueImporter.

 Allow import to optionally use HFileOutputFormat
 

 Key: HBASE-5440
 URL: https://issues.apache.org/jira/browse/HBASE-5440
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Fix For: 0.94.0

 Attachments: 5440.txt


 importtsv support importing into a life table or to generate HFiles for bulk 
 load.
 import should allow the same.
 Could even consider merging these tools into one (in principle the only 
 difference is the parsing part - although that is maybe for a different jira).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5440) Allow import to optionally use HFileOutputFormat

2012-02-23 Thread Lars Hofhansl (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5440:
-

Status: Patch Available  (was: Open)

 Allow import to optionally use HFileOutputFormat
 

 Key: HBASE-5440
 URL: https://issues.apache.org/jira/browse/HBASE-5440
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Fix For: 0.94.0

 Attachments: 5440.txt


 importtsv support importing into a life table or to generate HFiles for bulk 
 load.
 import should allow the same.
 Could even consider merging these tools into one (in principle the only 
 difference is the parsing part - although that is maybe for a different jira).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5357) Use builder pattern in HColumnDescriptor


[ 
https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215038#comment-13215038
 ] 

Phabricator commented on HBASE-5357:


mbautin has commented on the revision [jira] [HBASE-5357] Refactoring: use the 
builder pattern for HColumnDescriptor.

  The patch passed all unit tests in my map-reduce run, except 
TestAtomicOperation, which passed locally. Still waiting for the Hadoop QA run.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:438 I don't 
think this can break existing code. The return value will most likely be simply 
ignored.

  From http://docs.oracle.com/javase/tutorial/java/javaOO/methods.html:

  You cannot declare more than one method with the same name and the same 
number and type of arguments, because the compiler cannot tell them apart.

  The compiler does not consider return type when differentiating methods, so 
you cannot declare two methods with the same signature even if they have a 
different return type.


REVISION DETAIL
  https://reviews.facebook.net/D1851


 Use builder pattern in HColumnDescriptor
 

 Key: HBASE-5357
 URL: https://issues.apache.org/jira/browse/HBASE-5357
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: D1851.1.patch, D1851.2.patch, D1851.3.patch, 
 D1851.4.patch, 
 Use-builder-pattern-for-HColumnDescriptor-2012-02-21_19_13_35.patch, 
 Use-builder-pattern-for-HColumnDescriptor-2012-02-23_12_42_49.patch, 
 Use-builder-pattern-for-HColumnDescriptor-20120223113155-e387d251.patch


 We have five ways to create an HFile writer, two ways to create a StoreFile 
 writer, and the sets of parameters keep changing, creating a lot of 
 confusion, especially when porting patches across branches. The same thing is 
 happening to HColumnDescriptor. I think we should move to a builder pattern 
 solution, e.g.
 {code:java}
   HFileWriter w = HFile.getWriterBuilder(conf, some common args)
   .setParameter1(value1)
   .setParameter2(value2)
   ...
   .build();
 {code}
 Each parameter setter being on its own line will make merges/cherry-pick work 
 properly, we will not have to even mention default parameters again, and we 
 can eliminate a dozen impossible-to-remember constructors.
 This particular JIRA addresses the HColumnDescriptor refactoring. For 
 StoreFile/HFile refactoring see HBASE-5442.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5466) Opening a table also opens the metatable and never closes it.

2012-02-23 Thread Ashley Taylor (Created) (JIRA)

Opening a table also opens the metatable and never closes it.
-

 Key: HBASE-5466
 URL: https://issues.apache.org/jira/browse/HBASE-5466
 Project: HBase
  Issue Type: Bug
  Components: client
Reporter: Ashley Taylor


Having upgraded to CDH3U3 version of hbase we found we had a zookeeper 
connection leak, tracking it down we found that closing the connection will 
only close the zookeeper connection if all calls to get the connection have 
been closed, there is incCount and decCount in the HConnection class,

When a table is opened it makes a call to the metascanner class which opens a 
connection to the meta table, this table never gets closed.

This caused the count in the HConnection class to never return to zero meaning 
that the zookeeper connection will not close when we close all the tables or 
calling
HConnectionManager.deleteConnection(config, true);  




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5466) Opening a table also opens the metatable and never closes it.

2012-02-23 Thread Ashley Taylor (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashley Taylor updated HBASE-5466:
-

Affects Version/s: 0.90.5
   0.92.0
   Status: Patch Available  (was: Open)

patch to make sure the metatable gets closed when the table is opened before it 
falls out of scope

 Opening a table also opens the metatable and never closes it.
 -

 Key: HBASE-5466
 URL: https://issues.apache.org/jira/browse/HBASE-5466
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.92.0, 0.90.5
Reporter: Ashley Taylor
 Attachments: MetaScanner_HBASE_5466.patch


 Having upgraded to CDH3U3 version of hbase we found we had a zookeeper 
 connection leak, tracking it down we found that closing the connection will 
 only close the zookeeper connection if all calls to get the connection have 
 been closed, there is incCount and decCount in the HConnection class,
 When a table is opened it makes a call to the metascanner class which opens a 
 connection to the meta table, this table never gets closed.
 This caused the count in the HConnection class to never return to zero 
 meaning that the zookeeper connection will not close when we close all the 
 tables or calling
 HConnectionManager.deleteConnection(config, true);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5467) changing homepage to refer to 'reference guide'

2012-02-23 Thread Doug Meil (Created) (JIRA)

changing homepage to refer to 'reference guide'
---

 Key: HBASE-5467
 URL: https://issues.apache.org/jira/browse/HBASE-5467
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: site_hbase_5467.patch

index.xml
site.xml

Changing reference from book to reference guide on home page and left-nav.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5466) Opening a table also opens the metatable and never closes it.

2012-02-23 Thread Ashley Taylor (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashley Taylor updated HBASE-5466:
-

Attachment: MetaScanner_HBASE_5466.patch

 Opening a table also opens the metatable and never closes it.
 -

 Key: HBASE-5466
 URL: https://issues.apache.org/jira/browse/HBASE-5466
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.5, 0.92.0
Reporter: Ashley Taylor
 Attachments: MetaScanner_HBASE_5466.patch


 Having upgraded to CDH3U3 version of hbase we found we had a zookeeper 
 connection leak, tracking it down we found that closing the connection will 
 only close the zookeeper connection if all calls to get the connection have 
 been closed, there is incCount and decCount in the HConnection class,
 When a table is opened it makes a call to the metascanner class which opens a 
 connection to the meta table, this table never gets closed.
 This caused the count in the HConnection class to never return to zero 
 meaning that the zookeeper connection will not close when we close all the 
 tables or calling
 HConnectionManager.deleteConnection(config, true);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5467) changing homepage to refer to 'reference guide'


 [ 
https://issues.apache.org/jira/browse/HBASE-5467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5467:
-

Attachment: site_hbase_5467.patch

 changing homepage to refer to 'reference guide'
 ---

 Key: HBASE-5467
 URL: https://issues.apache.org/jira/browse/HBASE-5467
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: site_hbase_5467.patch


 index.xml
 site.xml
 Changing reference from book to reference guide on home page and left-nav.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5467) changing homepage to refer to 'reference guide'


 [ 
https://issues.apache.org/jira/browse/HBASE-5467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5467:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 changing homepage to refer to 'reference guide'
 ---

 Key: HBASE-5467
 URL: https://issues.apache.org/jira/browse/HBASE-5467
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: site_hbase_5467.patch


 index.xml
 site.xml
 Changing reference from book to reference guide on home page and left-nav.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5467) changing homepage to refer to 'reference guide'


 [ 
https://issues.apache.org/jira/browse/HBASE-5467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5467:
-

Status: Patch Available  (was: Open)

 changing homepage to refer to 'reference guide'
 ---

 Key: HBASE-5467
 URL: https://issues.apache.org/jira/browse/HBASE-5467
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: site_hbase_5467.patch


 index.xml
 site.xml
 Changing reference from book to reference guide on home page and left-nav.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5317) Fix TestHFileOutputFormat to work against hadoop 0.23

2012-02-23 Thread Gregory Chanan (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215066#comment-13215066
 ] 

Gregory Chanan commented on HBASE-5317:
---

I am unable to reproduce either on my machine or on a coworker's Mac.

Is it possible for you to try it on another machine?  Perhaps some (security?) 
setting on your machine makes mapreduce unable to open the required port or 
something?



 Fix TestHFileOutputFormat to work against hadoop 0.23
 -

 Key: HBASE-5317
 URL: https://issues.apache.org/jira/browse/HBASE-5317
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.92.0, 0.94.0
Reporter: Gregory Chanan
Assignee: Gregory Chanan
 Attachments: HBASE-5317-v0.patch, HBASE-5317-v1.patch, 
 HBASE-5317-v3.patch, HBASE-5317-v4.patch, HBASE-5317-v5.patch, 
 HBASE-5317-v6.patch, 
 TEST-org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat.xml


 Running
 mvn -Dhadoop.profile=23 test -P localTests 
 -Dtest=org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
 yields this on 0.92:
 Failed tests:   
 testColumnFamilyCompression(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat):
  HFile for column family info-A not found
 Tests in error: 
   test_TIMERANGE(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat): 
 /home/gchanan/workspace/apache92/target/test-data/276cbd0c-c771-4f81-9ba8-c464c9dd7486/test_TIMERANGE_present/_temporary/0/_temporary/_attempt_200707121733_0001_m_00_0
  (Is a directory)
   
 testMRIncrementalLoad(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat):
  TestTable
   
 testMRIncrementalLoadWithSplit(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat):
  TestTable
 It looks like on trunk, this also results in an error:
   
 testExcludeMinorCompaction(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat):
  TestTable
 I have a patch that fixes testColumnFamilyCompression and test_TIMERANGE, but 
 haven't fixed the other 3 yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5466) Opening a table also opens the metatable and never closes it.


[ 
https://issues.apache.org/jira/browse/HBASE-5466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215068#comment-13215068
 ] 

Zhihong Yu commented on HBASE-5466:
---

Thanks for the finding.
{code}
+  }finally{
+ if(metaTable!=null){
+ metaTable.close();
+ }
{code}
We use two spaces for indentation.

Can you regenerate patch ? Refer to HBASE-3678.

 Opening a table also opens the metatable and never closes it.
 -

 Key: HBASE-5466
 URL: https://issues.apache.org/jira/browse/HBASE-5466
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.5, 0.92.0
Reporter: Ashley Taylor
 Attachments: MetaScanner_HBASE_5466.patch


 Having upgraded to CDH3U3 version of hbase we found we had a zookeeper 
 connection leak, tracking it down we found that closing the connection will 
 only close the zookeeper connection if all calls to get the connection have 
 been closed, there is incCount and decCount in the HConnection class,
 When a table is opened it makes a call to the metascanner class which opens a 
 connection to the meta table, this table never gets closed.
 This caused the count in the HConnection class to never return to zero 
 meaning that the zookeeper connection will not close when we close all the 
 tables or calling
 HConnectionManager.deleteConnection(config, true);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5317) Fix TestHFileOutputFormat to work against hadoop 0.23


[ 
https://issues.apache.org/jira/browse/HBASE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215069#comment-13215069
 ] 

Zhihong Yu commented on HBASE-5317:
---

I will try again when I get home.
I noticed weird connection issues at work - inability to use Colloquy, etc

 Fix TestHFileOutputFormat to work against hadoop 0.23
 -

 Key: HBASE-5317
 URL: https://issues.apache.org/jira/browse/HBASE-5317
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.92.0, 0.94.0
Reporter: Gregory Chanan
Assignee: Gregory Chanan
 Attachments: HBASE-5317-v0.patch, HBASE-5317-v1.patch, 
 HBASE-5317-v3.patch, HBASE-5317-v4.patch, HBASE-5317-v5.patch, 
 HBASE-5317-v6.patch, 
 TEST-org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat.xml


 Running
 mvn -Dhadoop.profile=23 test -P localTests 
 -Dtest=org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
 yields this on 0.92:
 Failed tests:   
 testColumnFamilyCompression(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat):
  HFile for column family info-A not found
 Tests in error: 
   test_TIMERANGE(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat): 
 /home/gchanan/workspace/apache92/target/test-data/276cbd0c-c771-4f81-9ba8-c464c9dd7486/test_TIMERANGE_present/_temporary/0/_temporary/_attempt_200707121733_0001_m_00_0
  (Is a directory)
   
 testMRIncrementalLoad(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat):
  TestTable
   
 testMRIncrementalLoadWithSplit(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat):
  TestTable
 It looks like on trunk, this also results in an error:
   
 testExcludeMinorCompaction(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat):
  TestTable
 I have a patch that fixes testColumnFamilyCompression and test_TIMERANGE, but 
 haven't fixed the other 3 yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5440) Allow import to optionally use HFileOutputFormat

[
https://issues.apache.org/jira/browse/HBASE-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215071#comment-13215071
]

Hadoop QA commented on HBASE-5440:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12515813/5440.txt
against trunk revision .

+1 @author. The patch does not contain any @author tags.

-1 javadoc. The javadoc tool appears to have generated -136 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 152 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.regionserver.TestAtomicOperation
org.apache.hadoop.hbase.coprocessor.TestClassLoading
org.apache.hadoop.hbase.mapreduce.TestImportTsv
org.apache.hadoop.hbase.mapred.TestTableMapReduce
org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1027//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1027//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1027//console

This message is automatically generated.

Allow import to optionally use HFileOutputFormat

Key: HBASE-5440
URL: https://issues.apache.org/jira/browse/HBASE-5440
Project: HBase
Issue Type: Improvement
Components: mapreduce
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
Fix For: 0.94.0

Attachments: 5440.txt

importtsv support importing into a life table or to generate HFiles for bulk
load.
import should allow the same.
Could even consider merging these tools into one (in principle the only
difference is the parsing part - although that is maybe for a different jira).

[jira] [Commented] (HBASE-5357) Use builder pattern in HColumnDescriptor

[
https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215078#comment-13215078
]

Hadoop QA commented on HBASE-5357:
--

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12515810/Use-builder-pattern-for-HColumnDescriptor-2012-02-23_12_42_49.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 78 new or modified tests.

-1 javadoc. The javadoc tool appears to have generated -136 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 152 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1028//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1028//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1028//console

This message is automatically generated.

Use builder pattern in HColumnDescriptor

We have five ways to create an HFile writer, two ways to create a StoreFile
writer, and the sets of parameters keep changing, creating a lot of
confusion, especially when porting patches across branches. The same thing is
happening to HColumnDescriptor. I think we should move to a builder pattern
solution, e.g.
{code:java}
HFileWriter w = HFile.getWriterBuilder(conf, some common args)
.setParameter1(value1)
.setParameter2(value2)
...
.build();
{code}
Each parameter setter being on its own line will make merges/cherry-pick work
properly, we will not have to even mention default parameters again, and we
can eliminate a dozen impossible-to-remember constructors.
This particular JIRA addresses the HColumnDescriptor refactoring. For
StoreFile/HFile refactoring see HBASE-5442.

[jira] [Updated] (HBASE-5466) Opening a table also opens the metatable and never closes it.

2012-02-23 Thread Ashley Taylor (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashley Taylor updated HBASE-5466:
-

Attachment: MetaScanner_HBASE_5466(2).patch

patch regenerated thanks for the link to that jira task.

 Opening a table also opens the metatable and never closes it.
 -

 Key: HBASE-5466
 URL: https://issues.apache.org/jira/browse/HBASE-5466
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.5, 0.92.0
Reporter: Ashley Taylor
 Attachments: MetaScanner_HBASE_5466(2).patch, 
 MetaScanner_HBASE_5466.patch


 Having upgraded to CDH3U3 version of hbase we found we had a zookeeper 
 connection leak, tracking it down we found that closing the connection will 
 only close the zookeeper connection if all calls to get the connection have 
 been closed, there is incCount and decCount in the HConnection class,
 When a table is opened it makes a call to the metascanner class which opens a 
 connection to the meta table, this table never gets closed.
 This caused the count in the HConnection class to never return to zero 
 meaning that the zookeeper connection will not close when we close all the 
 tables or calling
 HConnectionManager.deleteConnection(config, true);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5074) support checksums in HBase block cache


[ 
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215083#comment-13215083
 ] 

Phabricator commented on HBASE-5074:


dhruba has commented on the revision [jira] [HBASE-5074] Support checksums in 
HBase block cache.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:73 Actually, 
a new checksum object is created by every invocation of 
ChecksumType.getChecksumObject(), so it should be thread-safe
  src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:120 doing it

  src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:164 will 
restructure the comment, this feature is  switched on by default.

REVISION DETAIL
  https://reviews.facebook.net/D1521


 support checksums in HBase block cache
 --

 Key: HBASE-5074
 URL: https://issues.apache.org/jira/browse/HBASE-5074
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, 
 D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, 
 D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, 
 D1521.7.patch, D1521.8.patch, D1521.8.patch


 The current implementation of HDFS stores the data in one block file and the 
 metadata(checksum) in another block file. This means that every read into the 
 HBase block cache actually consumes two disk iops, one to the datafile and 
 one to the checksum file. This is a major problem for scaling HBase, because 
 HBase is usually bottlenecked on the number of random disk iops that the 
 storage-hardware offers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4365) Add a decent heuristic for region size

2012-02-23 Thread Jean-Daniel Cryans (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215080#comment-13215080
 ] 

Jean-Daniel Cryans commented on HBASE-4365:
---

Conclusion for the 1TB upload:

Flush size: 512MB
Split size: 20GB

Without patch:
18012s

With patch:
12505s

It's 1.44x better, so a huge improvement. The difference here is due to the 
fact that it takes an awfully long time to split the first few regions without 
the patch. In the past I was starting the test with a smaller split size and 
then once I got a good distribution I was doing an online alter to set it to 
20GB. Not anymore with this patch :)

Another observation: the upload in general is slowed down by too many store 
files blocking. I could trace this to compactions taking a long time to get 
rid of reference files (3.5GB taking more than 10 minutes) and during that time 
you can hit the block multiple times. We really ought to see how we can 
optimize the compactions, consider compacting those big files in many threads 
instead of only one, and enable referencing reference files to skip some 
compactions altogether.

 Add a decent heuristic for region size
 --

 Key: HBASE-4365
 URL: https://issues.apache.org/jira/browse/HBASE-4365
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.1, 0.94.0
Reporter: Todd Lipcon
Priority: Critical
  Labels: usability
 Attachments: 4365-v2.txt, 4365.txt


 A few of us were brainstorming this morning about what the default region 
 size should be. There were a few general points made:
 - in some ways it's better to be too-large than too-small, since you can 
 always split a table further, but you can't merge regions currently
 - with HFile v2 and multithreaded compactions there are fewer reasons to 
 avoid very-large regions (10GB+)
 - for small tables you may want a small region size just so you can 
 distribute load better across a cluster
 - for big tables, multi-GB is probably best

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5074) support checksums in HBase block cache


[ 
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215084#comment-13215084
 ] 

Phabricator commented on HBASE-5074:


dhruba has commented on the revision [jira] [HBASE-5074] Support checksums in 
HBase block cache.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:73 Actually, 
a new checksum object is created by every invocation of 
ChecksumType.getChecksumObject(), so it should be thread-safe
  src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:120 doing it

  src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:164 will 
restructure the comment, this feature is  switched on by default.

REVISION DETAIL
  https://reviews.facebook.net/D1521


 support checksums in HBase block cache
 --

 Key: HBASE-5074
 URL: https://issues.apache.org/jira/browse/HBASE-5074
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, 
 D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, 
 D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, 
 D1521.7.patch, D1521.8.patch, D1521.8.patch


 The current implementation of HDFS stores the data in one block file and the 
 metadata(checksum) in another block file. This means that every read into the 
 HBase block cache actually consumes two disk iops, one to the datafile and 
 one to the checksum file. This is a major problem for scaling HBase, because 
 HBase is usually bottlenecked on the number of random disk iops that the 
 storage-hardware offers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5347) GC free memory management in Level-1 Block Cache

2012-02-23 Thread Prakash Khemani (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215085#comment-13215085
]

Prakash Khemani commented on HBASE-5347:

Lars, you are right. I have been trying to induce a Full GC but without
any success. (I can induce a full GC if I artificially hold some
key-values in queue and force them to be tenured.)

On 89-fb, my test-case is doing random increments on a space of slightly
more than 40GB worth of Key-value data. The heap is set to 36GB. The LRU
cache has a high and low watermark of .98 and .85 percents. The region
server spawns 1000 threads that continuously do the increments. The
eviction thread manages to keep the block-cache at about 85% always.
Cache-on-write is turned on to induce more cache churn. All the 12 disks
are close to 100% read pegged. GC takes 60% of the CPU (sum of user times
in 1000 lines of gc log / (elapsed time * #cpus)). Compactions that get
started never complete while the load is on.

I guess I have to change the dynamics of the test case to induce GC pauses.

On 2/22/12 11:35 PM, Todd Lipcon (Commented) (JIRA) j...@apache.org
wrote:

GC free memory management in Level-1 Block Cache

Key: HBASE-5347
URL: https://issues.apache.org/jira/browse/HBASE-5347
Project: HBase
Issue Type: Improvement
Reporter: Prakash Khemani
Assignee: Prakash Khemani
Attachments: D1635.5.patch

On eviction of a block from the block-cache, instead of waiting for the
garbage collecter to reuse its memory, reuse the block right away.
This will require us to keep reference counts on the HFile blocks. Once we
have the reference counts in place we can do our own simple
blocks-out-of-slab allocation for the block-cache.
This will help us with
* reducing gc pressure, especially in the old generation
* making it possible to have non-java-heap memory backing the HFile blocks

[jira] [Commented] (HBASE-4365) Add a decent heuristic for region size

2012-02-23 Thread Todd Lipcon (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215088#comment-13215088
 ] 

Todd Lipcon commented on HBASE-4365:


Great results! Very cool.

 Add a decent heuristic for region size
 --

 Key: HBASE-4365
 URL: https://issues.apache.org/jira/browse/HBASE-4365
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.1, 0.94.0
Reporter: Todd Lipcon
Priority: Critical
  Labels: usability
 Attachments: 4365-v2.txt, 4365.txt


 A few of us were brainstorming this morning about what the default region 
 size should be. There were a few general points made:
 - in some ways it's better to be too-large than too-small, since you can 
 always split a table further, but you can't merge regions currently
 - with HFile v2 and multithreaded compactions there are fewer reasons to 
 avoid very-large regions (10GB+)
 - for small tables you may want a small region size just so you can 
 distribute load better across a cluster
 - for big tables, multi-GB is probably best

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5442) Use builder pattern in StoreFile and HFile

[
https://issues.apache.org/jira/browse/HBASE-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215094#comment-13215094
]

Hadoop QA commented on HBASE-5442:
--

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12515717/HFile-StoreFile-builder-2012-02-22_22_49_00.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 73 new or modified tests.

-1 javadoc. The javadoc tool appears to have generated -135 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 153 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.replication.TestReplicationPeer

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1029//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1029//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1029//console

This message is automatically generated.

Use builder pattern in StoreFile and HFile
--

[jira] [Commented] (HBASE-5407) Show the per-region level request/sec count in the web ui


[ 
https://issues.apache.org/jira/browse/HBASE-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215099#comment-13215099
 ] 

Phabricator commented on HBASE-5407:


Kannan has commented on the revision [jira][HBASE-5407][89-fb] Show the 
per-region level request/sec count in the web ui.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/HServerLoad.java:279 I think these 
fields need not/shouldn't be part of the serialize/deserialize -- otherwise, 
we'll break client-server interop for the getClusterStatus call.

  We temporarily ran into similar problem before and the relevant stack is:

 at java.io.DataInputStream.readFully(DataInputStream.java:152)
  at 
org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:211)
  at 
org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:512)
  at 
org.apache.hadoop.hbase.HServerInfo.readFields(HServerInfo.java:228)
  at 
org.apache.hadoop.hbase.ClusterStatus.readFields(ClusterStatus.java:226)
  at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:489)
  at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readFields(HbaseObjectWritable.java:230)
  at 
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:534)
  at 
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:459)
  java.lang.reflect.UndeclaredThrowableException
  at $Proxy0.getClusterStatus(Unknown Source)
  at 
org.apache.hadoop.hbase.client.HBaseAdmin.getClusterStatus(HBaseAdmin.java:1076)

  We should remove these from serialize/deserialize, but the metrics should 
still be available for viewing the web-ui.

REVISION DETAIL
  https://reviews.facebook.net/D1779

BRANCH
  regionRequest


 Show the per-region level request/sec count in the web ui
 -

 Key: HBASE-5407
 URL: https://issues.apache.org/jira/browse/HBASE-5407
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D1779.1.patch, D1779.1.patch, D1779.1.patch


 It would be nice to show the per-region level request/sec count in the web 
 ui, especially when debugging the hot region problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5407) Show the per-region level request/sec count in the web ui


[ 
https://issues.apache.org/jira/browse/HBASE-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215100#comment-13215100
 ] 

Phabricator commented on HBASE-5407:


Kannan has commented on the revision [jira][HBASE-5407][89-fb] Show the 
per-region level request/sec count in the web ui.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/HServerLoad.java:279 I think these 
fields need not/shouldn't be part of the serialize/deserialize -- otherwise, 
we'll break client-server interop for the getClusterStatus call.

  We temporarily ran into similar problem before and the relevant stack is:

 at java.io.DataInputStream.readFully(DataInputStream.java:152)
  at 
org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:211)
  at 
org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:512)
  at 
org.apache.hadoop.hbase.HServerInfo.readFields(HServerInfo.java:228)
  at 
org.apache.hadoop.hbase.ClusterStatus.readFields(ClusterStatus.java:226)
  at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:489)
  at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readFields(HbaseObjectWritable.java:230)
  at 
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:534)
  at 
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:459)
  java.lang.reflect.UndeclaredThrowableException
  at $Proxy0.getClusterStatus(Unknown Source)
  at 
org.apache.hadoop.hbase.client.HBaseAdmin.getClusterStatus(HBaseAdmin.java:1076)

  We should remove these from serialize/deserialize, but the metrics should 
still be available for viewing the web-ui.

REVISION DETAIL
  https://reviews.facebook.net/D1779

BRANCH
  regionRequest


 Show the per-region level request/sec count in the web ui
 -

 Key: HBASE-5407
 URL: https://issues.apache.org/jira/browse/HBASE-5407
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D1779.1.patch, D1779.1.patch, D1779.1.patch


 It would be nice to show the per-region level request/sec count in the web 
 ui, especially when debugging the hot region problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5407) Show the per-region level request/sec count in the web ui


[ 
https://issues.apache.org/jira/browse/HBASE-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215102#comment-13215102
 ] 

Phabricator commented on HBASE-5407:


Kannan has commented on the revision [jira][HBASE-5407][89-fb] Show the 
per-region level request/sec count in the web ui.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/HServerLoad.java:279 I think these 
fields need not/shouldn't be part of the serialize/deserialize -- otherwise, 
we'll break client-server interop for the getClusterStatus call.

  We temporarily ran into similar problem before and the relevant stack is:

 at java.io.DataInputStream.readFully(DataInputStream.java:152)
  at 
org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:211)
  at 
org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:512)
  at 
org.apache.hadoop.hbase.HServerInfo.readFields(HServerInfo.java:228)
  at 
org.apache.hadoop.hbase.ClusterStatus.readFields(ClusterStatus.java:226)
  at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:489)
  at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readFields(HbaseObjectWritable.java:230)
  at 
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:534)
  at 
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:459)
  java.lang.reflect.UndeclaredThrowableException
  at $Proxy0.getClusterStatus(Unknown Source)
  at 
org.apache.hadoop.hbase.client.HBaseAdmin.getClusterStatus(HBaseAdmin.java:1076)

  We should remove these from serialize/deserialize, but the metrics should 
still be available for viewing the web-ui.

REVISION DETAIL
  https://reviews.facebook.net/D1779

BRANCH
  regionRequest


 Show the per-region level request/sec count in the web ui
 -

 Key: HBASE-5407
 URL: https://issues.apache.org/jira/browse/HBASE-5407
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D1779.1.patch, D1779.1.patch, D1779.1.patch


 It would be nice to show the per-region level request/sec count in the web 
 ui, especially when debugging the hot region problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5407) Show the per-region level request/sec count in the web ui


[ 
https://issues.apache.org/jira/browse/HBASE-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215103#comment-13215103
 ] 

Phabricator commented on HBASE-5407:


Kannan has commented on the revision [jira][HBASE-5407][89-fb] Show the 
per-region level request/sec count in the web ui.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/HServerLoad.java:279 I think these 
fields need not/shouldn't be part of the serialize/deserialize -- otherwise, 
we'll break client-server interop for the getClusterStatus call.

  We temporarily ran into similar problem before and the relevant stack is:

 at java.io.DataInputStream.readFully(DataInputStream.java:152)
  at 
org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:211)
  at 
org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:512)
  at 
org.apache.hadoop.hbase.HServerInfo.readFields(HServerInfo.java:228)
  at 
org.apache.hadoop.hbase.ClusterStatus.readFields(ClusterStatus.java:226)
  at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:489)
  at 
org.apache.hadoop.hbase.io.HbaseObjectWritable.readFields(HbaseObjectWritable.java:230)
  at 
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:534)
  at 
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:459)
  java.lang.reflect.UndeclaredThrowableException
  at $Proxy0.getClusterStatus(Unknown Source)
  at 
org.apache.hadoop.hbase.client.HBaseAdmin.getClusterStatus(HBaseAdmin.java:1076)

  We should remove these from serialize/deserialize, but the metrics should 
still be available for viewing the web-ui.

REVISION DETAIL
  https://reviews.facebook.net/D1779

BRANCH
  regionRequest


 Show the per-region level request/sec count in the web ui
 -

 Key: HBASE-5407
 URL: https://issues.apache.org/jira/browse/HBASE-5407
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D1779.1.patch, D1779.1.patch, D1779.1.patch


 It would be nice to show the per-region level request/sec count in the web 
 ui, especially when debugging the hot region problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5074) support checksums in HBase block cache

[
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215108#comment-13215108
]

Phabricator commented on HBASE-5074:

dhruba has commented on the revision [jira] [HBASE-5074] Support checksums in
HBase block cache.

INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 This
constructor is used only for V2, hence the major number is not a parameter.
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1235 I think
there won;t be any changes to the number of threads in the datanode. A datanode
thread is not tied up with a client FileSystem object. Instead, a global pool
of threads in the datanode are free to serve any read-requests from any client
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1244 The minor
version indicates disk-format changes inside an HFileBlock. The major version
indicates disk-format changes within a entire HFile. Since the AbstractFSReader
only reads HFileBlocks, so it is logical that it contains the minorVersion, is
it not?

But I can put in the majorVersion in it as well, if you so desire.
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1584 Yes, the
default it to enable hbase-checksum verification. And you are right that if the
hfile is of the older type, then we will quickly flip this back to false (in
the next line)
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1630 I think
we should keep both streams active till the HFile itself is closed.
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1646 done

src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java:961
Yes, precisely. Going forward, I would like to see if we can make HLogs go to a
filesystem object that is different from the filesystem used for hfiles.

src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java:86
I agree with you completely. This is an interface that should not change often.

REVISION DETAIL
https://reviews.facebook.net/D1521

support checksums in HBase block cache
--

Key: HBASE-5074
URL: https://issues.apache.org/jira/browse/HBASE-5074
Project: HBase
Issue Type: Improvement
Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch,
D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch,
D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch,
D1521.7.patch, D1521.8.patch, D1521.8.patch

The current implementation of HDFS stores the data in one block file and the
metadata(checksum) in another block file. This means that every read into the
HBase block cache actually consumes two disk iops, one to the datafile and
one to the checksum file. This is a major problem for scaling HBase, because
HBase is usually bottlenecked on the number of random disk iops that the
storage-hardware offers.

[jira] [Commented] (HBASE-5074) support checksums in HBase block cache

[
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215109#comment-13215109
]

Phabricator commented on HBASE-5074:

dhruba has commented on the revision [jira] [HBASE-5074] Support checksums in
HBase block cache.

src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java:86
I agree with you completely. This is an interface that should not change often.

REVISION DETAIL
https://reviews.facebook.net/D1521

support checksums in HBase block cache
--

[jira] [Updated] (HBASE-5074) support checksums in HBase block cache


 [ 
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5074:
---

Attachment: D1521.9.patch

dhruba updated the revision [jira] [HBASE-5074] Support checksums in HBase 
block cache.
Reviewers: mbautin

  Pulled in review comments from Stack and Ted.

REVISION DETAIL
  https://reviews.facebook.net/D1521

AFFECTED FILES
  src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
  src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
  
src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
  src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
  
src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
  
src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
  src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
  src/main/java/org/apache/hadoop/hbase/HConstants.java
  src/main/java/org/apache/hadoop/hbase/fs
  src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
  src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
  src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
  src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
  src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
  
src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
  src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java


 support checksums in HBase block cache
 --

 Key: HBASE-5074
 URL: https://issues.apache.org/jira/browse/HBASE-5074
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, 
 D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, 
 D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, 
 D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch


 The current implementation of HDFS stores the data in one block file and the 
 metadata(checksum) in another block file. This means that every read into the 
 HBase block cache actually consumes two disk iops, one to the datafile and 
 one to the checksum file. This is a major problem for scaling HBase, because 
 HBase is usually bottlenecked on the number of random disk iops that the 
 storage-hardware offers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5074) support checksums in HBase block cache


 [ 
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5074:
---

Attachment: D1521.9.patch

dhruba updated the revision [jira] [HBASE-5074] Support checksums in HBase 
block cache.
Reviewers: mbautin

  Pulled in review comments from Stack and Ted.

REVISION DETAIL
  https://reviews.facebook.net/D1521

AFFECTED FILES
  src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
  src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
  
src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
  src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java
  
src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java
  
src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java
  src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java
  src/main/java/org/apache/hadoop/hbase/HConstants.java
  src/main/java/org/apache/hadoop/hbase/fs
  src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
  src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java
  src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
  src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java
  src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
  
src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
  src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java


 support checksums in HBase block cache
 --

 Key: HBASE-5074
 URL: https://issues.apache.org/jira/browse/HBASE-5074
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, 
 D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, 
 D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, 
 D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch


 The current implementation of HDFS stores the data in one block file and the 
 metadata(checksum) in another block file. This means that every read into the 
 HBase block cache actually consumes two disk iops, one to the datafile and 
 one to the checksum file. This is a major problem for scaling HBase, because 
 HBase is usually bottlenecked on the number of random disk iops that the 
 storage-hardware offers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5074) support checksums in HBase block cache


[ 
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215115#comment-13215115
 ] 

Phabricator commented on HBASE-5074:


mbautin has commented on the revision [jira] [HBASE-5074] Support checksums in 
HBase block cache.

  @dhruba: going through the diff once again. Since you've updated the 
revision, submitting existing comments against the previous version, and 
continuing with the new version.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:131 
Misspelling: Minimun - Minimum
  src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:44-45 Can these two 
be made final too?
  src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:145 
s/chuck/chunk/
  src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:48 
Fix javadoc: do do - do
  src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:38 Make this 
final, rename to DUMMY_VALUE, because this is a constant, and make the length a 
factor of 16 to take advantage of alignment.
  src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java:532 
s/manor/major/
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:157 This comment is 
misleading. This is not something that defaults to the 16 K, but the default 
value itself. I think this should say something about how a non-default value 
is specified.
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:265-271 The 
additional constructor should not be needed when 
https://issues.apache.org/jira/browse/HBASE-5442 goes in.
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:409 Is it possible 
to obtain the filesystem from the input stream rather than pass it as an 
additional parameter? Or is the underlying filesystem of the input stream a 
regular one, as opposed to an HFileSystem?

REVISION DETAIL
  https://reviews.facebook.net/D1521


 support checksums in HBase block cache
 --

 Key: HBASE-5074
 URL: https://issues.apache.org/jira/browse/HBASE-5074
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, 
 D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, 
 D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, 
 D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch


 The current implementation of HDFS stores the data in one block file and the 
 metadata(checksum) in another block file. This means that every read into the 
 HBase block cache actually consumes two disk iops, one to the datafile and 
 one to the checksum file. This is a major problem for scaling HBase, because 
 HBase is usually bottlenecked on the number of random disk iops that the 
 storage-hardware offers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5074) support checksums in HBase block cache


[ 
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215114#comment-13215114
 ] 

Phabricator commented on HBASE-5074:


mbautin has commented on the revision [jira] [HBASE-5074] Support checksums in 
HBase block cache.

  @dhruba: going through the diff once again. Since you've updated the 
revision, submitting existing comments against the previous version, and 
continuing with the new version.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:131 
Misspelling: Minimun - Minimum
  src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:44-45 Can these two 
be made final too?
  src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:145 
s/chuck/chunk/
  src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:48 
Fix javadoc: do do - do
  src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:38 Make this 
final, rename to DUMMY_VALUE, because this is a constant, and make the length a 
factor of 16 to take advantage of alignment.
  src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java:532 
s/manor/major/
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:157 This comment is 
misleading. This is not something that defaults to the 16 K, but the default 
value itself. I think this should say something about how a non-default value 
is specified.
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:265-271 The 
additional constructor should not be needed when 
https://issues.apache.org/jira/browse/HBASE-5442 goes in.
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:409 Is it possible 
to obtain the filesystem from the input stream rather than pass it as an 
additional parameter? Or is the underlying filesystem of the input stream a 
regular one, as opposed to an HFileSystem?

REVISION DETAIL
  https://reviews.facebook.net/D1521


 support checksums in HBase block cache
 --

 Key: HBASE-5074
 URL: https://issues.apache.org/jira/browse/HBASE-5074
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, 
 D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, 
 D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, 
 D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch


 The current implementation of HDFS stores the data in one block file and the 
 metadata(checksum) in another block file. This means that every read into the 
 HBase block cache actually consumes two disk iops, one to the datafile and 
 one to the checksum file. This is a major problem for scaling HBase, because 
 HBase is usually bottlenecked on the number of random disk iops that the 
 storage-hardware offers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5466) Opening a table also opens the metatable and never closes it.


[ 
https://issues.apache.org/jira/browse/HBASE-5466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215127#comment-13215127
 ] 

Hadoop QA commented on HBASE-5466:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12515823/MetaScanner_HBASE_5466%282%29.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated -136 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 152 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestImportTsv

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1030//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1030//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1030//console

This message is automatically generated.

 Opening a table also opens the metatable and never closes it.
 -

 Key: HBASE-5466
 URL: https://issues.apache.org/jira/browse/HBASE-5466
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.5, 0.92.0
Reporter: Ashley Taylor
 Attachments: MetaScanner_HBASE_5466(2).patch, 
 MetaScanner_HBASE_5466.patch


 Having upgraded to CDH3U3 version of hbase we found we had a zookeeper 
 connection leak, tracking it down we found that closing the connection will 
 only close the zookeeper connection if all calls to get the connection have 
 been closed, there is incCount and decCount in the HConnection class,
 When a table is opened it makes a call to the metascanner class which opens a 
 connection to the meta table, this table never gets closed.
 This caused the count in the HConnection class to never return to zero 
 meaning that the zookeeper connection will not close when we close all the 
 tables or calling
 HConnectionManager.deleteConnection(config, true);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5466) Opening a table also opens the metatable and never closes it.


[ 
https://issues.apache.org/jira/browse/HBASE-5466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215129#comment-13215129
 ] 

Hadoop QA commented on HBASE-5466:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12515823/MetaScanner_HBASE_5466%282%29.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated -136 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 152 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks
  org.apache.hadoop.hbase.mapreduce.TestImportTsv
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
  org.apache.hadoop.hbase.TestZooKeeper

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1031//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1031//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1031//console

This message is automatically generated.

 Opening a table also opens the metatable and never closes it.
 -

 Key: HBASE-5466
 URL: https://issues.apache.org/jira/browse/HBASE-5466
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.5, 0.92.0
Reporter: Ashley Taylor
 Attachments: MetaScanner_HBASE_5466(2).patch, 
 MetaScanner_HBASE_5466.patch


 Having upgraded to CDH3U3 version of hbase we found we had a zookeeper 
 connection leak, tracking it down we found that closing the connection will 
 only close the zookeeper connection if all calls to get the connection have 
 been closed, there is incCount and decCount in the HConnection class,
 When a table is opened it makes a call to the metascanner class which opens a 
 connection to the meta table, this table never gets closed.
 This caused the count in the HConnection class to never return to zero 
 meaning that the zookeeper connection will not close when we close all the 
 tables or calling
 HConnectionManager.deleteConnection(config, true);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5468) [book] updating copyright in Reference Guide

2012-02-23 Thread Doug Meil (Created) (JIRA)

[book] updating copyright in Reference Guide


 Key: HBASE-5468
 URL: https://issues.apache.org/jira/browse/HBASE-5468
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Trivial
 Attachments: book_hbase_5468.xml.patch

book.xml

updating copyright to 2012

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5468) [book] updating copyright in Reference Guide


 [ 
https://issues.apache.org/jira/browse/HBASE-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5468:
-

Status: Patch Available  (was: Open)

 [book] updating copyright in Reference Guide
 

 Key: HBASE-5468
 URL: https://issues.apache.org/jira/browse/HBASE-5468
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Trivial
 Attachments: book_hbase_5468.xml.patch


 book.xml
 updating copyright to 2012

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5468) [book] updating copyright in Reference Guide


 [ 
https://issues.apache.org/jira/browse/HBASE-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5468:
-

Attachment: book_hbase_5468.xml.patch

 [book] updating copyright in Reference Guide
 

 Key: HBASE-5468
 URL: https://issues.apache.org/jira/browse/HBASE-5468
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Trivial
 Attachments: book_hbase_5468.xml.patch


 book.xml
 updating copyright to 2012

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5468) [book] updating copyright in Reference Guide


 [ 
https://issues.apache.org/jira/browse/HBASE-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-5468:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 [book] updating copyright in Reference Guide
 

 Key: HBASE-5468
 URL: https://issues.apache.org/jira/browse/HBASE-5468
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Trivial
 Attachments: book_hbase_5468.xml.patch


 book.xml
 updating copyright to 2012

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5466) Opening a table also opens the metatable and never closes it.


[ 
https://issues.apache.org/jira/browse/HBASE-5466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215137#comment-13215137
 ] 

stack commented on HBASE-5466:
--

+1 on patch (except for the spacing that is not like the rest of the file)

 Opening a table also opens the metatable and never closes it.
 -

 Key: HBASE-5466
 URL: https://issues.apache.org/jira/browse/HBASE-5466
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.5, 0.92.0
Reporter: Ashley Taylor
 Attachments: MetaScanner_HBASE_5466(2).patch, 
 MetaScanner_HBASE_5466.patch


 Having upgraded to CDH3U3 version of hbase we found we had a zookeeper 
 connection leak, tracking it down we found that closing the connection will 
 only close the zookeeper connection if all calls to get the connection have 
 been closed, there is incCount and decCount in the HConnection class,
 When a table is opened it makes a call to the metascanner class which opens a 
 connection to the meta table, this table never gets closed.
 This caused the count in the HConnection class to never return to zero 
 meaning that the zookeeper connection will not close when we close all the 
 tables or calling
 HConnectionManager.deleteConnection(config, true);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5466) Opening a table also opens the metatable and never closes it.


[ 
https://issues.apache.org/jira/browse/HBASE-5466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215140#comment-13215140
 ] 

Zhihong Yu commented on HBASE-5466:
---

TestZooKeeper passed locally with patch v2.
{code}
+  }finally{
+if(metaTable!=null){
{code}
There should be a space between } and finally, finally and {, if and (, ) and {

Overall, +1 on patch v2. Please fix formatting in v3.

 Opening a table also opens the metatable and never closes it.
 -

 Key: HBASE-5466
 URL: https://issues.apache.org/jira/browse/HBASE-5466
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.5, 0.92.0
Reporter: Ashley Taylor
 Attachments: MetaScanner_HBASE_5466(2).patch, 
 MetaScanner_HBASE_5466.patch


 Having upgraded to CDH3U3 version of hbase we found we had a zookeeper 
 connection leak, tracking it down we found that closing the connection will 
 only close the zookeeper connection if all calls to get the connection have 
 been closed, there is incCount and decCount in the HConnection class,
 When a table is opened it makes a call to the metascanner class which opens a 
 connection to the meta table, this table never gets closed.
 This caused the count in the HConnection class to never return to zero 
 meaning that the zookeeper connection will not close when we close all the 
 tables or calling
 HConnectionManager.deleteConnection(config, true);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5074) support checksums in HBase block cache


[ 
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215149#comment-13215149
 ] 

Phabricator commented on HBASE-5074:


tedyu has commented on the revision [jira] [HBASE-5074] Support checksums in 
HBase block cache.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 What will 
happen after HFileV3 is introduced ?
  I would expect HFileV3 starts with minorVersion of 0.
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java:961 
HLog goes to fs on SSD ?
  Nice.

REVISION DETAIL
  https://reviews.facebook.net/D1521


 support checksums in HBase block cache
 --

 Key: HBASE-5074
 URL: https://issues.apache.org/jira/browse/HBASE-5074
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, 
 D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, 
 D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, 
 D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch


 The current implementation of HDFS stores the data in one block file and the 
 metadata(checksum) in another block file. This means that every read into the 
 HBase block cache actually consumes two disk iops, one to the datafile and 
 one to the checksum file. This is a major problem for scaling HBase, because 
 HBase is usually bottlenecked on the number of random disk iops that the 
 storage-hardware offers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5074) support checksums in HBase block cache


[ 
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215148#comment-13215148
 ] 

Phabricator commented on HBASE-5074:


tedyu has commented on the revision [jira] [HBASE-5074] Support checksums in 
HBase block cache.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 What will 
happen after HFileV3 is introduced ?
  I would expect HFileV3 starts with minorVersion of 0.
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java:961 
HLog goes to fs on SSD ?
  Nice.

REVISION DETAIL
  https://reviews.facebook.net/D1521


 support checksums in HBase block cache
 --

 Key: HBASE-5074
 URL: https://issues.apache.org/jira/browse/HBASE-5074
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, 
 D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, 
 D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, 
 D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch


 The current implementation of HDFS stores the data in one block file and the 
 metadata(checksum) in another block file. This means that every read into the 
 HBase block cache actually consumes two disk iops, one to the datafile and 
 one to the checksum file. This is a major problem for scaling HBase, because 
 HBase is usually bottlenecked on the number of random disk iops that the 
 storage-hardware offers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5074) support checksums in HBase block cache

[
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215150#comment-13215150
]

Phabricator commented on HBASE-5074:

stack has commented on the revision [jira] [HBASE-5074] Support checksums in
HBase block cache.

Good w/ your comebacks Dhruba... just minor one below for your next rev.

Let us know how the cluster testing goes. This patch applies fine. Might
try it out over here too..

INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 I don't
understand. I think this means the fact that we have a minor version
unaccompanied by a major needs docing here in a comment? No hurry.

REVISION DETAIL
https://reviews.facebook.net/D1521

support checksums in HBase block cache
--

[jira] [Commented] (HBASE-5074) support checksums in HBase block cache


[ 
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215153#comment-13215153
 ] 

Phabricator commented on HBASE-5074:


mbautin has commented on the revision [jira] [HBASE-5074] Support checksums in 
HBase block cache.

  @dhruba: some more comments inline.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:451-452 Assign 
headerSize() to a local variable instead of calling it twice.
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:529-530 Call 
headerSize() once and store in a local variable.
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1232 do do - 
do
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1642-1644 
Store and reuse part of the previous error message.
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1636 Check if 
WARN level messages are enabled and only generate the message string in that 
case.
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1848 double 
semicolon (does not matter)
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java:424 What if 
istream != istreamNoFsChecksum but istreamNoFsChecksum == null?
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:3610-3612 Not 
sure how this is related to HBase-level checksum checking
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:265 Make this 
conf key a constant in HConstants
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:275 conf key - 
HConstants
  src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:40-43 This is 
unnecessary because the default toString would do the same.
  src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:57-60 This is 
unnecessary because the default toString would do the same.
  src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:103-106 This is 
unnecessary because the default toString would do the same.
  src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:143-144 It looks 
like toString would to this.
  src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:179 Would not 
the built-in enum method valueOf do what this function is doing?
  
src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:1
 This file still seems to contain a lot of copy-and-paste from TestHFileBlock. 
Are you planning to address that?

REVISION DETAIL
  https://reviews.facebook.net/D1521


 support checksums in HBase block cache
 --

 Key: HBASE-5074
 URL: https://issues.apache.org/jira/browse/HBASE-5074
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, 
 D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, 
 D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, 
 D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch


 The current implementation of HDFS stores the data in one block file and the 
 metadata(checksum) in another block file. This means that every read into the 
 HBase block cache actually consumes two disk iops, one to the datafile and 
 one to the checksum file. This is a major problem for scaling HBase, because 
 HBase is usually bottlenecked on the number of random disk iops that the 
 storage-hardware offers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5074) support checksums in HBase block cache

[
https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215151#comment-13215151
]

Phabricator commented on HBASE-5074:

stack has commented on the revision [jira] [HBASE-5074] Support checksums in
HBase block cache.

Good w/ your comebacks Dhruba... just minor one below for your next rev.

Let us know how the cluster testing goes. This patch applies fine. Might
try it out over here too..

REVISION DETAIL
https://reviews.facebook.net/D1521

support checksums in HBase block cache
--

[jira] [Updated] (HBASE-5415) FSTableDescriptors should handle random folders in hbase.root.dir better

2012-02-23 Thread Jean-Daniel Cryans (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans updated HBASE-5415:
--

Attachment: HBASE-5415.patch

This patch handles it by just printing a WARN, the side effect is that this 
method doesn't throw TableExistsException anymore (which didn't make sense 
anyway) so I cleaned up a bunch of code.

 FSTableDescriptors should handle random folders in hbase.root.dir better
 

 Key: HBASE-5415
 URL: https://issues.apache.org/jira/browse/HBASE-5415
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Priority: Critical
 Fix For: 0.92.1, 0.94.0

 Attachments: HBASE-5415.patch


 I faked an upgrade on a test cluster using our dev data so I had to distcp 
 the data between the two clusters, but after starting up and doing the 
 migration and whatnot the web UI didn't show any table. The reason was in the 
 master's log:
 {quote}
 org.apache.hadoop.hbase.TableExistsException: No descriptor for 
 _distcp_logs_e0ehek
 at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:164)
 at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(FSTableDescriptors.java:182)
 at 
 org.apache.hadoop.hbase.master.HMaster.getHTableDescriptors(HMaster.java:1554)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326)
 {quote}
 I don't think we need to show a full stack (just a WARN maybe), this 
 shouldn't kill the request (still see tables in the web UI), and why is that 
 a TableExistsException?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5357) Use builder pattern in HColumnDescriptor


[ 
https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215158#comment-13215158
 ] 

Phabricator commented on HBASE-5357:


tedyu has accepted the revision [jira] [HBASE-5357] Refactoring: use the 
builder pattern for HColumnDescriptor.

  I see 'This diff has Lint Problems.' because of Lint being skipped.

REVISION DETAIL
  https://reviews.facebook.net/D1851

BRANCH
  hcd_builder2


 Use builder pattern in HColumnDescriptor
 

 Key: HBASE-5357
 URL: https://issues.apache.org/jira/browse/HBASE-5357
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: D1851.1.patch, D1851.2.patch, D1851.3.patch, 
 D1851.4.patch, 
 Use-builder-pattern-for-HColumnDescriptor-2012-02-21_19_13_35.patch, 
 Use-builder-pattern-for-HColumnDescriptor-2012-02-23_12_42_49.patch, 
 Use-builder-pattern-for-HColumnDescriptor-20120223113155-e387d251.patch


 We have five ways to create an HFile writer, two ways to create a StoreFile 
 writer, and the sets of parameters keep changing, creating a lot of 
 confusion, especially when porting patches across branches. The same thing is 
 happening to HColumnDescriptor. I think we should move to a builder pattern 
 solution, e.g.
 {code:java}
   HFileWriter w = HFile.getWriterBuilder(conf, some common args)
   .setParameter1(value1)
   .setParameter2(value2)
   ...
   .build();
 {code}
 Each parameter setter being on its own line will make merges/cherry-pick work 
 properly, we will not have to even mention default parameters again, and we 
 can eliminate a dozen impossible-to-remember constructors.
 This particular JIRA addresses the HColumnDescriptor refactoring. For 
 StoreFile/HFile refactoring see HBASE-5442.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5437) HRegionThriftServer does not start because of a bug in HbaseHandlerMetricsProxy


[ 
https://issues.apache.org/jira/browse/HBASE-5437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215161#comment-13215161
 ] 

Zhihong Yu commented on HBASE-5437:
---

Integrated to TRUNK.

Thanks for the patch, Scott.

Thanks for the review, Stack.

 HRegionThriftServer does not start because of a bug in 
 HbaseHandlerMetricsProxy
 ---

 Key: HBASE-5437
 URL: https://issues.apache.org/jira/browse/HBASE-5437
 Project: HBase
  Issue Type: Bug
  Components: metrics, thrift
Reporter: Scott Chen
Assignee: Scott Chen
 Fix For: 0.94.0

 Attachments: HBASE-5437.D1857.1.patch, HBASE-5437.D1887.1.patch, 
 HBASE-5437.D1887.2.patch


 3.facebook.com,60020,1329865516120: Initialization of RS failed.  Hence 
 aborting RS.
 java.lang.ClassCastException: $Proxy9 cannot be cast to 
 org.apache.hadoop.hbase.thrift.generated.Hbase$Iface
 at 
 org.apache.hadoop.hbase.thrift.HbaseHandlerMetricsProxy.newInstance(HbaseHandlerMetricsProxy.java:47)
 at 
 org.apache.hadoop.hbase.thrift.ThriftServerRunner.init(ThriftServerRunner.java:239)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionThriftServer.init(HRegionThriftServer.java:74)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.initializeThreads(HRegionServer.java:646)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:546)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:658)
 at java.lang.Thread.run(Thread.java:662)
 2012-02-21 15:05:18,749 FATAL org.apache.hadoop.h

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5415) FSTableDescriptors should handle random folders in hbase.root.dir better


[ 
https://issues.apache.org/jira/browse/HBASE-5415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215164#comment-13215164
 ] 

stack commented on HBASE-5415:
--

Whats difference between miscellaneous dirs under hbase.rootdir and an actual 
table directory that is missing its .tableinfo file?

We're changing our API when we remove TEE from public methods?

 FSTableDescriptors should handle random folders in hbase.root.dir better
 

 Key: HBASE-5415
 URL: https://issues.apache.org/jira/browse/HBASE-5415
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Priority: Critical
 Fix For: 0.92.1, 0.94.0

 Attachments: HBASE-5415.patch


 I faked an upgrade on a test cluster using our dev data so I had to distcp 
 the data between the two clusters, but after starting up and doing the 
 migration and whatnot the web UI didn't show any table. The reason was in the 
 master's log:
 {quote}
 org.apache.hadoop.hbase.TableExistsException: No descriptor for 
 _distcp_logs_e0ehek
 at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:164)
 at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(FSTableDescriptors.java:182)
 at 
 org.apache.hadoop.hbase.master.HMaster.getHTableDescriptors(HMaster.java:1554)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326)
 {quote}
 I don't think we need to show a full stack (just a WARN maybe), this 
 shouldn't kill the request (still see tables in the web UI), and why is that 
 a TableExistsException?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5357) Use builder pattern in HColumnDescriptor

2012-02-23 Thread Mikhail Bautin (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215166#comment-13215166
 ] 

Mikhail Bautin commented on HBASE-5357:
---

Re-ran failed unit tests locally:

Running org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 211.925 sec
Running org.apache.hadoop.hbase.mapreduce.TestImportTsv
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 81.352 sec
Running org.apache.hadoop.hbase.mapreduce.TestTableMapReduce
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 105.09 sec
Running org.apache.hadoop.hbase.mapred.TestTableMapReduce
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 68.055 sec

Results :

Tests run: 19, Failures: 0, Errors: 0, Skipped: 0


 Use builder pattern in HColumnDescriptor
 

 Key: HBASE-5357
 URL: https://issues.apache.org/jira/browse/HBASE-5357
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: D1851.1.patch, D1851.2.patch, D1851.3.patch, 
 D1851.4.patch, 
 Use-builder-pattern-for-HColumnDescriptor-2012-02-21_19_13_35.patch, 
 Use-builder-pattern-for-HColumnDescriptor-2012-02-23_12_42_49.patch, 
 Use-builder-pattern-for-HColumnDescriptor-20120223113155-e387d251.patch


 We have five ways to create an HFile writer, two ways to create a StoreFile 
 writer, and the sets of parameters keep changing, creating a lot of 
 confusion, especially when porting patches across branches. The same thing is 
 happening to HColumnDescriptor. I think we should move to a builder pattern 
 solution, e.g.
 {code:java}
   HFileWriter w = HFile.getWriterBuilder(conf, some common args)
   .setParameter1(value1)
   .setParameter2(value2)
   ...
   .build();
 {code}
 Each parameter setter being on its own line will make merges/cherry-pick work 
 properly, we will not have to even mention default parameters again, and we 
 can eliminate a dozen impossible-to-remember constructors.
 This particular JIRA addresses the HColumnDescriptor refactoring. For 
 StoreFile/HFile refactoring see HBASE-5442.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5469) Add baseline compression efficiency to DataBlockEncodingTool

2012-02-23 Thread Mikhail Bautin (Created) (JIRA)

Add baseline compression efficiency to DataBlockEncodingTool


 Key: HBASE-5469
 URL: https://issues.apache.org/jira/browse/HBASE-5469
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor


DataBlockEncodingTool currently does not provide baseline compression 
efficiency, e.g. Hadoop compression codec applied to unencoded data. E.g. if we 
are using LZO to compress blocks, we would like to have the following columns 
in the report (possibly as percentages of raw data size).

Baseline K+V in blockcache  |   Baseline K + V on disk  (LZO compressed)  | K + 
V  DataBlockEncoded in block cache |   K + V DataBlockEncoded + LZOCompressed 
(on disk)

Background: we never store compressed blocks in cache, but we always store 
encoded data blocks in cache if data block encoding is enabled for the column 
family.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5415) FSTableDescriptors should handle random folders in hbase.root.dir better

2012-02-23 Thread Jean-Daniel Cryans (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215175#comment-13215175
 ] 

Jean-Daniel Cryans commented on HBASE-5415:
---

bq. Whats difference between miscellaneous dirs under hbase.rootdir and an 
actual table directory that is missing its .tableinfo file?

Former's HTD is null, latter gets a FNFE.

bq. We're changing our API when we remove TEE from public methods?

Technically no, TEE (and FNFE FWIW) are both IOEs so there's no change there. I 
removed TEE specifically because it isn't thrown anymore.

 FSTableDescriptors should handle random folders in hbase.root.dir better
 

 Key: HBASE-5415
 URL: https://issues.apache.org/jira/browse/HBASE-5415
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Priority: Critical
 Fix For: 0.92.1, 0.94.0

 Attachments: HBASE-5415.patch


 I faked an upgrade on a test cluster using our dev data so I had to distcp 
 the data between the two clusters, but after starting up and doing the 
 migration and whatnot the web UI didn't show any table. The reason was in the 
 master's log:
 {quote}
 org.apache.hadoop.hbase.TableExistsException: No descriptor for 
 _distcp_logs_e0ehek
 at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:164)
 at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(FSTableDescriptors.java:182)
 at 
 org.apache.hadoop.hbase.master.HMaster.getHTableDescriptors(HMaster.java:1554)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326)
 {quote}
 I don't think we need to show a full stack (just a WARN maybe), this 
 shouldn't kill the request (still see tables in the web UI), and why is that 
 a TableExistsException?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4365) Add a decent heuristic for region size