[jira] [Updated] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2012-01-25 Thread Mikhail Bautin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-4218:
--

Attachment: Delta-encoding-2012-01-25_00_45_29.patch

Submitting for Jenkins testing. This corresponds to the latest patch on 
Phabricator: https://reviews.facebook.net/D447?vs=id=4407whitespace=ignore-all


 Data Block Encoding of KeyValues  (aka delta encoding / prefix compression)
 ---

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Fix For: 0.94.0

 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 0001-Delta-encoding.patch, 4218-2012-01-14.txt, 4218-v16.txt, 4218.txt, 
 D447.1.patch, D447.10.patch, D447.11.patch, D447.12.patch, D447.13.patch, 
 D447.14.patch, D447.15.patch, D447.16.patch, D447.17.patch, D447.18.patch, 
 D447.19.patch, D447.2.patch, D447.20.patch, D447.21.patch, D447.22.patch, 
 D447.23.patch, D447.24.patch, D447.25.patch, D447.3.patch, D447.4.patch, 
 D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, 
 Data-block-encoding-2011-12-23.patch, 
 Delta-encoding-2012-01-17_11_09_09.patch, 
 Delta-encoding-2012-01-25_00_45_29.patch, 
 Delta-encoding.patch-2011-12-22_11_52_07.patch, 
 Delta-encoding.patch-2012-01-05_15_16_43.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44_copy.patch, 
 Delta-encoding.patch-2012-01-05_18_50_47.patch, 
 Delta-encoding.patch-2012-01-07_14_12_48.patch, 
 Delta-encoding.patch-2012-01-13_12_20_07.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2012-01-25 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13192959#comment-13192959
 ] 

Hadoop QA commented on HBASE-4218:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12511817/Delta-encoding-2012-01-25_00_45_29.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 189 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -140 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 161 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
  org.apache.hadoop.hbase.client.TestAdmin
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.client.TestFromClientSide
  org.apache.hadoop.hbase.io.hfile.TestHFileBlock
  org.apache.hadoop.hbase.mapreduce.TestImportTsv

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/851//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/851//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/851//console

This message is automatically generated.

 Data Block Encoding of KeyValues  (aka delta encoding / prefix compression)
 ---

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Fix For: 0.94.0

 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 0001-Delta-encoding.patch, 4218-2012-01-14.txt, 4218-v16.txt, 4218.txt, 
 D447.1.patch, D447.10.patch, D447.11.patch, D447.12.patch, D447.13.patch, 
 D447.14.patch, D447.15.patch, D447.16.patch, D447.17.patch, D447.18.patch, 
 D447.19.patch, D447.2.patch, D447.20.patch, D447.21.patch, D447.22.patch, 
 D447.23.patch, D447.24.patch, D447.25.patch, D447.3.patch, D447.4.patch, 
 D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, 
 Data-block-encoding-2011-12-23.patch, 
 Delta-encoding-2012-01-17_11_09_09.patch, 
 Delta-encoding-2012-01-25_00_45_29.patch, 
 Delta-encoding.patch-2011-12-22_11_52_07.patch, 
 Delta-encoding.patch-2012-01-05_15_16_43.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44_copy.patch, 
 Delta-encoding.patch-2012-01-05_18_50_47.patch, 
 Delta-encoding.patch-2012-01-07_14_12_48.patch, 
 Delta-encoding.patch-2012-01-13_12_20_07.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about 

[jira] [Commented] (HBASE-5276) PerformanceEvaluation does not set the correct classpath for MR because it lives in the test jar

2012-01-25 Thread Tim Robertson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193021#comment-13193021
 ] 

Tim Robertson commented on HBASE-5276:
--

Lars G. points out that Stack has fixed this before 
https://github.com/apache/hbase/commit/e3f165f8f7327af53427a35f74a450b4df179ccc 
but seemingly it didn't make it into the CDH3u2

 PerformanceEvaluation does not set the correct classpath for MR because it 
 lives in the test jar
 

 Key: HBASE-5276
 URL: https://issues.apache.org/jira/browse/HBASE-5276
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.90.4
Reporter: Tim Robertson
Priority: Minor

 Note: This was discovered running the CDH version hbase-0.90.4-cdh3u2
 Running the PerformanceEvaluation as follows:
   $HADOOP_HOME/bin/hadoop org.apache.hadoop.hbase.PerformanceEvaluation scan 5
 fails because the MR tasks do not get the HBase jar on the CP, and thus hit 
 ClassNotFoundExceptions.
 The job gets the following only:
   file:/Users/tim/dev/hadoop/hbase-0.90.4-cdh3u2/hbase-0.90.4-cdh3u2-tests.jar
   
 file:/Users/tim/dev/hadoop/hadoop-0.20.2-cdh3u2/hadoop-core-0.20.2-cdh3u2.jar
   
 file:/Users/tim/dev/hadoop/hbase-0.90.4-cdh3u2/lib/zookeeper-3.3.3-cdh3u2.jar
 The RowCounter etc all work because they live in the HBase jar, not the test 
 jar, and they get the following 
   file:/Users/tim/dev/hadoop/hbase-0.90.4-cdh3u2/lib/guava-r06.jar
   
 file:/Users/tim/dev/hadoop/hadoop-0.20.2-cdh3u2/hadoop-core-0.20.2-cdh3u2.jar
   file:/Users/tim/dev/hadoop/hbase-0.90.4-cdh3u2/hbase-0.90.4-cdh3u2.jar
   
 file:/Users/tim/dev/hadoop/hbase-0.90.4-cdh3u2/lib/zookeeper-3.3.3-cdh3u2.jar
 Presumably this relates to 
   job.setJarByClass(PerformanceEvaluation.class);
   ...
   TableMapReduceUtil.addDependencyJars(job);
 A (cowboy) workaround to run PE is to unpack the jars, and copy the 
 PerformanceEvaluation* classes building a patched jar.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2965) Implement MultipleTableInputs which is analogous to MultipleInputs in Hadoop

2012-01-25 Thread Alexey Romanenko (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193027#comment-13193027
 ] 

Alexey Romanenko commented on HBASE-2965:
-

It seems this is not implemented yet, isn't it?

 Implement MultipleTableInputs which is analogous to MultipleInputs in Hadoop
 

 Key: HBASE-2965
 URL: https://issues.apache.org/jira/browse/HBASE-2965
 Project: HBase
  Issue Type: New Feature
  Components: mapred, mapreduce
Reporter: Adam Warrington
Assignee: Ophir Cohen
Priority: Minor

 This feature would be helpful for doing reduce side joins, or even passing 
 similarly structured data from multiple tables through map reduce. The API I 
 envision would be very similar to the already existent MultipleInputs, parts 
 of which could be reused.
 MultipleTableInputs would have a public api like:
 class MultipleTableInputs {
   public static void addInputTable(Job job, Table table, Scan scan, Class? 
 extends TableInputFormatBase inputFormatClass, Class? extends Mapper 
 mapperClass);
 };
 MultipleTableInputs would build a mapping of Tables to configured 
 TableInputFormats the same way MultipleInputs builds a mapping between Paths 
 and InputFormats. Since most people will probably use TableInputFormat.class 
 as the input format class, the MultipleTableInput implementation will have to 
 replace the TableInputFormatBase's private scan and table members that are 
 configured when an instance of TableInputFormat is created (from within its 
 setConf() method) by calling setScan and setHTable with the table and scan 
 that are passed into addInputTable above. MultipleTableInputFormat's 
 addInputTable() member function would also set the input format for the job 
 to DelegatingTableInputFormat, described below.
 A new class called DelegatingTableInputFormat would be analogous to 
 DelegatingInputFormat, where getSplits() would return TaggedInputSplits (same 
 TaggedInputSplit object that the Hadoop DelegatingInputFormat uses), which 
 tag the split with its InputFormat and Mapper. These are created by looping 
 through the HTable to InputFormat mappings, and calling getSplits on each 
 input format, and using the split, the input format, and mapper as 
 constructor args to TaggedInputSplits.
 The createRecordReader() function in DelegatingTableInputFormat could have 
 the same implementation as the Hadoop DelegatingInputFormat.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-01-25 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193119#comment-13193119
 ] 

jirapos...@reviews.apache.org commented on HBASE-5128:
--



bq.  On 2012-01-14 00:15:01, Ted Yu wrote:
bq.   src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 91
bq.   https://reviews.apache.org/r/3435/diff/2/?file=68921#file68921line91
bq.  
bq.   I think '.META.' should be used.

ok


bq.  On 2012-01-14 00:15:01, Ted Yu wrote:
bq.   src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 118
bq.   https://reviews.apache.org/r/3435/diff/2/?file=68921#file68921line118
bq.  
bq.   Should read 'that it was assigned to'

ok


bq.  On 2012-01-14 00:15:01, Ted Yu wrote:
bq.   src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 154
bq.   https://reviews.apache.org/r/3435/diff/2/?file=68921#file68921line154
bq.  
bq.   This is about fixing region assignment, right ?
bq.   Better include that in javadoc.

done


bq.  On 2012-01-14 00:15:01, Ted Yu wrote:
bq.   src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 121
bq.   https://reviews.apache.org/r/3435/diff/2/?file=68921#file68921line121
bq.  
bq.   Should read 'repairs require hbase ...'
bq.   
bq.   'to' at the end is not needed.

done


bq.  On 2012-01-14 00:15:01, Ted Yu wrote:
bq.   src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 172
bq.   https://reviews.apache.org/r/3435/diff/2/?file=68921#file68921line172
bq.  
bq.   Should read ' and correct '

done


bq.  On 2012-01-14 00:15:01, Ted Yu wrote:
bq.   src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 174
bq.   https://reviews.apache.org/r/3435/diff/2/?file=68921#file68921line174
bq.  
bq.   Would regionInfoMap be a better name ?

done


bq.  On 2012-01-14 00:15:01, Ted Yu wrote:
bq.   src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 270
bq.   https://reviews.apache.org/r/3435/diff/2/?file=68921#file68921line270
bq.  
bq.   Please correct this sentence's syntax.

sure


bq.  On 2012-01-14 00:15:01, Ted Yu wrote:
bq.   src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 280
bq.   https://reviews.apache.org/r/3435/diff/2/?file=68921#file68921line280
bq.  
bq.   We should impose maximum number of iterations for the loop, right ?

good point.


bq.  On 2012-01-14 00:15:01, Ted Yu wrote:
bq.   src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 287
bq.   https://reviews.apache.org/r/3435/diff/2/?file=68921#file68921line287
bq.  
bq.   Should read 'method requires cluster to be online ...'

done.


bq.  On 2012-01-14 00:15:01, Ted Yu wrote:
bq.   src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 289
bq.   https://reviews.apache.org/r/3435/diff/2/?file=68921#file68921line289
bq.  
bq.   Should read ' to be consistent'

reworded


bq.  On 2012-01-14 00:15:01, Ted Yu wrote:
bq.   src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 337
bq.   https://reviews.apache.org/r/3435/diff/2/?file=68921#file68921line337
bq.  
bq.   Should be called checkAndFixIntegrity()

ok.


bq.  On 2012-01-14 00:15:01, Ted Yu wrote:
bq.   src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 334
bq.   https://reviews.apache.org/r/3435/diff/2/?file=68921#file68921line334
bq.  
bq.   Should be called checkAndFixConsistency()

ok


bq.  On 2012-01-14 00:15:01, Ted Yu wrote:
bq.   src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 343
bq.   https://reviews.apache.org/r/3435/diff/2/?file=68921#file68921line343
bq.  
bq.   This sentence can be omitted.
bq.   If you keep it, please move it after the @return line.

removed


- jmhsieh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3435/#review4379
---


On 2012-01-13 22:49:33, jmhsieh wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3435/
bq.  ---
bq.  
bq.  (Updated 2012-01-13 22:49:33)
bq.  
bq.  
bq.  Review request for hbase, Todd Lipcon, Ted Yu, Michael Stack, and 
Jean-Daniel Cryans.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  I'm posting a preliminary version that I'm currently testing on real 
clusters. The tests are flakey on the 0.90 branch (so there is something async 
that I didn't synchronize properly), and there are a few more TODO's I want to 
knock out before this is ready for full review to be considered for committing. 
It's got some problems I need some advice figuring out.
bq.  
bq.  Problem 1:
bq.  
bq.  In the unit tests, I have a few cases where I fabricate new regions and 
try to 

[jira] [Commented] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-01-25 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193121#comment-13193121
 ] 

jirapos...@reviews.apache.org commented on HBASE-5128:
--



bq.  On 2012-01-11 21:15:13, Ted Yu wrote:
bq.   src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 1586
bq.   https://reviews.apache.org/r/3435/diff/1/?file=67172#file67172line1586
bq.  
bq.   Should be 'to end key'.

update this and handful of other comments.


bq.  On 2012-01-11 21:15:13, Ted Yu wrote:
bq.   src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 1594
bq.   https://reviews.apache.org/r/3435/diff/1/?file=67172#file67172line1594
bq.  
bq.   Should insert some text between newRegion and region.

updated


bq.  On 2012-01-11 21:15:13, Ted Yu wrote:
bq.   src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 1600
bq.   https://reviews.apache.org/r/3435/diff/1/?file=67172#file67172line1600
bq.  
bq.   This should be outside the for loop.

done


bq.  On 2012-01-11 21:15:13, Ted Yu wrote:
bq.   src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 1602
bq.   https://reviews.apache.org/r/3435/diff/1/?file=67172#file67172line1602
bq.  
bq.   Space between  and 0.

done


- jmhsieh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3435/#review4317
---


On 2012-01-13 22:49:33, jmhsieh wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3435/
bq.  ---
bq.  
bq.  (Updated 2012-01-13 22:49:33)
bq.  
bq.  
bq.  Review request for hbase, Todd Lipcon, Ted Yu, Michael Stack, and 
Jean-Daniel Cryans.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  I'm posting a preliminary version that I'm currently testing on real 
clusters. The tests are flakey on the 0.90 branch (so there is something async 
that I didn't synchronize properly), and there are a few more TODO's I want to 
knock out before this is ready for full review to be considered for committing. 
It's got some problems I need some advice figuring out.
bq.  
bq.  Problem 1:
bq.  
bq.  In the unit tests, I have a few cases where I fabricate new regions and 
try to force the overlapping regions to be closed. For some of these, I cannot 
delete a table after it is repaired without causing subsequent tests to fail. I 
think this is due to a few things:
bq.  
bq.  1) The disable table handler uses in-memory assignment manager state while 
delete uses in META assignment information.
bq.  2) Currently I'm using the sneaky closeRegion that purposely doesn't go 
through the master and in turn doesn't modify in-memory state – disable uses 
out of date in-memory region assignments. If I use the unassign method sends 
RIT transitions to the master, but which ends up attempting to assign it again, 
causing timing/transient states.
bq.  
bq.  What is a good way to clear the HMaster's assignment manager's assignment 
data for particular regions or to force it to re-read from META? (without 
modifying the 0.90 HBase's it is meant to repair).
bq.  
bq.  Problem 2:
bq.  
bq.  Sometimes test fail reporting HOLE_IN_REGION_CHAIN and 
SERVER_DOES_NOT_MATCH_META. This means the old and new regions are confiused 
with each other and basically something is still happening asynchronously. I 
think this is the new region is being assigned and is still transitioning. 
Sound about right? To make the unit test deterministic, should hbck wait for 
these to settle or should just the unit test wait?
bq.  
bq.  
bq.  This addresses bug HBASE-5128.
bq.  https://issues.apache.org/jira/browse/HBASE-5128
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java c56b3a6 
bq.src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 
330a7cc 
bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java 3c7b68d 
bq.src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 6d3401d 
bq.src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java a3d8b8b 
bq.src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java 
29e8bb2 
bq.
src/main/java/org/apache/hadoop/hbase/util/hbck/TableIntegrityErrorHandler.java 
PRE-CREATION 
bq.src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java a640d57 
bq.src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java 
dbb97f8 
bq.
src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java 
3e8729d 
bq.
src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java 
11a1151 
bq.

[jira] [Commented] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-01-25 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193120#comment-13193120
 ] 

jirapos...@reviews.apache.org commented on HBASE-5128:
--



bq.  On 2012-01-14 05:43:38, Lars Hofhansl wrote:
bq.   src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java, 
line 586
bq.   https://reviews.apache.org/r/3435/diff/2/?file=68919#file68919line586
bq.  
bq.   I liked this better before :)

I probably broke this out to be easier to step debug.   I can restore.


bq.  On 2012-01-14 05:43:38, Lars Hofhansl wrote:
bq.   src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java, line 154
bq.   https://reviews.apache.org/r/3435/diff/2/?file=68922#file68922line154
bq.  
bq.   No wait in case of exception. Is that by design?

nice catch. 


bq.  On 2012-01-14 05:43:38, Lars Hofhansl wrote:
bq.   src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 1083
bq.   https://reviews.apache.org/r/3435/diff/2/?file=68921#file68921line1083
bq.  
bq.   I think you said in the intro, that you need to check the 
availability of this rpc.

done in next version.


bq.  On 2012-01-14 05:43:38, Lars Hofhansl wrote:
bq.   src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 1072
bq.   https://reviews.apache.org/r/3435/diff/2/?file=68921#file68921line1072
bq.  
bq.   0.90.6?

updated to 0.90.6, with the assumption that this feature will not make it 
there, (but hopefully in to a 0.90.7)


bq.  On 2012-01-14 05:43:38, Lars Hofhansl wrote:
bq.   src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 2275
bq.   https://reviews.apache.org/r/3435/diff/2/?file=68921#file68921line2275
bq.  
bq.   I know this is not new, but this ErrorReporter is used for status 
messages as well as error reporting. Should maybe have a different name.
bq.   
bq.   Also should messages go to STDOUT (out) and error go to STDERR (err)?

TODO -- I'll follow up on this after the next round.


bq.  On 2012-01-14 05:43:38, Lars Hofhansl wrote:
bq.   src/main/java/org/apache/hadoop/hbase/master/HMaster.java, line 1053
bq.   https://reviews.apache.org/r/3435/diff/2/?file=68920#file68920line1053
bq.  
bq.   Should we add a double check here that the region is in fact offline 
(by checking .META.) or is that too expensive/not-needed?
bq.   
bq.   I'm thinking, once this method exists folks will eventually called 
for other reasons.

Currently, we needed this method to explicitly remove information from the 
Master's memory.  In the cases where this is used, I've directly removed data 
from meta (Delete into .META.) and closed the regions on region servers 
directly (HRegionInterface#closeRegion).

I haven't worked it out completely yet but it probably makes sense to fix 
closeRegion to properly add an param that will remove this in memory master 
state as well. I was under the gun get something working out, and now having 
accomplished this I'm definitely open to refactor this to make it saner and to 
clean this up more.


bq.  On 2012-01-14 05:43:38, Lars Hofhansl wrote:
bq.   src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 90
bq.   https://reviews.apache.org/r/3435/diff/2/?file=68921#file68921line90
bq.  
bq.   Nice documentation. This tool is awesome.

thanks!


- jmhsieh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3435/#review4384
---


On 2012-01-13 22:49:33, jmhsieh wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3435/
bq.  ---
bq.  
bq.  (Updated 2012-01-13 22:49:33)
bq.  
bq.  
bq.  Review request for hbase, Todd Lipcon, Ted Yu, Michael Stack, and 
Jean-Daniel Cryans.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  I'm posting a preliminary version that I'm currently testing on real 
clusters. The tests are flakey on the 0.90 branch (so there is something async 
that I didn't synchronize properly), and there are a few more TODO's I want to 
knock out before this is ready for full review to be considered for committing. 
It's got some problems I need some advice figuring out.
bq.  
bq.  Problem 1:
bq.  
bq.  In the unit tests, I have a few cases where I fabricate new regions and 
try to force the overlapping regions to be closed. For some of these, I cannot 
delete a table after it is repaired without causing subsequent tests to fail. I 
think this is due to a few things:
bq.  
bq.  1) The disable table handler uses in-memory assignment manager state while 
delete uses in META assignment information.
bq.  2) Currently I'm using the sneaky closeRegion that purposely doesn't go 

[jira] [Commented] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2012-01-25 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193130#comment-13193130
 ] 

Phabricator commented on HBASE-4218:


gqchen has commented on the revision [jira] [HBASE-4218] HFile data block 
encoding framework and delta encoding implementation.

  Looks really good to me!

  I haven't finished reviewing DiffKeyDeltaEncoding (another day or so) and 
might probably have a few minor comments about cosmetic things. But definitely 
no need to wait for that.

INLINE COMMENTS
  
src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java:198 
I think the logic is the following:
  1. if the value not the same, copy the whole value.
  2. however, if type is also not the same, take advantage of the fact that 
type field is right ahead of value, and copy both type and value in one 
shot.

  So the code would be like:

  if ((flag  FLAG_SAME_VALUE) == 0) {
  if ((flag  FALG_SAME_TYPE) == 0) {
 valueOffset -= ...
 valueLength += ...
  }
  ByteBufferUtils.copy...
  }

  The headache is if we decide to add one more field between type and value 
in the future, this code will be silently broken.

REVISION DETAIL
  https://reviews.facebook.net/D447


 Data Block Encoding of KeyValues  (aka delta encoding / prefix compression)
 ---

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Fix For: 0.94.0

 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 0001-Delta-encoding.patch, 4218-2012-01-14.txt, 4218-v16.txt, 4218.txt, 
 D447.1.patch, D447.10.patch, D447.11.patch, D447.12.patch, D447.13.patch, 
 D447.14.patch, D447.15.patch, D447.16.patch, D447.17.patch, D447.18.patch, 
 D447.19.patch, D447.2.patch, D447.20.patch, D447.21.patch, D447.22.patch, 
 D447.23.patch, D447.24.patch, D447.25.patch, D447.3.patch, D447.4.patch, 
 D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, 
 Data-block-encoding-2011-12-23.patch, 
 Delta-encoding-2012-01-17_11_09_09.patch, 
 Delta-encoding-2012-01-25_00_45_29.patch, 
 Delta-encoding.patch-2011-12-22_11_52_07.patch, 
 Delta-encoding.patch-2012-01-05_15_16_43.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44_copy.patch, 
 Delta-encoding.patch-2012-01-05_18_50_47.patch, 
 Delta-encoding.patch-2012-01-07_14_12_48.patch, 
 Delta-encoding.patch-2012-01-13_12_20_07.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5277) Put AssignmentManager on a diet

2012-01-25 Thread stack (Created) (JIRA)
Put AssignmentManager on a diet
---

 Key: HBASE-5277
 URL: https://issues.apache.org/jira/browse/HBASE-5277
 Project: HBase
  Issue Type: Task
Reporter: stack


AM has five or six inner classes, it takes zk callbacks, it has distinct Maps 
that are supposed to be edited together (regions and servers) and that then 
have some interaction w/ notion of regionsInTransition, etc.  It could do w/ a 
tune-up so it comes back into the realm of the penetrable (currently I'd not be 
surprised if its workings were beyond the ken of all who here code). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-01-25 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193139#comment-13193139
 ] 

jirapos...@reviews.apache.org commented on HBASE-5128:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3435/
---

(Updated 2012-01-25 17:24:41.277326)


Review request for hbase, Todd Lipcon, Ted Yu, Michael Stack, and Jean-Daniel 
Cryans.


Changes
---

This version includes updates after testing against real online but idle 
clusters with real induced corruptions.  This was hbck was tested successfully 
against apache/0.90+this patch branch region servers and regionservers on 
cdh3u2 (an 0.90.4-based hbase without the new offline method).  

I'm going to post usage description and images I've created to explain this 
better on the JIRA.

High level changes in this rev.
- hbck now wraps calls to the offline method and will use unasssign if the 
target region server does not support offline.
- restructured hdfs integrity repairs into more phases -- when compound 
problems were present we'd get into a loop where orphan repair would cause new 
overlaps on a subsequent integrity repair iteration.  This new approach should 
be deterministic. The new phases are 1) Find hdfs holes and patch (post 
condition: no more holes), 2) adopt orphan hdfs regions  (post condition: no 
orphan data in hdfs) 3) reload and fix overlaps (precondition: no holes but 
overlaps possible; post condition: no overlaps).  Previously integrity repairs 
would interate doing all three until it converged (but this didn't always 
happen in practice!). 
- Added more command line options that allow this hbck to only attempt certain 
repairs (which is necessary to get overlap repairs to work more 
deterministically, and needed in to get non-offline supporting hbases to 
converge)
- Added a few more test cases for new corruptions.

One big caveat with this rev is that the hbase was online but idle (no writes 
happening).   It was also suggested that I need to worry about compactions when 
I close regions during overlap merging (JD -- I didn't see anything in 
OnlineMerge -- why wasn't this a concern there?).  If so, I'd like advice on 
how to add guards to protect the user (is a glaring warning message or 
requiring confirmation sufficient?).  I'm going to do some initial testing on 
online and active cases -- but ideally would like this to come in follow on 
jiras.  


Summary
---

I'm posting a preliminary version that I'm currently testing on real clusters. 
The tests are flakey on the 0.90 branch (so there is something async that I 
didn't synchronize properly), and there are a few more TODO's I want to knock 
out before this is ready for full review to be considered for committing. It's 
got some problems I need some advice figuring out.

Problem 1:

In the unit tests, I have a few cases where I fabricate new regions and try to 
force the overlapping regions to be closed. For some of these, I cannot delete 
a table after it is repaired without causing subsequent tests to fail. I think 
this is due to a few things:

1) The disable table handler uses in-memory assignment manager state while 
delete uses in META assignment information.
2) Currently I'm using the sneaky closeRegion that purposely doesn't go through 
the master and in turn doesn't modify in-memory state – disable uses out of 
date in-memory region assignments. If I use the unassign method sends RIT 
transitions to the master, but which ends up attempting to assign it again, 
causing timing/transient states.

What is a good way to clear the HMaster's assignment manager's assignment data 
for particular regions or to force it to re-read from META? (without modifying 
the 0.90 HBase's it is meant to repair).

Problem 2:

Sometimes test fail reporting HOLE_IN_REGION_CHAIN and 
SERVER_DOES_NOT_MATCH_META. This means the old and new regions are confiused 
with each other and basically something is still happening asynchronously. I 
think this is the new region is being assigned and is still transitioning. 
Sound about right? To make the unit test deterministic, should hbck wait for 
these to settle or should just the unit test wait?


This addresses bug HBASE-5128.
https://issues.apache.org/jira/browse/HBASE-5128


Diffs (updated)
-

  src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java c56b3a6 
  src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 9520b95 
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java f7ad064 
  src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 6d3401d 
  src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java a3d8b8b 
  src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java 
29e8bb2 
  

[jira] [Updated] (HBASE-5277) Put AssignmentManager on a diet

2012-01-25 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5277:
-

Attachment: 5277v1.txt

Move RegionState out into its own package private class.  Make new 
datastructure ServersAndRegions that manages the servers to regions and regions 
to servers Maps.

Not done yet.  Cuts AM by 1k lines or about 8%.  More to do.

 Put AssignmentManager on a diet
 ---

 Key: HBASE-5277
 URL: https://issues.apache.org/jira/browse/HBASE-5277
 Project: HBase
  Issue Type: Task
Reporter: stack
 Attachments: 5277v1.txt


 AM has five or six inner classes, it takes zk callbacks, it has distinct Maps 
 that are supposed to be edited together (regions and servers) and that then 
 have some interaction w/ notion of regionsInTransition, etc.  It could do w/ 
 a tune-up so it comes back into the realm of the penetrable (currently I'd 
 not be surprised if its workings were beyond the ken of all who here code). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2012-01-25 Thread Mikhail Bautin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193148#comment-13193148
 ] 

Mikhail Bautin commented on HBASE-4218:
---

Re-running unit tests that failed on Jenkins:

Running org.apache.hadoop.hbase.client.TestFromClientSide
Tests run: 52, Failures: 0, Errors: 0, Skipped: 3, Time elapsed: 181.919 sec
Running org.apache.hadoop.hbase.client.TestAdmin
Tests run: 35, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 195.194 sec
Running org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 223.405 sec
Running org.apache.hadoop.hbase.mapreduce.TestImportTsv
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 78.48 sec
Running org.apache.hadoop.hbase.mapreduce.TestTableMapReduce
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 97.561 sec
Running org.apache.hadoop.hbase.mapred.TestTableMapReduce
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 67.289 sec
Running org.apache.hadoop.hbase.io.hfile.TestHFileBlock
Tests run: 16, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 49.362 sec

Results :

Tests run: 122, Failures: 0, Errors: 0, Skipped: 3


 Data Block Encoding of KeyValues  (aka delta encoding / prefix compression)
 ---

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Fix For: 0.94.0

 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 0001-Delta-encoding.patch, 4218-2012-01-14.txt, 4218-v16.txt, 4218.txt, 
 D447.1.patch, D447.10.patch, D447.11.patch, D447.12.patch, D447.13.patch, 
 D447.14.patch, D447.15.patch, D447.16.patch, D447.17.patch, D447.18.patch, 
 D447.19.patch, D447.2.patch, D447.20.patch, D447.21.patch, D447.22.patch, 
 D447.23.patch, D447.24.patch, D447.25.patch, D447.3.patch, D447.4.patch, 
 D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, 
 Data-block-encoding-2011-12-23.patch, 
 Delta-encoding-2012-01-17_11_09_09.patch, 
 Delta-encoding-2012-01-25_00_45_29.patch, 
 Delta-encoding.patch-2011-12-22_11_52_07.patch, 
 Delta-encoding.patch-2012-01-05_15_16_43.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44_copy.patch, 
 Delta-encoding.patch-2012-01-05_18_50_47.patch, 
 Delta-encoding.patch-2012-01-07_14_12_48.patch, 
 Delta-encoding.patch-2012-01-13_12_20_07.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-01-25 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193154#comment-13193154
 ] 

jirapos...@reviews.apache.org commented on HBASE-5128:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3435/#review4591
---


We should deprecate clearRegionFromTransition().


src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
https://reviews.apache.org/r/3435/#comment10238

I think a boolean return value would help determine the outcome of the 
action.



src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
https://reviews.apache.org/r/3435/#comment10237

This sentence should be moved before ' from ...'



src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
https://reviews.apache.org/r/3435/#comment10234

We should handle potential exception from this method.

Maybe we should check the availability of this rpc outside the loop and set 
a flag indicating whether Master supports this RPC.



src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
https://reviews.apache.org/r/3435/#comment10240

I would expect a boolean return value since we may return without throwing 
exception (line 1125)



src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
https://reviews.apache.org/r/3435/#comment10239

How about naming this method hasHdfsOnlyEdits() ?



src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
https://reviews.apache.org/r/3435/#comment10233

This TODO has been implemented, so we can remove it.



src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
https://reviews.apache.org/r/3435/#comment10232

More action is needed beyond a WARN message, right ?



src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java
https://reviews.apache.org/r/3435/#comment10235

success is local variable.
Why don't we change return type to boolean and return its value ?



src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java
https://reviews.apache.org/r/3435/#comment10236

We should set interrupt flag.


- Ted


On 2012-01-25 17:24:41, jmhsieh wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3435/
bq.  ---
bq.  
bq.  (Updated 2012-01-25 17:24:41)
bq.  
bq.  
bq.  Review request for hbase, Todd Lipcon, Ted Yu, Michael Stack, and 
Jean-Daniel Cryans.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  I'm posting a preliminary version that I'm currently testing on real 
clusters. The tests are flakey on the 0.90 branch (so there is something async 
that I didn't synchronize properly), and there are a few more TODO's I want to 
knock out before this is ready for full review to be considered for committing. 
It's got some problems I need some advice figuring out.
bq.  
bq.  Problem 1:
bq.  
bq.  In the unit tests, I have a few cases where I fabricate new regions and 
try to force the overlapping regions to be closed. For some of these, I cannot 
delete a table after it is repaired without causing subsequent tests to fail. I 
think this is due to a few things:
bq.  
bq.  1) The disable table handler uses in-memory assignment manager state while 
delete uses in META assignment information.
bq.  2) Currently I'm using the sneaky closeRegion that purposely doesn't go 
through the master and in turn doesn't modify in-memory state – disable uses 
out of date in-memory region assignments. If I use the unassign method sends 
RIT transitions to the master, but which ends up attempting to assign it again, 
causing timing/transient states.
bq.  
bq.  What is a good way to clear the HMaster's assignment manager's assignment 
data for particular regions or to force it to re-read from META? (without 
modifying the 0.90 HBase's it is meant to repair).
bq.  
bq.  Problem 2:
bq.  
bq.  Sometimes test fail reporting HOLE_IN_REGION_CHAIN and 
SERVER_DOES_NOT_MATCH_META. This means the old and new regions are confiused 
with each other and basically something is still happening asynchronously. I 
think this is the new region is being assigned and is still transitioning. 
Sound about right? To make the unit test deterministic, should hbck wait for 
these to settle or should just the unit test wait?
bq.  
bq.  
bq.  This addresses bug HBASE-5128.
bq.  https://issues.apache.org/jira/browse/HBASE-5128
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java c56b3a6 
bq.src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 
9520b95 
bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java f7ad064 
bq.

[jira] [Updated] (HBASE-5258) Move coprocessors set out of RegionLoad, region server should calculate disparity of loaded coprocessors among regions and send report through HServerLoad

2012-01-25 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5258:
--

Priority: Critical  (was: Major)

Raising priority as this task is about making the correct design choices.

 Move coprocessors set out of RegionLoad, region server should calculate 
 disparity of loaded coprocessors among regions and send report through 
 HServerLoad
 --

 Key: HBASE-5258
 URL: https://issues.apache.org/jira/browse/HBASE-5258
 Project: HBase
  Issue Type: Task
Reporter: Zhihong Yu
Priority: Critical

 When I worked on HBASE-5256, I revisited the code related to Ser/De of 
 coprocessors set in RegionLoad.
 I think the rationale for embedding coprocessors set is for maximum 
 flexibility where each region can load different coprocessors.
 This flexibility is causing extra cost in the region server to Master 
 communication and increasing the footprint of Master heap.
 Would HServerLoad be a better place for this set ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler

2012-01-25 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193187#comment-13193187
 ] 

stack commented on HBASE-5270:
--

bq. So, the param 'definitiveRootServer' is used in this case to ensure the 
dead root server is carryingRoot when it is being expired.

Whats 'definitive' about it?  Is it that we know for sure the server was 
carrying root or meta?  How?

bq. Is there any possible to expire a server if its carrying root and meta now? 
I don't think so.

You are saying that this patch does nothing new here?  We COULD expire the 
server that was carrying root, wait on its log split, then expire the server 
carrying meta (though it may have been the same server)... it might be ok but 
we might kill a server that has just started. I'm ok if fixing this is outside 
scope of this patch.

bq. I don't find this operation earlier in master setup, and this operation is 
not introduced by this issue. And I only introduce this logic for 90 from trunk.

So, you copied this to 0.90 from TRUNK (so my notion that we already had this 
is my remembering how things work on TRUNK.. that would make sense).

bq. I think we need explain it, But whether we shouldn't use distributed split 
log, I'm not very sure.

If we are not sure, we shouldn't do it.

bq. When matser is initializing, if one RS is killed and restart, then dead 
server is in progress while master startup

This seems like a small window.  Or do you think it could happen frequent?  
Could we hold up shutdownserverhandler until master is up?





 Handle potential data loss due to concurrent processing of processFaileOver 
 and ServerShutdownHandler
 -

 Key: HBASE-5270
 URL: https://issues.apache.org/jira/browse/HBASE-5270
 Project: HBase
  Issue Type: Sub-task
  Components: master
Reporter: Zhihong Yu
 Fix For: 0.94.0, 0.92.1


 This JIRA continues the effort from HBASE-5179. Starting with Stack's 
 comments about patches for 0.92 and TRUNK:
 Reviewing 0.92v17
 isDeadServerInProgress is a new public method in ServerManager but it does 
 not seem to be used anywhere.
 Does isDeadRootServerInProgress need to be public? Ditto for meta version.
 This method param names are not right 'definitiveRootServer'; what is meant 
 by definitive? Do they need this qualifier?
 Is there anything in place to stop us expiring a server twice if its carrying 
 root and meta?
 What is difference between asking assignment manager isCarryingRoot and this 
 variable that is passed in? Should be doc'd at least. Ditto for meta.
 I think I've asked for this a few times - onlineServers needs to be 
 explained... either in javadoc or in comment. This is the param passed into 
 joinCluster. How does it arise? I think I know but am unsure. God love the 
 poor noob that comes awandering this code trying to make sense of it all.
 It looks like we get the list by trawling zk for regionserver znodes that 
 have not checked in. Don't we do this operation earlier in master setup? Are 
 we doing it again here?
 Though distributed split log is configured, we will do in master single 
 process splitting under some conditions with this patch. Its not explained in 
 code why we would do this. Why do we think master log splitting 'high 
 priority' when it could very well be slower. Should we only go this route if 
 distributed splitting is not going on. Do we know if concurrent distributed 
 log splitting and master splitting works?
 Why would we have dead servers in progress here in master startup? Because a 
 servershutdownhandler fired?
 This patch is different to the patch for 0.90. Should go into trunk first 
 with tests, then 0.92. Should it be in this issue? This issue is really hard 
 to follow now. Maybe this issue is for 0.90.x and new issue for more work on 
 this trunk patch?
 This patch needs to have the v18 differences applied.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler

2012-01-25 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193197#comment-13193197
 ] 

Zhihong Yu commented on HBASE-5270:
---

bq. though it may have been the same server
This has been handled in patch on reviewboard, line 628:
{code}
  !currentMetaServer.equals(currentRootServer) 
{code}
bq. Could we hold up shutdownserverhandler until master is up?
What if the region server hosting .META. went down ?

 Handle potential data loss due to concurrent processing of processFaileOver 
 and ServerShutdownHandler
 -

 Key: HBASE-5270
 URL: https://issues.apache.org/jira/browse/HBASE-5270
 Project: HBase
  Issue Type: Sub-task
  Components: master
Reporter: Zhihong Yu
 Fix For: 0.94.0, 0.92.1


 This JIRA continues the effort from HBASE-5179. Starting with Stack's 
 comments about patches for 0.92 and TRUNK:
 Reviewing 0.92v17
 isDeadServerInProgress is a new public method in ServerManager but it does 
 not seem to be used anywhere.
 Does isDeadRootServerInProgress need to be public? Ditto for meta version.
 This method param names are not right 'definitiveRootServer'; what is meant 
 by definitive? Do they need this qualifier?
 Is there anything in place to stop us expiring a server twice if its carrying 
 root and meta?
 What is difference between asking assignment manager isCarryingRoot and this 
 variable that is passed in? Should be doc'd at least. Ditto for meta.
 I think I've asked for this a few times - onlineServers needs to be 
 explained... either in javadoc or in comment. This is the param passed into 
 joinCluster. How does it arise? I think I know but am unsure. God love the 
 poor noob that comes awandering this code trying to make sense of it all.
 It looks like we get the list by trawling zk for regionserver znodes that 
 have not checked in. Don't we do this operation earlier in master setup? Are 
 we doing it again here?
 Though distributed split log is configured, we will do in master single 
 process splitting under some conditions with this patch. Its not explained in 
 code why we would do this. Why do we think master log splitting 'high 
 priority' when it could very well be slower. Should we only go this route if 
 distributed splitting is not going on. Do we know if concurrent distributed 
 log splitting and master splitting works?
 Why would we have dead servers in progress here in master startup? Because a 
 servershutdownhandler fired?
 This patch is different to the patch for 0.90. Should go into trunk first 
 with tests, then 0.92. Should it be in this issue? This issue is really hard 
 to follow now. Maybe this issue is for 0.90.x and new issue for more work on 
 this trunk patch?
 This patch needs to have the v18 differences applied.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler

2012-01-25 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193209#comment-13193209
 ] 

stack commented on HBASE-5270:
--

bq. What if the region server hosting .META. went down ?

Yes... was just thinking about that.  In this case we'd run the splitter 
in-line, in SSH, not via executor let me look at code.  I'm trying to write 
tests and catch-up on all the stuff that was done over on previous issue.

 Handle potential data loss due to concurrent processing of processFaileOver 
 and ServerShutdownHandler
 -

 Key: HBASE-5270
 URL: https://issues.apache.org/jira/browse/HBASE-5270
 Project: HBase
  Issue Type: Sub-task
  Components: master
Reporter: Zhihong Yu
 Fix For: 0.94.0, 0.92.1


 This JIRA continues the effort from HBASE-5179. Starting with Stack's 
 comments about patches for 0.92 and TRUNK:
 Reviewing 0.92v17
 isDeadServerInProgress is a new public method in ServerManager but it does 
 not seem to be used anywhere.
 Does isDeadRootServerInProgress need to be public? Ditto for meta version.
 This method param names are not right 'definitiveRootServer'; what is meant 
 by definitive? Do they need this qualifier?
 Is there anything in place to stop us expiring a server twice if its carrying 
 root and meta?
 What is difference between asking assignment manager isCarryingRoot and this 
 variable that is passed in? Should be doc'd at least. Ditto for meta.
 I think I've asked for this a few times - onlineServers needs to be 
 explained... either in javadoc or in comment. This is the param passed into 
 joinCluster. How does it arise? I think I know but am unsure. God love the 
 poor noob that comes awandering this code trying to make sense of it all.
 It looks like we get the list by trawling zk for regionserver znodes that 
 have not checked in. Don't we do this operation earlier in master setup? Are 
 we doing it again here?
 Though distributed split log is configured, we will do in master single 
 process splitting under some conditions with this patch. Its not explained in 
 code why we would do this. Why do we think master log splitting 'high 
 priority' when it could very well be slower. Should we only go this route if 
 distributed splitting is not going on. Do we know if concurrent distributed 
 log splitting and master splitting works?
 Why would we have dead servers in progress here in master startup? Because a 
 servershutdownhandler fired?
 This patch is different to the patch for 0.90. Should go into trunk first 
 with tests, then 0.92. Should it be in this issue? This issue is really hard 
 to follow now. Maybe this issue is for 0.90.x and new issue for more work on 
 this trunk patch?
 This patch needs to have the v18 differences applied.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5278) HBase shell script refers to removed migrate functionality

2012-01-25 Thread Shaneal Manek (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaneal Manek updated HBASE-5278:
-

Attachment: hbase-5278.patch

 HBase shell script refers to removed migrate functionality
 

 Key: HBASE-5278
 URL: https://issues.apache.org/jira/browse/HBASE-5278
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.94.0, 0.90.5, 0.92.0
Reporter: Shaneal Manek
Assignee: Shaneal Manek
Priority: Trivial
 Attachments: hbase-5278.patch


 $ hbase migrate
 Exception in thread main java.lang.NoClassDefFoundError: 
 org/apache/hadoop/hbase/util/Migrate
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.hadoop.hbase.util.Migrate
 at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
 Could not find the main class: org.apache.hadoop.hbase.util.Migrate. Program 
 will exit.
 The 'hbase' shell script has docs referring to a 'migrate' command which no 
 longer exists.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5278) HBase shell script refers to removed migrate functionality

2012-01-25 Thread Shaneal Manek (Created) (JIRA)
HBase shell script refers to removed migrate functionality


 Key: HBASE-5278
 URL: https://issues.apache.org/jira/browse/HBASE-5278
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.92.0, 0.90.5, 0.94.0
Reporter: Shaneal Manek
Assignee: Shaneal Manek
Priority: Trivial
 Attachments: hbase-5278.patch

$ hbase migrate
Exception in thread main java.lang.NoClassDefFoundError: 
org/apache/hadoop/hbase/util/Migrate
Caused by: java.lang.ClassNotFoundException: 
org.apache.hadoop.hbase.util.Migrate
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Could not find the main class: org.apache.hadoop.hbase.util.Migrate. Program 
will exit.


The 'hbase' shell script has docs referring to a 'migrate' command which no 
longer exists.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5278) HBase shell script refers to removed migrate functionality

2012-01-25 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193248#comment-13193248
 ] 

Jonathan Hsieh commented on HBASE-5278:
---

+1. lgtm. 

 HBase shell script refers to removed migrate functionality
 

 Key: HBASE-5278
 URL: https://issues.apache.org/jira/browse/HBASE-5278
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.94.0, 0.90.5, 0.92.0
Reporter: Shaneal Manek
Assignee: Shaneal Manek
Priority: Trivial
 Attachments: hbase-5278.patch


 $ hbase migrate
 Exception in thread main java.lang.NoClassDefFoundError: 
 org/apache/hadoop/hbase/util/Migrate
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.hadoop.hbase.util.Migrate
 at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
 Could not find the main class: org.apache.hadoop.hbase.util.Migrate. Program 
 will exit.
 The 'hbase' shell script has docs referring to a 'migrate' command which no 
 longer exists.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5278) HBase shell script refers to removed migrate functionality

2012-01-25 Thread Shaneal Manek (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaneal Manek updated HBASE-5278:
-

Status: Patch Available  (was: Open)

 HBase shell script refers to removed migrate functionality
 

 Key: HBASE-5278
 URL: https://issues.apache.org/jira/browse/HBASE-5278
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.92.0, 0.90.5, 0.94.0
Reporter: Shaneal Manek
Assignee: Shaneal Manek
Priority: Trivial
 Attachments: hbase-5278.patch


 $ hbase migrate
 Exception in thread main java.lang.NoClassDefFoundError: 
 org/apache/hadoop/hbase/util/Migrate
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.hadoop.hbase.util.Migrate
 at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
 Could not find the main class: org.apache.hadoop.hbase.util.Migrate. Program 
 will exit.
 The 'hbase' shell script has docs referring to a 'migrate' command which no 
 longer exists.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5278) HBase shell script refers to removed migrate functionality

2012-01-25 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5278:
-

   Resolution: Fixed
Fix Version/s: 0.92.1
   0.94.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed trunk and branch.  Thanks for patch Shaneal.

 HBase shell script refers to removed migrate functionality
 

 Key: HBASE-5278
 URL: https://issues.apache.org/jira/browse/HBASE-5278
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.94.0, 0.90.5, 0.92.0
Reporter: Shaneal Manek
Assignee: Shaneal Manek
Priority: Trivial
 Fix For: 0.94.0, 0.92.1

 Attachments: hbase-5278.patch


 $ hbase migrate
 Exception in thread main java.lang.NoClassDefFoundError: 
 org/apache/hadoop/hbase/util/Migrate
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.hadoop.hbase.util.Migrate
 at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
 Could not find the main class: org.apache.hadoop.hbase.util.Migrate. Program 
 will exit.
 The 'hbase' shell script has docs referring to a 'migrate' command which no 
 longer exists.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5276) PerformanceEvaluation does not set the correct classpath for MR because it lives in the test jar

2012-01-25 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193255#comment-13193255
 ] 

stack commented on HBASE-5276:
--

@Tim Maybe open issue against CDH and close this one?

 PerformanceEvaluation does not set the correct classpath for MR because it 
 lives in the test jar
 

 Key: HBASE-5276
 URL: https://issues.apache.org/jira/browse/HBASE-5276
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.90.4
Reporter: Tim Robertson
Priority: Minor

 Note: This was discovered running the CDH version hbase-0.90.4-cdh3u2
 Running the PerformanceEvaluation as follows:
   $HADOOP_HOME/bin/hadoop org.apache.hadoop.hbase.PerformanceEvaluation scan 5
 fails because the MR tasks do not get the HBase jar on the CP, and thus hit 
 ClassNotFoundExceptions.
 The job gets the following only:
   file:/Users/tim/dev/hadoop/hbase-0.90.4-cdh3u2/hbase-0.90.4-cdh3u2-tests.jar
   
 file:/Users/tim/dev/hadoop/hadoop-0.20.2-cdh3u2/hadoop-core-0.20.2-cdh3u2.jar
   
 file:/Users/tim/dev/hadoop/hbase-0.90.4-cdh3u2/lib/zookeeper-3.3.3-cdh3u2.jar
 The RowCounter etc all work because they live in the HBase jar, not the test 
 jar, and they get the following 
   file:/Users/tim/dev/hadoop/hbase-0.90.4-cdh3u2/lib/guava-r06.jar
   
 file:/Users/tim/dev/hadoop/hadoop-0.20.2-cdh3u2/hadoop-core-0.20.2-cdh3u2.jar
   file:/Users/tim/dev/hadoop/hbase-0.90.4-cdh3u2/hbase-0.90.4-cdh3u2.jar
   
 file:/Users/tim/dev/hadoop/hbase-0.90.4-cdh3u2/lib/zookeeper-3.3.3-cdh3u2.jar
 Presumably this relates to 
   job.setJarByClass(PerformanceEvaluation.class);
   ...
   TableMapReduceUtil.addDependencyJars(job);
 A (cowboy) workaround to run PE is to unpack the jars, and copy the 
 PerformanceEvaluation* classes building a patched jar.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4917) CRUD Verify Utility

2012-01-25 Thread Mubarak Seyed (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193259#comment-13193259
 ] 

Mubarak Seyed commented on HBASE-4917:
--

@Nicolas,

How different this LoadTest tool from PerformanceEvaluation 

{code}
hbase org.apache.hadoop.hbase.PerformanceEvaluation
Usage: java org.apache.hadoop.hbase.PerformanceEvaluation \
  [--miniCluster] [--nomapred] [--rows=ROWS] command nclients
{code}

I believe LoadTester.java generates load for multiple column families (provided 
there is an external properties file to define the CFs and their definition, 
read/write threads, regions/server) where as PerformanceEvaluation uses only 
one CF (TestTable:info).

How does LoadTester differ from YCSB, i believe YCSB supports only one CF as 
well.

I think LoadTester can be used for burn-in test (when we provision a new 
cluster and sniff the cluster). 

If no one is working on this issue, i can help porting loadtest to 
src/test/java/org/apache/hadoop/hbase/loadtest.

Thanks.



 CRUD Verify Utility
 ---

 Key: HBASE-4917
 URL: https://issues.apache.org/jira/browse/HBASE-4917
 Project: HBase
  Issue Type: Sub-task
  Components: client, regionserver
Reporter: Nicolas Spiegelberg
 Fix For: 0.94.0


 Add a verify utility to run basic CRUD tests against hbase in various common 
 use cases.  This is great for sanity checking a cluster setup because it can 
 be run as a one line shell command with no required params.  Multiple column 
 families for different use-cases can be tested together.  Currently provided 
 use-cases are 'action log', 'snapshot' and 'search'. The interface is 
 developed such that it can be easily extended to cover more use-cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5278) HBase shell script refers to removed migrate functionality

2012-01-25 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193260#comment-13193260
 ] 

Jonathan Hsieh commented on HBASE-5278:
---

Wow, you are fast Stack.  I was trying to commit. :)

 HBase shell script refers to removed migrate functionality
 

 Key: HBASE-5278
 URL: https://issues.apache.org/jira/browse/HBASE-5278
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.94.0, 0.90.5, 0.92.0
Reporter: Shaneal Manek
Assignee: Shaneal Manek
Priority: Trivial
 Fix For: 0.94.0, 0.92.1

 Attachments: hbase-5278.patch


 $ hbase migrate
 Exception in thread main java.lang.NoClassDefFoundError: 
 org/apache/hadoop/hbase/util/Migrate
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.hadoop.hbase.util.Migrate
 at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
 Could not find the main class: org.apache.hadoop.hbase.util.Migrate. Program 
 will exit.
 The 'hbase' shell script has docs referring to a 'migrate' command which no 
 longer exists.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-25 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5259:
---

Attachment: D1413.2.patch

Liyin updated the revision [jira][HBASE-5259] Normalize the RegionLocation in 
TableInputFormat by the reverse DNS lookup..
Reviewers: Kannan, Karthik, mbautin

  Address Ted's comments.

REVISION DETAIL
  https://reviews.facebook.net/D1413

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java


 Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
 ---

 Key: HBASE-5259
 URL: https://issues.apache.org/jira/browse/HBASE-5259
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
 D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch


 Assuming the HBase and MapReduce running in the same cluster, the 
 TableInputFormat is to override the split function which divides all the 
 regions from one particular table into a series of mapper tasks. So each 
 mapper task can process a region or one part of a region. Ideally, the mapper 
 task should run on the same machine on which the region server hosts the 
 corresponding region. That's the motivation that the TableInputFormat sets 
 the RegionLocation so that the MapReduce framework can respect the node 
 locality. 
 The code simply set the host name of the region server as the 
 HRegionLocation. However, the host name of the region server may have 
 different format with the host name of the task tracker (Mapper task). The 
 task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
 service may return different host name format. For example, the host name of 
 the region server is correctly set as a.b.c.d while the reverse DNS lookup 
 may return a.b.c.d. (With an additional doc in the end).
 So the solution is to set the RegionLocation by the reverse DNS lookup as 
 well. No matter what host name format the DNS system is using, the 
 TableInputFormat has the responsibility to keep the consistent host name 
 format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-25 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5259:
---

Attachment: D1413.2.patch

Liyin updated the revision [jira][HBASE-5259] Normalize the RegionLocation in 
TableInputFormat by the reverse DNS lookup..
Reviewers: Kannan, Karthik, mbautin

  Address Ted's comments.

REVISION DETAIL
  https://reviews.facebook.net/D1413

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java


 Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
 ---

 Key: HBASE-5259
 URL: https://issues.apache.org/jira/browse/HBASE-5259
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
 D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch


 Assuming the HBase and MapReduce running in the same cluster, the 
 TableInputFormat is to override the split function which divides all the 
 regions from one particular table into a series of mapper tasks. So each 
 mapper task can process a region or one part of a region. Ideally, the mapper 
 task should run on the same machine on which the region server hosts the 
 corresponding region. That's the motivation that the TableInputFormat sets 
 the RegionLocation so that the MapReduce framework can respect the node 
 locality. 
 The code simply set the host name of the region server as the 
 HRegionLocation. However, the host name of the region server may have 
 different format with the host name of the task tracker (Mapper task). The 
 task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
 service may return different host name format. For example, the host name of 
 the region server is correctly set as a.b.c.d while the reverse DNS lookup 
 may return a.b.c.d. (With an additional doc in the end).
 So the solution is to set the RegionLocation by the reverse DNS lookup as 
 well. No matter what host name format the DNS system is using, the 
 TableInputFormat has the responsibility to keep the consistent host name 
 format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-25 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5259:
---

Attachment: D1413.2.patch

Liyin updated the revision [jira][HBASE-5259] Normalize the RegionLocation in 
TableInputFormat by the reverse DNS lookup..
Reviewers: Kannan, Karthik, mbautin

  Address Ted's comments.

REVISION DETAIL
  https://reviews.facebook.net/D1413

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java


 Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
 ---

 Key: HBASE-5259
 URL: https://issues.apache.org/jira/browse/HBASE-5259
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
 D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch


 Assuming the HBase and MapReduce running in the same cluster, the 
 TableInputFormat is to override the split function which divides all the 
 regions from one particular table into a series of mapper tasks. So each 
 mapper task can process a region or one part of a region. Ideally, the mapper 
 task should run on the same machine on which the region server hosts the 
 corresponding region. That's the motivation that the TableInputFormat sets 
 the RegionLocation so that the MapReduce framework can respect the node 
 locality. 
 The code simply set the host name of the region server as the 
 HRegionLocation. However, the host name of the region server may have 
 different format with the host name of the task tracker (Mapper task). The 
 task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
 service may return different host name format. For example, the host name of 
 the region server is correctly set as a.b.c.d while the reverse DNS lookup 
 may return a.b.c.d. (With an additional doc in the end).
 So the solution is to set the RegionLocation by the reverse DNS lookup as 
 well. No matter what host name format the DNS system is using, the 
 TableInputFormat has the responsibility to keep the consistent host name 
 format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-25 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5259:
---

Attachment: D1413.2.patch

Liyin updated the revision [jira][HBASE-5259] Normalize the RegionLocation in 
TableInputFormat by the reverse DNS lookup..
Reviewers: Kannan, Karthik, mbautin

  Address Ted's comments.

REVISION DETAIL
  https://reviews.facebook.net/D1413

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java


 Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
 ---

 Key: HBASE-5259
 URL: https://issues.apache.org/jira/browse/HBASE-5259
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
 D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch


 Assuming the HBase and MapReduce running in the same cluster, the 
 TableInputFormat is to override the split function which divides all the 
 regions from one particular table into a series of mapper tasks. So each 
 mapper task can process a region or one part of a region. Ideally, the mapper 
 task should run on the same machine on which the region server hosts the 
 corresponding region. That's the motivation that the TableInputFormat sets 
 the RegionLocation so that the MapReduce framework can respect the node 
 locality. 
 The code simply set the host name of the region server as the 
 HRegionLocation. However, the host name of the region server may have 
 different format with the host name of the task tracker (Mapper task). The 
 task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
 service may return different host name format. For example, the host name of 
 the region server is correctly set as a.b.c.d while the reverse DNS lookup 
 may return a.b.c.d. (With an additional doc in the end).
 So the solution is to set the RegionLocation by the reverse DNS lookup as 
 well. No matter what host name format the DNS system is using, the 
 TableInputFormat has the responsibility to keep the consistent host name 
 format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5274) Filter out the expired store file scanner during the compaction

2012-01-25 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193274#comment-13193274
 ] 

Phabricator commented on HBASE-5274:


Liyin has abandoned the revision [jira][HBASE-5274] Filter out the expired 
store file scanner during the compaction.

  This could be part of [HBASE-5010] Filter HFiles based on TTL.  Mikhail will 
help to followup for fixing this issue.

REVISION DETAIL
  https://reviews.facebook.net/D1407


 Filter out the expired store file scanner during the compaction
 ---

 Key: HBASE-5274
 URL: https://issues.apache.org/jira/browse/HBASE-5274
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D1407.1.patch, D1407.1.patch, D1407.1.patch, 
 D1407.1.patch, D1407.1.patch


 During the compaction time, HBase will generate a store scanner which will 
 scan a list of store files. And it would be more efficient to filer out the 
 expired store file since there is no need to read any key values from these 
 store files.
 This optimization has been already implemented on 89-fb and this is the 
 building block for HBASE-5199 as well. It is supposed to be no-ops to compact 
 the expired store files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5274) Filter out the expired store file scanner during the compaction

2012-01-25 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193275#comment-13193275
 ] 

Phabricator commented on HBASE-5274:


Liyin has abandoned the revision [jira][HBASE-5274] Filter out the expired 
store file scanner during the compaction.

  This could be part of [HBASE-5010] Filter HFiles based on TTL.  Mikhail will 
help to followup for fixing this issue.

REVISION DETAIL
  https://reviews.facebook.net/D1407


 Filter out the expired store file scanner during the compaction
 ---

 Key: HBASE-5274
 URL: https://issues.apache.org/jira/browse/HBASE-5274
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D1407.1.patch, D1407.1.patch, D1407.1.patch, 
 D1407.1.patch, D1407.1.patch


 During the compaction time, HBase will generate a store scanner which will 
 scan a list of store files. And it would be more efficient to filer out the 
 expired store file since there is no need to read any key values from these 
 store files.
 This optimization has been already implemented on 89-fb and this is the 
 building block for HBASE-5199 as well. It is supposed to be no-ops to compact 
 the expired store files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5274) Filter out the expired store file scanner during the compaction

2012-01-25 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193278#comment-13193278
 ] 

Phabricator commented on HBASE-5274:


Liyin has abandoned the revision [jira][HBASE-5274] Filter out the expired 
store file scanner during the compaction.

  This could be part of [HBASE-5010] Filter HFiles based on TTL.  Mikhail will 
help to followup for fixing this issue.

REVISION DETAIL
  https://reviews.facebook.net/D1407


 Filter out the expired store file scanner during the compaction
 ---

 Key: HBASE-5274
 URL: https://issues.apache.org/jira/browse/HBASE-5274
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D1407.1.patch, D1407.1.patch, D1407.1.patch, 
 D1407.1.patch, D1407.1.patch


 During the compaction time, HBase will generate a store scanner which will 
 scan a list of store files. And it would be more efficient to filer out the 
 expired store file since there is no need to read any key values from these 
 store files.
 This optimization has been already implemented on 89-fb and this is the 
 building block for HBASE-5199 as well. It is supposed to be no-ops to compact 
 the expired store files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5274) Filter out the expired store file scanner during the compaction

2012-01-25 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193277#comment-13193277
 ] 

Phabricator commented on HBASE-5274:


Liyin has abandoned the revision [jira][HBASE-5274] Filter out the expired 
store file scanner during the compaction.

  This could be part of [HBASE-5010] Filter HFiles based on TTL.  Mikhail will 
help to followup for fixing this issue.

REVISION DETAIL
  https://reviews.facebook.net/D1407


 Filter out the expired store file scanner during the compaction
 ---

 Key: HBASE-5274
 URL: https://issues.apache.org/jira/browse/HBASE-5274
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D1407.1.patch, D1407.1.patch, D1407.1.patch, 
 D1407.1.patch, D1407.1.patch


 During the compaction time, HBase will generate a store scanner which will 
 scan a list of store files. And it would be more efficient to filer out the 
 expired store file since there is no need to read any key values from these 
 store files.
 This optimization has been already implemented on 89-fb and this is the 
 building block for HBASE-5199 as well. It is supposed to be no-ops to compact 
 the expired store files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-25 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193287#comment-13193287
 ] 

Phabricator commented on HBASE-5259:


tedyu has commented on the revision [jira][HBASE-5259] Normalize the 
RegionLocation in TableInputFormat by the reverse DNS lookup..

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:89 
I think reverseDNSCache is a good enough name.
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:202 
Should NamingException be handled here ?

REVISION DETAIL
  https://reviews.facebook.net/D1413


 Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
 ---

 Key: HBASE-5259
 URL: https://issues.apache.org/jira/browse/HBASE-5259
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
 D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch


 Assuming the HBase and MapReduce running in the same cluster, the 
 TableInputFormat is to override the split function which divides all the 
 regions from one particular table into a series of mapper tasks. So each 
 mapper task can process a region or one part of a region. Ideally, the mapper 
 task should run on the same machine on which the region server hosts the 
 corresponding region. That's the motivation that the TableInputFormat sets 
 the RegionLocation so that the MapReduce framework can respect the node 
 locality. 
 The code simply set the host name of the region server as the 
 HRegionLocation. However, the host name of the region server may have 
 different format with the host name of the task tracker (Mapper task). The 
 task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
 service may return different host name format. For example, the host name of 
 the region server is correctly set as a.b.c.d while the reverse DNS lookup 
 may return a.b.c.d. (With an additional doc in the end).
 So the solution is to set the RegionLocation by the reverse DNS lookup as 
 well. No matter what host name format the DNS system is using, the 
 TableInputFormat has the responsibility to keep the consistent host name 
 format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-25 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193289#comment-13193289
 ] 

Phabricator commented on HBASE-5259:


tedyu has commented on the revision [jira][HBASE-5259] Normalize the 
RegionLocation in TableInputFormat by the reverse DNS lookup..

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:89 
I think reverseDNSCache is a good enough name.
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:202 
Should NamingException be handled here ?

REVISION DETAIL
  https://reviews.facebook.net/D1413


 Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
 ---

 Key: HBASE-5259
 URL: https://issues.apache.org/jira/browse/HBASE-5259
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
 D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch


 Assuming the HBase and MapReduce running in the same cluster, the 
 TableInputFormat is to override the split function which divides all the 
 regions from one particular table into a series of mapper tasks. So each 
 mapper task can process a region or one part of a region. Ideally, the mapper 
 task should run on the same machine on which the region server hosts the 
 corresponding region. That's the motivation that the TableInputFormat sets 
 the RegionLocation so that the MapReduce framework can respect the node 
 locality. 
 The code simply set the host name of the region server as the 
 HRegionLocation. However, the host name of the region server may have 
 different format with the host name of the task tracker (Mapper task). The 
 task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
 service may return different host name format. For example, the host name of 
 the region server is correctly set as a.b.c.d while the reverse DNS lookup 
 may return a.b.c.d. (With an additional doc in the end).
 So the solution is to set the RegionLocation by the reverse DNS lookup as 
 well. No matter what host name format the DNS system is using, the 
 TableInputFormat has the responsibility to keep the consistent host name 
 format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-25 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193288#comment-13193288
 ] 

Phabricator commented on HBASE-5259:


tedyu has commented on the revision [jira][HBASE-5259] Normalize the 
RegionLocation in TableInputFormat by the reverse DNS lookup..

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:89 
I think reverseDNSCache is a good enough name.
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:202 
Should NamingException be handled here ?

REVISION DETAIL
  https://reviews.facebook.net/D1413


 Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
 ---

 Key: HBASE-5259
 URL: https://issues.apache.org/jira/browse/HBASE-5259
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
 D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch


 Assuming the HBase and MapReduce running in the same cluster, the 
 TableInputFormat is to override the split function which divides all the 
 regions from one particular table into a series of mapper tasks. So each 
 mapper task can process a region or one part of a region. Ideally, the mapper 
 task should run on the same machine on which the region server hosts the 
 corresponding region. That's the motivation that the TableInputFormat sets 
 the RegionLocation so that the MapReduce framework can respect the node 
 locality. 
 The code simply set the host name of the region server as the 
 HRegionLocation. However, the host name of the region server may have 
 different format with the host name of the task tracker (Mapper task). The 
 task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
 service may return different host name format. For example, the host name of 
 the region server is correctly set as a.b.c.d while the reverse DNS lookup 
 may return a.b.c.d. (With an additional doc in the end).
 So the solution is to set the RegionLocation by the reverse DNS lookup as 
 well. No matter what host name format the DNS system is using, the 
 TableInputFormat has the responsibility to keep the consistent host name 
 format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.

2012-01-25 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193290#comment-13193290
 ] 

Phabricator commented on HBASE-5259:


tedyu has commented on the revision [jira][HBASE-5259] Normalize the 
RegionLocation in TableInputFormat by the reverse DNS lookup..

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:89 
I think reverseDNSCache is a good enough name.
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:202 
Should NamingException be handled here ?

REVISION DETAIL
  https://reviews.facebook.net/D1413


 Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
 ---

 Key: HBASE-5259
 URL: https://issues.apache.org/jira/browse/HBASE-5259
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, 
 D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch


 Assuming the HBase and MapReduce running in the same cluster, the 
 TableInputFormat is to override the split function which divides all the 
 regions from one particular table into a series of mapper tasks. So each 
 mapper task can process a region or one part of a region. Ideally, the mapper 
 task should run on the same machine on which the region server hosts the 
 corresponding region. That's the motivation that the TableInputFormat sets 
 the RegionLocation so that the MapReduce framework can respect the node 
 locality. 
 The code simply set the host name of the region server as the 
 HRegionLocation. However, the host name of the region server may have 
 different format with the host name of the task tracker (Mapper task). The 
 task tracker always gets its hostname by the reverse DNS lookup. And the DNS 
 service may return different host name format. For example, the host name of 
 the region server is correctly set as a.b.c.d while the reverse DNS lookup 
 may return a.b.c.d. (With an additional doc in the end).
 So the solution is to set the RegionLocation by the reverse DNS lookup as 
 well. No matter what host name format the DNS system is using, the 
 TableInputFormat has the responsibility to keep the consistent host name 
 format with the MapReduce framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5279) NPE in Master after upgrading to 0.92.0

2012-01-25 Thread Tobias Herbert (Created) (JIRA)
NPE in Master after upgrading to 0.92.0
---

 Key: HBASE-5279
 URL: https://issues.apache.org/jira/browse/HBASE-5279
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0
Reporter: Tobias Herbert
Priority: Critical


I have upgraded my environment from 0.90.4 to 0.92.0

after the table migration I get the following error in the master (permanent)

{noformat}
2012-01-25 18:23:48,648 FATAL master-namenode,6,1327512209588 
org.apache.hadoop.hbase.master.HMaster - Unhandled exception. Starting shutdown.
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.master.AssignmentManager.rebuildUserRegions(AssignmentManager.java:2190)
at 
org.apache.hadoop.hbase.master.AssignmentManager.joinCluster(AssignmentManager.java:323)
at 
org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:501)
at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:326)
at java.lang.Thread.run(Thread.java:662)
2012-01-25 18:23:48,650 INFO namenode,6,1327512209588 
org.apache.hadoop.hbase.master.HMaster - Aborting
{noformat}

I think that's because I had a hard crash in the cluster a while ago - and the 
following WARN since then

{noformat}
2012-01-25 21:20:47,121 WARN namenode,6,1327513078123-CatalogJanitor 
org.apache.hadoop.hbase.master.CatalogJanitor - REGIONINFO_QUALIFIER is empty 
in keyvalues={emails,,xxx./info:server/1314336400471/Put/vlen=38, 
emails,,1314189353300.xxx./info:serverstartcode/1314336400471/Put/vlen=8}
{noformat}

my patch was simple to go around the NPE (as the other code around the lines)
but I don't know if that's correct


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5279) NPE in Master after upgrading to 0.92.0

2012-01-25 Thread Tobias Herbert (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tobias Herbert updated HBASE-5279:
--

Attachment: HBASE-5279.patch

 NPE in Master after upgrading to 0.92.0
 ---

 Key: HBASE-5279
 URL: https://issues.apache.org/jira/browse/HBASE-5279
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0
Reporter: Tobias Herbert
Priority: Critical
 Attachments: HBASE-5279.patch


 I have upgraded my environment from 0.90.4 to 0.92.0
 after the table migration I get the following error in the master (permanent)
 {noformat}
 2012-01-25 18:23:48,648 FATAL master-namenode,6,1327512209588 
 org.apache.hadoop.hbase.master.HMaster - Unhandled exception. Starting 
 shutdown.
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hbase.master.AssignmentManager.rebuildUserRegions(AssignmentManager.java:2190)
 at 
 org.apache.hadoop.hbase.master.AssignmentManager.joinCluster(AssignmentManager.java:323)
 at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:501)
 at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:326)
 at java.lang.Thread.run(Thread.java:662)
 2012-01-25 18:23:48,650 INFO namenode,6,1327512209588 
 org.apache.hadoop.hbase.master.HMaster - Aborting
 {noformat}
 I think that's because I had a hard crash in the cluster a while ago - and 
 the following WARN since then
 {noformat}
 2012-01-25 21:20:47,121 WARN namenode,6,1327513078123-CatalogJanitor 
 org.apache.hadoop.hbase.master.CatalogJanitor - REGIONINFO_QUALIFIER is empty 
 in keyvalues={emails,,xxx./info:server/1314336400471/Put/vlen=38, 
 emails,,1314189353300.xxx./info:serverstartcode/1314336400471/Put/vlen=8}
 {noformat}
 my patch was simple to go around the NPE (as the other code around the lines)
 but I don't know if that's correct

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5230) Unit test to ensure compactions don't cache data on write

2012-01-25 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193307#comment-13193307
 ] 

Phabricator commented on HBASE-5230:


mbautin has abandoned the revision [jira] [HBASE-5230] Extend TestCacheOnWrite 
to ensure we don't cache data blocks on compaction.

  This has been committed to HBase trunk. Abandoning the diff since I forgot to 
include differential revision in the commit message.


REVISION DETAIL
  https://reviews.facebook.net/D1353


 Unit test to ensure compactions don't cache data on write
 -

 Key: HBASE-5230
 URL: https://issues.apache.org/jira/browse/HBASE-5230
 Project: HBase
  Issue Type: Test
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
 D1353.4.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-23_15_27_23.patch


 Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
 write during compactions even if cache-on-write is enabled generally 
 enabled). This is because we have very different implementations of 
 HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
 and with CacheConfig (presumably it's there but not sure if it even works, 
 since the patch in HBASE-3976 may not have been committed). We need to create 
 a unit test to verify that we don't cache data blocks on write during 
 compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5278) HBase shell script refers to removed migrate functionality

2012-01-25 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193365#comment-13193365
 ] 

stack commented on HBASE-5278:
--

@Jon I've a bit of practise

 HBase shell script refers to removed migrate functionality
 

 Key: HBASE-5278
 URL: https://issues.apache.org/jira/browse/HBASE-5278
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.94.0, 0.90.5, 0.92.0
Reporter: Shaneal Manek
Assignee: Shaneal Manek
Priority: Trivial
 Fix For: 0.94.0, 0.92.1

 Attachments: hbase-5278.patch


 $ hbase migrate
 Exception in thread main java.lang.NoClassDefFoundError: 
 org/apache/hadoop/hbase/util/Migrate
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.hadoop.hbase.util.Migrate
 at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
 Could not find the main class: org.apache.hadoop.hbase.util.Migrate. Program 
 will exit.
 The 'hbase' shell script has docs referring to a 'migrate' command which no 
 longer exists.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5279) NPE in Master after upgrading to 0.92.0

2012-01-25 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193363#comment-13193363
 ] 

stack commented on HBASE-5279:
--

Skipping should be fine.  You have a scan of .META. from before upgrade?

Are you up now?

 NPE in Master after upgrading to 0.92.0
 ---

 Key: HBASE-5279
 URL: https://issues.apache.org/jira/browse/HBASE-5279
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0
Reporter: Tobias Herbert
Priority: Critical
 Attachments: HBASE-5279.patch


 I have upgraded my environment from 0.90.4 to 0.92.0
 after the table migration I get the following error in the master (permanent)
 {noformat}
 2012-01-25 18:23:48,648 FATAL master-namenode,6,1327512209588 
 org.apache.hadoop.hbase.master.HMaster - Unhandled exception. Starting 
 shutdown.
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hbase.master.AssignmentManager.rebuildUserRegions(AssignmentManager.java:2190)
 at 
 org.apache.hadoop.hbase.master.AssignmentManager.joinCluster(AssignmentManager.java:323)
 at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:501)
 at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:326)
 at java.lang.Thread.run(Thread.java:662)
 2012-01-25 18:23:48,650 INFO namenode,6,1327512209588 
 org.apache.hadoop.hbase.master.HMaster - Aborting
 {noformat}
 I think that's because I had a hard crash in the cluster a while ago - and 
 the following WARN since then
 {noformat}
 2012-01-25 21:20:47,121 WARN namenode,6,1327513078123-CatalogJanitor 
 org.apache.hadoop.hbase.master.CatalogJanitor - REGIONINFO_QUALIFIER is empty 
 in keyvalues={emails,,xxx./info:server/1314336400471/Put/vlen=38, 
 emails,,1314189353300.xxx./info:serverstartcode/1314336400471/Put/vlen=8}
 {noformat}
 my patch was simple to go around the NPE (as the other code around the lines)
 but I don't know if that's correct

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5279) NPE in Master after upgrading to 0.92.0

2012-01-25 Thread Tobias Herbert (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193393#comment-13193393
 ] 

Tobias Herbert commented on HBASE-5279:
---

unfortunately I have no scan from .META. from before the upgrade.
but with this patch I am up now :-)

 NPE in Master after upgrading to 0.92.0
 ---

 Key: HBASE-5279
 URL: https://issues.apache.org/jira/browse/HBASE-5279
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0
Reporter: Tobias Herbert
Priority: Critical
 Attachments: HBASE-5279.patch


 I have upgraded my environment from 0.90.4 to 0.92.0
 after the table migration I get the following error in the master (permanent)
 {noformat}
 2012-01-25 18:23:48,648 FATAL master-namenode,6,1327512209588 
 org.apache.hadoop.hbase.master.HMaster - Unhandled exception. Starting 
 shutdown.
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hbase.master.AssignmentManager.rebuildUserRegions(AssignmentManager.java:2190)
 at 
 org.apache.hadoop.hbase.master.AssignmentManager.joinCluster(AssignmentManager.java:323)
 at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:501)
 at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:326)
 at java.lang.Thread.run(Thread.java:662)
 2012-01-25 18:23:48,650 INFO namenode,6,1327512209588 
 org.apache.hadoop.hbase.master.HMaster - Aborting
 {noformat}
 I think that's because I had a hard crash in the cluster a while ago - and 
 the following WARN since then
 {noformat}
 2012-01-25 21:20:47,121 WARN namenode,6,1327513078123-CatalogJanitor 
 org.apache.hadoop.hbase.master.CatalogJanitor - REGIONINFO_QUALIFIER is empty 
 in keyvalues={emails,,xxx./info:server/1314336400471/Put/vlen=38, 
 emails,,1314189353300.xxx./info:serverstartcode/1314336400471/Put/vlen=8}
 {noformat}
 my patch was simple to go around the NPE (as the other code around the lines)
 but I don't know if that's correct

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5278) HBase shell script refers to removed migrate functionality

2012-01-25 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193396#comment-13193396
 ] 

Hudson commented on HBASE-5278:
---

Integrated in HBase-0.92 #262 (See 
[https://builds.apache.org/job/HBase-0.92/262/])
HBASE-5278 HBase shell script refers to removed 'migrate' functionality

stack : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/bin/hbase


 HBase shell script refers to removed migrate functionality
 

 Key: HBASE-5278
 URL: https://issues.apache.org/jira/browse/HBASE-5278
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.94.0, 0.90.5, 0.92.0
Reporter: Shaneal Manek
Assignee: Shaneal Manek
Priority: Trivial
 Fix For: 0.94.0, 0.92.1

 Attachments: hbase-5278.patch


 $ hbase migrate
 Exception in thread main java.lang.NoClassDefFoundError: 
 org/apache/hadoop/hbase/util/Migrate
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.hadoop.hbase.util.Migrate
 at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
 Could not find the main class: org.apache.hadoop.hbase.util.Migrate. Program 
 will exit.
 The 'hbase' shell script has docs referring to a 'migrate' command which no 
 longer exists.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3025) Coprocessor based simple access control

2012-01-25 Thread Andrew Purtell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193415#comment-13193415
 ] 

Andrew Purtell commented on HBASE-3025:
---

See HBASE-4990. Destined for the site manual. The piece I have left to do is a 
capture of an example shell session. I have such a capture but it's led to 
follow on jiras that need to be resolved for 0.92.1

 Coprocessor based simple access control
 ---

 Key: HBASE-3025
 URL: https://issues.apache.org/jira/browse/HBASE-3025
 Project: HBase
  Issue Type: Sub-task
  Components: coprocessors
Reporter: Andrew Purtell
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3025.1.patch, HBASE-3025_5.patch, 
 HBASE-3025_6.patch


 Thanks for the clarification Jeff which reminds me to edit this issue.
 Goals of this issue
 # Client access to HBase is authenticated
 # User data is private unless access has been granted
 # Access to data can be granted at a table or per column family basis. 
 Non-Goals of this issue
 The following items will be left out of the initial implementation for 
 simplicity:
 # Row-level or per value (cell) This would require broader changes for 
 storing the ACLs inline with rows. It's still a future goal, but would slow 
 down the initial implementation considerably.
 # Push down of file ownership to HDFS While table ownership seems like a 
 useful construct to start with (at least to lay the groundwork for future 
 changes), making HBase act as table owners when interacting with HDFS would 
 require more changes. In additional, while HDFS file ownership would make 
 applying quotas easy, and possibly make bulk imports more straightforward, 
 it's not clean it would offer a more secure setup. We'll leave this to 
 evaluate in a later phase.
 # HBase managed roles as collections of permissions We will not model 
 roles internally in HBase to begin with. We will instead allow group names 
 to be granted permissions, which will allow some external modeling of roles 
 via group memberships. Groups will be created and manipulated externally to 
 HBase. 
 While the assignment of permissions to roles and roles to users (or other 
 roles) allows a great deal of flexibility in security policy, it would add 
 complexity to the initial implementation. 
 After the initial implementation, which will appear on this issue, we will 
 evaluate the addition of role definitions internal to HBase in a new JIRA. In 
 this scheme, administrators could assign permissions specifying HDFS groups, 
 and additionally HBase roles. HBase roles would be created and manipulated 
 internally to HBase, and would appear distinct from HDFS groups via some 
 syntactic sugar. HBase role definitions will be allowed to reference other 
 HBase role definitions. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5280) Remove AssignmentManager#clearRegionFromTransition and replace with assignmentManager#regionOffline

2012-01-25 Thread Jonathan Hsieh (Created) (JIRA)
Remove AssignmentManager#clearRegionFromTransition and replace with 
assignmentManager#regionOffline
---

 Key: HBASE-5280
 URL: https://issues.apache.org/jira/browse/HBASE-5280
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.92.0, 0.90.5, 0.94.0
Reporter: Jonathan Hsieh


These two methods are essentially the same and both present in the code base.  
It was suggested in the review for HBASE-5128 to remove 
#clearRegionFromTransition in favor of #regionOffline  (HBASE-5128 deprecates 
this method, but it is internal to the HMaster, so should be safely removable 
from 0.92 and 0.94).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5230) Ensure compactions do not cache-on-write data blocks

2012-01-25 Thread Mikhail Bautin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-5230:
--

Issue Type: Improvement  (was: Test)
   Summary: Ensure compactions do not cache-on-write data blocks  (was: 
Unit test to ensure compactions don't cache data on write)

 Ensure compactions do not cache-on-write data blocks
 

 Key: HBASE-5230
 URL: https://issues.apache.org/jira/browse/HBASE-5230
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
 D1353.4.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-23_15_27_23.patch


 Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
 write during compactions even if cache-on-write is enabled generally 
 enabled). This is because we have very different implementations of 
 HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
 and with CacheConfig (presumably it's there but not sure if it even works, 
 since the patch in HBASE-3976 may not have been committed). We need to create 
 a unit test to verify that we don't cache data blocks on write during 
 compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3466) runtime exception -- cached an already cached block -- during compaction

2012-01-25 Thread Simon Dircks (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193444#comment-13193444
 ] 

Simon Dircks commented on HBASE-3466:
-

I just reproduced this with hadoop-1.0 and hbase-0.92 with YCSB. 


2012-01-25 23:23:51,556 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
regionserver:60020-0x134f70a343101a0 Successfully transitioned node 
162702503c650e551130e5fb588b3ec2 from RS_ZK_REGION_SPLIT to RS_ZK_REGION_SPLIT
2012-01-25 23:23:51,616 ERROR 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
java.lang.RuntimeException: Cached an already cached block
at 
org.apache.hadoop.hbase.io.hfile.LruBlockCache.cacheBlock(LruBlockCache.java:268)
at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:276)
at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekTo(HFileReaderV2.java:487)
at 
org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekTo(HalfStoreFileReader.java:168)
at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:181)
at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:111)
at 
org.apache.hadoop.hbase.regionserver.StoreScanner.init(StoreScanner.java:83)
at 
org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:1721)
at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.init(HRegion.java:2861)
at 
org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:1432)
at 
org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1424)
at 
org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1400)
at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:3688)
at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:3581)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1771)
at sun.reflect.GeneratedMethodAccessor38.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1325)
2012-01-25 23:23:51,656 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
regionserver:60020-0x134f70a343101a0 Attempting to transition node 
162702503c650e551130e5fb588b3ec2 from RS_ZK_REGION_SPLIT to RS_ZK_REGION_SPLIT






18 node cluster, dedicated namenode, zookeeper, hbasemaster, and YCSB client 
machine. 


/usr/local/bin/java -cp build/ycsb.jar:db/hbase/lib/*:db/hbase/conf/ 
com.yahoo.ycsb.Client -load -db com.yahoo.ycsb.db.HBaseClient -P 
workloads/workloada -p columnfamily=family1 -p recordcount=500 -s  load.dat

loaded 5mil records, that created 8 regions. (balanced all onto the same RS)


/usr/local/bin/java -cp build/ycsb.jar:db/hbase/lib/*:db/hbase/conf/ 
com.yahoo.ycsb.Client -t -db com.yahoo.ycsb.db.HBaseClient -P 
workloads/workloada -p columnfamily=family1 -p operationcount=500 -threads 
10 -s  transaction.dat


I also was able to reproduce the 
2/01/25 15:19:24 WARN client.HConnectionManager$HConnectionImplementation: 
Failed all from 
region=usertable,user3076346045817661344,1327530607222.bab55fba6adb17bc8757eb6cdee99a91.,
 hostname=datatask6.hadoop.telescope.tv, port=60020
java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
java.lang.NullPointerException

found in https://issues.apache.org/jira/browse/HBASE-4890




 runtime exception -- cached an already cached block -- during compaction
 

 Key: HBASE-3466
 URL: https://issues.apache.org/jira/browse/HBASE-3466
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.0
 Environment: ubuntu 9.10, kernel 2.6.31-14-generic SMP 8-core with 
 hyperthreading
Reporter: M. C. Srivas
Priority: Critical

 Happened while running ycsb against a single RS.  BlockSize was set to 64M to 
 tickle more splits. No compression, and replication factor set to 1.
  
 I noticed that  https://issues.apache.org/jira/browse/HBASE-2455 applied to 
 0.20.4, so opened this new one (didn't check to see if the code was the same 
 in 0.20.4 and 0.90.0)
 YCSB was run as follows:
 java -mx3000m -cp conf/:build/ycsb.jar:db/hbase/lib/* com.yahoo.ycsb.Client 
 -t -db com.yahoo.ycsb.db.HBaseClient -P workloads/workloada -p 
 columnfamily=family -p operationcount=1000 -s -threads 30 -target 3
 workloada was modified to do 1 billion records:
 --
 recordcount=10
 operationcount=1000
 

[jira] [Updated] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2012-01-25 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-4218:
---

Attachment: D447.26.patch

mbautin updated the revision [jira] [HBASE-4218] HFile data block encoding 
framework and delta encoding implementation.
Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

  Addressing Jerry's comments and rebasing on HBASE-5230 (ensuring that 
compactions do not cache data blocks on write). All unit tests pass.

  If there are no objections, I will commit this after final cluster testing.

REVISION DETAIL
  https://reviews.facebook.net/D447

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
  src/main/java/org/apache/hadoop/hbase/HConstants.java
  src/main/java/org/apache/hadoop/hbase/KeyValue.java
  src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
  
src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
  
src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
  src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
  src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
  
src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
  src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java
  src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java
  src/main/ruby/hbase/admin.rb
  src/test/java/org/apache/hadoop/hbase/BROKE_TODO_FIX_TestAcidGuarantees.java
  src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
  src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
  src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java
  src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java
  src/test/java/org/apache/hadoop/hbase/TestKeyValue.java
  src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
  src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
  src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java
  src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
  src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java
  
src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java
  src/test/java/org/apache/hadoop/hbase/io/encoding/TestChangingEncoding.java
  src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
  src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java
  
src/test/java/org/apache/hadoop/hbase/io/encoding/TestLoadAndSwitchEncodeOnDisk.java
  
src/test/java/org/apache/hadoop/hbase/io/encoding/TestUpgradeFromHFileV1ToEncoding.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java
  

[jira] [Updated] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2012-01-25 Thread Mikhail Bautin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-4218:
--

Attachment: Delta-encoding-2012-01-25_16_32_14.patch

Attaching a patch rebased on HBASE-5230 and addressing Jerry's new comment.

 Data Block Encoding of KeyValues  (aka delta encoding / prefix compression)
 ---

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Fix For: 0.94.0

 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 0001-Delta-encoding.patch, 4218-2012-01-14.txt, 4218-v16.txt, 4218.txt, 
 D447.1.patch, D447.10.patch, D447.11.patch, D447.12.patch, D447.13.patch, 
 D447.14.patch, D447.15.patch, D447.16.patch, D447.17.patch, D447.18.patch, 
 D447.19.patch, D447.2.patch, D447.20.patch, D447.21.patch, D447.22.patch, 
 D447.23.patch, D447.24.patch, D447.25.patch, D447.26.patch, D447.3.patch, 
 D447.4.patch, D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, 
 D447.9.patch, Data-block-encoding-2011-12-23.patch, 
 Delta-encoding-2012-01-17_11_09_09.patch, 
 Delta-encoding-2012-01-25_00_45_29.patch, 
 Delta-encoding-2012-01-25_16_32_14.patch, 
 Delta-encoding.patch-2011-12-22_11_52_07.patch, 
 Delta-encoding.patch-2012-01-05_15_16_43.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44_copy.patch, 
 Delta-encoding.patch-2012-01-05_18_50_47.patch, 
 Delta-encoding.patch-2012-01-07_14_12_48.patch, 
 Delta-encoding.patch-2012-01-13_12_20_07.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3466) runtime exception -- cached an already cached block -- during compaction

2012-01-25 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193491#comment-13193491
 ] 

Zhihong Yu commented on HBASE-3466:
---

Since the issue can be reproduced, can you include cacheKey (and cb) in the 
exception message ?
{code}
CachedBlock cb = map.get(cacheKey);
if(cb != null) {
  throw new RuntimeException(Cached an already cached block);
}
{code}
Thanks

 runtime exception -- cached an already cached block -- during compaction
 

 Key: HBASE-3466
 URL: https://issues.apache.org/jira/browse/HBASE-3466
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.0
 Environment: ubuntu 9.10, kernel 2.6.31-14-generic SMP 8-core with 
 hyperthreading
Reporter: M. C. Srivas
Priority: Critical

 Happened while running ycsb against a single RS.  BlockSize was set to 64M to 
 tickle more splits. No compression, and replication factor set to 1.
  
 I noticed that  https://issues.apache.org/jira/browse/HBASE-2455 applied to 
 0.20.4, so opened this new one (didn't check to see if the code was the same 
 in 0.20.4 and 0.90.0)
 YCSB was run as follows:
 java -mx3000m -cp conf/:build/ycsb.jar:db/hbase/lib/* com.yahoo.ycsb.Client 
 -t -db com.yahoo.ycsb.db.HBaseClient -P workloads/workloada -p 
 columnfamily=family -p operationcount=1000 -s -threads 30 -target 3
 workloada was modified to do 1 billion records:
 --
 recordcount=10
 operationcount=1000
 workload=com.yahoo.ycsb.workloads.CoreWorkload
 readallfields=true
 readproportion=0.5
 updateproportion=0.4
 scanproportion=0
 insertproportion=0.1
 requestdistribution=zipfian
 ---
 Relevant portions from the RS's log:
 2011-01-23 10:48:20,719 INFO  
 org.apache.hadoop.hbase.regionserver.SplitTransaction 
 [regionserver60020.compactor]: Starting split of region 
 usertable,,1295808232386.44386ab6079bd5b497a6de3ab95e850c.
 2011-01-23 10:48:20,788 INFO  org.apache.hadoop.hbase.regionserver.Store 
 [regionserver60020.compactor]: Renaming flushed file at 
 maprfs:/hbase/usertable/44386ab6079bd5b497a6de3ab95e850c/.tmp/3202441284831392385
  to 
 maprfs:/hbase/usertable/44386ab6079bd5b497a6de3ab95e850c/family/1800354539520698957
 2011-01-23 10:48:20,791 INFO  org.apache.hadoop.hbase.regionserver.Store 
 [regionserver60020.compactor]: Added 
 maprfs:/hbase/usertable/44386ab6079bd5b497a6de3ab95e850c/family/1800354539520698957,
  entries=10943, sequenceid=128924, memsize=3.4m, filesize=1.5m
 2011-01-23 10:48:20,792 INFO  org.apache.hadoop.hbase.regionserver.HRegion 
 [regionserver60020.compactor]: Closed 
 usertable,,1295808232386.44386ab6079bd5b497a6de3ab95e850c.
 2011-01-23 10:48:20,828 INFO  org.apache.hadoop.hbase.catalog.MetaEditor 
 [regionserver60020.compactor]: Offlined parent region 
 usertable,,1295808232386.44386ab6079bd5b497a6de3ab95e850c. in META
 2011-01-23 10:48:20,856 INFO  org.apache.hadoop.hbase.regionserver.HRegion 
 [perfnode15.perf.lab,60020,1295807975391-daughterOpener=89e0f70da1e5ce2d5c4024ca6cc1addb]:
  Onlined usertable,,1295808500713.89e0f70da1e5ce2d5c4024ca6cc1addb.; next 
 sequenceid=128925
 2011-01-23 10:48:20,791 INFO  org.apache.hadoop.hbase.regionserver.Store 
 [regionserver60020.compactor]: Added 
 maprfs:/hbase/usertable/44386ab6079bd5b497a6de3ab95e850c/family/1800354539520698957,
  entries=10943, sequenceid=128924, memsize=3.4m, filesize=1.5m
 2011-01-23 10:48:20,792 INFO  org.apache.hadoop.hbase.regionserver.HRegion 
 [regionserver60020.compactor]: Closed 
 usertable,,1295808232386.44386ab6079bd5b497a6de3ab95e850c.
 2011-01-23 10:48:20,828 INFO  org.apache.hadoop.hbase.catalog.MetaEditor 
 [regionserver60020.compactor]: Offlined parent region 
 usertable,,1295808232386.44386ab6079bd5b497a6de3ab95e850c. in META
 2011-01-23 10:48:20,856 INFO  org.apache.hadoop.hbase.regionserver.HRegion 
 [perfnode15.perf.lab,60020,1295807975391-daughterOpener=89e0f70da1e5ce2d5c4024ca6cc1addb]:
  Onlined usertable,,1295808500713.89e0f70da1e5ce2d5c4024ca6cc1addb.; next 
 sequenceid=128925
 2011-01-23 10:48:20,863 INFO  org.apache.hadoop.hbase.catalog.MetaEditor 
 [perfnode15.perf.lab,60020,1295807975391-daughterOpener=89e0f70da1e5ce2d5c4024ca6cc1addb]:
  Added daughter usertable,,1295808500713.89e0f70da1e5ce2d5c4024ca6cc1addb. in 
 region .META.,,1, serverInfo=perfnode15.perf.lab,60020,1295807975391
 2011-01-23 10:48:20,868 INFO  org.apache.hadoop.hbase.regionserver.HRegion 
 [perfnode15.perf.lab,60020,1295807975391-daughterOpener=fd1d4e71c9a7e262a6e26adc0742414e]:
  Onlined 
 usertable,user1907848630,1295808500713.fd1d4e71c9a7e262a6e26adc0742414e.; 
 next sequenceid=128926
 2011-01-23 10:48:20,869 INFO  org.apache.hadoop.hbase.catalog.MetaEditor 
 

[jira] [Commented] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.

2012-01-25 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193499#comment-13193499
 ] 

Jonathan Hsieh commented on HBASE-5128:
---

It was also suggested that I need to worry about compactions due to a HRegion 
flush when I close regions during overlap merging.  At least in  0.90, this is 
not actually necessary -- the closeRegion HMaster side actually flushes but 
ignores the return value of internalFlushcache return flag that specifies if a 
region needs to be compacted.


 [uber hbck] Enable hbck to automatically repair table integrity problems as 
 well as region consistency problems while online.
 -

 Key: HBASE-5128
 URL: https://issues.apache.org/jira/browse/HBASE-5128
 Project: HBase
  Issue Type: New Feature
  Components: hbck
Affects Versions: 0.90.5, 0.92.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh

 The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region 
 consistency and table integrity invariant violations.  However with '-fix' it 
 can only automatically repair region consistency cases having to do with 
 deployment problems.  This updated version should be able to handle all cases 
 (including a new orphan regiondir case).  When complete will likely deprecate 
 the OfflineMetaRepair tool and subsume several open META-hole related issue.
 Here's the approach (from the comment of at the top of the new version of the 
 file).
 {code}
 /**
  * HBaseFsck (hbck) is a tool for checking and repairing region consistency 
 and
  * table integrity.  
  * 
  * Region consistency checks verify that META, region deployment on
  * region servers and the state of data in HDFS (.regioninfo files) all are in
  * accordance. 
  * 
  * Table integrity checks verify that that all possible row keys can resolve 
 to
  * exactly one region of a table.  This means there are no individual 
 degenerate
  * or backwards regions; no holes between regions; and that there no 
 overlapping
  * regions. 
  * 
  * The general repair strategy works in these steps.
  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
  * 2) Repair Region Consistency with META and assignments
  * 
  * For table integrity repairs, the tables their region directories are 
 scanned
  * for .regioninfo files.  Each table's integrity is then verified.  If there 
  * are any orphan regions (regions with no .regioninfo files), or holes, new 
  * regions are fabricated.  Backwards regions are sidelined as well as empty
  * degenerate (endkey==startkey) regions.  If there are any overlapping 
 regions,
  * a new region is created and all data is merged into the new region.  
  * 
  * Table integrity repairs deal solely with HDFS and can be done offline -- 
 the
  * hbase region servers or master do not need to be running.  These phase can 
 be
  * use to completely reconstruct the META table in an offline fashion. 
  * 
  * Region consistency requires three conditions -- 1) valid .regioninfo file 
  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
  * and 3) a region is deployed only at the regionserver that is was assigned 
 to.
  * 
  * Region consistency requires hbck to contact the HBase master and region
  * servers, so the connect() must first be called successfully.  Much of the
  * region consistency information is transient and less risky to repair.
  */
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2012-01-25 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193514#comment-13193514
 ] 

Phabricator commented on HBASE-4218:


Kannan has accepted the revision [jira] [HBASE-4218] HFile data block encoding 
framework and delta encoding implementation.

  excellent!!

REVISION DETAIL
  https://reviews.facebook.net/D447


 Data Block Encoding of KeyValues  (aka delta encoding / prefix compression)
 ---

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Fix For: 0.94.0

 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 0001-Delta-encoding.patch, 4218-2012-01-14.txt, 4218-v16.txt, 4218.txt, 
 D447.1.patch, D447.10.patch, D447.11.patch, D447.12.patch, D447.13.patch, 
 D447.14.patch, D447.15.patch, D447.16.patch, D447.17.patch, D447.18.patch, 
 D447.19.patch, D447.2.patch, D447.20.patch, D447.21.patch, D447.22.patch, 
 D447.23.patch, D447.24.patch, D447.25.patch, D447.26.patch, D447.3.patch, 
 D447.4.patch, D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, 
 D447.9.patch, Data-block-encoding-2011-12-23.patch, 
 Delta-encoding-2012-01-17_11_09_09.patch, 
 Delta-encoding-2012-01-25_00_45_29.patch, 
 Delta-encoding-2012-01-25_16_32_14.patch, 
 Delta-encoding.patch-2011-12-22_11_52_07.patch, 
 Delta-encoding.patch-2012-01-05_15_16_43.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44_copy.patch, 
 Delta-encoding.patch-2012-01-05_18_50_47.patch, 
 Delta-encoding.patch-2012-01-07_14_12_48.patch, 
 Delta-encoding.patch-2012-01-13_12_20_07.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4917) CRUD Verify Utility

2012-01-25 Thread Mubarak Seyed (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193532#comment-13193532
 ] 

Mubarak Seyed commented on HBASE-4917:
--

working on this port, will attach the patch once i get the corporate approval. 
Thanks.

 CRUD Verify Utility
 ---

 Key: HBASE-4917
 URL: https://issues.apache.org/jira/browse/HBASE-4917
 Project: HBase
  Issue Type: Sub-task
  Components: client, regionserver
Reporter: Nicolas Spiegelberg
 Fix For: 0.94.0


 Add a verify utility to run basic CRUD tests against hbase in various common 
 use cases.  This is great for sanity checking a cluster setup because it can 
 be run as a one line shell command with no required params.  Multiple column 
 families for different use-cases can be tested together.  Currently provided 
 use-cases are 'action log', 'snapshot' and 'search'. The interface is 
 developed such that it can be easily extended to cover more use-cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2012-01-25 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193534#comment-13193534
 ] 

Hadoop QA commented on HBASE-4218:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12511917/Delta-encoding-2012-01-25_16_32_14.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 189 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -140 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 88 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.io.hfile.TestHFileBlock

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/852//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/852//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/852//console

This message is automatically generated.

 Data Block Encoding of KeyValues  (aka delta encoding / prefix compression)
 ---

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Fix For: 0.94.0

 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 0001-Delta-encoding.patch, 4218-2012-01-14.txt, 4218-v16.txt, 4218.txt, 
 D447.1.patch, D447.10.patch, D447.11.patch, D447.12.patch, D447.13.patch, 
 D447.14.patch, D447.15.patch, D447.16.patch, D447.17.patch, D447.18.patch, 
 D447.19.patch, D447.2.patch, D447.20.patch, D447.21.patch, D447.22.patch, 
 D447.23.patch, D447.24.patch, D447.25.patch, D447.26.patch, D447.3.patch, 
 D447.4.patch, D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, 
 D447.9.patch, Data-block-encoding-2011-12-23.patch, 
 Delta-encoding-2012-01-17_11_09_09.patch, 
 Delta-encoding-2012-01-25_00_45_29.patch, 
 Delta-encoding-2012-01-25_16_32_14.patch, 
 Delta-encoding.patch-2011-12-22_11_52_07.patch, 
 Delta-encoding.patch-2012-01-05_15_16_43.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44_copy.patch, 
 Delta-encoding.patch-2012-01-05_18_50_47.patch, 
 Delta-encoding.patch-2012-01-07_14_12_48.patch, 
 Delta-encoding.patch-2012-01-13_12_20_07.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 

[jira] [Commented] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2012-01-25 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193535#comment-13193535
 ] 

Phabricator commented on HBASE-4218:


mbautin has committed the revision [jira] [HBASE-4218] HFile data block 
encoding framework and delta encoding implementation.

REVISION DETAIL
  https://reviews.facebook.net/D447

COMMIT
  https://reviews.facebook.net/rHBASE1236031


 Data Block Encoding of KeyValues  (aka delta encoding / prefix compression)
 ---

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Fix For: 0.94.0

 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 0001-Delta-encoding.patch, 4218-2012-01-14.txt, 4218-v16.txt, 4218.txt, 
 D447.1.patch, D447.10.patch, D447.11.patch, D447.12.patch, D447.13.patch, 
 D447.14.patch, D447.15.patch, D447.16.patch, D447.17.patch, D447.18.patch, 
 D447.19.patch, D447.2.patch, D447.20.patch, D447.21.patch, D447.22.patch, 
 D447.23.patch, D447.24.patch, D447.25.patch, D447.26.patch, D447.3.patch, 
 D447.4.patch, D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, 
 D447.9.patch, Data-block-encoding-2011-12-23.patch, 
 Delta-encoding-2012-01-17_11_09_09.patch, 
 Delta-encoding-2012-01-25_00_45_29.patch, 
 Delta-encoding-2012-01-25_16_32_14.patch, 
 Delta-encoding.patch-2011-12-22_11_52_07.patch, 
 Delta-encoding.patch-2012-01-05_15_16_43.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44.patch, 
 Delta-encoding.patch-2012-01-05_16_31_44_copy.patch, 
 Delta-encoding.patch-2012-01-05_18_50_47.patch, 
 Delta-encoding.patch-2012-01-07_14_12_48.patch, 
 Delta-encoding.patch-2012-01-13_12_20_07.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5276) PerformanceEvaluation does not set the correct classpath for MR because it lives in the test jar

2012-01-25 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193536#comment-13193536
 ] 

Jonathan Hsieh commented on HBASE-5276:
---

Hi Tim, 

From the HBASE-4688 issue, it looks this isn't in Apache HBase until 0.92.0.  
If you would like this in a future CDH3 release please file an issue here:

https://issues.cloudera.org/browse/DISTRO

Since CDH4 is based on Apache HBase 0.92, it will be in the CDH4 HBase.  

Thanks,
Jon.

 PerformanceEvaluation does not set the correct classpath for MR because it 
 lives in the test jar
 

 Key: HBASE-5276
 URL: https://issues.apache.org/jira/browse/HBASE-5276
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.90.4
Reporter: Tim Robertson
Priority: Minor

 Note: This was discovered running the CDH version hbase-0.90.4-cdh3u2
 Running the PerformanceEvaluation as follows:
   $HADOOP_HOME/bin/hadoop org.apache.hadoop.hbase.PerformanceEvaluation scan 5
 fails because the MR tasks do not get the HBase jar on the CP, and thus hit 
 ClassNotFoundExceptions.
 The job gets the following only:
   file:/Users/tim/dev/hadoop/hbase-0.90.4-cdh3u2/hbase-0.90.4-cdh3u2-tests.jar
   
 file:/Users/tim/dev/hadoop/hadoop-0.20.2-cdh3u2/hadoop-core-0.20.2-cdh3u2.jar
   
 file:/Users/tim/dev/hadoop/hbase-0.90.4-cdh3u2/lib/zookeeper-3.3.3-cdh3u2.jar
 The RowCounter etc all work because they live in the HBase jar, not the test 
 jar, and they get the following 
   file:/Users/tim/dev/hadoop/hbase-0.90.4-cdh3u2/lib/guava-r06.jar
   
 file:/Users/tim/dev/hadoop/hadoop-0.20.2-cdh3u2/hadoop-core-0.20.2-cdh3u2.jar
   file:/Users/tim/dev/hadoop/hbase-0.90.4-cdh3u2/hbase-0.90.4-cdh3u2.jar
   
 file:/Users/tim/dev/hadoop/hbase-0.90.4-cdh3u2/lib/zookeeper-3.3.3-cdh3u2.jar
 Presumably this relates to 
   job.setJarByClass(PerformanceEvaluation.class);
   ...
   TableMapReduceUtil.addDependencyJars(job);
 A (cowboy) workaround to run PE is to unpack the jars, and copy the 
 PerformanceEvaluation* classes building a patched jar.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5186) Add metrics to ThriftServer

2012-01-25 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5186:
---

Attachment: HBASE-5186.D1461.1.patch

sc requested code review of HBASE-5186 [jira] Add metrics to ThriftServer.
Reviewers: dhruba, tedyu, JIRA

  Add metrics to ThriftServer

  It will be useful to have some metrics (queue length, waiting time, processing
  time ...) similar to Hadoop RPC server. This allows us to monitor system 
health
  also provide a tool to diagnose the problem where thrift calls are slow.

  It will be useful to have some metrics (queue length, waiting time, 
processing time ...) similar to Hadoop RPC server. This allows us to monitor 
system health also provide a tool to diagnose the problem where thrift calls 
are slow.

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D1461

AFFECTED FILES
  pom.xml
  src/main/java/org/apache/hadoop/hbase/thrift/CallQueue.java
  src/main/java/org/apache/hadoop/hbase/thrift/HbaseHandlerMetricsProxy.java
  src/main/java/org/apache/hadoop/hbase/thrift/TBoundedThreadPoolServer.java
  src/main/java/org/apache/hadoop/hbase/thrift/ThriftMetrics.java
  src/main/java/org/apache/hadoop/hbase/thrift/ThriftServerRunner.java
  src/test/java/org/apache/hadoop/hbase/thrift/TestThriftServer.java

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/3021/

Tip: use the X-Herald-Rules header to filter Herald messages in your client.


 Add metrics to ThriftServer
 ---

 Key: HBASE-5186
 URL: https://issues.apache.org/jira/browse/HBASE-5186
 Project: HBase
  Issue Type: Improvement
Reporter: Scott Chen
Assignee: Scott Chen
 Attachments: HBASE-5186.D1461.1.patch


 It will be useful to have some metrics (queue length, waiting time, 
 processing time ...) similar to Hadoop RPC server. This allows us to monitor 
 system health also provide a tool to diagnose the problem where thrift calls 
 are slow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5186) Add metrics to ThriftServer

2012-01-25 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5186:
---

Attachment: HBASE-5186.D1461.2.patch

sc updated the revision HBASE-5186 [jira] Add metrics to ThriftServer.
Reviewers: dhruba, tedyu, JIRA

  Remove the debug change

REVISION DETAIL
  https://reviews.facebook.net/D1461

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/thrift/CallQueue.java
  src/main/java/org/apache/hadoop/hbase/thrift/HbaseHandlerMetricsProxy.java
  src/main/java/org/apache/hadoop/hbase/thrift/TBoundedThreadPoolServer.java
  src/main/java/org/apache/hadoop/hbase/thrift/ThriftMetrics.java
  src/main/java/org/apache/hadoop/hbase/thrift/ThriftServerRunner.java
  src/test/java/org/apache/hadoop/hbase/thrift/TestThriftServer.java


 Add metrics to ThriftServer
 ---

 Key: HBASE-5186
 URL: https://issues.apache.org/jira/browse/HBASE-5186
 Project: HBase
  Issue Type: Improvement
Reporter: Scott Chen
Assignee: Scott Chen
 Attachments: HBASE-5186.D1461.1.patch, HBASE-5186.D1461.2.patch


 It will be useful to have some metrics (queue length, waiting time, 
 processing time ...) similar to Hadoop RPC server. This allows us to monitor 
 system health also provide a tool to diagnose the problem where thrift calls 
 are slow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5230) Ensure compactions do not cache-on-write data blocks

2012-01-25 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193545#comment-13193545
 ] 

Hudson commented on HBASE-5230:
---

Integrated in HBase-TRUNK #2646 (See 
[https://builds.apache.org/job/HBase-TRUNK/2646/])
HBASE-5230 : ensure that compactions do not cache-on-write data blocks

mbautin : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java


 Ensure compactions do not cache-on-write data blocks
 

 Key: HBASE-5230
 URL: https://issues.apache.org/jira/browse/HBASE-5230
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
 D1353.4.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-23_15_27_23.patch


 Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
 write during compactions even if cache-on-write is enabled generally 
 enabled). This is because we have very different implementations of 
 HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
 and with CacheConfig (presumably it's there but not sure if it even works, 
 since the patch in HBASE-3976 may not have been committed). We need to create 
 a unit test to verify that we don't cache data blocks on write during 
 compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server

2012-01-25 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193546#comment-13193546
 ] 

Hudson commented on HBASE-4720:
---

Integrated in HBase-TRUNK #2646 (See 
[https://builds.apache.org/job/HBase-TRUNK/2646/])
HBASE-4720 revert until agreement is reached on solution
HBASE-4720 Implement atomic update operations (checkAndPut, checkAndDelete) for 
REST client/server (Mubarak)

tedyu : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/rest/CheckAndDeleteRowResource.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/rest/CheckAndDeleteTableResource.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/rest/CheckAndPutRowResource.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/rest/CheckAndPutTableResource.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/rest/RootResource.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/rest/TestRowResource.java

tedyu : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/rest/CheckAndDeleteRowResource.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/rest/CheckAndDeleteTableResource.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/rest/CheckAndPutRowResource.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/rest/CheckAndPutTableResource.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/rest/RootResource.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/rest/TestRowResource.java


 Implement atomic update operations (checkAndPut, checkAndDelete) for REST 
 client/server 
 

 Key: HBASE-4720
 URL: https://issues.apache.org/jira/browse/HBASE-4720
 Project: HBase
  Issue Type: Improvement
Reporter: Daniel Lord
Assignee: Mubarak Seyed
 Fix For: 0.94.0

 Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, 
 HBASE-4720.trunk.v3.patch, HBASE-4720.trunk.v4.patch, 
 HBASE-4720.trunk.v5.patch, HBASE-4720.trunk.v6.patch, HBASE-4720.v1.patch, 
 HBASE-4720.v3.patch


 I have several large application/HBase clusters where an application node 
 will occasionally need to talk to HBase from a different cluster.  In order 
 to help ensure some of my consistency guarantees I have a sentinel table that 
 is updated atomically as users interact with the system.  This works quite 
 well for the regular hbase client but the REST client does not implement 
 the checkAndPut and checkAndDelete operations.  This exposes the application 
 to some race conditions that have to be worked around.  It would be ideal if 
 the same checkAndPut/checkAndDelete operations could be supported by the REST 
 client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2012-01-25 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193543#comment-13193543
 ] 

Hudson commented on HBASE-4218:
---

Integrated in HBase-TRUNK #2646 (See 
[https://builds.apache.org/job/HBase-TRUNK/2646/])
[jira] [HBASE-4218] HFile data block encoding framework and delta encoding
implementation (Jacek Midgal, Mikhail Bautin)

Summary:

Adding a framework that allows to encode keys in an HFile data block. We
support two modes of encoding: (1) both on disk and in cache, and (2) in cache
only. This is distinct from compression that is already being done in HBase,
e.g. GZ or LZO. When data block encoding is enabled, we store blocks in cache
in an uncompressed but encoded form. This allows to fit more blocks in cache
and reduce the number of disk reads.

The most common example of data block encoding is delta encoding, where we take
advantage of the fact that HFile keys are sorted and share a lot of common
prefixes, and only store the delta between each pair of consecutive keys.
Initial encoding algorithms implemented are DIFF, FAST_DIFF, and PREFIX.

This is based on the delta encoding patch developed by Jacek Midgal during his
2011 summer internship at Facebook. The original patch is available here:
https://reviews.apache.org/r/2308/diff/.

Test Plan: Unit tests. Distributed load test on a five-node cluster.

Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

Reviewed By: Kannan

CC: tedyu, todd, mbautin, stack, Kannan, mcorgan, gqchen

Differential Revision: https://reviews.facebook.net/D447

mbautin : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HConstants.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
* 

[jira] [Commented] (HBASE-5278) HBase shell script refers to removed migrate functionality

2012-01-25 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193544#comment-13193544
 ] 

Hudson commented on HBASE-5278:
---

Integrated in HBase-TRUNK #2646 (See 
[https://builds.apache.org/job/HBase-TRUNK/2646/])
HBASE-5278 HBase shell script refers to removed 'migrate' functionality

stack : 
Files : 
* /hbase/trunk/bin/hbase


 HBase shell script refers to removed migrate functionality
 

 Key: HBASE-5278
 URL: https://issues.apache.org/jira/browse/HBASE-5278
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.94.0, 0.90.5, 0.92.0
Reporter: Shaneal Manek
Assignee: Shaneal Manek
Priority: Trivial
 Fix For: 0.94.0, 0.92.1

 Attachments: hbase-5278.patch


 $ hbase migrate
 Exception in thread main java.lang.NoClassDefFoundError: 
 org/apache/hadoop/hbase/util/Migrate
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.hadoop.hbase.util.Migrate
 at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
 Could not find the main class: org.apache.hadoop.hbase.util.Migrate. Program 
 will exit.
 The 'hbase' shell script has docs referring to a 'migrate' command which no 
 longer exists.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5229) Explore building blocks for multi-row local transactions.

2012-01-25 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193551#comment-13193551
 ] 

Lars Hofhansl commented on HBASE-5229:
--

Is anybody interested in me exploring the split prefix idea described above?

Basically a table would declare a prefix of N bytes, and during splitting we 
make sure don't split values with the same prefix (which essentially just means 
that we calculate the midKey as we do now, and just take the first N bytes to 
perform the actual split, hence actual split point would always be aligned with 
the prefixes).
That way we have defined a grouping of rows that could participate in local 
transactions.


 Explore building blocks for multi-row local transactions.
 ---

 Key: HBASE-5229
 URL: https://issues.apache.org/jira/browse/HBASE-5229
 Project: HBase
  Issue Type: New Feature
  Components: client, regionserver
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.0

 Attachments: 5229-seekto-v2.txt, 5229-seekto.txt, 5229.txt


 HBase should provide basic building blocks for multi-row local transactions. 
 Local means that we do this by co-locating the data. Global (cross region) 
 transactions are not discussed here.
 After a bit of discussion two solutions have emerged:
 1. Keep the row-key for determining grouping and location and allow efficient 
 intra-row scanning. A client application would then model tables as 
 HBase-rows.
 2. Define a prefix-length in HTableDescriptor that defines a grouping of 
 rows. Regions will then never be split inside a grouping prefix.
 #1 is true to the current storage paradigm of HBase.
 #2 is true to the current client side API.
 I will explore these two with sample patches here.
 
 Was:
 As discussed (at length) on the dev mailing list with the HBASE-3584 and 
 HBASE-5203 committed, supporting atomic cross row transactions within a 
 region becomes simple.
 I am aware of the hesitation about the usefulness of this feature, but we 
 have to start somewhere.
 Let's use this jira for discussion, I'll attach a patch (with tests) 
 momentarily to make this concrete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5186) Add metrics to ThriftServer

2012-01-25 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5186:
---

Attachment: HBASE-5186.D1461.3.patch

sc updated the revision HBASE-5186 [jira] Add metrics to ThriftServer.
Reviewers: dhruba, tedyu, JIRA

  Add TestCallQueue

REVISION DETAIL
  https://reviews.facebook.net/D1461

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/thrift/CallQueue.java
  src/main/java/org/apache/hadoop/hbase/thrift/HbaseHandlerMetricsProxy.java
  src/main/java/org/apache/hadoop/hbase/thrift/TBoundedThreadPoolServer.java
  src/main/java/org/apache/hadoop/hbase/thrift/ThriftMetrics.java
  src/main/java/org/apache/hadoop/hbase/thrift/ThriftServerRunner.java
  src/test/java/org/apache/hadoop/hbase/thrift/TestCallQueue.java
  src/test/java/org/apache/hadoop/hbase/thrift/TestThriftServer.java


 Add metrics to ThriftServer
 ---

 Key: HBASE-5186
 URL: https://issues.apache.org/jira/browse/HBASE-5186
 Project: HBase
  Issue Type: Improvement
Reporter: Scott Chen
Assignee: Scott Chen
 Attachments: HBASE-5186.D1461.1.patch, HBASE-5186.D1461.2.patch, 
 HBASE-5186.D1461.3.patch


 It will be useful to have some metrics (queue length, waiting time, 
 processing time ...) similar to Hadoop RPC server. This allows us to monitor 
 system health also provide a tool to diagnose the problem where thrift calls 
 are slow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5230) Ensure compactions do not cache-on-write data blocks

2012-01-25 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193566#comment-13193566
 ] 

Hudson commented on HBASE-5230:
---

Integrated in HBase-TRUNK-security #90 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/90/])
HBASE-5230 : ensure that compactions do not cache-on-write data blocks

mbautin : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java


 Ensure compactions do not cache-on-write data blocks
 

 Key: HBASE-5230
 URL: https://issues.apache.org/jira/browse/HBASE-5230
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
 D1353.4.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch, 
 Don-t-cache-data-blocks-on-compaction-2012-01-23_15_27_23.patch


 Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
 write during compactions even if cache-on-write is enabled generally 
 enabled). This is because we have very different implementations of 
 HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
 and with CacheConfig (presumably it's there but not sure if it even works, 
 since the patch in HBASE-3976 may not have been committed). We need to create 
 a unit test to verify that we don't cache data blocks on write during 
 compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2012-01-25 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193564#comment-13193564
 ] 

Hudson commented on HBASE-4218:
---

Integrated in HBase-TRUNK-security #90 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/90/])
[jira] [HBASE-4218] HFile data block encoding framework and delta encoding
implementation (Jacek Midgal, Mikhail Bautin)

Summary:

Adding a framework that allows to encode keys in an HFile data block. We
support two modes of encoding: (1) both on disk and in cache, and (2) in cache
only. This is distinct from compression that is already being done in HBase,
e.g. GZ or LZO. When data block encoding is enabled, we store blocks in cache
in an uncompressed but encoded form. This allows to fit more blocks in cache
and reduce the number of disk reads.

The most common example of data block encoding is delta encoding, where we take
advantage of the fact that HFile keys are sorted and share a lot of common
prefixes, and only store the delta between each pair of consecutive keys.
Initial encoding algorithms implemented are DIFF, FAST_DIFF, and PREFIX.

This is based on the delta encoding patch developed by Jacek Midgal during his
2011 summer internship at Facebook. The original patch is available here:
https://reviews.apache.org/r/2308/diff/.

Test Plan: Unit tests. Distributed load test on a five-node cluster.

Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan

Reviewed By: Kannan

CC: tedyu, todd, mbautin, stack, Kannan, mcorgan, gqchen

Differential Revision: https://reviews.facebook.net/D447

mbautin : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HConstants.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
* 

[jira] [Updated] (HBASE-5266) Add documentation for ColumnRangeFilter

2012-01-25 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5266:
-

Attachment: 5266-v2.txt

How's this?

 Add documentation for ColumnRangeFilter
 ---

 Key: HBASE-5266
 URL: https://issues.apache.org/jira/browse/HBASE-5266
 Project: HBase
  Issue Type: Sub-task
  Components: documentation
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Fix For: 0.94.0

 Attachments: 5266-v2.txt, 5266.txt


 There are only a few lines of documentation for ColumnRangeFilter.
 Given the usefulness of this filter for efficient intra-row scanning (see 
 HBASE-5229 and HBASE-4256), we should make this filter more prominent in the 
 documentation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5186) Add metrics to ThriftServer

2012-01-25 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5186:
---

Attachment: HBASE-5186.D1461.4.patch

sc updated the revision HBASE-5186 [jira] Add metrics to ThriftServer.
Reviewers: dhruba, tedyu, JIRA, heyongqiang

  Remove some dead variables

REVISION DETAIL
  https://reviews.facebook.net/D1461

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java
  src/main/java/org/apache/hadoop/hbase/thrift/CallQueue.java
  src/main/java/org/apache/hadoop/hbase/thrift/HbaseHandlerMetricsProxy.java
  src/main/java/org/apache/hadoop/hbase/thrift/TBoundedThreadPoolServer.java
  src/main/java/org/apache/hadoop/hbase/thrift/ThriftMetrics.java
  src/main/java/org/apache/hadoop/hbase/thrift/ThriftServerRunner.java
  src/test/java/org/apache/hadoop/hbase/thrift/TestCallQueue.java
  src/test/java/org/apache/hadoop/hbase/thrift/TestThriftServer.java


 Add metrics to ThriftServer
 ---

 Key: HBASE-5186
 URL: https://issues.apache.org/jira/browse/HBASE-5186
 Project: HBase
  Issue Type: Improvement
Reporter: Scott Chen
Assignee: Scott Chen
 Attachments: HBASE-5186.D1461.1.patch, HBASE-5186.D1461.2.patch, 
 HBASE-5186.D1461.3.patch, HBASE-5186.D1461.4.patch


 It will be useful to have some metrics (queue length, waiting time, 
 processing time ...) similar to Hadoop RPC server. This allows us to monitor 
 system health also provide a tool to diagnose the problem where thrift calls 
 are slow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss

2012-01-25 Thread ramkrishna.s.vasudevan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-5179:
--

Fix Version/s: (was: 0.90.6)
   0.90.7

Updating the affected version to 0.90.7.  This issue will go into 0.90.7 after 
sufficient testing.

 Concurrent processing of processFaileOver and ServerShutdownHandler may cause 
 region to be assigned before log splitting is completed, causing data loss
 

 Key: HBASE-5179
 URL: https://issues.apache.org/jira/browse/HBASE-5179
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.2
Reporter: chunhui shen
Assignee: chunhui shen
Priority: Critical
 Fix For: 0.94.0, 0.90.7, 0.92.1

 Attachments: 5179-90.txt, 5179-90v10.patch, 5179-90v11.patch, 
 5179-90v12.patch, 5179-90v13.txt, 5179-90v14.patch, 5179-90v15.patch, 
 5179-90v16.patch, 5179-90v17.txt, 5179-90v18.txt, 5179-90v2.patch, 
 5179-90v3.patch, 5179-90v4.patch, 5179-90v5.patch, 5179-90v6.patch, 
 5179-90v7.patch, 5179-90v8.patch, 5179-90v9.patch, 5179-92v17.patch, 
 5179-v11-92.txt, 5179-v11.txt, 5179-v2.txt, 5179-v3.txt, 5179-v4.txt, 
 Errorlog, hbase-5179.patch, hbase-5179v10.patch, hbase-5179v12.patch, 
 hbase-5179v17.patch, hbase-5179v5.patch, hbase-5179v6.patch, 
 hbase-5179v7.patch, hbase-5179v8.patch, hbase-5179v9.patch


 If master's processing its failover and ServerShutdownHandler's processing 
 happen concurrently, it may appear following  case.
 1.master completed splitLogAfterStartup()
 2.RegionserverA restarts, and ServerShutdownHandler is processing.
 3.master starts to rebuildUserRegions, and RegionserverA is considered as 
 dead server.
 4.master starts to assign regions of RegionserverA because it is a dead 
 server by step3.
 However, when doing step4(assigning region), ServerShutdownHandler may be 
 doing split log, Therefore, it may cause data loss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5271) Result.getValue and Result.getColumnLatest return the wrong column.

2012-01-25 Thread ramkrishna.s.vasudevan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-5271:
--

Fix Version/s: (was: 0.90.6)
   0.90.7

Updated affects version to 0.90.7

 Result.getValue and Result.getColumnLatest return the wrong column.
 ---

 Key: HBASE-5271
 URL: https://issues.apache.org/jira/browse/HBASE-5271
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.5
Reporter: Ghais Issa
Assignee: Ghais Issa
 Fix For: 0.94.0, 0.90.7, 0.92.1

 Attachments: 5271-90.txt, 5271-v2.txt, 
 fixKeyValueMatchingColumn.diff, testGetValue.diff


 In the following example result.getValue returns the wrong column
 KeyValue kv = new KeyValue(Bytes.toBytes(r), Bytes.toBytes(24), 
 Bytes.toBytes(2), Bytes.toBytes(7L));
 Result result = new Result(new KeyValue[] { kv });
 System.out.println(Bytes.toLong(result.getValue(Bytes.toBytes(2), 
 Bytes.toBytes(2; //prints 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-4893) HConnectionImplementation is closed but not deleted

2012-01-25 Thread ramkrishna.s.vasudevan (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-4893.
---

Resolution: Fixed
  Assignee: Mubarak Seyed

Resolving the issue

 HConnectionImplementation is closed but not deleted
 ---

 Key: HBASE-4893
 URL: https://issues.apache.org/jira/browse/HBASE-4893
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.1
 Environment: Linux 2.6, HBase-0.90.1
Reporter: Mubarak Seyed
Assignee: Mubarak Seyed
  Labels: noob
 Fix For: 0.90.6

 Attachments: HBASE-4893.v1.patch, HBASE-4893.v2.patch


 In abort() of HConnectionManager$HConnectionImplementation, instance of 
 HConnectionImplementation is marked as this.closed=true.
 There is no way for client application to check the hbase client connection 
 whether it is still opened/good (this.closed=false) or not. We need a method 
 to validate the state of a connection like isClosed().
 {code}
 public boolean isClosed(){
return this.closed;
 } 
 {code}
 Once the connection is closed and it should get deleted. Client application 
 still gets a connection from HConnectionManager.getConnection(Configuration) 
 and tries to make a RPC call to RS, since connection is already closed, 
 HConnectionImplementation.getRegionServerWithRetries throws 
 RetriesExhaustedException with error message
 {code}
 Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying 
 to contact region server null for region , row 
 '----xxx', but failed after 10 attempts.
 Exceptions:
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
 java.io.IOException: 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7eab48a7
  closed
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1008)
   at org.apache.hadoop.hbase.client.HTable.get(HTable.java:546)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5235) HLogSplitter writer thread's streams not getting closed when any of the writer threads has exceptions.

2012-01-25 Thread ramkrishna.s.vasudevan (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-5235.
---

Resolution: Fixed

Committed to 0.92, trunk and 0.90

 HLogSplitter writer thread's streams not getting closed when any of the 
 writer threads has exceptions.
 --

 Key: HBASE-5235
 URL: https://issues.apache.org/jira/browse/HBASE-5235
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.5, 0.92.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.6, 0.92.1

 Attachments: HBASE-5235_0.90.patch, HBASE-5235_0.90_1.patch, 
 HBASE-5235_0.90_2.patch, HBASE-5235_trunk.patch


 Pls find the analysis.  Correct me if am wrong
 {code}
 2012-01-15 05:14:02,374 FATAL 
 org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: WriterThread-9 Got 
 while writing log entry to log
 java.io.IOException: All datanodes 10.18.40.200:50010 are bad. Aborting...
   at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3373)
   at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2811)
   at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:3026)
 {code}
 Here we have an exception in one of the writer threads. If any exception we 
 try to hold it in an Atomic variable 
 {code}
   private void writerThreadError(Throwable t) {
 thrown.compareAndSet(null, t);
   }
 {code}
 In the finally block of splitLog we try to close the streams.
 {code}
   for (WriterThread t: writerThreads) {
 try {
   t.join();
 } catch (InterruptedException ie) {
   throw new IOException(ie);
 }
 checkForErrors();
   }
   LOG.info(Split writers finished);
   
   return closeStreams();
 {code}
 Inside checkForErrors
 {code}
   private void checkForErrors() throws IOException {
 Throwable thrown = this.thrown.get();
 if (thrown == null) return;
 if (thrown instanceof IOException) {
   throw (IOException)thrown;
 } else {
   throw new RuntimeException(thrown);
 }
   }
 So once we throw the exception the DFSStreamer threads are not getting closed.
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5243) LogSyncerThread not getting shutdown waiting for the interrupted flag

2012-01-25 Thread ramkrishna.s.vasudevan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-5243:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to 0.92, trunk and 0.90

 LogSyncerThread not getting shutdown waiting for the interrupted flag
 -

 Key: HBASE-5243
 URL: https://issues.apache.org/jira/browse/HBASE-5243
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.5
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.6, 0.92.1

 Attachments: 5243-92.addendum, HBASE-5243_0.90.patch, 
 HBASE-5243_0.90_1.patch, HBASE-5243_trunk.patch


 In the LogSyncer run() we keep looping till this.isInterrupted flag is set.
 But in some cases the DFSclient is consuming the Interrupted exception.  So
 we are running into infinite loop in some shutdown cases.
 I would suggest that as we are the ones who tries to close down the
 LogSyncerThread we can introduce a variable like
 Close or shutdown and based on the state of this flag along with
 isInterrupted() we can make the thread stop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5237) Addendum for HBASE-5160 and HBASE-4397

2012-01-25 Thread ramkrishna.s.vasudevan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-5237:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to 0.90, trunk and 0.92

 Addendum for HBASE-5160 and HBASE-4397
 --

 Key: HBASE-5237
 URL: https://issues.apache.org/jira/browse/HBASE-5237
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.5
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.6, 0.92.1

 Attachments: HBASE-5237_0.90.patch, HBASE-5237_trunk.patch


 As part of HBASE-4397 there is one more scenario where the patch has to be 
 applied.
 {code}
 RegionPlan plan = getRegionPlan(state, forceNewPlan);
   if (plan == null) {
 debugLog(state.getRegion(),
 Unable to determine a plan to assign  + state);
 return; // Should get reassigned later when RIT times out.
   }
 {code}
 I think in this scenario also 
 {code}
 this.timeoutMonitor.setAllRegionServersOffline(true);
 {code}
 this should be done.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4951) master process can not be stopped when it is initializing

2012-01-25 Thread ramkrishna.s.vasudevan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4951:
--

Fix Version/s: (was: 0.90.6)
   0.90.7

Updated the affected version to 0.90.7.

 master process can not be stopped when it is initializing
 -

 Key: HBASE-4951
 URL: https://issues.apache.org/jira/browse/HBASE-4951
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.3
Reporter: xufeng
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.90.7

 Attachments: HBASE-4951.patch, HBASE-4951_branch.patch


 It is easy to reproduce by following step:
 step1:start master process.(do not start regionserver process in the cluster).
 the master will wait the regionserver to check in:
 org.apache.hadoop.hbase.master.ServerManager: Waiting on regionserver(s) to 
 checkin
 step2:stop the master by sh command bin/hbase master stop
 result:the master process will never die because catalogTracker.waitForRoot() 
 method will block unitl the root region assigned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3855) Performance degradation of memstore because reseek is linear

2012-01-25 Thread ramkrishna.s.vasudevan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-3855:
--

Fix Version/s: (was: 0.90.6)
   0.90.7

Not fixed in this 0.90.6.  Hence moving it to 0.90.7.

 Performance degradation of memstore because reseek is linear
 

 Key: HBASE-3855
 URL: https://issues.apache.org/jira/browse/HBASE-3855
 Project: HBase
  Issue Type: Improvement
Reporter: dhruba borthakur
Priority: Blocker
 Fix For: 0.90.7

 Attachments: memstoreReseek.txt, memstoreReseek2.txt


 The scanner use reseek to find the next row (or next column) as part of a 
 scan. The reseek code iterates over a Set to position itself at the right 
 place. If there are many thousands of kvs that need to be skipped over, then 
 the time-cost is very high. In this case, a seek would be far lesser in cost 
 than a reseek.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5003) If the master is started with a wrong root dir, it gets stuck and can't be killed

2012-01-25 Thread ramkrishna.s.vasudevan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-5003:
--

Fix Version/s: (was: 0.90.6)
   0.90.7

Updated affect version to 0.90.7

 If the master is started with a wrong root dir, it gets stuck and can't be 
 killed
 -

 Key: HBASE-5003
 URL: https://issues.apache.org/jira/browse/HBASE-5003
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: Jean-Daniel Cryans
Priority: Critical
  Labels: noob
 Fix For: 0.94.0, 0.90.7, 0.92.1


 Reported by a new user on IRC who tried to set hbase.rootdir to 
 file:///~/hbase, the master gets stuck and cannot be killed. I tried 
 something similar on my machine and it spins while logging:
 {quote}
 2011-12-09 16:11:17,002 WARN org.apache.hadoop.hbase.util.FSUtils: Unable to 
 create version file at file:/bin/hbase, retrying: Mkdirs failed to create 
 file:/bin/hbase
 2011-12-09 16:11:27,002 WARN org.apache.hadoop.hbase.util.FSUtils: Unable to 
 create version file at file:/bin/hbase, retrying: Mkdirs failed to create 
 file:/bin/hbase
 2011-12-09 16:11:37,003 WARN org.apache.hadoop.hbase.util.FSUtils: Unable to 
 create version file at file:/bin/hbase, retrying: Mkdirs failed to create 
 file:/bin/hbase
 {quote}
 The reason it cannot be stopped is that the master's main thread is stuck in 
 there and will never be notified:
 {quote}
 Master:0;su-jdcryans-01.local,51116,1323475535684 prio=5 tid=7f92b7a3c000 
 nid=0x1137ba000 waiting on condition [1137b9000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
   at java.lang.Thread.sleep(Native Method)
   at org.apache.hadoop.hbase.util.FSUtils.setVersion(FSUtils.java:297)
   at org.apache.hadoop.hbase.util.FSUtils.setVersion(FSUtils.java:268)
   at 
 org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:339)
   at 
 org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:128)
   at 
 org.apache.hadoop.hbase.master.MasterFileSystem.init(MasterFileSystem.java:113)
   at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:435)
   at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:314)
   at 
 org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster.run(HMasterCommandLine.java:218)
   at java.lang.Thread.run(Thread.java:680)
 {quote}
 It seems we should do a better handling of the exceptions we get in there, 
 and die if we need to. It would make a better user experience.
 Maybe also do a check on hbase.rootdir before even starting the master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3834) Store ignores checksum errors when opening files

2012-01-25 Thread ramkrishna.s.vasudevan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-3834:
--


Moving to 0.90.7

 Store ignores checksum errors when opening files
 

 Key: HBASE-3834
 URL: https://issues.apache.org/jira/browse/HBASE-3834
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.2
Reporter: Todd Lipcon
Priority: Critical
 Fix For: 0.90.6


 If you corrupt one of the storefiles in a region (eg using vim to muck up 
 some bytes), the region will still open, but that storefile will just be 
 ignored with a log message. We should probably not do this in general - 
 better to keep that region unassigned and force an admin to make a decision 
 to remove the bad storefile.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4470) ServerNotRunningException coming out of assignRootAndMeta kills the Master

2012-01-25 Thread ramkrishna.s.vasudevan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4470:
--

Fix Version/s: (was: 0.90.6)
   0.90.7

Moving into 0.90.7

 ServerNotRunningException coming out of assignRootAndMeta kills the Master
 --

 Key: HBASE-4470
 URL: https://issues.apache.org/jira/browse/HBASE-4470
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: Jean-Daniel Cryans
Priority: Critical
 Fix For: 0.90.7


 I'm surprised we still have issues like that and I didn't get a hit while 
 googling so forgive me if there's already a jira about it.
 When the master starts it verifies the locations of root and meta before 
 assigning them, if the server is started but not running you'll get this:
 {quote}
 2011-09-23 04:47:44,859 WARN 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: 
 RemoteException connecting to RS
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running 
 yet
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1038)
 at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771)
 at 
 org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
 at $Proxy6.getProtocolVersion(Unknown Source)
 at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419)
 at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393)
 at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444)
 at 
 org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:969)
 at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:388)
 at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:287)
 at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:484)
 at 
 org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:441)
 at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:388)
 at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:282)
 {quote}
 I hit that 3-4 times this week while debugging something else. The worst is 
 that when you restart the master it sees that as a failover, but none of the 
 regions are assigned so it takes an eternity to get back fully online.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4298) Support to drain RS nodes through ZK

2012-01-25 Thread ramkrishna.s.vasudevan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4298:
--

Fix Version/s: (was: 0.90.6)
   0.90.7

Moving into 0.90.7

 Support to drain RS nodes through ZK
 

 Key: HBASE-4298
 URL: https://issues.apache.org/jira/browse/HBASE-4298
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.90.4
 Environment: all
Reporter: Aravind Gottipati
Priority: Critical
  Labels: patch
 Fix For: 0.90.7

 Attachments: 4298-trunk-v2.txt, 4298-trunk-v3.txt, 90_hbase.patch, 
 drainingservertest-v2.txt, drainingservertest.txt, trunk_hbase.patch, 
 trunk_with_test.txt


 HDFS currently has a way to exclude certain datanodes and prevent them from 
 getting new blocks.  HDFS goes one step further and even drains these nodes 
 for you.  This enhancement is a step in that direction.
 The idea is that we mark nodes in zookeeper as draining nodes.  This means 
 that they don't get any more new regions.  These draining nodes look exactly 
 the same as the corresponding nodes in /rs, except they live under /draining.
 Eventually, support for draining them can be added.  I am submitting two 
 patches for review - one for the 0.90 branch and one for trunk (in git).
 Here are the two patches
 0.90 - 
 https://github.com/aravind/hbase/commit/181041e72e7ffe6a4da6d82b431ef7f8c99e62d2
 trunk - 
 https://github.com/aravind/hbase/commit/e127b25ae3b4034103b185d8380f3b7267bc67d5
 I have tested both these patches and they work as advertised.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4288) Server not running exception during meta verification causes RS abort

2012-01-25 Thread ramkrishna.s.vasudevan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4288:
--

Fix Version/s: (was: 0.90.6)
   0.90.7

Moving into 0.90.7

 Server not running exception during meta verification causes RS abort
 ---

 Key: HBASE-4288
 URL: https://issues.apache.org/jira/browse/HBASE-4288
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.4
Reporter: Todd Lipcon
Priority: Critical
 Fix For: 0.90.7

 Attachments: 4288-v2.txt, 4288.txt


 The master tried to verify the META location just as that server was shutting 
 down due to an abort. This caused the Server not running exception to get 
 thrown, which wasn't handled properly in the master, causing it to abort.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4298) Support to drain RS nodes through ZK

2012-01-25 Thread ramkrishna.s.vasudevan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4298:
--

Fix Version/s: 0.92.0

@Stack
This issue has gone into 0.92 and trunk. As it is a new feature do you want to 
go into future 0.90 releases? if not can remove the fix versions as 0.90?

 Support to drain RS nodes through ZK
 

 Key: HBASE-4298
 URL: https://issues.apache.org/jira/browse/HBASE-4298
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.90.4
 Environment: all
Reporter: Aravind Gottipati
Priority: Critical
  Labels: patch
 Fix For: 0.90.7, 0.92.0

 Attachments: 4298-trunk-v2.txt, 4298-trunk-v3.txt, 90_hbase.patch, 
 drainingservertest-v2.txt, drainingservertest.txt, trunk_hbase.patch, 
 trunk_with_test.txt


 HDFS currently has a way to exclude certain datanodes and prevent them from 
 getting new blocks.  HDFS goes one step further and even drains these nodes 
 for you.  This enhancement is a step in that direction.
 The idea is that we mark nodes in zookeeper as draining nodes.  This means 
 that they don't get any more new regions.  These draining nodes look exactly 
 the same as the corresponding nodes in /rs, except they live under /draining.
 Eventually, support for draining them can be added.  I am submitting two 
 patches for review - one for the 0.90 branch and one for trunk (in git).
 Here are the two patches
 0.90 - 
 https://github.com/aravind/hbase/commit/181041e72e7ffe6a4da6d82b431ef7f8c99e62d2
 trunk - 
 https://github.com/aravind/hbase/commit/e127b25ae3b4034103b185d8380f3b7267bc67d5
 I have tested both these patches and they work as advertised.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4288) Server not running exception during meta verification causes RS abort

2012-01-25 Thread ramkrishna.s.vasudevan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4288:
--

Fix Version/s: 0.92.0

Already committed to 0.92 and trunk  But not into 0.90.  Hence not resolving 
just updating fix version.

 Server not running exception during meta verification causes RS abort
 ---

 Key: HBASE-4288
 URL: https://issues.apache.org/jira/browse/HBASE-4288
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.4
Reporter: Todd Lipcon
Priority: Critical
 Fix For: 0.90.7, 0.92.0

 Attachments: 4288-v2.txt, 4288.txt


 The master tried to verify the META location just as that server was shutting 
 down due to an abort. This caused the Server not running exception to get 
 thrown, which wasn't handled properly in the master, causing it to abort.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4550) When master passed regionserver different address , because regionserver didn't create new zookeeper znode, as a result stop-hbase.sh is hang

2012-01-25 Thread ramkrishna.s.vasudevan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4550:
--

Fix Version/s: (was: 0.90.6)
   0.90.7

@Wanbin
If you provide a new patch can be integrated to 0.90.7
Moving to 0.90.7

 When master passed regionserver different address , because regionserver 
 didn't create new zookeeper znode,  as  a result stop-hbase.sh is hang
 ---

 Key: HBASE-4550
 URL: https://issues.apache.org/jira/browse/HBASE-4550
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.3
Reporter: wanbin
Assignee: wanbin
 Fix For: 0.90.7

 Attachments: hbase-0.90.3.patch, patch, patch.txt

   Original Estimate: 2h
  Remaining Estimate: 2h

 when master passed regionserver different address, regionserver didn't create 
 new zookeeper znode, master store new address in ServerManager, when call 
 stop-hbase.sh , RegionServerTracker.nodeDeleted received path is old address, 
 serverManager.expireServer is not be called. so stop-hbase.sh is hang.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3134) [replication] Add the ability to enable/disable streams

2012-01-25 Thread Teruyoshi Zenmyo (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teruyoshi Zenmyo updated HBASE-3134:


Attachment: HBASE-3134.patch

The patch introduces a znode which indicates that replication to a peer is 
disabled. ReplicationSource skips sending entries if the znode exists.

 [replication] Add the ability to enable/disable streams
 ---

 Key: HBASE-3134
 URL: https://issues.apache.org/jira/browse/HBASE-3134
 Project: HBase
  Issue Type: New Feature
  Components: replication
Reporter: Jean-Daniel Cryans
Priority: Minor
 Attachments: HBASE-3134.patch


 This jira was initially in the scope of HBASE-2201, but was pushed out since 
 it has low value compared to the required effort (and when want to ship 
 0.90.0 rather soonish).
 We need to design a way to enable/disable replication streams in a 
 determinate fashion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4848) TestScanner failing because hostname can't be null

2012-01-25 Thread ramkrishna.s.vasudevan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4848:
--

Fix Version/s: (was: 0.90.6)
   0.90.5
   0.92.0

Already committed and resolved issue.  Hence closing the issue as resolved.

 TestScanner failing because hostname can't be null
 --

 Key: HBASE-4848
 URL: https://issues.apache.org/jira/browse/HBASE-4848
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.5
Reporter: stack
Assignee: stack
 Fix For: 0.90.5, 0.92.0

 Attachments: 4848-092.txt, 4848.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira