date:20130801

[
https://issues.apache.org/jira/browse/HBASE-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

stack updated HBASE-8224:
-

Attachment: 8224.gen.script.txt

This patch includes latest over on hbase-8488 so ignore the changes to do w/
dependency inclusion and purge of slf4j for now (I'll fix up the patch later).

Patch adds a script into dev-support called generate-hadoopX-poms.sh

Run the script to generate a hadoop1 or hadoop2 pom from the original pom. The
new pom shows up to the side of the original. Next build telling mvn to use
this new pom. This seems to generate artifacts that are w/o pollution: i.e. no
need for downstreamer to add a -Dhadoop.profile=2.0, etc.

I used this script to deploy new hbase-hadoop1 and hbase-hadoop2 snapshots up
at
https://repository.apache.org/content/repositories/snapshots/org/apache/hbase/hbase/
([~brocknoland] -- you have a chance to take a look at them?).

Here is roughly what you do to build artifacts to publish:

$ bash -x ./dev-support/generate-hadoopX-poms.sh 0.95.2-SNAPSHOT
0.95.2-hadoop1-SNAPSHOT
$ $ mvn clean install deploy -DskipTests -Pgpg -f pom.xml.hadoop1

The head of the script has more on how it works.

I'll write it up better in the manual after I've played some more (need to look
at assembly's and at maven release).

Add '-hadoop1' or '-hadoop2' to our version string
--

Key: HBASE-8224
URL: https://issues.apache.org/jira/browse/HBASE-8224
Project: HBase
Issue Type: Task
Reporter: stack
Assignee: stack
Priority: Blocker
Fix For: 0.95.2

Attachments: 8224-adding.classifiers.txt, 8224.gen.script.txt,
hbase-8224-proto1.patch

So we can publish both the hadoop1 and the hadoop2 jars to a maven
repository, and so we can publish two packages, one for hadoop1 and one for
hadoop2, given how maven works, our only alternative (to the best of my
knowledge and after consulting others) is by amending the version string to
include hadoop1 or hadoop2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8741) Mutations on Regions in recovery mode might have same sequenceIDs

[
https://issues.apache.org/jira/browse/HBASE-8741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Himanshu Vashishtha updated HBASE-8741:
---

Attachment: HBASE-8741-v4.patch

Patch that contains latest review comments.

Attaching here for qabot. Ran jenkins + mvn test -Dtest=TestHLog* multiple
times. Also ran a patched version locally.

Mutations on Regions in recovery mode might have same sequenceIDs
-

Key: HBASE-8741
URL: https://issues.apache.org/jira/browse/HBASE-8741
Project: HBase
Issue Type: Bug
Components: MTTR
Affects Versions: 0.95.1
Reporter: Himanshu Vashishtha
Assignee: Himanshu Vashishtha
Attachments: HBASE-8741-v0.patch, HBASE-8741-v2.patch,
HBASE-8741-v3.patch, HBASE-8741-v4.patch

Currently, when opening a region, we find the maximum sequence ID from all
its HFiles and then set the LogSequenceId of the log (in case the later is at
a small value). This works good in recovered.edits case as we are not writing
to the region until we have replayed all of its previous edits.
With distributed log replay, if we want to enable writes while a region is
under recovery, we need to make sure that the logSequenceId maximum
logSequenceId of the old regionserver. Otherwise, we might have a situation
where new edits have same (or smaller) sequenceIds.
We can store region level information in the WALTrailer, than this scenario
could be avoided by:
a) reading the trailer of the last completed file, i.e., last wal file
which has a trailer and,
b) completely reading the last wal file (this file would not have the
trailer, so it needs to be read completely).
In future, if we switch to multi wal file, we could read the trailer for all
completed WAL files, and reading the remaining incomplete files.

[jira] [Commented] (HBASE-9106) Do not fail TestAcidGuarantees for exceptions on table flush


[ 
https://issues.apache.org/jira/browse/HBASE-9106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726097#comment-13726097
 ] 

stack commented on HBASE-9106:
--

+1

 Do not fail TestAcidGuarantees for exceptions on table flush
 

 Key: HBASE-9106
 URL: https://issues.apache.org/jira/browse/HBASE-9106
 Project: HBase
  Issue Type: Test
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.98.0, 0.95.2, 0.94.11

 Attachments: hbase-9106-0.94_v1.patch, hbase-9106_v1.patch


 TestAcidGuarantees failed in one run due to a flush taking longer than 
 60sec, with:
 {code}
 HBaseClient$CallTimeoutException: Call id=152, waitTime=60007, 
 rpcTimetout=6
 {code}
 We should ignore the exceptions coming from table flushes, since they are not 
 essential to the test. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8960) TestDistributedLogSplitting.testLogReplayForDisablingTable fails sometimes


[ 
https://issues.apache.org/jira/browse/HBASE-8960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726103#comment-13726103
 ] 

stack commented on HBASE-8960:
--

[~jeffreyz] This TDLS seems to be the only one that hangs on occasion now.  
Lets see if I can get more info on fails.  BTW, where is that fancy jenkins 
test checker of yours!  We need it here!

 TestDistributedLogSplitting.testLogReplayForDisablingTable fails sometimes
 --

 Key: HBASE-8960
 URL: https://issues.apache.org/jira/browse/HBASE-8960
 Project: HBase
  Issue Type: Task
  Components: test
Reporter: Jimmy Xiang
Assignee: Jeffrey Zhong
Priority: Minor
 Fix For: 0.95.2

 Attachments: hbase-8960-addendum-2.patch, hbase-8960-addendum.patch, 
 hbase-8960.patch


 http://54.241.6.143/job/HBase-0.95-Hadoop-2/org.apache.hbase$hbase-server/634/testReport/junit/org.apache.hadoop.hbase.master/TestDistributedLogSplitting/testLogReplayForDisablingTable/
 {noformat}
 java.lang.AssertionError: expected:1000 but was:0
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at org.junit.Assert.assertEquals(Assert.java:542)
   at 
 org.apache.hadoop.hbase.master.TestDistributedLogSplitting.testLogReplayForDisablingTable(TestDistributedLogSplitting.java:797)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
   at 
 org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8224) Add '-hadoop1' or '-hadoop2' to our version string


[ 
https://issues.apache.org/jira/browse/HBASE-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726115#comment-13726115
 ] 

stack commented on HBASE-8224:
--

I updated my downstreamer w/ notes on dependencies.  Some should be coming in 
transitively and actually are if I poke w/ dependency:tree (e.g. 
hbase-hadoop-compat) but building there is strange metrics fail if I don't 
include explicitly.  The poms have notes on the listed dependencies.  See 
https://github.com/saintstack/hbase-downstreamer

 Add '-hadoop1' or '-hadoop2' to our version string
 --

 Key: HBASE-8224
 URL: https://issues.apache.org/jira/browse/HBASE-8224
 Project: HBase
  Issue Type: Task
Reporter: stack
Assignee: stack
Priority: Blocker
 Fix For: 0.95.2

 Attachments: 8224-adding.classifiers.txt, 8224.gen.script.txt, 
 hbase-8224-proto1.patch


 So we can publish both the hadoop1 and the hadoop2 jars to a maven 
 repository, and so we can publish two packages, one for hadoop1 and one for 
 hadoop2, given how maven works, our only alternative (to the best of my 
 knowledge and after consulting others) is by amending the version string to 
 include hadoop1 or hadoop2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-9086) Add some options to improve count performance


[ 
https://issues.apache.org/jira/browse/HBASE-9086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726154#comment-13726154
 ] 

Lars Hofhansl commented on HBASE-9086:
--

Patch looks good. Some indentation issues, can be fixed upon commit.
Nit:
{code}
+  filter = org.apache.hadoop.hbase.filter.FirstKeyOnlyFilter.new
+  if key_only == 'yes'
+   filter = org.apache.hadoop.hbase.filter.KeyOnlyFilter.new
+  else
{code}

Could move create of the FirstKeyOnlyFilter into the else branch.

 Add some options to improve count performance
 -

 Key: HBASE-9086
 URL: https://issues.apache.org/jira/browse/HBASE-9086
 Project: HBase
  Issue Type: Wish
  Components: shell
Affects Versions: 0.94.2
Reporter: Cheney Sun
 Attachments: HBase-9086.patch


 The current count command in HBase shell is quite slow if the row size is 
 very big (100+kB each). It would be helpful to provide some option to specify 
 the column to count, which could give user a chance to reduce the data volume 
 to scan. 
 IMHO, only count the row key would be the ideal solution. Not sure how 
 difficult to implement it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-9086) Add some options to improve count performance


[ 
https://issues.apache.org/jira/browse/HBASE-9086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726161#comment-13726161
 ] 

Lars Hofhansl commented on HBASE-9086:
--

Wait, is that doing what you expect? The KeyOnlyFilter will emit a KV for each 
KV it encounters, whereas the FirstKeyOnly filter will only emit a KV once for 
each row.
So if all your rows had 3 columns, you'd get 3x the row count.


 Add some options to improve count performance
 -

 Key: HBASE-9086
 URL: https://issues.apache.org/jira/browse/HBASE-9086
 Project: HBase
  Issue Type: Wish
  Components: shell
Affects Versions: 0.94.2
Reporter: Cheney Sun
 Attachments: HBase-9086.patch


 The current count command in HBase shell is quite slow if the row size is 
 very big (100+kB each). It would be helpful to provide some option to specify 
 the column to count, which could give user a chance to reduce the data volume 
 to scan. 
 IMHO, only count the row key would be the ideal solution. Not sure how 
 difficult to implement it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8741) Mutations on Regions in recovery mode might have same sequenceIDs

2013-08-01 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-8741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726177#comment-13726177
]

Hadoop QA commented on HBASE-8741:
--

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12595359/HBASE-8741-v4.patch
against trunk revision .

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 36 new
or modified tests.

{color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop
1.0 profile.

{color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop
2.0 profile.

{color:green}+1 javadoc{color}. The javadoc tool did not generate any
warning messages.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.

{color:green}+1 lineLengths{color}. The patch does not introduce lines
longer than 100

{color:green}+1 site{color}. The mvn site goal succeeds with this patch.

{color:red}-1 core tests{color}. The patch failed these unit tests:
org.apache.hadoop.hbase.client.TestAdmin

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/6553//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/6553//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/6553//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/6553//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/6553//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/6553//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/6553//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/6553//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/6553//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/6553//console

This message is automatically generated.

Mutations on Regions in recovery mode might have same sequenceIDs
-

[jira] [Commented] (HBASE-8768) Improve bulk load performance by moving key value construction from map phase to reduce phase.

2013-08-01 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726189#comment-13726189
 ] 

Hudson commented on HBASE-8768:
---

SUCCESS: Integrated in hbase-0.95 #392 (See 
[https://builds.apache.org/job/hbase-0.95/392/])
HBASE-8768 Improve bulk load performance by moving key value construction from 
map phase to reduce phase (Rajshbabu) (tedyu: rev 1509079)
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TextSortReducer.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TsvImporterTextMapper.java
* 
/hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsv.java
* 
/hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsvParser.java


 Improve bulk load performance by moving key value construction from map phase 
 to reduce phase.
 --

 Key: HBASE-8768
 URL: https://issues.apache.org/jira/browse/HBASE-8768
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce, Performance
Reporter: rajeshbabu
Assignee: rajeshbabu
 Fix For: 0.98.0, 0.95.2

 Attachments: HBASE-8768_v2.patch, HBASE-8768_v3.patch, 
 HBASE-8768_v4.patch, HBase_Bulkload_Performance_Improvement.pdf


 ImportTSV bulkloading approach uses MapReduce framework. Existing mapper and 
 reducer classes used by ImportTSV are TsvImporterMapper.java and 
 PutSortReducer.java. ImportTSV tool parses the tab(by default) seperated 
 values from the input files and Mapper class generates the PUT objects for 
 each row using the Key value pairs created from the parsed text. 
 PutSortReducer then uses the partions based on the regions and sorts the Put 
 objects for each region. 
 Overheads we can see in the above approach:
 ==
 1) keyvalue construction for each parsed value in the line adding extra data 
 like rowkey,columnfamily,qualifier which will increase around 5x extra data 
 to be shuffled in reduce phase.
 We can calculate data size to shuffled as below
 {code}
  Data to be shuffled = nl*nt*(rl+cfl+cql+vall+tsl+30)
 {code}
 If we move keyvalue construction to reduce phase we datasize to be shuffle 
 will be which is very less compared to above.
 {code}
  Data to be shuffled = nl*nt*vall
 {code}
 nl - Number of lines in the raw file
 nt - Number of tabs or columns including row key.
 rl - row length which will be different for each line.
 cfl - column family length which will be different for each family
 cql - qualifier length
 tsl - timestamp length.
 vall- each parsed value length.
 30 bytes for kv size,number of families etc.
 2) In mapper side we are creating put objects by adding all keyvalues 
 constructed for each line and in reducer we will again collect keyvalues from 
 put and sort them.
 Instead we can directly create and sort keyvalues in reducer.
 Solution:
 
 We can improve bulk load performance by moving the key value construction 
 from mapper to reducer so that Mapper just sends the raw text for each row to 
 the Reducer. Reducer then parses the records for rows and create and sort the 
 key value pairs before writing to HFiles. 
 Conclusion:
 ===
 The above suggestions will improve map phase performance by avoiding keyvalue 
 construction and reduce phase performance by avoiding excess data to be 
 shuffled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7266) [89-fb] Using pread for non-compaction read request

2013-08-01 Thread Chao Shi (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726207#comment-13726207
]

Chao Shi commented on HBASE-7266:
-

Lars, yes, we're using an early 0.94 version. Thanks for the fix and we will
try to benchmark a newer version. I think the fix is too tricky and I would
prefer the idea Liyin proposed. Let's discus in the new issue.

[89-fb] Using pread for non-compaction read request
---

Key: HBASE-7266
URL: https://issues.apache.org/jira/browse/HBASE-7266
Project: HBase
Issue Type: Improvement
Reporter: Liyin Tang

There are 2 kinds of read operations in HBase: pread and seek+read.
Pread, positional read, is stateless and create a new connection between the
DFSClient and DataNode for each operation. While seek+read is to seek to a
specific postion and prefetch blocks from data nodes. The benefit of
seek+read is that it will cache the prefetch result but the downside is it is
stateful and needs to synchronized.
So far, both compaction and scan are using seek+read, which caused some
resource contention. So using the pread for the scan request can avoid the
resource contention. In addition, the region server is able to do the
prefetch for the scan request (HBASE-6874) so that it won't be necessary to
let the DFSClient to prefetch the data any more.
I will run through the scan benchmark (with no block cache) with verify the
performance.

[jira] [Commented] (HBASE-4633) Potential memory leak in client RPC timeout mechanism

2013-08-01 Thread Asaf Mesika (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-4633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726222#comment-13726222
]

Asaf Mesika commented on HBASE-4633:

So eventually, this issue wasn't solved? We're experiencing this brutally in
production with 0.94.7

Potential memory leak in client RPC timeout mechanism
-

Key: HBASE-4633
URL: https://issues.apache.org/jira/browse/HBASE-4633
Project: HBase
Issue Type: Bug
Components: Client
Affects Versions: 0.90.3
Environment: HBase version: 0.90.3 + Patches , Hadoop version: CDH3u0
Reporter: Shrijeet Paliwal
Attachments: HBaseclientstack.png

Relevant Jiras: https://issues.apache.org/jira/browse/HBASE-2937,
https://issues.apache.org/jira/browse/HBASE-4003
We have been using the 'hbase.client.operation.timeout' knob
introduced in 2937 for quite some time now. It helps us enforce SLA.
We have two HBase clusters and two HBase client clusters. One of them
is much busier than the other.
We have seen a deterministic behavior of clients running in busy
cluster. Their (client's) memory footprint increases consistently
after they have been up for roughly 24 hours.
This memory footprint almost doubles from its usual value (usual case
== RPC timeout disabled). After much investigation nothing concrete
came out and we had to put a hack
which keep heap size in control even when RPC timeout is enabled. Also
note , the same behavior is not observed in 'not so busy
cluster.
The patch is here : https://gist.github.com/1288023

[jira] [Commented] (HBASE-9102) HFile block pre-loading for large sequential scan

2013-08-01 Thread Chao Shi (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-9102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726224#comment-13726224
]

Chao Shi commented on HBASE-9102:
-

I don't think block cache should be used for such prefetch, as large sequential
scan will swap-out blocks for random read.
If we use hdfs client for prefetch, we also need to implement scanner-sticky
DFSInputStream, as seek called by another scanner will clear all the prefetch
work.

Another question is how do we consider if a scan is sequential or random. The
current implementation (before Lars's patch HBASE-7336) only treats Get as
random and thus uses pread. In our scenario, there are two kinds of scans: a)
from online system and b) MR. Most of a) does not scan more than 1 block and
are expected to return within tens of milliseconds.

HFile block pre-loading for large sequential scan
-

Key: HBASE-9102
URL: https://issues.apache.org/jira/browse/HBASE-9102
Project: HBase
Issue Type: Improvement
Affects Versions: 0.89-fb
Reporter: Liyin Tang
Assignee: Liyin Tang

The current HBase scan model cannot take full advantage of the aggrediate
disk throughput, especially for the large sequential scan cases. And for the
large sequential scan, it is easy to predict what the next block to read in
advance so that it can pre-load and decompress/decoded these data blocks from
HDFS into block cache right before the current read point.
Therefore, this jira is to optimized the large sequential scan performance by
pre-loading the HFile blocks into the block cache in a stream fashion so that
the scan query can read from the cache directly.

[jira] [Commented] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.

2013-08-01 Thread gautam (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726297#comment-13726297
 ] 

gautam commented on HBASE-9108:
---

I will shortly update the change I plan to do.

 LoadTestTool need to have a way to ignore keys which were failed during 
 write. 
 ---

 Key: HBASE-9108
 URL: https://issues.apache.org/jira/browse/HBASE-9108
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.95.0, 0.95.1, 0.94.9, 0.94.10
Reporter: gautam
Assignee: gautam
Priority: Critical
   Original Estimate: 48h
  Remaining Estimate: 48h

 While running the chaosmonkey integration tests, it is found that write 
 sometimes fails when the cluster components are restarted/stopped/killed etc..
 The data key which was being put, using the LoadTestTool, is added to the 
 failed key set, and at the end of the test, this failed key set is checked 
 for any entries to assert failures.
 While doing fail-over testing, it is expected that some of the keys may go 
 un-written. The point here is to validate that whatever gets into hbase for 
 an unstable cluster really goes in, and hence read should be 100% for 
 whatever keys went in successfully.
 Currently LoadTestTool has strict checks to validate every key being written 
 or not. In case any keys is not written, it fails.
 I wanted to loosen this constraint by allowing users to pass in a set of 
 exceptions they expect when doing put/write operations over hbase. If one of 
 these expected exception set is thrown while writing key to hbase, the failed 
 key would be ignored, and hence wont even be considered again for subsequent 
 write as well as read.
 This can be passed to the load test tool as csv list parameter 
 -allowed_write_exceptions, or it can be passed through hbase-site.xml by 
 writing a value for test.ignore.exceptions.during.write
 Here is the usage:
 -allowed_write_exceptions 
 java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException
 Hence, by doing this the existing integration tests can also make use of this 
 change by passing it as property in hbase-site.xml, as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.

2013-08-01 Thread gautam (JIRA)

gautam created HBASE-9108:
-

Summary: LoadTestTool need to have a way to ignore keys which were
failed during write.
Key: HBASE-9108
URL: https://issues.apache.org/jira/browse/HBASE-9108
Project: HBase
Issue Type: Improvement
Components: test
Affects Versions: 0.94.10, 0.94.9, 0.95.1, 0.95.0
Reporter: gautam
Assignee: gautam
Priority: Critical

While running the chaosmonkey integration tests, it is found that write
sometimes fails when the cluster components are restarted/stopped/killed etc..
The data key which was being put, using the LoadTestTool, is added to the
failed key set, and at the end of the test, this failed key set is checked for
any entries to assert failures.

While doing fail-over testing, it is expected that some of the keys may go
un-written. The point here is to validate that whatever gets into hbase for an
unstable cluster really goes in, and hence read should be 100% for whatever
keys went in successfully.

Currently LoadTestTool has strict checks to validate every key being written or
not. In case any keys is not written, it fails.
I wanted to loosen this constraint by allowing users to pass in a set of
exceptions they expect when doing put/write operations over hbase. If one of
these expected exception set is thrown while writing key to hbase, the failed
key would be ignored, and hence wont even be considered again for subsequent
write as well as read.
This can be passed to the load test tool as csv list parameter
-allowed_write_exceptions, or it can be passed through hbase-site.xml by
writing a value for test.ignore.exceptions.during.write

Here is the usage:
-allowed_write_exceptions
java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException

Hence, by doing this the existing integration tests can also make use of this
change by passing it as property in hbase-site.xml, as well.

[jira] [Updated] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.

2013-08-01 Thread gautam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gautam updated HBASE-9108:
--

Attachment: HBASE-9108.patch._0.94

Here is the change for 0.94 branch.

 LoadTestTool need to have a way to ignore keys which were failed during 
 write. 
 ---

 Key: HBASE-9108
 URL: https://issues.apache.org/jira/browse/HBASE-9108
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.95.0, 0.95.1, 0.94.9, 0.94.10
Reporter: gautam
Assignee: gautam
Priority: Critical
 Attachments: HBASE-9108.patch._0.94

   Original Estimate: 48h
  Remaining Estimate: 48h

 While running the chaosmonkey integration tests, it is found that write 
 sometimes fails when the cluster components are restarted/stopped/killed etc..
 The data key which was being put, using the LoadTestTool, is added to the 
 failed key set, and at the end of the test, this failed key set is checked 
 for any entries to assert failures.
 While doing fail-over testing, it is expected that some of the keys may go 
 un-written. The point here is to validate that whatever gets into hbase for 
 an unstable cluster really goes in, and hence read should be 100% for 
 whatever keys went in successfully.
 Currently LoadTestTool has strict checks to validate every key being written 
 or not. In case any keys is not written, it fails.
 I wanted to loosen this constraint by allowing users to pass in a set of 
 exceptions they expect when doing put/write operations over hbase. If one of 
 these expected exception set is thrown while writing key to hbase, the failed 
 key would be ignored, and hence wont even be considered again for subsequent 
 write as well as read.
 This can be passed to the load test tool as csv list parameter 
 -allowed_write_exceptions, or it can be passed through hbase-site.xml by 
 writing a value for test.ignore.exceptions.during.write
 Here is the usage:
 -allowed_write_exceptions 
 java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException
 Hence, by doing this the existing integration tests can also make use of this 
 change by passing it as property in hbase-site.xml, as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8768) Improve bulk load performance by moving key value construction from map phase to reduce phase.

2013-08-01 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726365#comment-13726365
 ] 

Hudson commented on HBASE-8768:
---

SUCCESS: Integrated in hbase-0.95-on-hadoop2 #212 (See 
[https://builds.apache.org/job/hbase-0.95-on-hadoop2/212/])
HBASE-8768 Improve bulk load performance by moving key value construction from 
map phase to reduce phase (Rajshbabu) (tedyu: rev 1509079)
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TextSortReducer.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TsvImporterTextMapper.java
* 
/hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsv.java
* 
/hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsvParser.java


 Improve bulk load performance by moving key value construction from map phase 
 to reduce phase.
 --

 Key: HBASE-8768
 URL: https://issues.apache.org/jira/browse/HBASE-8768
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce, Performance
Reporter: rajeshbabu
Assignee: rajeshbabu
 Fix For: 0.98.0, 0.95.2

 Attachments: HBASE-8768_v2.patch, HBASE-8768_v3.patch, 
 HBASE-8768_v4.patch, HBase_Bulkload_Performance_Improvement.pdf


 ImportTSV bulkloading approach uses MapReduce framework. Existing mapper and 
 reducer classes used by ImportTSV are TsvImporterMapper.java and 
 PutSortReducer.java. ImportTSV tool parses the tab(by default) seperated 
 values from the input files and Mapper class generates the PUT objects for 
 each row using the Key value pairs created from the parsed text. 
 PutSortReducer then uses the partions based on the regions and sorts the Put 
 objects for each region. 
 Overheads we can see in the above approach:
 ==
 1) keyvalue construction for each parsed value in the line adding extra data 
 like rowkey,columnfamily,qualifier which will increase around 5x extra data 
 to be shuffled in reduce phase.
 We can calculate data size to shuffled as below
 {code}
  Data to be shuffled = nl*nt*(rl+cfl+cql+vall+tsl+30)
 {code}
 If we move keyvalue construction to reduce phase we datasize to be shuffle 
 will be which is very less compared to above.
 {code}
  Data to be shuffled = nl*nt*vall
 {code}
 nl - Number of lines in the raw file
 nt - Number of tabs or columns including row key.
 rl - row length which will be different for each line.
 cfl - column family length which will be different for each family
 cql - qualifier length
 tsl - timestamp length.
 vall- each parsed value length.
 30 bytes for kv size,number of families etc.
 2) In mapper side we are creating put objects by adding all keyvalues 
 constructed for each line and in reducer we will again collect keyvalues from 
 put and sort them.
 Instead we can directly create and sort keyvalues in reducer.
 Solution:
 
 We can improve bulk load performance by moving the key value construction 
 from mapper to reducer so that Mapper just sends the raw text for each row to 
 the Reducer. Reducer then parses the records for rows and create and sort the 
 key value pairs before writing to HFiles. 
 Conclusion:
 ===
 The above suggestions will improve map phase performance by avoiding keyvalue 
 construction and reduce phase performance by avoiding excess data to be 
 shuffled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-9108) LoadTestTool need to have a way to ignore keys which were failed during write.

2013-08-01 Thread Jean-Marc Spaggiari (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726379#comment-13726379
 ] 

Jean-Marc Spaggiari commented on HBASE-9108:


Hi @gautam, Can you please do you patch against trunk so we can trigger Hadoop 
QA?

 LoadTestTool need to have a way to ignore keys which were failed during 
 write. 
 ---

 Key: HBASE-9108
 URL: https://issues.apache.org/jira/browse/HBASE-9108
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.95.0, 0.95.1, 0.94.9, 0.94.10
Reporter: gautam
Assignee: gautam
Priority: Critical
 Attachments: HBASE-9108.patch._0.94

   Original Estimate: 48h
  Remaining Estimate: 48h

 While running the chaosmonkey integration tests, it is found that write 
 sometimes fails when the cluster components are restarted/stopped/killed etc..
 The data key which was being put, using the LoadTestTool, is added to the 
 failed key set, and at the end of the test, this failed key set is checked 
 for any entries to assert failures.
 While doing fail-over testing, it is expected that some of the keys may go 
 un-written. The point here is to validate that whatever gets into hbase for 
 an unstable cluster really goes in, and hence read should be 100% for 
 whatever keys went in successfully.
 Currently LoadTestTool has strict checks to validate every key being written 
 or not. In case any keys is not written, it fails.
 I wanted to loosen this constraint by allowing users to pass in a set of 
 exceptions they expect when doing put/write operations over hbase. If one of 
 these expected exception set is thrown while writing key to hbase, the failed 
 key would be ignored, and hence wont even be considered again for subsequent 
 write as well as read.
 This can be passed to the load test tool as csv list parameter 
 -allowed_write_exceptions, or it can be passed through hbase-site.xml by 
 writing a value for test.ignore.exceptions.during.write
 Here is the usage:
 -allowed_write_exceptions 
 java.io.EOFException,org.apache.hadoop.hbase.NotServingRegionException,org.apache.hadoop.hbase.client.NoServerForRegionException,org.apache.hadoop.hbase.ipc.ServerNotRunningYetException
 Hence, by doing this the existing integration tests can also make use of this 
 change by passing it as property in hbase-site.xml, as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8220) can we record the count opened HTable for HTablePool

2013-08-01 Thread Jean-Marc Spaggiari (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726383#comment-13726383
 ] 

Jean-Marc Spaggiari commented on HBASE-8220:


[~cuijianwei] can you rebase your patch on last trunk version? Then we will see 
if anyone can commit it?

 can we record the count opened HTable for HTablePool
 

 Key: HBASE-8220
 URL: https://issues.apache.org/jira/browse/HBASE-8220
 Project: HBase
  Issue Type: Improvement
  Components: Client
Affects Versions: 0.94.3
Reporter: cuijianwei
Assignee: cuijianwei
 Attachments: 8220-trunk-v1.txt, 8220-trunk-v2.txt, 
 8220-trunk-v3-reattached.txt, 8220-trunk-v3.txt, 8220-trunk-v4.txt, 
 HBASE-8220-0.94.3.txt, HBASE-8220-0.94.3.txt, HBASE-8220-0.94.3.txt-v2, 
 HBASE-8220-0.94.3-v2.txt, HBASE-8220-0.94.3-v3.txt, HBASE-8220-0.94.3-v4.txt, 
 HBASE-8220-0.94.3-v5.txt


 In HTablePool, we have a method getCurrentPoolSize(...) to get how many 
 opened HTable has been pooled. However, we don't know ConcurrentOpenedHTable 
 which means the count of HTable get from HTablePool.getTable(...) and don't 
 return to HTablePool by PooledTable.close(). The ConcurrentOpenedHTable may 
 be meaningful because it indicates how many HTables should be opened for the 
 application which may help us set the appropriate MaxSize of HTablePool. 
 Therefore, we can and a ConcurrentOpenedHTable as a counter in HTablePool.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-9086) Add some options to improve count performance

2013-08-01 Thread Jean-Marc Spaggiari (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726390#comment-13726390
 ] 

Jean-Marc Spaggiari commented on HBASE-9086:


Then we should give the 2 filters together on a FilterList, right? first 
FirstKeyOnlyFilter to make sure we don't look at the other columns, then 
KeyOnlyFilter to make sure we don't transfert the value.

 Add some options to improve count performance
 -

 Key: HBASE-9086
 URL: https://issues.apache.org/jira/browse/HBASE-9086
 Project: HBase
  Issue Type: Wish
  Components: shell
Affects Versions: 0.94.2
Reporter: Cheney Sun
 Attachments: HBase-9086.patch


 The current count command in HBase shell is quite slow if the row size is 
 very big (100+kB each). It would be helpful to provide some option to specify 
 the column to count, which could give user a chance to reduce the data volume 
 to scan. 
 IMHO, only count the row key would be the ideal solution. Not sure how 
 difficult to implement it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-9087) Handlers being blocked during reads

2013-08-01 Thread Pablo Medina (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726416#comment-13726416
 ] 

Pablo Medina commented on HBASE-9087:
-

I ran this issue when asking concurrently the same keys. Looking at the stack 
trace It turns out that the bottleneck is on the Store level. So I guess that 
you should run some kind of test that retrieve under concurrency some set of 
keys belonging to the same Store. Does that make sense?

 Handlers being blocked during reads
 ---

 Key: HBASE-9087
 URL: https://issues.apache.org/jira/browse/HBASE-9087
 Project: HBase
  Issue Type: Bug
  Components: Performance
Affects Versions: 0.94.7, 0.95.1
Reporter: Pablo Medina
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.98.0, 0.95.2, 0.94.11

 Attachments: HBASE-9087-0.patch, HBASE-9087-1.patch


 I'm having a lot of handlers (90 - 300 aprox) being blocked when reading 
 rows. They are blocked during changedReaderObserver registration.
 Lars Hofhansl suggests to change the implementation of changedReaderObserver 
 from CopyOnWriteList to ConcurrentHashMap.
 Here is a stack trace: 
 IPC Server handler 99 on 60020 daemon prio=10 tid=0x41c84000 
 nid=0x2244 waiting on condition [0x7ff51fefd000]
java.lang.Thread.State: WAITING (parking)
 at sun.misc.Unsafe.park(Native Method)
 - parking to wait for  0xc5c13ae8 (a 
 java.util.concurrent.locks.ReentrantLock$NonfairSync)
 at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178)
 at 
 java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
 at 
 java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
 at 
 java.util.concurrent.CopyOnWriteArrayList.addIfAbsent(CopyOnWriteArrayList.java:553)
 at 
 java.util.concurrent.CopyOnWriteArraySet.add(CopyOnWriteArraySet.java:221)
 at 
 org.apache.hadoop.hbase.regionserver.Store.addChangedReaderObserver(Store.java:1085)
 at 
 org.apache.hadoop.hbase.regionserver.StoreScanner.init(StoreScanner.java:138)
 at 
 org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:2077)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.init(HRegion.java:3755)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:1804)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1796)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1771)
 at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4776)
 at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4750)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2152)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3700)
 at sun.reflect.GeneratedMethodAccessor26.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)
   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5954) Allow proper fsync support for HBase

2013-08-01 Thread Dave Latham (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726428#comment-13726428
 ] 

Dave Latham commented on HBASE-5954:


Should update the recommended HDFS configuration in the book then?  I think 
losing a region of data after a compaction and power failure should be 
prevented by default.

 Allow proper fsync support for HBase
 

 Key: HBASE-5954
 URL: https://issues.apache.org/jira/browse/HBASE-5954
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2

 Attachments: 5954-trunk-hdfs-trunk.txt, 5954-trunk-hdfs-trunk-v2.txt, 
 5954-trunk-hdfs-trunk-v3.txt, 5954-trunk-hdfs-trunk-v4.txt, 
 5954-trunk-hdfs-trunk-v5.txt, 5954-trunk-hdfs-trunk-v6.txt, hbase-hdfs-744.txt


 At least get recommendation into 0.96 doc and some numbers running w/ this 
 hdfs feature enabled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8741) Mutations on Regions in recovery mode might have same sequenceIDs

[
https://issues.apache.org/jira/browse/HBASE-8741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726438#comment-13726438
]

Himanshu Vashishtha commented on HBASE-8741:

Skimmed the console, failure looks unrelated. Create table fails. That test has
passed in my testing. I re-ran it just now too. I will rb'ed this patch.

Mutations on Regions in recovery mode might have same sequenceIDs
-

[jira] [Created] (HBASE-9109) Null pointer exception while invoking coprocessor.

2013-08-01 Thread Mayur (JIRA)

Mayur created HBASE-9109:


 Summary: Null pointer exception while invoking coprocessor.
 Key: HBASE-9109
 URL: https://issues.apache.org/jira/browse/HBASE-9109
 Project: HBase
  Issue Type: Bug
  Components: Client, Coprocessors
Affects Versions: 0.98.0
 Environment: OS - CentOS release 6.2 (Final)
JAVA - java version 1.6.0_22
   OpenJDK Runtime Environment (IcedTea6 1.10.4) 
(rhel-1.41.1.10.4.el6-x86_64)
OpenJDK 64-Bit Server VM (build 20.0-b11, mixed mode)
Configuration: 3 node cluster with Hadoop-3.0
Reporter: Mayur
 Fix For: 0.98.0


This problem is observed when region server dies while an endpoint coprocessor 
is executing. On the client side channel.getLastRegion() returns null and we 
get null pointer exception while updating result map. 
Following stack-trace is seen on client:

Caused by: java.lang.NullPointerException
at org.apache.hadoop.hbase.util.Bytes.compareTo(Bytes.java:981)
at 
org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:128)
at 
org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:119)
at java.util.TreeMap.put(TreeMap.java:530)
at java.util.Collections$SynchronizedMap.put(Collections.java:1979)
at org.apache.hadoop.hbase.client.HTable$17.update(HTable.java:1372)
at org.apache.hadoop.hbase.client.HTable$18.call(HTable.java:1401)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8408) Implement namespace

2013-08-01 Thread Francis Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-8408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francis Liu updated HBASE-8408:
---

Attachment: HBASE-8015_8.patch

 Implement namespace
 ---

 Key: HBASE-8408
 URL: https://issues.apache.org/jira/browse/HBASE-8408
 Project: HBase
  Issue Type: Sub-task
Reporter: Francis Liu
Assignee: Francis Liu
 Attachments: HBASE-8015_1.patch, HBASE-8015_2.patch, 
 HBASE-8015_3.patch, HBASE-8015_4.patch, HBASE-8015_5.patch, 
 HBASE-8015_6.patch, HBASE-8015_7.patch, HBASE-8015_8.patch, HBASE-8015.patch, 
 TestNamespaceMigration.tgz, TestNamespaceUpgrade.tgz




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-9109) Null pointer exception while invoking coprocessor.


[ 
https://issues.apache.org/jira/browse/HBASE-9109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726546#comment-13726546
 ] 

Ted Yu commented on HBASE-9109:
---

Why did the region server crash, do you know ?

 Null pointer exception while invoking coprocessor.
 --

 Key: HBASE-9109
 URL: https://issues.apache.org/jira/browse/HBASE-9109
 Project: HBase
  Issue Type: Bug
  Components: Client, Coprocessors
Affects Versions: 0.98.0
 Environment: OS - CentOS release 6.2 (Final)
 JAVA - java version 1.6.0_22
OpenJDK Runtime Environment (IcedTea6 1.10.4) 
 (rhel-1.41.1.10.4.el6-x86_64)
 OpenJDK 64-Bit Server VM (build 20.0-b11, mixed mode)
 Configuration: 3 node cluster with Hadoop-3.0
Reporter: Mayur
 Fix For: 0.98.0


 This problem is observed when region server dies while an endpoint 
 coprocessor is executing. On the client side channel.getLastRegion() returns 
 null and we get null pointer exception while updating result map. 
 Following stack-trace is seen on client:
 Caused by: java.lang.NullPointerException
 at org.apache.hadoop.hbase.util.Bytes.compareTo(Bytes.java:981)
 at 
 org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:128)
 at 
 org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:119)
 at java.util.TreeMap.put(TreeMap.java:530)
 at java.util.Collections$SynchronizedMap.put(Collections.java:1979)
 at org.apache.hadoop.hbase.client.HTable$17.update(HTable.java:1372)
 at org.apache.hadoop.hbase.client.HTable$18.call(HTable.java:1401)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8408) Implement namespace

2013-08-01 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726550#comment-13726550
 ] 

Hadoop QA commented on HBASE-8408:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12595434/HBASE-8015_8.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 653 
new or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6554//console

This message is automatically generated.

 Implement namespace
 ---

 Key: HBASE-8408
 URL: https://issues.apache.org/jira/browse/HBASE-8408
 Project: HBase
  Issue Type: Sub-task
Reporter: Francis Liu
Assignee: Francis Liu
 Attachments: HBASE-8015_1.patch, HBASE-8015_2.patch, 
 HBASE-8015_3.patch, HBASE-8015_4.patch, HBASE-8015_5.patch, 
 HBASE-8015_6.patch, HBASE-8015_7.patch, HBASE-8015_8.patch, HBASE-8015.patch, 
 TestNamespaceMigration.tgz, TestNamespaceUpgrade.tgz




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8015) Support for Namespaces

2013-08-01 Thread Francis Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726552#comment-13726552
 ] 

Francis Liu commented on HBASE-8015:


{quote}
The replaceAll() call only appears in this test. Can you tell me the reason ?
{quote}
It was needed to test snapshot across namespace when we had '.' as the 
delimiter. I've removed replaceAll() in the new patch.

{quote}
I think the above class was removed by HBASE-8764. Please refresh your 
workspace and update your patch.
{quote}
Re-removed it and a number of other files. I think it stayed during merge 
because I modified those files.

{quote}
TestRestoreFlushSnapshotFromClient seems to hang:
{quote}
Fixed this seemed to be a typo during cleanup.




 Support for Namespaces
 --

 Key: HBASE-8015
 URL: https://issues.apache.org/jira/browse/HBASE-8015
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Francis Liu
 Attachments: HBASE-8015_draft_94.patch, Namespace Design.pdf




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-9109) Null pointer exception while invoking coprocessor.

2013-08-01 Thread Mayur (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726553#comment-13726553
 ] 

Mayur commented on HBASE-9109:
--

Initially it died because of wrong heap size, we corrected the size later but 
this issue surfaced. The issue is reproducible by killing a region server 
manually.

Thanks,
Mayur

 Null pointer exception while invoking coprocessor.
 --

 Key: HBASE-9109
 URL: https://issues.apache.org/jira/browse/HBASE-9109
 Project: HBase
  Issue Type: Bug
  Components: Client, Coprocessors
Affects Versions: 0.98.0
 Environment: OS - CentOS release 6.2 (Final)
 JAVA - java version 1.6.0_22
OpenJDK Runtime Environment (IcedTea6 1.10.4) 
 (rhel-1.41.1.10.4.el6-x86_64)
 OpenJDK 64-Bit Server VM (build 20.0-b11, mixed mode)
 Configuration: 3 node cluster with Hadoop-3.0
Reporter: Mayur
 Fix For: 0.98.0


 This problem is observed when region server dies while an endpoint 
 coprocessor is executing. On the client side channel.getLastRegion() returns 
 null and we get null pointer exception while updating result map. 
 Following stack-trace is seen on client:
 Caused by: java.lang.NullPointerException
 at org.apache.hadoop.hbase.util.Bytes.compareTo(Bytes.java:981)
 at 
 org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:128)
 at 
 org.apache.hadoop.hbase.util.Bytes$ByteArrayComparator.compare(Bytes.java:119)
 at java.util.TreeMap.put(TreeMap.java:530)
 at java.util.Collections$SynchronizedMap.put(Collections.java:1979)
 at org.apache.hadoop.hbase.client.HTable$17.update(HTable.java:1372)
 at org.apache.hadoop.hbase.client.HTable$18.call(HTable.java:1401)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8408) Implement namespace


[ 
https://issues.apache.org/jira/browse/HBASE-8408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726554#comment-13726554
 ] 

Ted Yu commented on HBASE-8408:
---

Mind rebasing ?
{code}
-rw-r--r-- 1 tyu users 690 Aug  1 15:56 
./hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/TestRestoreFlushSnapshotFromClient.java.rej
-rw-r--r-- 1 tyu users 497 Aug  1 15:56 
./hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/SnapshotTestingUtils.java.rej
-rw-r--r-- 1 tyu users 3216 Aug  1 15:56 
./hbase-server/src/test/java/org/apache/hadoop/hbase/snapshot/TestFlushSnapshotFromClient.java.rej
-rw-r--r-- 1 tyu users 305 Aug  1 15:56 
./hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestSnapshotFromClient.java.rej
-rw-r--r-- 1 tyu users 1207 Aug  1 15:56 
./hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestRestoreSnapshotFromClient.java.rej
-rw-r--r-- 1 tyu users 672 Aug  1 15:56 
./hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestCloneSnapshotFromClient.java.rej
-rw-r--r-- 1 tyu users 701 Aug  1 15:56 
./hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/RegionCoprocessorRpcChannel.java.rej
-rw-r--r-- 1 tyu users 692 Aug  1 15:56 
./hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java.rej
-rw-r--r-- 1 tyu users 501 Aug  1 15:56 
./hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncProcess.java.rej
-rw-r--r-- 1 tyu users 1359 Aug  1 15:56 
./hbase-client/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java.rej
-rw-r--r-- 1 tyu users 1246 Aug  1 15:56 
./hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestAsyncProcess.java.rej
{code}

 Implement namespace
 ---

 Key: HBASE-8408
 URL: https://issues.apache.org/jira/browse/HBASE-8408
 Project: HBase
  Issue Type: Sub-task
Reporter: Francis Liu
Assignee: Francis Liu
 Attachments: HBASE-8015_1.patch, HBASE-8015_2.patch, 
 HBASE-8015_3.patch, HBASE-8015_4.patch, HBASE-8015_5.patch, 
 HBASE-8015_6.patch, HBASE-8015_7.patch, HBASE-8015_8.patch, HBASE-8015.patch, 
 TestNamespaceMigration.tgz, TestNamespaceUpgrade.tgz




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8224) Add '-hadoop1' or '-hadoop2' to our version string

2013-08-01 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726558#comment-13726558
 ] 

Brock Noland commented on HBASE-8224:
-

Ignore that last comment. Wrong JIRA

 Add '-hadoop1' or '-hadoop2' to our version string
 --

 Key: HBASE-8224
 URL: https://issues.apache.org/jira/browse/HBASE-8224
 Project: HBase
  Issue Type: Task
Reporter: stack
Assignee: stack
Priority: Blocker
 Fix For: 0.95.2

 Attachments: 8224-adding.classifiers.txt, 8224.gen.script.txt, 
 hbase-8224-proto1.patch


 So we can publish both the hadoop1 and the hadoop2 jars to a maven 
 repository, and so we can publish two packages, one for hadoop1 and one for 
 hadoop2, given how maven works, our only alternative (to the best of my 
 knowledge and after consulting others) is by amending the version string to 
 include hadoop1 or hadoop2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-9110) Meta region edits not recovered while migrating to 0.96.0

Himanshu Vashishtha created HBASE-9110:
--

 Summary: Meta region edits not recovered while migrating to 0.96.0
 Key: HBASE-9110
 URL: https://issues.apache.org/jira/browse/HBASE-9110
 Project: HBase
  Issue Type: Bug
  Components: migration
Affects Versions: 0.94.10, 0.95.2
Reporter: Himanshu Vashishtha


I was doing the the migration testing from 0.94.11-snapshot to 0.95.0, and 
faced this issue.

1) Do some edits in meta table (for eg, create a table).

2) Kill the cluster.
(I used kill because we would be doing log splitting when upgrading anyway).

3) There is some dependency on WALs. Upgrade the bits to 0.95.2-snapshot. Start 
the cluster.
Every thing comes up. I see log splitting happening as expected. But, the 
WAL-data for meta table is missing.

I could see recovered.edits file for meta created, and placed at the right 
location. It is just that the new HMaster code tries to recover meta by looking 
at meta prefix in the log name, and if it didn't find one, just opens the meta 
region. So, the recovered.edits file, created afterwards, is not honored.

Opening this jira to let folks give their opinions about how to tackle this 
migration issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-9110) Meta region edits not recovered while migrating to 0.96.0

[
https://issues.apache.org/jira/browse/HBASE-9110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Himanshu Vashishtha updated HBASE-9110:
---

Description:
I was doing the migration testing from 0.94.11-snapshot to 0.95.0, and faced
this issue.

1) Do some edits in meta table (for eg, create a table).

2) Kill the cluster.
(I used kill because we would be doing log splitting when upgrading anyway).

3) There is some dependency on WALs. Upgrade the bits to 0.95.2-snapshot. Start
the cluster.
Every thing comes up. I see log splitting happening as expected. But, the
WAL-data for meta table is missing.

I could see recovered.edits file for meta created, and placed at the right
location. It is just that the new HMaster code tries to recover meta by looking
at meta prefix in the log name, and if it didn't find one, just opens the meta
region. So, the recovered.edits file, created afterwards, is not honored.

Opening this jira to let folks give their opinions about how to tackle this
migration issue.

was:
I was doing the the migration testing from 0.94.11-snapshot to 0.95.0, and
faced this issue.

1) Do some edits in meta table (for eg, create a table).

2) Kill the cluster.
(I used kill because we would be doing log splitting when upgrading anyway).

Opening this jira to let folks give their opinions about how to tackle this
migration issue.

Meta region edits not recovered while migrating to 0.96.0
-

Key: HBASE-9110
URL: https://issues.apache.org/jira/browse/HBASE-9110
Project: HBase
Issue Type: Bug
Components: migration
Affects Versions: 0.95.2, 0.94.10
Reporter: Himanshu Vashishtha

I was doing the migration testing from 0.94.11-snapshot to 0.95.0, and faced
this issue.
1) Do some edits in meta table (for eg, create a table).
2) Kill the cluster.
(I used kill because we would be doing log splitting when upgrading anyway).
3) There is some dependency on WALs. Upgrade the bits to 0.95.2-snapshot.
Start the cluster.
Every thing comes up. I see log splitting happening as expected. But, the
WAL-data for meta table is missing.
I could see recovered.edits file for meta created, and placed at the right
location. It is just that the new HMaster code tries to recover meta by
looking at meta prefix in the log name, and if it didn't find one, just opens
the meta region. So, the recovered.edits file, created afterwards, is not
honored.
Opening this jira to let folks give their opinions about how to tackle this
migration issue.

[jira] [Updated] (HBASE-8768) Improve bulk load performance by moving key value construction from map phase to reduce phase.

[
https://issues.apache.org/jira/browse/HBASE-8768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ted Yu updated HBASE-8768:
--

Resolution: Fixed
Hadoop Flags: Reviewed
Status: Resolved (was: Patch Available)

Improve bulk load performance by moving key value construction from map phase
to reduce phase.
--

Key: HBASE-8768
URL: https://issues.apache.org/jira/browse/HBASE-8768
Project: HBase
Issue Type: Improvement
Components: mapreduce, Performance
Reporter: rajeshbabu
Assignee: rajeshbabu
Fix For: 0.98.0, 0.95.2

Attachments: HBASE-8768_v2.patch, HBASE-8768_v3.patch,
HBASE-8768_v4.patch, HBase_Bulkload_Performance_Improvement.pdf

ImportTSV bulkloading approach uses MapReduce framework. Existing mapper and
reducer classes used by ImportTSV are TsvImporterMapper.java and
PutSortReducer.java. ImportTSV tool parses the tab(by default) seperated
values from the input files and Mapper class generates the PUT objects for
each row using the Key value pairs created from the parsed text.
PutSortReducer then uses the partions based on the regions and sorts the Put
objects for each region.
Overheads we can see in the above approach:
==
1) keyvalue construction for each parsed value in the line adding extra data
like rowkey,columnfamily,qualifier which will increase around 5x extra data
to be shuffled in reduce phase.
We can calculate data size to shuffled as below
{code}
Data to be shuffled = nl*nt*(rl+cfl+cql+vall+tsl+30)
{code}
If we move keyvalue construction to reduce phase we datasize to be shuffle
will be which is very less compared to above.
{code}
Data to be shuffled = nl*nt*vall
{code}
nl - Number of lines in the raw file
nt - Number of tabs or columns including row key.
rl - row length which will be different for each line.
cfl - column family length which will be different for each family
cql - qualifier length
tsl - timestamp length.
vall- each parsed value length.
30 bytes for kv size,number of families etc.
2) In mapper side we are creating put objects by adding all keyvalues
constructed for each line and in reducer we will again collect keyvalues from
put and sort them.
Instead we can directly create and sort keyvalues in reducer.
Solution:

We can improve bulk load performance by moving the key value construction
from mapper to reducer so that Mapper just sends the raw text for each row to
the Reducer. Reducer then parses the records for rows and create and sort the
key value pairs before writing to HFiles.
Conclusion:
===
The above suggestions will improve map phase performance by avoiding keyvalue
construction and reduce phase performance by avoiding excess data to be
shuffled.

[jira] [Commented] (HBASE-9110) Meta region edits not recovered while migrating to 0.96.0

[
https://issues.apache.org/jira/browse/HBASE-9110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726592#comment-13726592
]

Ted Yu commented on HBASE-9110:
---

Can you tell me the value for configuration
hbase.regionserver.separate.hlog.for.meta ?

See HBASE-8081

Meta region edits not recovered while migrating to 0.96.0
-

Key: HBASE-9110
URL: https://issues.apache.org/jira/browse/HBASE-9110
Project: HBase
Issue Type: Bug
Components: migration
Affects Versions: 0.95.2, 0.94.10
Reporter: Himanshu Vashishtha

[jira] [Commented] (HBASE-9110) Meta region edits not recovered while migrating to 0.96.0

[
https://issues.apache.org/jira/browse/HBASE-9110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726599#comment-13726599
]

Himanshu Vashishtha commented on HBASE-9110:

Yes, thanks for reminding Ted.
I was using the default option for separate meta log property, i.e., it was OFF.

Meta region edits not recovered while migrating to 0.96.0
-

Key: HBASE-9110
URL: https://issues.apache.org/jira/browse/HBASE-9110
Project: HBase
Issue Type: Bug
Components: migration
Affects Versions: 0.95.2, 0.94.10
Reporter: Himanshu Vashishtha

[jira] [Updated] (HBASE-7634) Replication handling of changes to peer clusters is inefficient


 [ 
https://issues.apache.org/jira/browse/HBASE-7634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans updated HBASE-7634:
--

   Resolution: Fixed
Fix Version/s: 0.95.2
   0.98.0
 Assignee: Gabriel Reid
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to branch and trunk, thanks for the good work Gabriel.

 Replication handling of changes to peer clusters is inefficient
 ---

 Key: HBASE-7634
 URL: https://issues.apache.org/jira/browse/HBASE-7634
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.95.2
Reporter: Gabriel Reid
Assignee: Gabriel Reid
 Fix For: 0.98.0, 0.95.2

 Attachments: HBASE-7634.patch, HBASE-7634.v2.patch, 
 HBASE-7634.v3.patch, HBASE-7634.v4.patch, HBASE-7634.v5.patch, 
 HBASE-7634.v6.patch


 The current handling of changes to the region servers in a replication peer 
 cluster is currently quite inefficient. The list of region servers that are 
 being replicated to is only updated if there are a large number of issues 
 encountered while replicating.
 This can cause it to take quite a while to recognize that a number of the 
 regionserver in a peer cluster are no longer available. A potentially bigger 
 problem is that if a replication peer cluster is started with a small number 
 of regionservers, and then more region servers are added after replication 
 has started, the additional region servers will never be used for replication 
 (unless there are failures on the in-use regionservers).
 Part of the current issue is that the retry code in 
 ReplicationSource#shipEdits checks a randomly-chosen replication peer 
 regionserver (in ReplicationSource#isSlaveDown) to see if it is up after a 
 replication write has failed on a different randonly-chosen replication peer. 
 If the peer is seen as not down, another randomly-chosen peer is used for 
 writing.
 A second part of the issue is that changes to the list of region servers in a 
 peer cluster are not detected at all, and are only picked up if a certain 
 number of failures have occurred when trying to ship edits.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8771) ensure replication_scope's value is either local(0) or global(1)


[ 
https://issues.apache.org/jira/browse/HBASE-8771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726622#comment-13726622
 ] 

Jean-Daniel Cryans commented on HBASE-8771:
---

[~nidmhbase] See my other comment: 
https://issues.apache.org/jira/browse/HBASE-8663?focusedCommentId=13724453page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13724453

 ensure replication_scope's value is either local(0) or global(1)
 

 Key: HBASE-8771
 URL: https://issues.apache.org/jira/browse/HBASE-8771
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.8
Reporter: Demai Ni
Assignee: Demai Ni
Priority: Minor
 Fix For: 0.94.11

 Attachments: HBASE-8771-0.94.8-v0.patch


 For replication_scope, only two values are meaningful:
 {code} 
   public static final int REPLICATION_SCOPE_LOCAL = 0;
   public static final int REPLICATION_SCOPE_GLOBAL = 1;
 {code} 
 However, there is no checking for that, so currently user can set it to any 
 integer value. And all non-zero value will be treated as 1(GLOBAL). 
 This jira is to add a checking in HColumnDescriptor#setScope() so that only 0 
 and 1 will be accept during create_table or alter_table. 
 In the future, we can leverage replication_scope to store for info. For 
 example: 
 -1: A columnfam is replicated from another cluster in MASTER_SLAVE setup (i.e 
 readonly)
 2 : A columnfam is set MASTER_MASTER
 Probably a major improve JIRA is needed for the future usage. It will be good 
 to ensure the scope value at this moment. 
 {code:title=Testing|borderStyle=solid}
 hbase(main):002:0 create 't1_dn',{NAME='cf1',REPLICATION_SCOPE=2}
 ERROR: java.lang.IllegalArgumentException: Replication Scope must be either 
 0(local) or 1(global)
 ...
 hbase(main):004:0 alter 't1_dn',{NAME='cf1',REPLICATION_SCOPE=-1}
 ERROR: java.lang.IllegalArgumentException: Replication Scope must be either 
 0(local) or 1(global)
 ...
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-9098) During recovery use ZK as the source of truth for region state

2013-08-01 Thread Devaraj Das (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-9098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HBASE-9098:
---

Priority: Blocker  (was: Major)

 During recovery use ZK as the source of truth for region state 
 ---

 Key: HBASE-9098
 URL: https://issues.apache.org/jira/browse/HBASE-9098
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Jeffrey Zhong
Priority: Blocker

 In HLogSplitter:locateRegionAndRefreshLastFlushedSequenceId(HConnection, 
 byte[], byte[], String), we talk to the replayee regionserver to figure out 
 whether a region is in recovery or not. We should look at ZK only for this 
 piece of information (since that is the source of truth for recovery 
 otherwise).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-9098) During recovery use ZK as the source of truth for region state

2013-08-01 Thread Devaraj Das (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-9098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HBASE-9098:
---

  Component/s: regionserver
Affects Version/s: 0.95.0
Fix Version/s: 0.95.2

Making it a blocker for 0.95.2 (feel free to move it to 0.95.3 if the patch 
doesn't get in time). This bug actually causes dataloss as far as I can tell in 
the meta region - meta will never be recovered (it turns out that 
isRecovering() is called on a constant FIRST_META_REGIONINFO and that will 
always return false).

 During recovery use ZK as the source of truth for region state 
 ---

 Key: HBASE-9098
 URL: https://issues.apache.org/jira/browse/HBASE-9098
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.95.0
Reporter: Devaraj Das
Assignee: Jeffrey Zhong
Priority: Blocker
 Fix For: 0.95.2


 In HLogSplitter:locateRegionAndRefreshLastFlushedSequenceId(HConnection, 
 byte[], byte[], String), we talk to the replayee regionserver to figure out 
 whether a region is in recovery or not. We should look at ZK only for this 
 piece of information (since that is the source of truth for recovery 
 otherwise).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-9102) HFile block pre-loading for large sequential scan

2013-08-01 Thread Liyin Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726635#comment-13726635
 ] 

Liyin Tang commented on HBASE-9102:
---

Chao, You are right that the pre-load will run in a rate/limit fashion to make 
sure it won't pollute the block cache substantially.
The pre-loading targets on the large sequential scan case. The client is able 
to enable/disable on each request basis. 


 HFile block pre-loading for large sequential scan
 -

 Key: HBASE-9102
 URL: https://issues.apache.org/jira/browse/HBASE-9102
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.89-fb
Reporter: Liyin Tang
Assignee: Liyin Tang

 The current HBase scan model cannot take full advantage of the aggrediate 
 disk throughput, especially for the large sequential scan cases. And for the 
 large sequential scan, it is easy to predict what the next block to read in 
 advance so that it can pre-load and decompress/decoded these data blocks from 
 HDFS into block cache right before the current read point. 
 Therefore, this jira is to optimized the large sequential scan performance by 
 pre-loading the HFile blocks into the block cache in a stream fashion so that 
 the scan query can read from the cache directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8741) Mutations on Regions in recovery mode might have same sequenceIDs


 [ 
https://issues.apache.org/jira/browse/HBASE-8741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Himanshu Vashishtha updated HBASE-8741:
---

Attachment: HBASE-8741-v4-again.patch

Attaching the v4 patch again (along with minor refactoring in some tests 
methods) to let qa run again. The prior testing still holds.

 Mutations on Regions in recovery mode might have same sequenceIDs
 -

 Key: HBASE-8741
 URL: https://issues.apache.org/jira/browse/HBASE-8741
 Project: HBase
  Issue Type: Bug
  Components: MTTR
Affects Versions: 0.95.1
Reporter: Himanshu Vashishtha
Assignee: Himanshu Vashishtha
 Attachments: HBASE-8741-v0.patch, HBASE-8741-v2.patch, 
 HBASE-8741-v3.patch, HBASE-8741-v4-again.patch, HBASE-8741-v4.patch


 Currently, when opening a region, we find the maximum sequence ID from all 
 its HFiles and then set the LogSequenceId of the log (in case the later is at 
 a small value). This works good in recovered.edits case as we are not writing 
 to the region until we have replayed all of its previous edits. 
 With distributed log replay, if we want to enable writes while a region is 
 under recovery, we need to make sure that the logSequenceId  maximum 
 logSequenceId of the old regionserver. Otherwise, we might have a situation 
 where new edits have same (or smaller) sequenceIds. 
 We can store region level information in the WALTrailer, than this scenario 
 could be avoided by:
 a) reading the trailer of the last completed file, i.e., last wal file 
 which has a trailer and,
 b) completely reading the last wal file (this file would not have the 
 trailer, so it needs to be read completely).
 In future, if we switch to multi wal file, we could read the trailer for all 
 completed WAL files, and reading the remaining incomplete files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-9061) Put back TestReplicationKillMasterRSCompressedwhen fixed over in HBASE-8615


 [ 
https://issues.apache.org/jira/browse/HBASE-9061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans updated HBASE-9061:
--

Summary: Put back TestReplicationKillMasterRSCompressedwhen fixed over in 
HBASE-8615  (was: Put back TestReplicationKillRs* when fixed over in HBASE-8615)

 Put back TestReplicationKillMasterRSCompressedwhen fixed over in HBASE-8615
 ---

 Key: HBASE-9061
 URL: https://issues.apache.org/jira/browse/HBASE-9061
 Project: HBase
  Issue Type: Task
  Components: test
Reporter: stack
Priority: Critical
 Fix For: 0.95.2


 The suite of TestReplicationKillRs* tests were removed temporarily.  Put them 
 back after they've been fixed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-9111) Put back TestReplicationKill* except for the MasterRSCompressed one

Jean-Daniel Cryans created HBASE-9111:
-

 Summary: Put back TestReplicationKill* except for the 
MasterRSCompressed one
 Key: HBASE-9111
 URL: https://issues.apache.org/jira/browse/HBASE-9111
 Project: HBase
  Issue Type: Task
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
 Fix For: 0.98.0, 0.95.2


TestReplicationKillMasterRSCompressed was the only one affected in HBASE-8615 
so it would be good to keep to others around.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-9098) During recovery use ZK as the source of truth for region state

2013-08-01 Thread Jeffrey Zhong (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726639#comment-13726639
 ] 

Jeffrey Zhong commented on HBASE-9098:
--

Thanks for the very good catch. Just providing more background details. In file 
HRegionInfo.java as following:
{code}
  public static HRegionInfo convert(final RegionInfo proto) {
if (proto == null) return null;
byte [] tableName = proto.getTableName().toByteArray();
if (Bytes.equals(tableName, HConstants.META_TABLE_NAME)) {
  return FIRST_META_REGIONINFO;
}
long regionId = proto.getRegionId();
{code}

For META region recovery, we always return the constant FIRST_META_REGIONINFO 
whose recovering state is always false. The consequence is that we'll skip META 
region recovery.

I'll use ZK as the source of truth in the fix so the problematic area should be 
fixed.

 During recovery use ZK as the source of truth for region state 
 ---

 Key: HBASE-9098
 URL: https://issues.apache.org/jira/browse/HBASE-9098
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.95.0
Reporter: Devaraj Das
Assignee: Jeffrey Zhong
Priority: Blocker
 Fix For: 0.95.2


 In HLogSplitter:locateRegionAndRefreshLastFlushedSequenceId(HConnection, 
 byte[], byte[], String), we talk to the replayee regionserver to figure out 
 whether a region is in recovery or not. We should look at ZK only for this 
 piece of information (since that is the source of truth for recovery 
 otherwise).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-9112) Custom TableInputFormat in initTableMapperJob throws ClassNoFoundException on TableMapper

2013-08-01 Thread Debanjan Bhattacharyya (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Debanjan Bhattacharyya updated HBASE-9112:
--

Description: 
When using custom TableInputFormat in TableMapReduceUtil.initTableMapperJob in 
the following way
TableMapReduceUtil.initTableMapperJob(mytable, 
MyScan, 
MyMapper.class,
MyKey.class, 
MyValue.class, 
myJob,true,  
MyTableInputFormat.class);

I get error: java.lang.ClassNotFoundException: 
org.apache.hadoop.hbase.mapreduce.TableMapper
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)

If I do not use the last two parameters, there is no error.
What is going wrong here?

Thanks
Regards

  was:
When using custom TableInputFormat in TableMapReduceUtil.initTableMapperJob in 
the following way
TableMapReduceUtil.initTableMapperJob(mytable, 
MyScan, 
MyMapper.class,
MyKey.class, 
MyValue.class, 
myJob,true,  
MyTableInputFormat.class);

I get error: java.lang.ClassNotFoundException: 
org.apache.hadoop.hbase.mapreduce.TableMapper
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)

The if I do not use the last two parameters, there is no error.
What is going wrong here?

Thanks
Regards


 Custom TableInputFormat in initTableMapperJob throws ClassNoFoundException on 
 TableMapper
 -

 Key: HBASE-9112
 URL: https://issues.apache.org/jira/browse/HBASE-9112
 Project: HBase
  Issue Type: Bug
  Components: hadoop2
Affects Versions: 0.2.0
 Environment: CDH-4.3.0-1.cdh4.3.0.p0.22
Reporter: Debanjan Bhattacharyya

 When using custom TableInputFormat in TableMapReduceUtil.initTableMapperJob 
 in the following way
 TableMapReduceUtil.initTableMapperJob(mytable, 
   MyScan, 
   MyMapper.class,
   MyKey.class, 
   MyValue.class, 
   myJob,true,  
 MyTableInputFormat.class);
 I get error: java.lang.ClassNotFoundException: 
 org.apache.hadoop.hbase.mapreduce.TableMapper
   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
   at

[jira] [Created] (HBASE-9112) Custom TableInputFormat in initTableMapperJob throws ClassNoFoundException on TableMapper

2013-08-01 Thread Debanjan Bhattacharyya (JIRA)

Debanjan Bhattacharyya created HBASE-9112:
-

 Summary: Custom TableInputFormat in initTableMapperJob throws 
ClassNoFoundException on TableMapper
 Key: HBASE-9112
 URL: https://issues.apache.org/jira/browse/HBASE-9112
 Project: HBase
  Issue Type: Bug
  Components: hadoop2
Affects Versions: 0.2.0
 Environment: CDH-4.3.0-1.cdh4.3.0.p0.22
Reporter: Debanjan Bhattacharyya


When using custom TableInputFormat in TableMapReduceUtil.initTableMapperJob in 
the following way
TableMapReduceUtil.initTableMapperJob(mytable, 
MyScan, 
MyMapper.class,
MyKey.class, 
MyValue.class, 
myJob,true,  
MyTableInputFormat.class);

I get error: java.lang.ClassNotFoundException: 
org.apache.hadoop.hbase.mapreduce.TableMapper
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)

The if I do not use the last two parameters, there is no error.
What is going wrong here?

Thanks
Regards

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-9029) Backport HBASE-8706 Some improvement in snapshot to 0.94


[ 
https://issues.apache.org/jira/browse/HBASE-9029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726651#comment-13726651
 ] 

Lars Hofhansl commented on HBASE-9029:
--

Skimmed patch. Looks good. I assume all snapshot related tests pass?


 Backport HBASE-8706 Some improvement in snapshot to 0.94
 

 Key: HBASE-9029
 URL: https://issues.apache.org/jira/browse/HBASE-9029
 Project: HBase
  Issue Type: Improvement
  Components: snapshots
Affects Versions: 0.94.9
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor
 Fix For: 0.94.11

 Attachments: HBase-9029-0.94.patch


 'HBASE-8706 Some improvement in snapshot' has some good parameter tuning and 
 improvement for snapshot handling, making snapshot more robust.
 It will nice to put it in 0.94.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8960) TestDistributedLogSplitting.testLogReplayForDisablingTable fails sometimes

2013-08-01 Thread Jeffrey Zhong (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-8960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726661#comment-13726661
 ] 

Jeffrey Zhong commented on HBASE-8960:
--

The flaky test detector is tracked in hbase-8018. I run the tool(which doesn't 
work on http://54.241.6.143 yet because its Jenkins set up is different than 
builds.apache.org) against 0.95 and the fail test case summary of the past 30 
runs is shown below(where 1 means pass, 0 means didn't run, -1 means fail): You 
can see TestDistributedLogSplitting is stable there.

{code}
java -jar buildstats-1.0-jar-with-dependencies.jar  https://builds.apache.org 
hbase-0.95 30
{code}

Failed Test Cases   366  367  368  369  370  371  372  373  374  
375  376  377  378  379  381  382  383  384  385  386  387  388  389  390  391  
392


org.apache.hadoop.hbase.coprocessor.testmastercoprocessorexceptionwithremove.testexceptionfromcoprocessorwhencreatingtable
111   -101111111111
11111111111
org.apache.hadoop.hbase.mapreduce.testhfileoutputformat.testmrincrementalloadwithsplit
111   -101111111111
11111111111
org.apache.hadoop.hbase.master.cleaner.testsnapshotfrommaster.testsnapshothfilearchiving
11111111111   -1011
11111111111
org.apache.hadoop.hbase.master.testsplitlogmanager.testmultipleresubmits1   
 1111011111111   -101   
 111111111
org.apache.hadoop.hbase.regionserver.testregionmergetransactiononcluster.testwholesomemerge
111   -101111111111
11111111111
org.apache.hadoop.hbase.regionserver.wal.testlogrolling.testlogrollonpipelinerestart
11111111111   -1011
11111111111
org.apache.hadoop.hbase.snapshot.testflushsnapshotfromclient.testconcurrentsnapshottingattempts
111111   -101111111
11111111111
org.apache.hadoop.hbase.testiofencing.testfencingaroundcompactionafterwalsync   
 1111111111111111   
 111   -1011111
org.apache.hadoop.hbase.testzookeeper.testlogsplittingaftermasterrecoveryduetozkexpiry
111111111111111
11111111   -101
org.apache.hadoop.hbase.testzookeeper.testregionserversessionexpired11  
  1111   -101111111111  
  11111111
org.apache.hadoop.hbase.thrift.testthriftserver.testall111   -1
0111111111111111
111111
org.apache.hadoop.hbase.zookeeper.lock.testzkinterprocessreadwritelock.testreadlockexcludeswriters
111111   -101111111
11111111111

 TestDistributedLogSplitting.testLogReplayForDisablingTable fails sometimes
 --

 Key: HBASE-8960
 URL: https://issues.apache.org/jira/browse/HBASE-8960
 Project: HBase
  Issue Type: Task
  Components: test
Reporter: Jimmy Xiang
Assignee: Jeffrey Zhong
Priority: Minor
 Fix For: 0.95.2

 Attachments: hbase-8960-addendum-2.patch, hbase-8960-addendum.patch, 
 hbase-8960.patch


 http://54.241.6.143/job/HBase-0.95-Hadoop-2/org.apache.hbase$hbase-server/634/testReport/junit/org.apache.hadoop.hbase.master/TestDistributedLogSplitting/testLogReplayForDisablingTable/
 {noformat}
 java.lang.AssertionError: expected:1000 but was:0
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at org.junit.Assert.assertEquals(Assert.java:542)
   at 
 org.apache.hadoop.hbase.master.TestDistributedLogSplitting.testLogReplayForDisablingTable(TestDistributedLogSplitting.java:797)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at

[jira] [Commented] (HBASE-9087) Handlers being blocked during reads


[ 
https://issues.apache.org/jira/browse/HBASE-9087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1372#comment-1372
 ] 

Lars Hofhansl commented on HBASE-9087:
--

Yeah it's per store, so you'd see this contention if you read a lot of KVs from 
the same Region and ColumnFamily.
Now thinking about how this is used a bit more... We're using this to notify 
the scanners that they have to reset their KVHeap stack. In that we absolutely 
have to make sure that all currently open scanners do this. ConcurrentHashMap 
does not actually guarantee this upon interating, but CopyOnWriteArraySet does. 
So maybe we're opening ourselves up to concurrency issues. An alternative would 
be a to use a HashSet and synchronize on it.

 Handlers being blocked during reads
 ---

 Key: HBASE-9087
 URL: https://issues.apache.org/jira/browse/HBASE-9087
 Project: HBase
  Issue Type: Bug
  Components: Performance
Affects Versions: 0.94.7, 0.95.1
Reporter: Pablo Medina
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.98.0, 0.95.2, 0.94.11

 Attachments: HBASE-9087-0.patch, HBASE-9087-1.patch


 I'm having a lot of handlers (90 - 300 aprox) being blocked when reading 
 rows. They are blocked during changedReaderObserver registration.
 Lars Hofhansl suggests to change the implementation of changedReaderObserver 
 from CopyOnWriteList to ConcurrentHashMap.
 Here is a stack trace: 
 IPC Server handler 99 on 60020 daemon prio=10 tid=0x41c84000 
 nid=0x2244 waiting on condition [0x7ff51fefd000]
java.lang.Thread.State: WAITING (parking)
 at sun.misc.Unsafe.park(Native Method)
 - parking to wait for  0xc5c13ae8 (a 
 java.util.concurrent.locks.ReentrantLock$NonfairSync)
 at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178)
 at 
 java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
 at 
 java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
 at 
 java.util.concurrent.CopyOnWriteArrayList.addIfAbsent(CopyOnWriteArrayList.java:553)
 at 
 java.util.concurrent.CopyOnWriteArraySet.add(CopyOnWriteArraySet.java:221)
 at 
 org.apache.hadoop.hbase.regionserver.Store.addChangedReaderObserver(Store.java:1085)
 at 
 org.apache.hadoop.hbase.regionserver.StoreScanner.init(StoreScanner.java:138)
 at 
 org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:2077)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.init(HRegion.java:3755)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:1804)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1796)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1771)
 at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4776)
 at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4750)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2152)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3700)
 at sun.reflect.GeneratedMethodAccessor26.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)
   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-9111) Put back TestReplicationKill* except for the MasterRSCompressed one


 [ 
https://issues.apache.org/jira/browse/HBASE-9111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans updated HBASE-9111:
--

Attachment: HBASE-9111.patch

What I'm about to commit back. It's pretty much what it was except I had to 
change the import for one exception because it moved.

 Put back TestReplicationKill* except for the MasterRSCompressed one
 ---

 Key: HBASE-9111
 URL: https://issues.apache.org/jira/browse/HBASE-9111
 Project: HBase
  Issue Type: Task
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
 Fix For: 0.98.0, 0.95.2

 Attachments: HBASE-9111.patch


 TestReplicationKillMasterRSCompressed was the only one affected in HBASE-8615 
 so it would be good to keep to others around.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-9111) Put back TestReplicationKill* except for the MasterRSCompressed one


 [ 
https://issues.apache.org/jira/browse/HBASE-9111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans updated HBASE-9111:
--

Status: Patch Available  (was: Open)

Getting a HadoopQA run just to be clean.

 Put back TestReplicationKill* except for the MasterRSCompressed one
 ---

 Key: HBASE-9111
 URL: https://issues.apache.org/jira/browse/HBASE-9111
 Project: HBase
  Issue Type: Task
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
 Fix For: 0.98.0, 0.95.2

 Attachments: HBASE-9111.patch


 TestReplicationKillMasterRSCompressed was the only one affected in HBASE-8615 
 so it would be good to keep to others around.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-9110) Meta region edits not recovered while migrating to 0.96.0

2013-08-01 Thread Devaraj Das (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-9110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726696#comment-13726696
]

Devaraj Das commented on HBASE-9110:

Hmm.. so maybe for the migration, we should have the cluster enter into a quiet
period, and flush the meta just before the cluster shutdown.

Meta region edits not recovered while migrating to 0.96.0
-

Key: HBASE-9110
URL: https://issues.apache.org/jira/browse/HBASE-9110
Project: HBase
Issue Type: Bug
Components: migration
Affects Versions: 0.95.2, 0.94.10
Reporter: Himanshu Vashishtha

[jira] [Updated] (HBASE-8408) Implement namespace

2013-08-01 Thread Francis Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-8408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francis Liu updated HBASE-8408:
---

Attachment: HBASE-8015_9.patch

rebased patch.

 Implement namespace
 ---

 Key: HBASE-8408
 URL: https://issues.apache.org/jira/browse/HBASE-8408
 Project: HBase
  Issue Type: Sub-task
Reporter: Francis Liu
Assignee: Francis Liu
 Attachments: HBASE-8015_1.patch, HBASE-8015_2.patch, 
 HBASE-8015_3.patch, HBASE-8015_4.patch, HBASE-8015_5.patch, 
 HBASE-8015_6.patch, HBASE-8015_7.patch, HBASE-8015_8.patch, 
 HBASE-8015_9.patch, HBASE-8015.patch, TestNamespaceMigration.tgz, 
 TestNamespaceUpgrade.tgz




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-9092) OpenRegion could be ignored by mistake


[ 
https://issues.apache.org/jira/browse/HBASE-9092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726716#comment-13726716
 ] 

stack commented on HBASE-9092:
--

+1

I like the call to unassign if FAILED_OPEN before moving on.

 OpenRegion could be ignored by mistake
 --

 Key: HBASE-9092
 URL: https://issues.apache.org/jira/browse/HBASE-9092
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.98.0, 0.95.2

 Attachments: trunk-9092.patch


 Looked into failed test: 
 http://54.241.6.143/job/HBase-0.95/org.apache.hbase$hbase-server/721/testReport/
 In this test run, several tests in TestAssignmentManagerOnCluster failed.  
 Most of them timed out because the first failure testOpenFailedUnrecoverable 
 used too much resource in deleting the table.
 http://54.241.6.143/job/HBase-0.95/org.apache.hbase$hbase-server/721/testReport/org.apache.hadoop.hbase.master/TestAssignmentManagerOnCluster/testOpenFailedUnrecoverable/
 The reason testOpenFailedUnrecoverable failed is that the second openRegion 
 call was ignored since the previous open call was still going on and stayed 
 in OpenRegionHandler#doCleanUpOnFailedOpen for too long (perhaps thread 
 scheduling issue).  The second openRegion call was skipped since the region 
 was still in the middle of opening.  However, the failed_open event was 
 already processed by master.  Therefore the region stuck in transition and 
 the delete table went no where.  It is a similar issue as we ran into before 
 while for that time, the region was closing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-8615) HLog Compression fails in mysterious ways (working title)


 [ 
https://issues.apache.org/jira/browse/HBASE-8615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans updated HBASE-8615:
--

 Priority: Critical  (was: Major)
Fix Version/s: 0.95.2
   0.98.0
  Summary: HLog Compression fails in mysterious ways (working title)  
(was: TestReplicationQueueFailoverCompressed#queueFailover fails on hadoop 2.0 
due to IndexOutOfBoundsException)

Here's what I know about the different problems.

The first one is that we find data in the compressed HLog that's unexpected. It 
happens easily on Hadoop 2 and takes more data to hit on Hadoop 1. It manifests 
itself as show in the jira's description or like this:

{noformat}
2013-07-27 15:17:54,789 ERROR [RS:1;vesta:34230.replicationSource,2] 
wal.ProtobufLogReader(236): Error  while reading 4 WAL KVs; started reading at 
65475 and read up to 65541
2013-07-27 15:17:54,790 WARN  [RS:1;vesta:34230.replicationSource,2] 
regionserver.ReplicationSource(323): 2 Got: 
java.io.IOException: Error  while reading 4 WAL KVs; started reading at 65475 
and read up to 65541
at 
org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.readNext(ProtobufLogReader.java:237)
at 
org.apache.hadoop.hbase.regionserver.wal.ReaderBase.next(ReaderBase.java:96)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.readNextAndSetPosition(ReplicationHLogReaderManager.java:89)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:407)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:319)
Caused by: java.lang.IllegalArgumentException
at 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:76)
at 
org.apache.hadoop.hbase.regionserver.wal.WALCellCodec$StreamUtils.toShort(WALCellCodec.java:353)
at 
org.apache.hadoop.hbase.regionserver.wal.WALCellCodec$CompressedKvDecoder.readIntoArray(WALCellCodec.java:237)
at 
org.apache.hadoop.hbase.regionserver.wal.WALCellCodec$CompressedKvDecoder.parseCell(WALCellCodec.java:206)
at 
org.apache.hadoop.hbase.codec.BaseDecoder.advance(BaseDecoder.java:46)
at 
org.apache.hadoop.hbase.regionserver.wal.WALEdit.readFromCells(WALEdit.java:213)
at 
org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.readNext(ProtobufLogReader.java:217)
... 4 more
{noformat}

One thing I saw is that it always happens when we're close to a multiple of 
Short.MAX_VALUE. In the stack trace I just pasted you can see it started 
reading at 65475 and in the jira's description it was ending at 65538.

I'm able to recreate the problem with at patch to 
TestReplicationHLogReaderManager that I'm going to attach later. I also was 
able to recreate the problem on a single node cluster and was able to grab a 
corrupted HLog that will also be attached.

The other problem I found is that when appending WALEdits with only 1 KV to a 
compressed HLog, it hits an invalid PB:

{noformat}
2013-07-31 11:38:52,156 ERROR [main] wal.ProtobufLogReader(199):
Invalid PB while reading WAL, probably an unexpected EOF, ignoring
com.google.protobuf.InvalidProtocolBufferException: Protocol message
contained an invalid tag (zero).
at 
com.google.protobuf.InvalidProtocolBufferException.invalidTag(InvalidProtocolBufferException.java:68)
at 
com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:108)
at 
org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$Builder.mergeFrom(WALProtos.java:1120)
at 
org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$Builder.mergeFrom(WALProtos.java:885)
at 
com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:212)
at 
com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
at 
com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
at 
com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
at 
com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
at 
com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
at 
com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
at 
org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.readNext(ProtobufLogReader.java:197)
at 
org.apache.hadoop.hbase.regionserver.wal.ReaderBase.next(ReaderBase.java:96)
{noformat}

Printing the position when it fails I can see it's still around a multiple of 
Short.MAX_VALUE, and using the unit test I attached you can reliably get the 
issue after reading the same number of edits. I wasn't able to trigger the 
issue in Hadoop 1

[jira] [Commented] (HBASE-9023) TestIOFencing.testFencingAroundCompactionAfterWALSync