[jira] [Created] (HBASE-17453) add Ping into HBase server for deprecated GetProtocolVersion

2017-01-11 Thread Tianying Chang (JIRA)
Tianying Chang created HBASE-17453:
--

 Summary: add Ping into HBase server for deprecated 
GetProtocolVersion
 Key: HBASE-17453
 URL: https://issues.apache.org/jira/browse/HBASE-17453
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 1.2.2
Reporter: Tianying Chang
Assignee: Tianying Chang
Priority: Minor


Our HBase service is hosted in AWS. We saw cases where the connection between 
the client (Asynchbase in our case) and server stop working but did not throw 
any exception, therefore traffic stuck. So we added a "Ping" feature in 
AsyncHBase 1.5 by utilizing the GetProtocolVersion() API provided at RS side, 
if no traffic for given time, we send the "Ping", if no response back for 
"Ping", we assume the connect is bad and reconnect. 

Now we are upgrading cluster from 94 to 1.2. However, GetProtocolVersion() is 
deprecated. To be able to support same detect/reconnect feature, we added 
Ping() in our internal HBase 1.2 branch, and also patched accordingly in 
Asynchbase 1.7.

We would like to open source this feature since it is useful for use case in 
AWS environment. 


We used GetProtocolVersion in AsyncHBase to detect unhealthy connection to RS 
since in AWS, sometimes it enters a state the connection 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17452) Failed taking snapshot - region Manifest proto-message too large

2017-01-11 Thread huaxiang sun (JIRA)
huaxiang sun created HBASE-17452:


 Summary: Failed taking snapshot - region Manifest proto-message 
too large
 Key: HBASE-17452
 URL: https://issues.apache.org/jira/browse/HBASE-17452
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 2.0.0
Reporter: huaxiang sun
Assignee: huaxiang sun


With MOB, it is possible that there are lots of files under mobdir. When taking 
snapshot, the region manifest could be very large. Similar to HBASE-15430, we 
are seeing use cases running into the following exception:

{code}
Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol message 
was too large. May be malicious. Use CodedInputStream.setSizeLimit() to 
increase the size limit.
at 
com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
at 
com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
at 
com.google.protobuf.CodedInputStream.readRawBytes(CodedInputStream.java:811)
at 
com.google.protobuf.CodedInputStream.readBytes(CodedInputStream.java:329)
at 
org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$StoreFile.(SnapshotProtos.java:1313)
at 
org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$StoreFile.(SnapshotProtos.java:1263)
at 
org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$StoreFile$1.parsePartialFrom(SnapshotProtos.java:1364)
at 
org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$StoreFile$1.parsePartialFrom(SnapshotProtos.java:1359)
at 
com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309)
at 
org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$FamilyFiles.(SnapshotProtos.java:2161)
at 
org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$FamilyFiles.(SnapshotProtos.java:2103)
at 
org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$FamilyFiles$1.parsePartialFrom(SnapshotProtos.java:2197)
at 
org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$FamilyFiles$1.parsePartialFrom(SnapshotProtos.java:2192)
at 
com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309)
at 
org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.(SnapshotProtos.java:1165)
at 
org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.(SnapshotProtos.java:1094)
at 
org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$1.parsePartialFrom(SnapshotProtos.java:1201)
at 
org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$1.parsePartialFrom(SnapshotProtos.java:1196)
at 
com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200)
at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:217)
at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:223)
at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)
at 
org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.parseFrom(SnapshotProtos.java:3111)
at 
org.apache.hadoop.hbase.snapshot.SnapshotManifestV2$2.call(SnapshotManifestV2.java:139)
at 
org.apache.hadoop.hbase.snapshot.SnapshotManifestV2$2.call(SnapshotManifestV2.java:134)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
... 1 more
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Proposal: Create "branch RM" roles for non releasing branches branch-1, branch-2 (when it exists), and master

2017-01-11 Thread Enis Söztutar
I was thinking in similar lines that the RM for 1.X which is the next one
would be managing branch-1, but I am also concerned about the large gap in
terms of timing. For example, unless we are close to 1.4, an 1.4 RM will
not materialize.

So, I am in favor of having an informal branch-1 RM that will work with the
1.x RMs. An +1 for Andrew for that role.

Enis

On Wed, Jan 11, 2017 at 1:17 PM, Andrew Purtell 
wrote:

> We could do it that way but there would be nobody promising to watch
> branch-1 for any length of time. I'd like to do that. We could do this
> alternative for branch-2. And it makes sense once we have this sorted to
> write down what we'd like to do.
>
>
> > On Jan 9, 2017, at 3:27 PM, Nick Dimiduk  wrote:
> >
> > Somewhat late to the reply --
> >
> > Does it make sense, for branch-1, to have the person planning to RM the
> > next minor release act as the RM for the major-level branch? That person
> > would hand responsibility to the next minor RM upon cutting the
> > stabilization branch.
> >
> > This could be applied to master/branch-2 as well, but the further away we
> > get from a target release date, the more nebulous the RM role becomes.
> >
> >> On Fri, Jan 6, 2017 at 5:07 PM Andrew Purtell 
> wrote:
> >>
> >> HBasers,
> >>
> >>
> >>
> >> I would like to propose extending our informal "branch RM" concept just
> a
> >>
> >> bit to include the nonreleasing branches like branch-1, branch-2 (when
> it
> >>
> >> exists), and master. These branches are where all commits are made
> passing
> >>
> >> through down to the releasing branches targeted for the change (like,
> >>
> >> branch-1.1, branch-1.2, branch-1.3, etc.)
> >>
> >>
> >>
> >> The releasing branches all have their own RM. I assume that RM is
> >>
> >> diligently monitoring its state, by way of review of commit history,
> >>
> >> occasional execution of the unit test suite, occasional execution of the
> >>
> >> integration tests, and has perhaps some automation in place to help with
> >>
> >> that on a nightly or weekly basis. No matter, let's assume there is a
> >>
> >> nonzero level of scrutiny applied to them, which leads to feedback to
> >>
> >> committers about inappropriate commits via compat guidelines, commits
> which
> >>
> >> have broken unit tests, or other indications of quality or functional
> >>
> >> concerns.  I think it would improve our overall velocity as a project
> if we
> >>
> >> could also have volunteers tending the development branches upstream
> from
> >>
> >> the releasing branches. Less work would fall to the RMs tending the
> release
> >>
> >> branches if a common troublesome commit can be caught upstream first. In
> >>
> >> particular I am thinking about branch-1.
> >>
> >>
> >>
> >> I would like to volunteer to become the new RM for branch-1, to test and
> >>
> >> refine my above proposal in practice. Unless I hear objections I will
> >>
> >> assume by lazy consensus everyone is ok with this experiment.
> >>
> >>
> >>
> >> What this would mean:
> >>
> >>
> >>
> >>   - JIRAs like "TestFooBar is broken on branch-1" will show up sooner,
> and
> >>
> >>   more likely with fix patches
> >>
> >>   - Semiregular performance reports on branch-1 code as of date X/Y/Z,
> can
> >>
> >>   compare with earlier reports for trending
> >>
> >>   - Occasional sweep through master history looking for appropriate
> >>
> >>   candidates for backport to branch-1, execution of said backport
> >>
> >>   - Occasional 1B row ITBLL torture tests, probably if failure with
> bisect
> >>
> >>   back to commit that introduced instability
> >>
> >>
> >>
> >> What this does not mean:
> >>
> >>
> >>
> >>   - The branch-1 RM will not attempt to tell other branch RMs what or
> what
> >>
> >>   not to include in their release branches
> >>
> >>   - The branch-1 RM won't commit anything backported from master to any
> of
> >>
> >>   the release branches; it will continue to be up to the release branch
> >> RMs
> >>
> >>   what they would or would not like to be included
> >>
> >>
> >>
> >> ​Also, I don't see why I couldn't spend some time looking at master now
> and
> >>
> >> then.
> >>
> >>
> >>
> >> I am going to assume our current co-RM team for branch-2 would maybe do
> >>
> >> something similar for branch-2, once it materializes.
> >>
> >>
> >>
> >> Thoughts? Comments? Concerns?
> >>
> >> ​
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> --
> >>
> >> Best regards,
> >>
> >>
> >>
> >>   - Andy
> >>
> >>
> >>
> >> If you are given a choice, you believe you have acted freely. - Raymond
> >>
> >> Teller (via Peter Watts)
> >>
> >>
>


Re: Proposal: Create "branch RM" roles for non releasing branches branch-1, branch-2 (when it exists), and master

2017-01-11 Thread Andrew Purtell
We could do it that way but there would be nobody promising to watch branch-1 
for any length of time. I'd like to do that. We could do this alternative for 
branch-2. And it makes sense once we have this sorted to write down what we'd 
like to do. 


> On Jan 9, 2017, at 3:27 PM, Nick Dimiduk  wrote:
> 
> Somewhat late to the reply --
> 
> Does it make sense, for branch-1, to have the person planning to RM the
> next minor release act as the RM for the major-level branch? That person
> would hand responsibility to the next minor RM upon cutting the
> stabilization branch.
> 
> This could be applied to master/branch-2 as well, but the further away we
> get from a target release date, the more nebulous the RM role becomes.
> 
>> On Fri, Jan 6, 2017 at 5:07 PM Andrew Purtell  wrote:
>> 
>> HBasers,
>> 
>> 
>> 
>> I would like to propose extending our informal "branch RM" concept just a
>> 
>> bit to include the nonreleasing branches like branch-1, branch-2 (when it
>> 
>> exists), and master. These branches are where all commits are made passing
>> 
>> through down to the releasing branches targeted for the change (like,
>> 
>> branch-1.1, branch-1.2, branch-1.3, etc.)
>> 
>> 
>> 
>> The releasing branches all have their own RM. I assume that RM is
>> 
>> diligently monitoring its state, by way of review of commit history,
>> 
>> occasional execution of the unit test suite, occasional execution of the
>> 
>> integration tests, and has perhaps some automation in place to help with
>> 
>> that on a nightly or weekly basis. No matter, let's assume there is a
>> 
>> nonzero level of scrutiny applied to them, which leads to feedback to
>> 
>> committers about inappropriate commits via compat guidelines, commits which
>> 
>> have broken unit tests, or other indications of quality or functional
>> 
>> concerns.  I think it would improve our overall velocity as a project if we
>> 
>> could also have volunteers tending the development branches upstream from
>> 
>> the releasing branches. Less work would fall to the RMs tending the release
>> 
>> branches if a common troublesome commit can be caught upstream first. In
>> 
>> particular I am thinking about branch-1.
>> 
>> 
>> 
>> I would like to volunteer to become the new RM for branch-1, to test and
>> 
>> refine my above proposal in practice. Unless I hear objections I will
>> 
>> assume by lazy consensus everyone is ok with this experiment.
>> 
>> 
>> 
>> What this would mean:
>> 
>> 
>> 
>>   - JIRAs like "TestFooBar is broken on branch-1" will show up sooner, and
>> 
>>   more likely with fix patches
>> 
>>   - Semiregular performance reports on branch-1 code as of date X/Y/Z, can
>> 
>>   compare with earlier reports for trending
>> 
>>   - Occasional sweep through master history looking for appropriate
>> 
>>   candidates for backport to branch-1, execution of said backport
>> 
>>   - Occasional 1B row ITBLL torture tests, probably if failure with bisect
>> 
>>   back to commit that introduced instability
>> 
>> 
>> 
>> What this does not mean:
>> 
>> 
>> 
>>   - The branch-1 RM will not attempt to tell other branch RMs what or what
>> 
>>   not to include in their release branches
>> 
>>   - The branch-1 RM won't commit anything backported from master to any of
>> 
>>   the release branches; it will continue to be up to the release branch
>> RMs
>> 
>>   what they would or would not like to be included
>> 
>> 
>> 
>> ​Also, I don't see why I couldn't spend some time looking at master now and
>> 
>> then.
>> 
>> 
>> 
>> I am going to assume our current co-RM team for branch-2 would maybe do
>> 
>> something similar for branch-2, once it materializes.
>> 
>> 
>> 
>> Thoughts? Comments? Concerns?
>> 
>> ​
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> --
>> 
>> Best regards,
>> 
>> 
>> 
>>   - Andy
>> 
>> 
>> 
>> If you are given a choice, you believe you have acted freely. - Raymond
>> 
>> Teller (via Peter Watts)
>> 
>> 


Successful: HBase Generate Website

2017-01-11 Thread Apache Jenkins Server
Build status: Successful

If successful, the website and docs have been generated. To update the live 
site, follow the instructions below. If failed, skip to the bottom of this 
email.

Use the following commands to download the patch and apply it to a clean branch 
based on origin/asf-site. If you prefer to keep the hbase-site repo around 
permanently, you can skip the clone step.

  git clone https://git-wip-us.apache.org/repos/asf/hbase-site.git

  cd hbase-site
  wget -O- 
https://builds.apache.org/job/hbase_generate_website/458/artifact/website.patch.zip
 | funzip > 953416eb3411f7361f39283fabd4a555bfc873f0.patch
  git fetch
  git checkout -b asf-site-953416eb3411f7361f39283fabd4a555bfc873f0 
origin/asf-site
  git am --whitespace=fix 953416eb3411f7361f39283fabd4a555bfc873f0.patch

At this point, you can preview the changes by opening index.html or any of the 
other HTML pages in your local 
asf-site-953416eb3411f7361f39283fabd4a555bfc873f0 branch.

There are lots of spurious changes, such as timestamps and CSS styles in 
tables, so a generic git diff is not very useful. To see a list of files that 
have been added, deleted, renamed, changed type, or are otherwise interesting, 
use the following command:

  git diff --name-status --diff-filter=ADCRTXUB origin/asf-site

To see only files that had 100 or more lines changed:

  git diff --stat origin/asf-site | grep -E '[1-9][0-9]{2,}'

When you are satisfied, publish your changes to origin/asf-site using these 
commands:

  git commit --allow-empty -m "Empty commit" # to work around a current ASF 
INFRA bug
  git push origin asf-site-953416eb3411f7361f39283fabd4a555bfc873f0:asf-site
  git checkout asf-site
  git branch -D asf-site-953416eb3411f7361f39283fabd4a555bfc873f0

Changes take a couple of minutes to be propagated. You can verify whether they 
have been propagated by looking at the Last Published date at the bottom of 
http://hbase.apache.org/. It should match the date in the index.html on the 
asf-site branch in Git.

As a courtesy- reply-all to this email to let other committers know you pushed 
the site.



If failed, see https://builds.apache.org/job/hbase_generate_website/458/console

[jira] [Created] (HBASE-17451) [C++] HBase Request and Response Converter

2017-01-11 Thread Sudeep Sunthankar (JIRA)
Sudeep Sunthankar created HBASE-17451:
-

 Summary: [C++] HBase Request and Response Converter
 Key: HBASE-17451
 URL: https://issues.apache.org/jira/browse/HBASE-17451
 Project: HBase
  Issue Type: Sub-task
Reporter: Sudeep Sunthankar


Conversion of HBase client side data structures to Protobuf messages and 
vice-versa.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17450) TablePermission.equals throws NPE after support namespace

2017-01-11 Thread huzheng (JIRA)
huzheng created HBASE-17450:
---

 Summary: TablePermission.equals throws NPE  after support 
namespace 
 Key: HBASE-17450
 URL: https://issues.apache.org/jira/browse/HBASE-17450
 Project: HBase
  Issue Type: Bug
  Components: findbugs
Reporter: huzheng
Assignee: huzheng


below unit test will throws NullPointerException:  

{code}
p1 = new TablePermission(TEST_NAMESPACE, TablePermission.Action.READ);
p2 = new TablePermission(TEST_NAMESPACE, TablePermission.Action.READ);
assertEquals(p1, p2);
{code}

this bug was introduced after construct method for namespace provided. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17449) Add explicit document on different timeout settings

2017-01-11 Thread Yu Li (JIRA)
Yu Li created HBASE-17449:
-

 Summary: Add explicit document on different timeout settings
 Key: HBASE-17449
 URL: https://issues.apache.org/jira/browse/HBASE-17449
 Project: HBase
  Issue Type: Improvement
  Components: documentation
Reporter: Yu Li


Currently we have more than one timeout settings, mainly includes:
* hbase.rpc.timeout
* hbase.client.operation.timeout
* hbase.client.scanner.timeout.period

And in latest branch-1 or master branch code, we will have two other properties:
* hbase.rpc.read.timeout
* hbase.rpc.write.timeout

However, in current refguid we don't have explicit instruction on the 
difference of these timeout settings (there're explanations for each property, 
but no instruction on when to use which)

In my understanding, for RPC layer timeout, or say each rpc call:
* Scan (openScanner/next): controlled by hbase.client.scanner.timeout.period
* Other operations:
   1. For released versions: controlled by hbase.rpc.timeout
   2. For 1.4+ versions: read operation controlled by hbase.rpc.read.timeout, 
write operation controlled by hbase.rpc.write.timeout, or hbase.rpc.timeout if 
the previous two are not set.

And hbase.client.operation.timeout is a higher-level control counting retry in, 
or say the overall control for one user call.

After this JIRA, I hope when users ask questions like "What settings I should 
use if I don't want to wait for more than 1 second for a single 
put/get/scan.next call", we could give a neat answer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HBASE-14061) Support CF-level Storage Policy

2017-01-11 Thread Yu Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li reopened HBASE-14061:
---

> Support CF-level Storage Policy
> ---
>
> Key: HBASE-14061
> URL: https://issues.apache.org/jira/browse/HBASE-14061
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, regionserver
> Environment: hadoop-2.6.0
>Reporter: Victor Xu
>Assignee: Yu Li
> Fix For: 2.0.0
>
> Attachments: HBASE-14061-master-v1.patch, HBASE-14061.addendum.patch, 
> HBASE-14061.addendum.patch, HBASE-14061.v2.patch, HBASE-14061.v3.patch, 
> HBASE-14061.v4.patch
>
>
> After reading [HBASE-12848|https://issues.apache.org/jira/browse/HBASE-12848] 
> and [HBASE-12934|https://issues.apache.org/jira/browse/HBASE-12934], I wrote 
> a patch to implement cf-level storage policy. 
> My main purpose is to improve random-read performance for some really hot 
> data, which usually locates in certain column family of a big table.
> Usage:
> $ hbase shell
> > alter 'TABLE_NAME', METADATA => {'hbase.hstore.block.storage.policy' => 
> > 'POLICY_NAME'}
> > alter 'TABLE_NAME', {NAME=>'CF_NAME', METADATA => 
> > {'hbase.hstore.block.storage.policy' => 'POLICY_NAME'}}
> HDFS's setStoragePolicy can only take effect when new hfile is created in a 
> configured directory, so I had to make sub directories(for each cf) in 
> region's .tmp directory and set storage policy for them.
> Besides, I had to upgrade hadoop version to 2.6.0 because 
> dfs.getStoragePolicy cannot be easily written in reflection, and I needed 
> this api to finish my unit test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)