Re: [DISCUSS] Planning changes on RegionServer totalRequestCount metrics

2017-08-07 Thread Anoop John
I see..  Good...  Ya +1

On Tue, Aug 8, 2017 at 8:46 AM, Yu Li  wrote:
> Thanks for chiming in @Anoop. Jerry raised the same question in JIRA and
> the patch is already updated there, will rename the metrics to
> "totalRowActionRequestCount". Will add release note to make it clear for
> user what the final changes are
>
> Best Regards,
> Yu
>
> On 7 August 2017 at 15:19, Anoop John  wrote:
>
>> Sorry for being late here Yu Li.
>> Regarding counting the rows (for the new metric) in multi..  There
>> might be 2 Actions in multi request for the same row. This is possible
>> some time.  I dont think we should check that and try to make it
>> perfect. That will have perf penalty also.  So just saying that we
>> will have some possible inconsistency even after.  May be we can say
>> how many actions in multi not rows affected!  any better name ?
>>
>> On Mon, Aug 7, 2017 at 8:07 AM, Yu Li  wrote:
>> > Thanks for chiming in @stack and @Jerry, will try to add a good release
>> > note when the work done.
>> >
>> > Since already more than 72 hours passed and no objections, I'd like to
>> call
>> > this discussion closed and apply the change in HBASE-18469. Thanks.
>> >
>> > Best Regards,
>> > Yu
>> >
>> > On 4 August 2017 at 13:59, stack  wrote:
>> >
>> >> +1
>> >>
>> >> We need a fat release note on this change so operators can quickly learn
>> >> why traffic went down on upgrade.
>> >>
>> >> S
>> >>
>> >> On Aug 3, 2017 14:49, "Yu Li"  wrote:
>> >>
>> >> > Dear all,
>> >> >
>> >> > Recently in HBASE-18469 > >> jira/browse/HBASE-18469
>> >> > >
>> >> > we found some inconsistency on regionserver request related metrics,
>> >> > including:
>> >> > 1. totalRequestCount could be less than readRequestCount+
>> >> writeRequestCount
>> >> > 2. For multi request, we count action count into totalRequestCount,
>> while
>> >> > for scan with caching we count only one.
>> >> >
>> >> > To fix the inconsistency, we plan to make below changes:
>> >> > 1. Make totalRequestCount only counts rpc request, thus multi request
>> >> will
>> >> > only count as one for totalRequestCount
>> >> > 2. Introduce a new metrics in name of "totalRowsRequestCount", which
>> will
>> >> > count the DML workloads on RS by row-level action, and for this
>> metrics
>> >> we
>> >> > will count how many rows included for multi and scan-with-caching
>> >> request.
>> >> >
>> >> > After the change, there won't be any compatibility issue -- existing
>> >> > monitoring system could still work -- only that totalRequestCount
>> will be
>> >> > less than previous. And it's recommended to use totalRowsRequestCount
>> to
>> >> > check the RS DML workload.
>> >> >
>> >> > Please kindly let us know if you have any different idea or suggestion
>> >> > (operators' opinion is especially welcomed).
>> >> >
>> >> > Let's make this discussion open for 72 hours and will make the change
>> if
>> >> no
>> >> > objections.
>> >> >
>> >> > Thanks!
>> >> >
>> >> > Best Regards,
>> >> > Yu
>> >> >
>> >>
>>


Re: Notes from dev meetup in Shenzhen, August 5th, 2017

2017-08-07 Thread Yu Li
Thanks for the great write up sir. You're really multi-threading: cannot
imagine how to write almost all details down while fully involved in the
discussion, please teach us!

Maybe we could open several umbrellas in JIRA to follow up the
implement-able topics and do some prioritizing? Thanks.

Best Regards,
Yu

On 8 August 2017 at 10:54, ashish singhi  wrote:

> Great write up, Stack. Covering everything what we all discussed.
> It was very nice meeting you all and hope we can continue this HBaseCon
> Asia.
>
> Regards,
> Ashish
>
> From: saint@gmail.com [mailto:saint@gmail.com] On Behalf Of Stack
> Sent: 08 August 2017 00:07
> To: HBase Dev List 
> Subject: Notes from dev meetup in Shenzhen, August 5th, 2017
>
> At fancy Huawei headquarters, 10:00-12:00AM or so (with nice coffee and
> fancy little cake squares provided about half way through the session).
>
> For list of attendees, see picture at end of this email.
>
> Discussion was mostly in Chinese with about 25% in English plus some
> gracious sideline translation so the below is patchy. Hopefully you get the
> gist.
>
> For client-side scanner going against hfiles directly; is there a means of
> being able to pass the permissions from hbase to hdfs?
>
> Issues w/ the hbase 99th percentile were brought up. "DynamoDB can do
> 10ms". How to do better?
>
> SSD is not enough.
>
> GC messes us up.
>
> Will the Distributed Log Replay come back to help improve MTTR? We could
> redo on new ProcedureV2 basis. ZK timeout is the biggest issue. Do as we
> used to and just rely on the regionserver heartbeating...
>
> Read replica helps w/ MTTR.
>
> Ratis incubator project to do a quorum based hbase?
>
> Digression on licensing issues around fb wangle and folly.
>
> Redo of hbase but quorum based would be another project altogether.
>
> Decided to go around the table to talk about concerns and what people are
> working on.
>
> Jieshan wondered what could be done to improve OLAP over hbase.
>
> Client side scanner was brought up again as means of skipping RS overhead
> and doing better OLAP.
>
> Have HBase compact to parquet files. Query parquet and hbase.
>
> At Huawei, they are using 1.0 hbase. Most problems are assignment. They
> have .5M regions. RIT is a killer. Double assignment issues. And RIT. They
> run their own services. Suggested they upgrade to get fixes at least. Then
> 2.0.
>
> Will HBase federate like HDFS? Can Master handle load at large scale? It
> needs to do federation too?
>
> Anyone using Bulk loaded replication? (Yes, it just works so no one talks
> about it...)
>
> Request that fixes be backported to all active branches, not just most
> current.
>
> Andrew was good at backporting... not all RMs are.
>
> Too many branches. What should we do?
>
> Proliferation of branches makes for too much work.
>
> Need to cleanup bugs in 1.3. Make it stable release now.
>
> Lets do more active EOL'ing of branches. 1.1?.
>
> Hubert asked if we can have clusters where RS are differently capable?
> i.e. several generations of HW all running in the same cluster.
>
> What if fat server goes down.
>
> Balancer could take of it all. RS Capacity. Balancer can take it into
> account.
> Regionserver labels like YARN labels. Characteristics.
>
> Or run it all in docker when heterogeneous cluster. The K8 talk from day
> before was mentioned; we should all look at being able to deploy in k8 and
> docker.
>
> Lets put out kubernetes blog...(Doing).
>
> Alibaba looking at HBase as native YARN app.
>
> i/o is hard even when containers.
>
> Use autoscaler of K8 when heavy user.
>
> Limit i/o use w/ CP. Throttle.
>
> Spark and client-side scanner came up again.
>
> Snapshot input format in spark.
>
> HBase federation came up again. jd.com talking of 3k to 4k
> nodes in a cluster. Millions of regions. Region assignment is messing them
> up.
>
> Maybe federation is good idea? Argument that it is too much operational
> conplexity. Can we fix master load w/ splittable meta, etc?
>
> Was brought up that even w/ 100s of RS there is scale issue, nvm thousands.
>
> Alibaba talked about disaster recovery. Described issue where HDFS has
> fencing problem during an upgrade. There was no active NN. All RS went down.
> ZK is another POF. If ZK is not available. Operators were being asked how
> much longer the cluster was going to be down but they could not answer the
> question. No indicators from HBase on how much longer it will be down or
> how many WALs its processed and how many more to go. Operator unable to
> tell his org how long it would be before it all came back on line. Should
> say how many regions are online and how many more to do.
>
> Alibaba use SQL to lower cost. HBase API is low-level. Row-key
> construction is tricky. New users make common mistakes. If you don't do
> schema right, high-performance is difficult.
>
> Alibaba are using a subset of Phoenix... simple sql only; throws
> exceptions if 

Re: [DISCUSS] Planning changes on RegionServer totalRequestCount metrics

2017-08-07 Thread Yu Li
Thanks for chiming in @Anoop. Jerry raised the same question in JIRA and
the patch is already updated there, will rename the metrics to
"totalRowActionRequestCount". Will add release note to make it clear for
user what the final changes are

Best Regards,
Yu

On 7 August 2017 at 15:19, Anoop John  wrote:

> Sorry for being late here Yu Li.
> Regarding counting the rows (for the new metric) in multi..  There
> might be 2 Actions in multi request for the same row. This is possible
> some time.  I dont think we should check that and try to make it
> perfect. That will have perf penalty also.  So just saying that we
> will have some possible inconsistency even after.  May be we can say
> how many actions in multi not rows affected!  any better name ?
>
> On Mon, Aug 7, 2017 at 8:07 AM, Yu Li  wrote:
> > Thanks for chiming in @stack and @Jerry, will try to add a good release
> > note when the work done.
> >
> > Since already more than 72 hours passed and no objections, I'd like to
> call
> > this discussion closed and apply the change in HBASE-18469. Thanks.
> >
> > Best Regards,
> > Yu
> >
> > On 4 August 2017 at 13:59, stack  wrote:
> >
> >> +1
> >>
> >> We need a fat release note on this change so operators can quickly learn
> >> why traffic went down on upgrade.
> >>
> >> S
> >>
> >> On Aug 3, 2017 14:49, "Yu Li"  wrote:
> >>
> >> > Dear all,
> >> >
> >> > Recently in HBASE-18469  >> jira/browse/HBASE-18469
> >> > >
> >> > we found some inconsistency on regionserver request related metrics,
> >> > including:
> >> > 1. totalRequestCount could be less than readRequestCount+
> >> writeRequestCount
> >> > 2. For multi request, we count action count into totalRequestCount,
> while
> >> > for scan with caching we count only one.
> >> >
> >> > To fix the inconsistency, we plan to make below changes:
> >> > 1. Make totalRequestCount only counts rpc request, thus multi request
> >> will
> >> > only count as one for totalRequestCount
> >> > 2. Introduce a new metrics in name of "totalRowsRequestCount", which
> will
> >> > count the DML workloads on RS by row-level action, and for this
> metrics
> >> we
> >> > will count how many rows included for multi and scan-with-caching
> >> request.
> >> >
> >> > After the change, there won't be any compatibility issue -- existing
> >> > monitoring system could still work -- only that totalRequestCount
> will be
> >> > less than previous. And it's recommended to use totalRowsRequestCount
> to
> >> > check the RS DML workload.
> >> >
> >> > Please kindly let us know if you have any different idea or suggestion
> >> > (operators' opinion is especially welcomed).
> >> >
> >> > Let's make this discussion open for 72 hours and will make the change
> if
> >> no
> >> > objections.
> >> >
> >> > Thanks!
> >> >
> >> > Best Regards,
> >> > Yu
> >> >
> >>
>


[jira] [Resolved] (HBASE-18266) Eliminate the warnings from the spotbugs

2017-08-07 Thread Chia-Ping Tsai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved HBASE-18266.

Resolution: Fixed

All sub-tasks are resolved.

> Eliminate the warnings from the spotbugs
> 
>
> Key: HBASE-18266
> URL: https://issues.apache.org/jira/browse/HBASE-18266
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 3.0.0, 1.4.0, 1.3.2, 1.2.7, 2.0.0-alpha-2
>
>
> It is hard to get +1 from QA currently because spotbugs is always unhappy...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


RE: Notes from dev meetup in Shenzhen, August 5th, 2017

2017-08-07 Thread ashish singhi
Great write up, Stack. Covering everything what we all discussed.
It was very nice meeting you all and hope we can continue this HBaseCon Asia.

Regards,
Ashish

From: saint@gmail.com [mailto:saint@gmail.com] On Behalf Of Stack
Sent: 08 August 2017 00:07
To: HBase Dev List 
Subject: Notes from dev meetup in Shenzhen, August 5th, 2017

At fancy Huawei headquarters, 10:00-12:00AM or so (with nice coffee and fancy 
little cake squares provided about half way through the session).

For list of attendees, see picture at end of this email.

Discussion was mostly in Chinese with about 25% in English plus some gracious 
sideline translation so the below is patchy. Hopefully you get the gist.

For client-side scanner going against hfiles directly; is there a means of 
being able to pass the permissions from hbase to hdfs?

Issues w/ the hbase 99th percentile were brought up. "DynamoDB can do 10ms". 
How to do better?

SSD is not enough.

GC messes us up.

Will the Distributed Log Replay come back to help improve MTTR? We could redo 
on new ProcedureV2 basis. ZK timeout is the biggest issue. Do as we used to and 
just rely on the regionserver heartbeating...

Read replica helps w/ MTTR.

Ratis incubator project to do a quorum based hbase?

Digression on licensing issues around fb wangle and folly.

Redo of hbase but quorum based would be another project altogether.

Decided to go around the table to talk about concerns and what people are 
working on.

Jieshan wondered what could be done to improve OLAP over hbase.

Client side scanner was brought up again as means of skipping RS overhead and 
doing better OLAP.

Have HBase compact to parquet files. Query parquet and hbase.

At Huawei, they are using 1.0 hbase. Most problems are assignment. They have 
.5M regions. RIT is a killer. Double assignment issues. And RIT. They run their 
own services. Suggested they upgrade to get fixes at least. Then 2.0.

Will HBase federate like HDFS? Can Master handle load at large scale? It needs 
to do federation too?

Anyone using Bulk loaded replication? (Yes, it just works so no one talks about 
it...)

Request that fixes be backported to all active branches, not just most current.

Andrew was good at backporting... not all RMs are.

Too many branches. What should we do?

Proliferation of branches makes for too much work.

Need to cleanup bugs in 1.3. Make it stable release now.

Lets do more active EOL'ing of branches. 1.1?.

Hubert asked if we can have clusters where RS are differently capable? i.e. 
several generations of HW all running in the same cluster.

What if fat server goes down.

Balancer could take of it all. RS Capacity. Balancer can take it into account.
Regionserver labels like YARN labels. Characteristics.

Or run it all in docker when heterogeneous cluster. The K8 talk from day before 
was mentioned; we should all look at being able to deploy in k8 and docker.

Lets put out kubernetes blog...(Doing).

Alibaba looking at HBase as native YARN app.

i/o is hard even when containers.

Use autoscaler of K8 when heavy user.

Limit i/o use w/ CP. Throttle.

Spark and client-side scanner came up again.

Snapshot input format in spark.

HBase federation came up again. jd.com talking of 3k to 4k nodes 
in a cluster. Millions of regions. Region assignment is messing them up.

Maybe federation is good idea? Argument that it is too much operational 
conplexity. Can we fix master load w/ splittable meta, etc?

Was brought up that even w/ 100s of RS there is scale issue, nvm thousands.

Alibaba talked about disaster recovery. Described issue where HDFS has fencing 
problem during an upgrade. There was no active NN. All RS went down.
ZK is another POF. If ZK is not available. Operators were being asked how much 
longer the cluster was going to be down but they could not answer the question. 
No indicators from HBase on how much longer it will be down or how many WALs 
its processed and how many more to go. Operator unable to tell his org how long 
it would be before it all came back on line. Should say how many regions are 
online and how many more to do.

Alibaba use SQL to lower cost. HBase API is low-level. Row-key construction is 
tricky. New users make common mistakes. If you don't do schema right, 
high-performance is difficult.

Alibaba are using a subset of Phoenix... simple sql only; throws exceptions if 
user tries to do joins, etc.., anything but basic ops.

HareQL is using hive for meta store.  Don't have data typing in hbase.

HareQL could perhaps contribute some piece... or a module in hbase to sql... 
From phoenix?

Secondary index.

Client is complicated in phoenix. Was suggested thin client just does parse... 
and then offload to server for optimization and execution.

Then secondary index. Need transaction engine. Consistency of secondary index.

We adjourned.

Your dodgy secretary,
St.Ack
P.S. Please add to this base set of notes if I missed anything.





[jira] [Resolved] (HBASE-18078) [C++] Harden RPC by handling various communication abnormalities

2017-08-07 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar resolved HBASE-18078.
---
   Resolution: Fixed
Fix Version/s: HBASE-14850

I have committed the v8 version which already addresses the review comments. 
HBASE-18204 builds on top of this. Thanks [~xiaobingo] for the patch. 


> [C++] Harden RPC by handling various communication abnormalities
> 
>
> Key: HBASE-18078
> URL: https://issues.apache.org/jira/browse/HBASE-18078
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Fix For: HBASE-14850
>
> Attachments: HBASE-18078.000.patch, HBASE-18078.001.patch, 
> HBASE-18078.002.patch, HBASE-18078.003.patch, HBASE-18078.004.patch, 
> HBASE-18078.005.patch, HBASE-18078.006.patch, HBASE-18078.007.patch, 
> HBASE-18078.008.patch
>
>
> RPC layer should handle various communication abnormalities (e.g. connection 
> timeout, server aborted connection, and so on). Ideally, the corresponding 
> exceptions should be raised and propagated through handlers of pipeline in 
> client.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-18537) [C++] Improvements to load-client

2017-08-07 Thread Enis Soztutar (JIRA)
Enis Soztutar created HBASE-18537:
-

 Summary: [C++] Improvements to load-client
 Key: HBASE-18537
 URL: https://issues.apache.org/jira/browse/HBASE-18537
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: HBASE-14850


A couple of improvements to the load-client after spending some time with 
testing:
 - better log messages
 - support for progress
 - minor bug fixes




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: hbaseconasia2017 slides and a few phtotos

2017-08-07 Thread Anoop John
Thanks for the write up Stack.

Thanks Huawei and all sponsors and specially to Jieshan.  Few months
back, such a conference was just a dream and it is mainly because of
him this could happen.  Hope we can continue this HBaseconAsia.

-Anoop-

On Tue, Aug 8, 2017 at 7:02 AM, Bijieshan  wrote:
> Thanks for uploading all the slides and the good write-up, stackHope more 
> people will join us next year:)
>
> Jieshan.
> -Original Message-
> From: saint@gmail.com [mailto:saint@gmail.com] On Behalf Of Stack
> Sent: 2017年8月7日 23:12
> To: HBase Dev List ; Hbase-User 
> Subject: hbaseconasia2017 slides and a few phtotos
>
> We had a nice day out in Shenzhen at HBaseCon Asia last friday [0] (August 
> 4th). There were a bunch of great talks [1] by the likes of jd.com, huawei, 
> xiaomi, alibaba and others. Those given in Chinese had best attendance. A 
> good few folks graciously had slides in English for the language-dumb (like
> myself) while their talk native.
>
> A couple of hbase-as-a-service in the cloud are coming down the pipe, a few 
> presenters talked about being at current scale limits, China Life keeps all 
> data as JSON in hbase instances, and there was an interesting talk on upping 
> utilization by deploying hbase with kubernetes (>1 container per node).
>
> Best quote: "HBase is just a kid--it is only ten years old" from our Yu Li 
> talking about interesting write speedups coming from Alibaba arguing there 
> are many more speedups to be had.
>
> I attached a few pictures. I'll put up more after I fixup the hbasecon home 
> page and redirect later.
>
> The day after, the 5th, there was a dev meetup at the Huawei office; notes to 
> follow.
>
> Thanks again to the program committee, the sponsors, and to our gracious host 
> Huawei. Jieshan in particular did an amazing job running the show taking care 
> of speakers.
>
> St.Ack
> 0.  Click on 'view details' on the right hand side of this page for agenda 
> (eventbrite won't show the event page at the end of the link
> post-the-event): https://www.eventbrite.com/e/hbasecon-asia-2017-
> tickets-34935546159#
> 1. Slides up on slideshare: https://www.slideshare.net/ 
> search/slideshow?searchfrom=header=hbaseconasia2017=
> any=all=en=
> 
>
>  _YP99027.JPG
> 
>
> The program committee taking questions at the end of the day
>
>  _YP99181.JPG
> 
>
> This one is of all the speakers
>
>  _YP99193.JPG
> 
>


RE: hbaseconasia2017 slides and a few phtotos

2017-08-07 Thread Bijieshan
Thanks for uploading all the slides and the good write-up, stackHope more 
people will join us next year:)

Jieshan.
-Original Message-
From: saint@gmail.com [mailto:saint@gmail.com] On Behalf Of Stack
Sent: 2017年8月7日 23:12
To: HBase Dev List ; Hbase-User 
Subject: hbaseconasia2017 slides and a few phtotos

We had a nice day out in Shenzhen at HBaseCon Asia last friday [0] (August 
4th). There were a bunch of great talks [1] by the likes of jd.com, huawei, 
xiaomi, alibaba and others. Those given in Chinese had best attendance. A good 
few folks graciously had slides in English for the language-dumb (like
myself) while their talk native.

A couple of hbase-as-a-service in the cloud are coming down the pipe, a few 
presenters talked about being at current scale limits, China Life keeps all 
data as JSON in hbase instances, and there was an interesting talk on upping 
utilization by deploying hbase with kubernetes (>1 container per node).

Best quote: "HBase is just a kid--it is only ten years old" from our Yu Li 
talking about interesting write speedups coming from Alibaba arguing there are 
many more speedups to be had.

I attached a few pictures. I'll put up more after I fixup the hbasecon home 
page and redirect later.

The day after, the 5th, there was a dev meetup at the Huawei office; notes to 
follow.

Thanks again to the program committee, the sponsors, and to our gracious host 
Huawei. Jieshan in particular did an amazing job running the show taking care 
of speakers.

St.Ack
0.  Click on 'view details' on the right hand side of this page for agenda 
(eventbrite won't show the event page at the end of the link
post-the-event): https://www.eventbrite.com/e/hbasecon-asia-2017-
tickets-34935546159#
1. Slides up on slideshare: https://www.slideshare.net/ 
search/slideshow?searchfrom=header=hbaseconasia2017=
any=all=en=

​
 _YP99027.JPG

​
The program committee taking questions at the end of the day
​
 _YP99181.JPG

​
This one is of all the speakers
​
 _YP99193.JPG

[jira] [Created] (HBASE-18536) [C++]

2017-08-07 Thread Xiaobing Zhou (JIRA)
Xiaobing Zhou created HBASE-18536:
-

 Summary: [C++] 
 Key: HBASE-18536
 URL: https://issues.apache.org/jira/browse/HBASE-18536
 Project: HBase
  Issue Type: Sub-task
Reporter: Xiaobing Zhou






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-18535) [C++] make RPC test mode transparent to initialization of RpcPipeline

2017-08-07 Thread Xiaobing Zhou (JIRA)
Xiaobing Zhou created HBASE-18535:
-

 Summary: [C++] make RPC test mode transparent to initialization of 
RpcPipeline
 Key: HBASE-18535
 URL: https://issues.apache.org/jira/browse/HBASE-18535
 Project: HBase
  Issue Type: Sub-task
Reporter: Xiaobing Zhou






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-18534) [C++] Support timeout in Rpc

2017-08-07 Thread Xiaobing Zhou (JIRA)
Xiaobing Zhou created HBASE-18534:
-

 Summary: [C++] Support timeout in Rpc
 Key: HBASE-18534
 URL: https://issues.apache.org/jira/browse/HBASE-18534
 Project: HBase
  Issue Type: Sub-task
Reporter: Xiaobing Zhou
Assignee: Xiaobing Zhou






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-18533) Expose BucketCache values to be configured

2017-08-07 Thread Zach York (JIRA)
Zach York created HBASE-18533:
-

 Summary: Expose BucketCache values to be configured
 Key: HBASE-18533
 URL: https://issues.apache.org/jira/browse/HBASE-18533
 Project: HBase
  Issue Type: Improvement
  Components: BucketCache
Reporter: Zach York
Assignee: Zach York


BucketCache always uses the default values for all cache configuration. 
However, this doesn't work for all use cases. In particular, users want to be 
able to configure the percentage of the cache that is single access, multi 
access, and in-memory access.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Notes from dev meetup in Shenzhen, August 5th, 2017

2017-08-07 Thread Jerry He
Looks like you folks had a good time there.  I wish I could've made it.
Good write-up too.

Thanks.

Jerry

On Mon, Aug 7, 2017 at 10:38 AM, ramkrishna vasudevan
 wrote:
> Thanks for the write up Stack. I could not make it to Shenzhen. Nice to
> know the conference and meet up went great.
>
> Regards
> Ram
>
> On Mon, Aug 7, 2017 at 9:36 PM, Stack  wrote:
>
>> At fancy Huawei headquarters, 10:00-12:00AM or so (with nice coffee and
>> fancy little cake squares provided about half way through the session).
>>
>> For list of attendees, see picture at end of this email.
>>
>> Discussion was mostly in Chinese with about 25% in English plus some
>> gracious sideline translation so the below is patchy. Hopefully you get the
>> gist.
>>
>> For client-side scanner going against hfiles directly; is there a means of
>> being able to pass the permissions from hbase to hdfs?
>>
>> Issues w/ the hbase 99th percentile were brought up. "DynamoDB can do
>> 10ms". How to do better?
>>
>> SSD is not enough.
>>
>> GC messes us up.
>>
>> Will the Distributed Log Replay come back to help improve MTTR? We could
>> redo on new ProcedureV2 basis. ZK timeout is the biggest issue. Do as we
>> used to and just rely on the regionserver heartbeating...
>>
>> Read replica helps w/ MTTR.
>>
>> Ratis incubator project to do a quorum based hbase?
>>
>> Digression on licensing issues around fb wangle and folly.
>>
>> Redo of hbase but quorum based would be another project altogether.
>>
>> Decided to go around the table to talk about concerns and what people are
>> working on.
>>
>> Jieshan wondered what could be done to improve OLAP over hbase.
>>
>> Client side scanner was brought up again as means of skipping RS overhead
>> and doing better OLAP.
>>
>> Have HBase compact to parquet files. Query parquet and hbase.
>>
>> At Huawei, they are using 1.0 hbase. Most problems are assignment. They
>> have .5M regions. RIT is a killer. Double assignment issues. And RIT. They
>> run their own services. Suggested they upgrade to get fixes at least. Then
>> 2.0.
>>
>> Will HBase federate like HDFS? Can Master handle load at large scale? It
>> needs to do federation too?
>>
>> Anyone using Bulk loaded replication? (Yes, it just works so no one talks
>> about it...)
>>
>> Request that fixes be backported to all active branches, not just most
>> current.
>>
>> Andrew was good at backporting... not all RMs are.
>>
>> Too many branches. What should we do?
>>
>> Proliferation of branches makes for too much work.
>>
>> Need to cleanup bugs in 1.3. Make it stable release now.
>>
>> Lets do more active EOL'ing of branches. 1.1?.
>>
>> Hubert asked if we can have clusters where RS are differently capable?
>> i.e. several generations of HW all running in the same cluster.
>>
>> What if fat server goes down.
>>
>> Balancer could take of it all. RS Capacity. Balancer can take it into
>> account.
>> Regionserver labels like YARN labels. Characteristics.
>>
>> Or run it all in docker when heterogeneous cluster. The K8 talk from day
>> before was mentioned; we should all look at being able to deploy in k8 and
>> docker.
>>
>> Lets put out kubernetes blog...(Doing).
>>
>> Alibaba looking at HBase as native YARN app.
>>
>> i/o is hard even when containers.
>>
>> Use autoscaler of K8 when heavy user.
>>
>> Limit i/o use w/ CP. Throttle.
>>
>> Spark and client-side scanner came up again.
>>
>> Snapshot input format in spark.
>>
>> HBase federation came up again. jd.com talking of 3k to 4k nodes in a
>> cluster. Millions of regions. Region assignment is messing them up.
>>
>> Maybe federation is good idea? Argument that it is too much operational
>> conplexity. Can we fix master load w/ splittable meta, etc?
>>
>> Was brought up that even w/ 100s of RS there is scale issue, nvm thousands.
>>
>> Alibaba talked about disaster recovery. Described issue where HDFS has
>> fencing problem during an upgrade. There was no active NN. All RS went down.
>> ZK is another POF. If ZK is not available. Operators were being asked how
>> much longer the cluster was going to be down but they could not answer the
>> question. No indicators from HBase on how much longer it will be down or
>> how many WALs its processed and how many more to go. Operator unable to
>> tell his org how long it would be before it all came back on line. Should
>> say how many regions are online and how many more to do.
>>
>> Alibaba use SQL to lower cost. HBase API is low-level. Row-key
>> construction is tricky. New users make common mistakes. If you don't do
>> schema right, high-performance is difficult.
>>
>> Alibaba are using a subset of Phoenix... simple sql only; throws
>> exceptions if user tries to do joins, etc.., anything but basic ops.
>>
>> HareQL is using hive for meta store.  Don't have data typing in hbase.
>>
>> HareQL could perhaps contribute some piece... or a module in hbase to
>> sql... From phoenix?
>>
>> Secondary index.
>>
>> Client is 

[jira] [Created] (HBASE-18532) Improve cache related stats rendered on RS UI

2017-08-07 Thread Biju Nair (JIRA)
Biju Nair created HBASE-18532:
-

 Summary: Improve cache related stats rendered on RS UI
 Key: HBASE-18532
 URL: https://issues.apache.org/jira/browse/HBASE-18532
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 1.1.2
Reporter: Biju Nair


The stats currently rendered for L1 and L2 cache are incorrect. Refer to the 
attached screenshots of stats from a cluster showing the combined cache stats, 
L1 stats and L2 stats. For e.g. the combined stats shows 38 GB used for cache 
while if we sum size of L1 and L2 cache the value is way less. One way we can 
improve this is to use the same stats used to populate the combined stats to 
render the values of L1 & L2 cache. Also for usability we can remove the table 
with details with BucketCache buckets from the L2 cache stats since this is 
going to be long for any installation using L2 cache. This will help in 
understanding the cache usage better. Thoughts? If there are no concerns will 
submit a patch. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-18531) incorrect error msg

2017-08-07 Thread david serafini (JIRA)
david serafini created HBASE-18531:
--

 Summary: incorrect error msg
 Key: HBASE-18531
 URL: https://issues.apache.org/jira/browse/HBASE-18531
 Project: HBase
  Issue Type: Bug
  Components: hbase
Affects Versions: 1.1.2
 Environment: CentOS release 6.8 (Final)
Reporter: david serafini
Priority: Minor


HBase 1.1.2.2.5.3.0-37  Hadoop 2.7.3.2.5.3.0-37

The hbase shell command :

{noformat}
hbase(main):003:0> grant 'username', 'RWXC', '@namespacename'
{noformat}

produces the error msg

{noformat}
ERROR: Unknown namespace username!
{noformat}

when it actually is the namespacename that is incorrect.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Notes from dev meetup in Shenzhen, August 5th, 2017

2017-08-07 Thread ramkrishna vasudevan
Thanks for the write up Stack. I could not make it to Shenzhen. Nice to
know the conference and meet up went great.

Regards
Ram

On Mon, Aug 7, 2017 at 9:36 PM, Stack  wrote:

> At fancy Huawei headquarters, 10:00-12:00AM or so (with nice coffee and
> fancy little cake squares provided about half way through the session).
>
> For list of attendees, see picture at end of this email.
>
> Discussion was mostly in Chinese with about 25% in English plus some
> gracious sideline translation so the below is patchy. Hopefully you get the
> gist.
>
> For client-side scanner going against hfiles directly; is there a means of
> being able to pass the permissions from hbase to hdfs?
>
> Issues w/ the hbase 99th percentile were brought up. "DynamoDB can do
> 10ms". How to do better?
>
> SSD is not enough.
>
> GC messes us up.
>
> Will the Distributed Log Replay come back to help improve MTTR? We could
> redo on new ProcedureV2 basis. ZK timeout is the biggest issue. Do as we
> used to and just rely on the regionserver heartbeating...
>
> Read replica helps w/ MTTR.
>
> Ratis incubator project to do a quorum based hbase?
>
> Digression on licensing issues around fb wangle and folly.
>
> Redo of hbase but quorum based would be another project altogether.
>
> Decided to go around the table to talk about concerns and what people are
> working on.
>
> Jieshan wondered what could be done to improve OLAP over hbase.
>
> Client side scanner was brought up again as means of skipping RS overhead
> and doing better OLAP.
>
> Have HBase compact to parquet files. Query parquet and hbase.
>
> At Huawei, they are using 1.0 hbase. Most problems are assignment. They
> have .5M regions. RIT is a killer. Double assignment issues. And RIT. They
> run their own services. Suggested they upgrade to get fixes at least. Then
> 2.0.
>
> Will HBase federate like HDFS? Can Master handle load at large scale? It
> needs to do federation too?
>
> Anyone using Bulk loaded replication? (Yes, it just works so no one talks
> about it...)
>
> Request that fixes be backported to all active branches, not just most
> current.
>
> Andrew was good at backporting... not all RMs are.
>
> Too many branches. What should we do?
>
> Proliferation of branches makes for too much work.
>
> Need to cleanup bugs in 1.3. Make it stable release now.
>
> Lets do more active EOL'ing of branches. 1.1?.
>
> Hubert asked if we can have clusters where RS are differently capable?
> i.e. several generations of HW all running in the same cluster.
>
> What if fat server goes down.
>
> Balancer could take of it all. RS Capacity. Balancer can take it into
> account.
> Regionserver labels like YARN labels. Characteristics.
>
> Or run it all in docker when heterogeneous cluster. The K8 talk from day
> before was mentioned; we should all look at being able to deploy in k8 and
> docker.
>
> Lets put out kubernetes blog...(Doing).
>
> Alibaba looking at HBase as native YARN app.
>
> i/o is hard even when containers.
>
> Use autoscaler of K8 when heavy user.
>
> Limit i/o use w/ CP. Throttle.
>
> Spark and client-side scanner came up again.
>
> Snapshot input format in spark.
>
> HBase federation came up again. jd.com talking of 3k to 4k nodes in a
> cluster. Millions of regions. Region assignment is messing them up.
>
> Maybe federation is good idea? Argument that it is too much operational
> conplexity. Can we fix master load w/ splittable meta, etc?
>
> Was brought up that even w/ 100s of RS there is scale issue, nvm thousands.
>
> Alibaba talked about disaster recovery. Described issue where HDFS has
> fencing problem during an upgrade. There was no active NN. All RS went down.
> ZK is another POF. If ZK is not available. Operators were being asked how
> much longer the cluster was going to be down but they could not answer the
> question. No indicators from HBase on how much longer it will be down or
> how many WALs its processed and how many more to go. Operator unable to
> tell his org how long it would be before it all came back on line. Should
> say how many regions are online and how many more to do.
>
> Alibaba use SQL to lower cost. HBase API is low-level. Row-key
> construction is tricky. New users make common mistakes. If you don't do
> schema right, high-performance is difficult.
>
> Alibaba are using a subset of Phoenix... simple sql only; throws
> exceptions if user tries to do joins, etc.., anything but basic ops.
>
> HareQL is using hive for meta store.  Don't have data typing in hbase.
>
> HareQL could perhaps contribute some piece... or a module in hbase to
> sql... From phoenix?
>
> Secondary index.
>
> Client is complicated in phoenix. Was suggested thin client just does
> parse... and then offload to server for optimization and execution.
>
> Then secondary index. Need transaction engine. Consistency of secondary
> index.
>
> We adjourned.
>
> Your dodgy secretary,
> St.Ack
> P.S. Please add to this base set of notes if I missed anything.
>
>
>
>

[jira] [Created] (HBASE-18530) precommit should check validity of changes to nightly jenkinsfile

2017-08-07 Thread Sean Busbey (JIRA)
Sean Busbey created HBASE-18530:
---

 Summary: precommit should check validity of changes to nightly 
jenkinsfile
 Key: HBASE-18530
 URL: https://issues.apache.org/jira/browse/HBASE-18530
 Project: HBase
  Issue Type: New Feature
  Components: community, test
Reporter: Sean Busbey


It'd be nice if our precommit job could check changes to the nightly 
jenkinsfile. Even if it's just a simple syntax check.

I believe there's a plugin for jenkins that you can curl with a proposed 
Jenkinsfile, but someone will need to chase down specifics.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Notes from dev meetup in Shenzhen, August 5th, 2017

2017-08-07 Thread Stack
At fancy Huawei headquarters, 10:00-12:00AM or so (with nice coffee and
fancy little cake squares provided about half way through the session).

For list of attendees, see picture at end of this email.

Discussion was mostly in Chinese with about 25% in English plus some
gracious sideline translation so the below is patchy. Hopefully you get the
gist.

For client-side scanner going against hfiles directly; is there a means of
being able to pass the permissions from hbase to hdfs?

Issues w/ the hbase 99th percentile were brought up. "DynamoDB can do
10ms". How to do better?

SSD is not enough.

GC messes us up.

Will the Distributed Log Replay come back to help improve MTTR? We could
redo on new ProcedureV2 basis. ZK timeout is the biggest issue. Do as we
used to and just rely on the regionserver heartbeating...

Read replica helps w/ MTTR.

Ratis incubator project to do a quorum based hbase?

Digression on licensing issues around fb wangle and folly.

Redo of hbase but quorum based would be another project altogether.

Decided to go around the table to talk about concerns and what people are
working on.

Jieshan wondered what could be done to improve OLAP over hbase.

Client side scanner was brought up again as means of skipping RS overhead
and doing better OLAP.

Have HBase compact to parquet files. Query parquet and hbase.

At Huawei, they are using 1.0 hbase. Most problems are assignment. They
have .5M regions. RIT is a killer. Double assignment issues. And RIT. They
run their own services. Suggested they upgrade to get fixes at least. Then
2.0.

Will HBase federate like HDFS? Can Master handle load at large scale? It
needs to do federation too?

Anyone using Bulk loaded replication? (Yes, it just works so no one talks
about it...)

Request that fixes be backported to all active branches, not just most
current.

Andrew was good at backporting... not all RMs are.

Too many branches. What should we do?

Proliferation of branches makes for too much work.

Need to cleanup bugs in 1.3. Make it stable release now.

Lets do more active EOL'ing of branches. 1.1?.

Hubert asked if we can have clusters where RS are differently capable? i.e.
several generations of HW all running in the same cluster.

What if fat server goes down.

Balancer could take of it all. RS Capacity. Balancer can take it into
account.
Regionserver labels like YARN labels. Characteristics.

Or run it all in docker when heterogeneous cluster. The K8 talk from day
before was mentioned; we should all look at being able to deploy in k8 and
docker.

Lets put out kubernetes blog...(Doing).

Alibaba looking at HBase as native YARN app.

i/o is hard even when containers.

Use autoscaler of K8 when heavy user.

Limit i/o use w/ CP. Throttle.

Spark and client-side scanner came up again.

Snapshot input format in spark.

HBase federation came up again. jd.com talking of 3k to 4k nodes in a
cluster. Millions of regions. Region assignment is messing them up.

Maybe federation is good idea? Argument that it is too much operational
conplexity. Can we fix master load w/ splittable meta, etc?

Was brought up that even w/ 100s of RS there is scale issue, nvm thousands.

Alibaba talked about disaster recovery. Described issue where HDFS has
fencing problem during an upgrade. There was no active NN. All RS went down.
ZK is another POF. If ZK is not available. Operators were being asked how
much longer the cluster was going to be down but they could not answer the
question. No indicators from HBase on how much longer it will be down or
how many WALs its processed and how many more to go. Operator unable to
tell his org how long it would be before it all came back on line. Should
say how many regions are online and how many more to do.

Alibaba use SQL to lower cost. HBase API is low-level. Row-key construction
is tricky. New users make common mistakes. If you don't do schema right,
high-performance is difficult.

Alibaba are using a subset of Phoenix... simple sql only; throws exceptions
if user tries to do joins, etc.., anything but basic ops.

HareQL is using hive for meta store.  Don't have data typing in hbase.

HareQL could perhaps contribute some piece... or a module in hbase to
sql... From phoenix?

Secondary index.

Client is complicated in phoenix. Was suggested thin client just does
parse... and then offload to server for optimization and execution.

Then secondary index. Need transaction engine. Consistency of secondary
index.

We adjourned.

Your dodgy secretary,
St.Ack
P.S. Please add to this base set of notes if I missed anything.


hbaseconasia2017 slides and a few phtotos

2017-08-07 Thread Stack
We had a nice day out in Shenzhen at HBaseCon Asia last friday [0] (August
4th). There were a bunch of great talks [1] by the likes of jd.com, huawei,
xiaomi, alibaba and others. Those given in Chinese had best attendance. A
good few folks graciously had slides in English for the language-dumb (like
myself) while their talk native.

A couple of hbase-as-a-service in the cloud are coming down the pipe, a few
presenters talked about being at current scale limits, China Life keeps all
data as JSON in hbase instances, and there was an interesting talk on
upping utilization by deploying hbase with kubernetes (>1 container per
node).

Best quote: "HBase is just a kid--it is only ten years old" from our Yu Li
talking about interesting write speedups coming from Alibaba arguing there
are many more speedups to be had.

I attached a few pictures. I'll put up more after I fixup the hbasecon home
page and redirect later.

The day after, the 5th, there was a dev meetup at the Huawei office; notes
to follow.

Thanks again to the program committee, the sponsors, and to our gracious
host Huawei. Jieshan in particular did an amazing job running the show
taking care of speakers.

St.Ack
0.  Click on 'view details' on the right hand side of this page for agenda
(eventbrite won't show the event page at the end of the link
post-the-event): https://www.eventbrite.com/e/hbasecon-asia-2017-
tickets-34935546159#
1. Slides up on slideshare: https://www.slideshare.net/
search/slideshow?searchfrom=header=hbaseconasia2017=
any=all=en=

​
 _YP99027.JPG

​
The program committee taking questions at the end of the day
​
 _YP99181.JPG

​
This one is of all the speakers
​
 _YP99193.JPG

Still Failing: HBase Generate Website

2017-08-07 Thread Apache Jenkins Server
Build status: Still Failing

The HBase website has not been updated to incorporate HBase commit 
${HBASE_GIT_SHA}.

See https://builds.apache.org/job/hbase_generate_website/1073/console

[jira] [Created] (HBASE-18529) Do not delete the tmp jars dir when load the coprocessor jar

2017-08-07 Thread Yun Zhao (JIRA)
Yun Zhao created HBASE-18529:


 Summary: Do not delete the tmp jars dir when load the coprocessor 
jar
 Key: HBASE-18529
 URL: https://issues.apache.org/jira/browse/HBASE-18529
 Project: HBase
  Issue Type: Bug
Reporter: Yun Zhao


When multi regionserver is deployed on a single server, used default 
hbase.local.dir . The tmp jars dir will deleted when one of them is restarted.
Also when multi regionserver start at the same time, the jar in the 
copyToLocalFile process may be deleted, causing the coprocessor load failed.

{code}
2017-08-06 20:02:15,326 ERROR [RS_OPEN_REGION--2] 
regionserver.RegionCoprocessorHost: Failed to load coprocessor 
ENOENT: No such file or directory
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmodImpl(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmod(NativeIO.java:226)
at 
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:629)
at 
org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:467)
at 
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:456)
at 
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:424)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:906)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:887)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:784)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:365)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:338)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:289)
at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1968)
at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1937)
at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1913)
at 
org.apache.hadoop.hbase.util.CoprocessorClassLoader.init(CoprocessorClassLoader.java:168)
at 
org.apache.hadoop.hbase.util.CoprocessorClassLoader.getClassLoader(CoprocessorClassLoader.java:250)
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Tags class using wrong length?

2017-08-07 Thread Chia-Ping Tsai
Sorry for the typo. I mean that "bytes[offset + TAG_LENGTH_SIZE]" is correct.

BTW, the name of "upcoming HBase book" you mentioned is "HBase: The Definitive 
Guide"?

On 2017-08-07 14:13, Lars George  wrote: 
> Gotcha, sorry for the noise. I documented properly in the upcoming HBase 
> book. :)
> 
> Sent from my iPhone
> 
> > On 7. Aug 2017, at 07:02, ramkrishna vasudevan 
> >  wrote:
> > 
> > I think the layout of tags is missing now in the javadoc. May be it got
> > missed or moved to some other place?
> > I remember we had a layout explaining the tag structure then this code is
> > much easier to read this code.
> > 
> > As Chia-Ping said  is the
> > layout.
> > So from the KeyValue lay out we extract the tag part which in itself has a
> > tag length to represent the complete set of tags.
> > 
> > From the tags offset and tags length from the KV we extract individual tags
> > in that KV.
> > 
> > For eg
> > See TagUtil#asList
> > 
> > {code}
> > List tags = new ArrayList<>();
> >int pos = offset;
> >while (pos < offset + length) {
> >  int tagLen = Bytes.readAsInt(b, pos, TAG_LENGTH_SIZE);
> >  tags.add(new ArrayBackedTag(b, pos, tagLen + TAG_LENGTH_SIZE));
> >  pos += TAG_LENGTH_SIZE + tagLen;
> >}
> >return tags;
> > {code}
> > 
> > Regards
> > Ram
> > 
> > 
> > 
> > 
> >> On Mon, Aug 7, 2017 at 3:25 AM, Ted Yu  wrote:
> >> 
> >> The byte following the tag length (a short) is the tag type.
> >> 
> >> The current code is correct.
> >> 
> >> On Sun, Aug 6, 2017 at 5:40 AM, Chia-Ping Tsai 
> >> wrote:
> >> 
> >>> According to the following code:
> >>>  public ArrayBackedTag(byte tagType, byte[] tag) {
> >>>int tagLength = tag.length + TYPE_LENGTH_SIZE;
> >>>if (tagLength > MAX_TAG_LENGTH) {
> >>>  throw new IllegalArgumentException(
> >>>  "Invalid tag data being passed. Its length can not exceed " +
> >>> MAX_TAG_LENGTH);
> >>>}
> >>>length = TAG_LENGTH_SIZE + tagLength;
> >>>bytes = new byte[length];
> >>>int pos = Bytes.putAsShort(bytes, 0, tagLength);
> >>>pos = Bytes.putByte(bytes, pos, tagType);
> >>>Bytes.putBytes(bytes, pos, tag, 0, tag.length);
> >>>this.type = tagType;
> >>>  }
> >>> The layout of the byte array should be:
> >>> |tag legnth (2 bytes)|tag type(1 byte)|tag|
> >>> 
> >>> It seems to me that the "bytes[offset + TYPE_LENGTH_SIZE]" is correct.
> >>> 
>  On 2017-08-06 16:35, Lars George  wrote:
>  Hi,
>  
>  I found this reading through tags in 1.3, but checked in trunk as
>  well. There is this code:
>  
>   public ArrayBackedTag(byte[] bytes, int offset, int length) {
> if (length > MAX_TAG_LENGTH) {
>   throw new IllegalArgumentException(
>   "Invalid tag data being passed. Its length can not exceed "
>  + MAX_TAG_LENGTH);
> }
> this.bytes = bytes;
> this.offset = offset;
> this.length = length;
> this.type = bytes[offset + TAG_LENGTH_SIZE];
>   }
>  
>  I am concerned about the last line of the code, using the wrong
> >> constant?
>  
>   public final static int TYPE_LENGTH_SIZE = Bytes.SIZEOF_BYTE;
>   public final static int TAG_LENGTH_SIZE = Bytes.SIZEOF_SHORT;
>  
>  Should this not read
>  
> this.type = bytes[offset + TYPE_LENGTH_SIZE];
>  
>  Would this not read the type from the wrong place in the array?
>  
>  Cheers,
>  Lars
>  
> >>> 
> >> 
> 


Re: [DISCUSS] Should flush decisions be made based on data size (key-value only) or based on heap size (including metadata overhead)?

2017-08-07 Thread Anoop John
Sorry for being later to reply.

So u mean we should track both sizes even at Region level?  This was
considered at that time but did not do as that will add more overhead.
We have to deal with 2 AtomicLongs in every Region.  Right now we
handle this double check at RS level only so that added just one more
variable dealing.

-Anoop-

On Mon, Jul 10, 2017 at 7:34 PM, Eshcar Hillel
 wrote:
> Here is a suggestion:We can track both heap and off-heap sizes and have 2 
> thresholds one for limiting heap size and one for limiting off-heap size.And 
> in all decision making junctions we check whether one of the thresholds is 
> exceeded and if it is we trigger a flush. We can choose which entity to flush 
> based on the cause.For example, if we decided to flush since the heap size 
> exceeds the heap threshold than we flush the region/store with greatest heap 
> size. and likewise for off-heap flush.
>
> I can prepare a patch.
>
> This is not rolling back HBASE-18294 simply refining it to have different 
> decision making for the on and off heap cases.
>
> On Monday, July 10, 2017, 8:25:12 AM GMT+3, Anoop John 
>  wrote:
>
> Stack and others..
> We wont do any OOM or FullGC issues.  Because globally at RS level we
> will track both the data size (of all the memstores) and the heap
> size.  The decision there accounts both. In fact in case of normal on
> heap memstores, the accounting is like the old way of heap size based.
>
> At region level (and at Segments level)  we track data size only.  The
> decisions are based on data size.
>
> So in the past region flush size of 128 MB means we will flush when
> the heap size of that region crosses 128 MB.  But now it is data size
> alone.  What I feel is that is more inclined to a normal user
> thinking.  He say flush size of 128 MB and then the thinking can be
> 128 MB of data.
>
> The background of this change is the off heap memstores where we need
> separate tracking of both data and heap overhead sizes.  But at
> region level this behave change was done thinking that is more user
> oriented
>
> I agree with Yu that it is a surprising behave change. Ya if not tuned
> accordingly one might see more blocked writes. Because the per region
> flushes are more delayed now and so chances of reaching the global
> memstore upper barrier chances are more.  And then we will block
> writes and force flushes.  (But off heap memstores will do better job
> here).  But this would NOT cause any OOME or FullGC.
>
> I guess we should have reduced the 128 MB default flush size then?  I
> asked this Q in that jira and then we did not discuss further.
>
> I hope I explained the background and the change and the impacts.  Thanks.
>
> -Anoop-
>
> On Thu, Jul 6, 2017 at 11:43 AM, 宾莉金(binlijin)  wrote:
>> I like to use the former, heap occupancy, so we not need to worry about the
>> OOM and FullGc,and change configuration to adapted to new policy.
>>
>> 2017-07-06 14:03 GMT+08:00 Stack :
>>
>>> On Wed, Jul 5, 2017 at 9:59 PM, ramkrishna vasudevan <
>>> ramkrishna.s.vasude...@gmail.com> wrote:
>>>
>>> >
>>> > >>Sounds like we should be doing the former, heap occupancy
>>> > Stack, so do you mean we need to roll back this new change in trunk? The
>>> > background is https://issues.apache.org/jira/browse/HBASE-16747.
>>> >
>>> >
>>> I remember that issue. It seems good to me (as it did then) where we have
>>> the global tracking in RS of all data and overhead so we shouldn't OOME and
>>> we keep accounting of overhead and data distinct because now data can be
>>> onheap or offheap.
>>>
>>> We shouldn't be doing blocking updates -- not when there is probably loads
>>> of memory still available -- but that is a different (critical) issue.
>>> Sounds like current configs can 'surprise' -- see Yu Li note -- given the
>>> new accounting.
>>>
>>> Looks like I need to read HBASE-18294
>>>  to figure what the
>>> pivot/problem w/ the new policy is.
>>>
>>> Thanks,
>>> St.Ack
>>>
>>>
>>>
>>>
>>>
>>> > Regards
>>> > Ram
>>> >
>>> >
>>> > On Thu, Jul 6, 2017 at 8:40 AM, Yu Li  wrote:
>>> >
>>> > > We've also observed more blocking updates happening with the new policy
>>> > > (flush decision made on data size), but could work-around it by
>>> reducing
>>> > > the hbase.hregion.memstore.flush.size setting. The advantage of
>>> current
>>> > > policy is we could control the flushed file size more accurately, but
>>> > > meanwhile losing some "compatibility" (requires configuration updating
>>> > > during rolling upgrade).
>>> > >
>>> > > I'm not sure whether we should rollback, but if stick on current policy
>>> > > there should be more documents, metrics (monitoring heap/data occupancy
>>> > > separately) and log message refinements, etc. Attaching some of the
>>> logs
>>> > we
>>> > > observed, which is pretty confusing w/o knowing the details 

Re: [DISCUSS] Planning changes on RegionServer totalRequestCount metrics

2017-08-07 Thread Anoop John
Sorry for being late here Yu Li.
Regarding counting the rows (for the new metric) in multi..  There
might be 2 Actions in multi request for the same row. This is possible
some time.  I dont think we should check that and try to make it
perfect. That will have perf penalty also.  So just saying that we
will have some possible inconsistency even after.  May be we can say
how many actions in multi not rows affected!  any better name ?

On Mon, Aug 7, 2017 at 8:07 AM, Yu Li  wrote:
> Thanks for chiming in @stack and @Jerry, will try to add a good release
> note when the work done.
>
> Since already more than 72 hours passed and no objections, I'd like to call
> this discussion closed and apply the change in HBASE-18469. Thanks.
>
> Best Regards,
> Yu
>
> On 4 August 2017 at 13:59, stack  wrote:
>
>> +1
>>
>> We need a fat release note on this change so operators can quickly learn
>> why traffic went down on upgrade.
>>
>> S
>>
>> On Aug 3, 2017 14:49, "Yu Li"  wrote:
>>
>> > Dear all,
>> >
>> > Recently in HBASE-18469 > jira/browse/HBASE-18469
>> > >
>> > we found some inconsistency on regionserver request related metrics,
>> > including:
>> > 1. totalRequestCount could be less than readRequestCount+
>> writeRequestCount
>> > 2. For multi request, we count action count into totalRequestCount, while
>> > for scan with caching we count only one.
>> >
>> > To fix the inconsistency, we plan to make below changes:
>> > 1. Make totalRequestCount only counts rpc request, thus multi request
>> will
>> > only count as one for totalRequestCount
>> > 2. Introduce a new metrics in name of "totalRowsRequestCount", which will
>> > count the DML workloads on RS by row-level action, and for this metrics
>> we
>> > will count how many rows included for multi and scan-with-caching
>> request.
>> >
>> > After the change, there won't be any compatibility issue -- existing
>> > monitoring system could still work -- only that totalRequestCount will be
>> > less than previous. And it's recommended to use totalRowsRequestCount to
>> > check the RS DML workload.
>> >
>> > Please kindly let us know if you have any different idea or suggestion
>> > (operators' opinion is especially welcomed).
>> >
>> > Let's make this discussion open for 72 hours and will make the change if
>> no
>> > objections.
>> >
>> > Thanks!
>> >
>> > Best Regards,
>> > Yu
>> >
>>


Re: Tags class using wrong length?

2017-08-07 Thread Lars George
Gotcha, sorry for the noise. I documented properly in the upcoming HBase book. 
:)

Sent from my iPhone

> On 7. Aug 2017, at 07:02, ramkrishna vasudevan 
>  wrote:
> 
> I think the layout of tags is missing now in the javadoc. May be it got
> missed or moved to some other place?
> I remember we had a layout explaining the tag structure then this code is
> much easier to read this code.
> 
> As Chia-Ping said  is the
> layout.
> So from the KeyValue lay out we extract the tag part which in itself has a
> tag length to represent the complete set of tags.
> 
> From the tags offset and tags length from the KV we extract individual tags
> in that KV.
> 
> For eg
> See TagUtil#asList
> 
> {code}
> List tags = new ArrayList<>();
>int pos = offset;
>while (pos < offset + length) {
>  int tagLen = Bytes.readAsInt(b, pos, TAG_LENGTH_SIZE);
>  tags.add(new ArrayBackedTag(b, pos, tagLen + TAG_LENGTH_SIZE));
>  pos += TAG_LENGTH_SIZE + tagLen;
>}
>return tags;
> {code}
> 
> Regards
> Ram
> 
> 
> 
> 
>> On Mon, Aug 7, 2017 at 3:25 AM, Ted Yu  wrote:
>> 
>> The byte following the tag length (a short) is the tag type.
>> 
>> The current code is correct.
>> 
>> On Sun, Aug 6, 2017 at 5:40 AM, Chia-Ping Tsai 
>> wrote:
>> 
>>> According to the following code:
>>>  public ArrayBackedTag(byte tagType, byte[] tag) {
>>>int tagLength = tag.length + TYPE_LENGTH_SIZE;
>>>if (tagLength > MAX_TAG_LENGTH) {
>>>  throw new IllegalArgumentException(
>>>  "Invalid tag data being passed. Its length can not exceed " +
>>> MAX_TAG_LENGTH);
>>>}
>>>length = TAG_LENGTH_SIZE + tagLength;
>>>bytes = new byte[length];
>>>int pos = Bytes.putAsShort(bytes, 0, tagLength);
>>>pos = Bytes.putByte(bytes, pos, tagType);
>>>Bytes.putBytes(bytes, pos, tag, 0, tag.length);
>>>this.type = tagType;
>>>  }
>>> The layout of the byte array should be:
>>> |tag legnth (2 bytes)|tag type(1 byte)|tag|
>>> 
>>> It seems to me that the "bytes[offset + TYPE_LENGTH_SIZE]" is correct.
>>> 
 On 2017-08-06 16:35, Lars George  wrote:
 Hi,
 
 I found this reading through tags in 1.3, but checked in trunk as
 well. There is this code:
 
  public ArrayBackedTag(byte[] bytes, int offset, int length) {
if (length > MAX_TAG_LENGTH) {
  throw new IllegalArgumentException(
  "Invalid tag data being passed. Its length can not exceed "
 + MAX_TAG_LENGTH);
}
this.bytes = bytes;
this.offset = offset;
this.length = length;
this.type = bytes[offset + TAG_LENGTH_SIZE];
  }
 
 I am concerned about the last line of the code, using the wrong
>> constant?
 
  public final static int TYPE_LENGTH_SIZE = Bytes.SIZEOF_BYTE;
  public final static int TAG_LENGTH_SIZE = Bytes.SIZEOF_SHORT;
 
 Should this not read
 
this.type = bytes[offset + TYPE_LENGTH_SIZE];
 
 Would this not read the type from the wrong place in the array?
 
 Cheers,
 Lars
 
>>> 
>>