Re: Kylin Query Latency and Number of Parallel Queries

Adunuthula, Seshu Fri, 19 Jun 2015 07:52:59 -0700

Sizing & Tuning Hbase requires some skills, but there is a lot of help
available on the web. Here are some basic principles to begin with.

1. Do not colocate Hbase Region Servers and MapReduce on the same nodes.
Shut down the Node Managers on the nodes running the Region Servers. It
reduces your MR Capacity but makes your Hbase a lot more stable.
2. Size your Region Servers correctly. Here is a great blog by Lars on
this subject. 
https://www.quora.com/HBase-Region-Server-guidelines-give-a-size-range-of-a
bout-1TB-whereas-data-nodes-are-configured-20-times-bigger-Why

Regards
Seshu Adunuthula

On 6/19/15, 3:12 AM, "Li Yang" <[email protected]> wrote:

>In the end, HBase is the bottleneck of the number parallel queries.
>Because
>every query will translated into one or more HBase scan. Assuming not much
>online processing is required (data is pre-aggregated right), the HBase
>scan will be the bottleneck.
>
>On Thu, Jun 11, 2015 at 5:34 PM, Shi, Shaofeng <[email protected]> wrote:
>
>> Recommend for reading:
>>
>> http://www.slideshare.net/YangLi43/design-cube-in-apache-kylin
>>
>>
>> On 6/11/15, 4:28 PM, "Vineet Mishra" <[email protected]> wrote:
>>
>> >Hi,
>> >
>> >I was trying Kylin for some of my usecase, where the data cube size is
>> >110Mb with 5 Million Records, the query for full data takes around a
>> >minute
>> >or so which seems to be taking hell lot of time, even apart from this I
>> >was
>> >wondering as what is the query threshold that Kylin can handle in
>> >parallel.
>> >
>> >For instance, how many queries can be fired in parallel to our
>>aggregated
>> >data cubes and is there some practice which can gain the query
>> >performance.
>> >
>> >Urgent Call!
>> >
>> >Thanks!
>>
>>

Re: Kylin Query Latency and Number of Parallel Queries

Reply via email to