Re: Kylin Query Latency and Number of Parallel Queries

Luke Han Fri, 19 Jun 2015 23:16:13 -0700

Hi Vineet,
    I got it, please feel free to continue post your question here. We are
happy to help, but frankly talk, we can't grantee the response time since
we also have tasks inside. But we will try our best to help everyone to use
Kylin smoothly.
    For your case, the concurrency should not be an issue, if you can
control the queries coming from Tableau, that mean do not allow Tableau
dashboard/report to pull huge data in one query. For example, please use
"connect live" not "import data" in Tableau.
    And, please setup more nodes to serve high concurrency requests,
Kylin's REST server is stateless which could scale out very well.


    Any issue, please let's know.
    Thanks.




Best Regards!
---------------------

Luke Han

On Fri, Jun 19, 2015 at 5:48 PM, Vineet Mishra <[email protected]>
wrote:

> Thanks Luke for the prompt response.
>
> As the Kylin project being in incubation mode with comparatively little
> less active mailers and due to the demand of my project which has already
> crossed the expected deliverable timeline, I have to put it that way! :)
>
> Well my use case is to get the aggregated data across various dimensions to
> visualize it on tableau. The visualization will be accessed by 100 of users
> (even more) and the connection will be live, as a result multiple queries
> are expected.
>
> On Fri, Jun 19, 2015 at 11:57 PM, Luke Han <[email protected]> wrote:
>
> > Hi Vineet,
> >     One query to pull 5 millions data will take a time which is not
> > recommended way to leverage Kylin.
> >     In our internal performance testing, Kylin could handle hundreds QPS
> > for small queries on single machine with several tomcat instances, please
> > refer to this slides (P31) for more detail:
> >
> >
> http://www.slideshare.net/lukehan/apache-kylin-big-data-technology-conference-2014-beijing-v2
> >
> >     Kylin is not a database which can only serve well for certain cases,
> > please evaluate your requirements, case, data, it's appreciated if you
> > could share more detail about your case, then we could have more clear
> idea
> > to help you:)
> >
> >     BTW, "Urgent Call!" is your signature or really urgent? I saw it in
> > every your thread and wondering about it:-)
> >
> >     Thank you very much
> >
> > Luke
> >
> >
> >
> > Best Regards!
> > ---------------------
> >
> > Luke Han
> >
> > On Fri, Jun 19, 2015 at 7:51 AM, Adunuthula, Seshu <[email protected]
> >
> > wrote:
> >
> > > Sizing & Tuning Hbase requires some skills, but there is a lot of help
> > > available on the web. Here are some basic principles to begin with.
> > >
> > > 1. Do not colocate Hbase Region Servers and MapReduce on the same
> nodes.
> > > Shut down the Node Managers on the nodes running the Region Servers. It
> > > reduces your MR Capacity but makes your Hbase a lot more stable.
> > > 2. Size your Region Servers correctly. Here is a great blog by Lars on
> > > this subject.
> > >
> >
> https://www.quora.com/HBase-Region-Server-guidelines-give-a-size-range-of-a
> > > bout-1TB-whereas-data-nodes-are-configured-20-times-bigger-Why
> > >
> > > Regards
> > > Seshu Adunuthula
> > >
> > >
> > > On 6/19/15, 3:12 AM, "Li Yang" <[email protected]> wrote:
> > >
> > > >In the end, HBase is the bottleneck of the number parallel queries.
> > > >Because
> > > >every query will translated into one or more HBase scan. Assuming not
> > much
> > > >online processing is required (data is pre-aggregated right), the
> HBase
> > > >scan will be the bottleneck.
> > > >
> > > >On Thu, Jun 11, 2015 at 5:34 PM, Shi, Shaofeng <[email protected]>
> > wrote:
> > > >
> > > >> Recommend for reading:
> > > >>
> > > >> http://www.slideshare.net/YangLi43/design-cube-in-apache-kylin
> > > >>
> > > >>
> > > >> On 6/11/15, 4:28 PM, "Vineet Mishra" <[email protected]>
> wrote:
> > > >>
> > > >> >Hi,
> > > >> >
> > > >> >I was trying Kylin for some of my usecase, where the data cube size
> > is
> > > >> >110Mb with 5 Million Records, the query for full data takes around
> a
> > > >> >minute
> > > >> >or so which seems to be taking hell lot of time, even apart from
> > this I
> > > >> >was
> > > >> >wondering as what is the query threshold that Kylin can handle in
> > > >> >parallel.
> > > >> >
> > > >> >For instance, how many queries can be fired in parallel to our
> > > >>aggregated
> > > >> >data cubes and is there some practice which can gain the query
> > > >> >performance.
> > > >> >
> > > >> >Urgent Call!
> > > >> >
> > > >> >Thanks!
> > > >>
> > > >>
> > >
> > >
> >
>

Re: Kylin Query Latency and Number of Parallel Queries

Reply via email to