Thanks Luke! :) On Sat, Jun 20, 2015 at 11:44 AM, Luke Han <[email protected]> wrote:
> Hi Vineet, > I got it, please feel free to continue post your question here. We are > happy to help, but frankly talk, we can't grantee the response time since > we also have tasks inside. But we will try our best to help everyone to use > Kylin smoothly. > For your case, the concurrency should not be an issue, if you can > control the queries coming from Tableau, that mean do not allow Tableau > dashboard/report to pull huge data in one query. For example, please use > "connect live" not "import data" in Tableau. > And, please setup more nodes to serve high concurrency requests, > Kylin's REST server is stateless which could scale out very well. > > Any issue, please let's know. > Thanks. > > > > > Best Regards! > --------------------- > > Luke Han > > On Fri, Jun 19, 2015 at 5:48 PM, Vineet Mishra <[email protected]> > wrote: > > > Thanks Luke for the prompt response. > > > > As the Kylin project being in incubation mode with comparatively little > > less active mailers and due to the demand of my project which has already > > crossed the expected deliverable timeline, I have to put it that way! :) > > > > Well my use case is to get the aggregated data across various dimensions > to > > visualize it on tableau. The visualization will be accessed by 100 of > users > > (even more) and the connection will be live, as a result multiple queries > > are expected. > > > > On Fri, Jun 19, 2015 at 11:57 PM, Luke Han <[email protected]> wrote: > > > > > Hi Vineet, > > > One query to pull 5 millions data will take a time which is not > > > recommended way to leverage Kylin. > > > In our internal performance testing, Kylin could handle hundreds > QPS > > > for small queries on single machine with several tomcat instances, > please > > > refer to this slides (P31) for more detail: > > > > > > > > > http://www.slideshare.net/lukehan/apache-kylin-big-data-technology-conference-2014-beijing-v2 > > > > > > Kylin is not a database which can only serve well for certain > cases, > > > please evaluate your requirements, case, data, it's appreciated if you > > > could share more detail about your case, then we could have more clear > > idea > > > to help you:) > > > > > > BTW, "Urgent Call!" is your signature or really urgent? I saw it in > > > every your thread and wondering about it:-) > > > > > > Thank you very much > > > > > > Luke > > > > > > > > > > > > Best Regards! > > > --------------------- > > > > > > Luke Han > > > > > > On Fri, Jun 19, 2015 at 7:51 AM, Adunuthula, Seshu < > [email protected] > > > > > > wrote: > > > > > > > Sizing & Tuning Hbase requires some skills, but there is a lot of > help > > > > available on the web. Here are some basic principles to begin with. > > > > > > > > 1. Do not colocate Hbase Region Servers and MapReduce on the same > > nodes. > > > > Shut down the Node Managers on the nodes running the Region Servers. > It > > > > reduces your MR Capacity but makes your Hbase a lot more stable. > > > > 2. Size your Region Servers correctly. Here is a great blog by Lars > on > > > > this subject. > > > > > > > > > > https://www.quora.com/HBase-Region-Server-guidelines-give-a-size-range-of-a > > > > bout-1TB-whereas-data-nodes-are-configured-20-times-bigger-Why > > > > > > > > Regards > > > > Seshu Adunuthula > > > > > > > > > > > > On 6/19/15, 3:12 AM, "Li Yang" <[email protected]> wrote: > > > > > > > > >In the end, HBase is the bottleneck of the number parallel queries. > > > > >Because > > > > >every query will translated into one or more HBase scan. Assuming > not > > > much > > > > >online processing is required (data is pre-aggregated right), the > > HBase > > > > >scan will be the bottleneck. > > > > > > > > > >On Thu, Jun 11, 2015 at 5:34 PM, Shi, Shaofeng <[email protected]> > > > wrote: > > > > > > > > > >> Recommend for reading: > > > > >> > > > > >> http://www.slideshare.net/YangLi43/design-cube-in-apache-kylin > > > > >> > > > > >> > > > > >> On 6/11/15, 4:28 PM, "Vineet Mishra" <[email protected]> > > wrote: > > > > >> > > > > >> >Hi, > > > > >> > > > > > >> >I was trying Kylin for some of my usecase, where the data cube > size > > > is > > > > >> >110Mb with 5 Million Records, the query for full data takes > around > > a > > > > >> >minute > > > > >> >or so which seems to be taking hell lot of time, even apart from > > > this I > > > > >> >was > > > > >> >wondering as what is the query threshold that Kylin can handle in > > > > >> >parallel. > > > > >> > > > > > >> >For instance, how many queries can be fired in parallel to our > > > > >>aggregated > > > > >> >data cubes and is there some practice which can gain the query > > > > >> >performance. > > > > >> > > > > > >> >Urgent Call! > > > > >> > > > > > >> >Thanks! > > > > >> > > > > >> > > > > > > > > > > > > > >
