Echo to this " *This is the beauty of Hadoop Ecosystem i like that we have lots ofdifferent tools which can be used according to requirement.*"
Can't agree more the ecosystem is the most important thing around Hadoop itself, there are different projects aim to resolve different problems. although there's some overlap between them but it depends how and where you leverage them, there's no one size fit all solution. Back to the original question, Impala is SQL on Hadoop, Kylin is OLAP on Hadoop, that's the biggest different:) Thanks. Best Regards! --------------------- Luke Han 2015-05-29 19:25 GMT+08:00 Akash Mishra <[email protected]>: > Hi Sun, > > I don't see Impala competing with Kylin. Both of them have very different > use case. We have build an prediction system with reporting capability for > our client using Impala as Real Time SQL on Hadoop solution. > > Impala is fast and works well with Point queries and aggregation within > same table. Most of our queries were completed in real time. We started > facing some problem for our reporting feature where there were lots of > aggregation and filtering happening between some tables. We had to write a > lot code for each aggregation and filtering. Mean while we found KYLIN and > found that we could use to to replace our Reporting feature from Impala to > KYLIN. > We still felt that both Impala and Kylin can stay in system and used as per > the requirement. > > This is the beauty of Hadoop Ecosystem i like that we have lots of > different tools which can be used according to requirement. > > > > > > > On Thu, May 28, 2015 at 2:58 PM, Adunuthula, Seshu <[email protected]> > wrote: > > > We are in the process of releasing TPC-DS bench marks for Kylin to > compare > > against Hive. > > > > Also I do not see Kylin completing with SQL on Hadoop Solutions like > > Impala but complementing them. There is a subset of SQL Workload that can > > be represented in a classic star schema format and allow for > > pre-aggregation where Kylin will do better. > > > > Regards > > Seshu > > > > > > On 5/27/15, 7:30 PM, "Luke Han" <[email protected]> wrote: > > > > >Hi Sun, > > > There's no benchmark from our side yet,especially in prod env. I'm > > >also > > >very curious to know if someone did such comparison. > > > > > > The direct advantage for Kylin over Impala (include other MPP > > >solution): > > > 1. Non-Invasive Design: you do not need to install any agent, > library > > >or others in your existing Hadoop Cluster (Neither on Namenode or > > >DataNode) > > > 2. Pre-Calculation result avoid runtime scan/aggregation, that mean > > >you > > >could get result more faster in seconds latency over billions data. > > > > > > > > > Thanks. > > > > > >Luke > > > > > > > > >Best Regards! > > >--------------------- > > > > > >Luke Han > > > > > >2015-05-28 10:20 GMT+08:00 [email protected] <[email protected]>: > > > > > >> Hi, team > > >> > > >> Really interested in the performance comparison and also the native > > >>design > > >> advantage over Apache Kylin > > >> > > >> and Cloudera Impala. As the official saying, Cloudera Impala is a > > >> "Lightning-fast, distributed SQL queries > > >> > > >> for petabytes of data stored in Apache Hadoop clusters". Kylin can > goes > > >>to > > >> 10-1000x query efficiency over > > >> > > >> hive in the usage of MOLAP, while Cloudera Impala can also achieve > much > > >> more performance upgrade over > > >> > > >> hive. > > >> > > >> Question is : Does Kylin do some benchmark test or performance > > >>comparison > > >> with Cloudera Impala in production > > >> > > >> environment? What can be the direct advantage for Apache Kylin over > > >> Cloudera Impala? > > >> > > >> If anyone had deployed and used both products in your usage, please > > >>kindly > > >> share any available suggestions. > > >> > > >> Best regards, > > >> > > >> Sun. > > >> > > >> > > >> > > >> [email protected] > > >> > > > > > > > -- > > With Sincere Regards, > Your's Sincerely, > > Akash Mishra. > > > "Its not our abilities that make us, but our decisions."--Albus Dumbledore >
