Following text is part of an article(https://zhuanlan.zhihu.com/p/343394287) .
=============================================================================== Kylin is suitable for aggregation queries with fixed modes because of its pre-calculated technology, for example, join, group by, and where condition modes in SQL are relatively fixed, etc. The larger the data volume is, the more obvious the advantages of using Kylin are; in particular, Kylin is particularly advantageous in the scenarios of de-emphasis (count distinct), Top N, and Percentile. In particular, Kylin's advantages in de-weighting (count distinct), Top N, Percentile and other scenarios are especially huge, and it is used in a large number of scenarios, such as Dashboard, all kinds of reports, large-screen display, traffic statistics, and user behavior analysis. Meituan, Aurora, Shell Housing, etc. use Kylin to build their data service platforms, providing millions to tens of millions of queries per day, and most of the queries can be completed within 2 - 3 seconds. There is no better alternative for such a high concurrency scenario. ClickHouse, because of its MPP architecture, has high computing power and is more suitable when the query request is more flexible, or when there is a need for detailed queries with low concurrency. Scenarios include: very many columns and where conditions are arbitrarily combined with the user label filtering, not a large amount of concurrency of complex on-the-spot query and so on. If the amount of data and access is large, you need to deploy a distributed ClickHouse cluster, which is a higher challenge for operation and maintenance. If some queries are very flexible but infrequent, it is more resource-efficient to use now-computing. Since the number of queries is small, even if each query consumes a lot of computational resources, it is still cost-effective overall. If some queries have a fixed pattern and the query volume is large, it is more suitable for Kylin, because the query volume is large, and by using large computational resources to save the results, the upfront computational cost can be amortized over each query, so it is the most economical. --- Translated with DeepL.com (free version) ------------------------ With warm regard Xiaoxiang Yu On Mon, Dec 4, 2023 at 3:16 PM Nam Đỗ Duy <na...@vnpay.vn.invalid> wrote: > Thank you Xiaoxiang for the near real time streaming feature. That's great. > > This morning there has been a new challenge to my team: clickhouse offered > us the speed of calculating 8 billion rows in millisecond which is faster > than my demonstration (I used Kylin to do calculating 1 billion rows in 2.9 > seconds) > > Can you briefly suggest the advantages of kylin over clickhouse so that I > can defend my demonstration. > > On Mon, Dec 4, 2023 at 1:55 PM Xiaoxiang Yu <x...@apache.org> wrote: > > > 1. "In this important scenario of realtime analytics, the reason here is > > that > > kylin has lag time due to model update of new segment build, is that > > correct?" > > > > You are correct. > > > > 2. "If that is true, then can you suggest a work-around of combination of > > ... " > > > > Kylin is planning to introduce NRT streaming(coding is completed but not > > released), > > which can make the time-lag to about 3 minutes(that is my estimation but > I > > am > > quite certain about it). > > NRT stands for 'near real-time', it will run a job and do micro-batch > > aggregation and persistence periodically. The price is that you need to > run > > and monitor a long-running > > job. This feature is based on Spark Streaming, so you need knowledge of > > it. > > > > I am curious about what is the maximum time-lag your customers > > can tolerate? > > Personally, I guess minute level time-lag is ok for most cases. > > > > ------------------------ > > With warm regard > > Xiaoxiang Yu > > > > > > > > On Mon, Dec 4, 2023 at 12:28 PM Nam Đỗ Duy <na...@vnpay.vn.invalid> > wrote: > > > > > Druid is better in > > > - Have a real-time datasource like Kafka etc. > > > > > > ========================== > > > > > > Hi Xiaoxiang, thank you for your response. > > > > > > In this important scenario of realtime alalytics, the reason here is > that > > > kylin has lag time due to model update of new segment build, is that > > > correct? > > > > > > If that is true, then can you suggest a work-around of combination of : > > > > > > (time - lag kylin cube) + (realtime DB update) to provide > > > realtime capability ? > > > > > > IMO, the point here is to find that (realtime DB update) and integrate > it > > > with (time - lag kylin cube). > > > > > > On Fri, Dec 1, 2023 at 1:53 PM Xiaoxiang Yu <x...@apache.org> wrote: > > > > > > > I researched and tested Druid two years ago(I don't know too much > about > > > > the change of Druid in these two years. New features that I know > are : > > > > new UI, fully on K8s etc). > > > > > > > > Here are some cases you should consider using Druid other than Kylin > > > > at the moment (using Kylin 5.0-beta to compare the Druid which I used > > two > > > > years ago): > > > > > > > > - Have a real-time datasource like Kafka etc. > > > > - Most queries are small(Based on my test result, I think Druid had > > > better > > > > response time for small queries two years ago.) > > > > - Don't know how to optimize Spark/Hadoop, want to use the K8S/public > > > > cloud platform as your deployment platform. > > > > > > > > But I do think there are many scenarios in which Kylin could be > better, > > > > like: > > > > > > > > - Better performance for complex/big queries. Kylin can have a more > > > > exact-match/fine-grained > > > > Index for queries containing different `Group By dimensions`. > > > > - User-friendly UI for modeling. > > > > - Support 'Join' better? (Not sure at the moment) > > > > - ODBC driver for different BI.(its website did not show it supports > > ODBC > > > > well) > > > > - Looks like Kylin supports ANSI SQL better than Druid. > > > > > > > > > > > > I don't know Pinot, so I have nothing to say about it. > > > > Hope to help you, or you are free to share your opinion. > > > > > > > > ------------------------ > > > > With warm regard > > > > Xiaoxiang Yu > > > > > > > > > > > > > > > > On Fri, Dec 1, 2023 at 11:11 AM Nam Đỗ Duy <na...@vnpay.vn.invalid> > > > wrote: > > > > > > > >> Dear Xiaoxiang, > > > >> Sirs/Madams, > > > >> > > > >> May I post my boss's question: > > > >> > > > >> What are the pros and cons of the OLAP platform Kylin compared to > > Pinot > > > >> and > > > >> Druid? > > > >> > > > >> Please kindly let me know > > > >> > > > >> Thank you very much and best regards > > > >> > > > > > > > > > >