Hi The idea that supports Kylin adding measures dynamically is impressive.
But in my opinion, once you add a measure, the existing segments should also calculate the new measure(just add a new measure column). Users can have many cubes, a cube can have many segments, if measure's view is different in each segment, it will increase the burden of the user. -- Regards! Aron Tao yuzhang <shifengdefan...@163.com> 于2019年4月20日周六 上午1:43写道: > Hi dear kylin users and develop team: > Here have some things I want to discuss with community. > As a representative of MOLAP engine, kylin uses pre-aggregation strategies > to provide high-concurrency and second-level response analysis > capabilities, but also loses some flexibility. > The limitation that purge existing segment firstly to add an additional > measure will cause many double calculation and unnecessary disk IO. Such > waste should be avoid especially in MOLAP engine. > For example, there is an cubeA with one measure m1 and segments over time > range1(tr1). Now, user add one measure m2, but don't want to clear segments > over tr1. The value of m2 will exist in tr2, the segments build > subsequently. Sure, tr1 doesn't contain value of m2, which will be > understanded by user who know litte about MOLAP. Querying over tr1 and tr2 > is valid for both m1 and m2, but the result of m2 over tr1 will be null. > It's will be better to reminder user the measure missing.Moreover, > refreshing will supply the m2 to segments over tr1. > Currently, kylin's storage engine uses HBase. The measure are aggregated > values based on combination of various dimension members and stored in a > column of a Column Family in HBase. For the same cube, adding a new measure > will add a column to the HBase table(mapping) and will take effect in the > next build. For the existing HTables(segments), the new column is allowed > to be missing. Refreshing old existing segments will add a new column in > their HTable to store new measure. Value of new measure is aggregated > according to the combination of dimension members in rowkey, without > recalculating existing measure. > Now, For additional measure and even additional dimensions, Kylin's > current solution is Hybrid, but we found the following shortcomings during > use: > 1. Management costs: Repeated maintenance of similar Cubes, most of which > have many intersections of dimensions and indicators. If you want to > perform optimization operations such as pruning, you need to configure all > of these cubes. > 2. A large number of cubes: The initial analysis of the business is not > stable, and analysts often have the need to increase some measures. The > cube is added continuously to the Hybrid group, which will produce a lot of > cubes. > 3. Repeat calculation: If you want to drop the old cube in the Hybrid > group, you need to build the latest cube by compute historical data to > cover the old cube. > Those will result in a lot of waste. > In addition, I felt that the metadata about the measure was not perfect > during the applying of Kylin. > 1. As one of the most important concerns of analysts, if the measures of > the analysis system can be decoupled from the materialized view(cube) and > have their own management system, it may be more flexibility. > 2. Once the dimensions have been choose in cube designing, it's cuboids > are confirmed no matter the number of measures. It may make confuse to > maintenance cubes with different measures but same cuboids. Cubes with > different cuboids should be considered different cube, which is the > definition of cube, isn't it? > It's just some thinking about MOLAP during I using kylin. How do you think > about this? Looking forward your reply, sincerely. > Maybe here are some mistake or misunderstanding, please feel free to > correct me or discuss further more if you find any of them. > Best regards > yuzhang > > > yuzhang > shifengdefan...@163.com > > <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=yuzhang&uid=shifengdefannao%40163.com&iconUrl=http%3A%2F%2Fmail-online.nosdn.127.net%2Fsm1c0446ade9371d208d1e209c8bc0827f.jpg&items=%5B%22shifengdefannao%40163.com%22%5D> > 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail81> 定制 >