Hi

The idea that supports Kylin adding measures dynamically is impressive.

But in my opinion, once you add a measure, the existing segments should
also calculate the new measure(just add a new measure column). Users can
have many cubes, a cube can have many segments, if measure's view is
different in each segment, it will increase the burden of the user.

-- 


Regards!

Aron Tao

yuzhang <shifengdefan...@163.com> 于2019年4月20日周六 上午1:43写道:

> Hi dear kylin users and develop team:
> Here have some things I want to discuss with community.
> As a representative of MOLAP engine, kylin uses pre-aggregation strategies
> to provide high-concurrency and second-level response analysis
> capabilities, but also loses some flexibility.
> The limitation that purge existing segment firstly to add an additional
> measure will cause many double calculation and unnecessary disk IO. Such
> waste should be avoid especially in MOLAP engine.
> For example, there is an cubeA with one measure m1 and segments over time
> range1(tr1). Now, user add one measure m2, but don't want to clear segments
> over tr1. The value of m2 will exist in tr2, the segments build
> subsequently. Sure, tr1 doesn't contain value of m2, which will be
> understanded by user who know litte about MOLAP. Querying over tr1 and tr2
> is valid for both m1 and m2, but the result of m2 over tr1 will be null.
> It's will be better to reminder user the measure missing.Moreover,
> refreshing will supply the m2 to segments over tr1.
> Currently, kylin's storage engine uses HBase. The measure are aggregated
> values based on combination of various dimension members and stored in a
> column of a Column Family in HBase. For the same cube, adding a new measure
> will add a column to the HBase table(mapping) and will take effect in the
> next build. For the existing HTables(segments), the new column is allowed
> to be missing. Refreshing old existing segments will add a new column in
> their HTable to store new measure. Value of new measure is aggregated
> according to the combination of dimension members in rowkey, without
> recalculating existing measure.
> Now, For additional measure and even additional dimensions, Kylin's
> current solution is Hybrid, but we found the following shortcomings during
> use:
> 1. Management costs: Repeated maintenance of similar Cubes, most of which
> have many intersections of dimensions and indicators. If you want to
> perform optimization operations such as pruning, you need to configure all
> of these cubes.
> 2. A large number of cubes: The initial analysis of the business is not
> stable, and analysts often have the need to increase some measures. The
> cube is added continuously to the Hybrid group, which will produce a lot of
> cubes.
> 3. Repeat calculation: If you want to drop the old cube in the Hybrid
> group, you need to build the latest cube by compute historical data to
> cover the old cube.
> Those will result in a lot of waste.
> In addition, I felt that the metadata about the measure was not perfect
> during the applying of Kylin.
> 1. As one of the most important concerns of analysts, if the measures of
> the analysis system can be decoupled from the materialized view(cube) and
> have their own management system, it may be more flexibility.
> 2. Once the dimensions have been choose in cube designing, it's cuboids
> are confirmed no matter the number of measures. It may make confuse to
> maintenance cubes with different measures but same cuboids. Cubes with
> different cuboids should be considered different cube, which is the
> definition of cube, isn't it?
> It's just some thinking about MOLAP during I using kylin. How do you think
> about this? Looking forward your reply, sincerely.
> Maybe here are some mistake or misunderstanding, please feel free to
> correct me or discuss further more if you find any of them.
> Best regards
> yuzhang
>
>
> yuzhang
> shifengdefan...@163.com
>
> <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=yuzhang&uid=shifengdefannao%40163.com&iconUrl=http%3A%2F%2Fmail-online.nosdn.127.net%2Fsm1c0446ade9371d208d1e209c8bc0827f.jpg&items=%5B%22shifengdefannao%40163.com%22%5D>
> 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail81> 定制
>

Reply via email to