hi dayue,

I'll agree with you. Current cube desc/model desc design is a result of
multiple rounds of re-designing, and it may failed to take maintenance
convenience into well consideration. And to be honest it's quite complex
now, especially when involved with cube/model updates.

Making cubes/models immutable looks appealing to me. However we might need
some more front end work to reduce cube/model recreate overhead for users.

@liyang and @luke  will you please comment on this?

On Tue, Aug 25, 2015 at 5:12 AM, Dayue Gao <[email protected]> wrote:

> Hi developers,
>
> When I was working on https://issues.apache.org/jira/browse/KYLIN-958 <
> https://issues.apache.org/jira/browse/KYLIN-958>, I found it difficult to
> implement CubeController.updateCubeDesc. The problems are
>
> 1. CubeDesc.calculateSignature only include fact table name and partition
> desc as data model information
>
> This means if user changes lookup tables or filter condition, cube desc
> signature won't change and kylin will not clear already built cube
> segments. BTW, why do we store signature in metadata rather than calculate
> it on demands? I know it may be an optimization to avoid recalculating
> signature every time, however desc changing shouldn't be a regular
> operation, so persisting signature won't give us too much benefit. What's
> more, once it's been recorded in metadata, it makes us difficult to change
> the computing logic.
>
> 2. Maintain metadata consistency
>
> This is a more general problem. As we have separated metadata into
> different files (cube, cube_desc, model_desc, project, etc) and maintaining
> consistency across these files is not an easy task in both
> FileResourceStore and HBaseResourceStore, IMO we'd better avoid operations
> that change multiple metadata files as much as possible.
> "CubeController.updateCubeDesc" is a notable counter-example. In order to
> complete this operation, a sequence of metadata updates (model_desc -> cube
> -> cube_desc -> cube -> project) is performed. Make sure
> "CubeController.updateCubeDesc" won't leave metadata in half success state
> is not easy.
>
> Given all these difficulties, do we really need to allow user to change
> data model? Can we just make data model immutable and only allow user to
> change cube desc? Immutable or versioned metadata is always good in my
> experience, so a further question is can we make key parts (properties that
> defines how cube was built, excluding description, notify_list for example)
> of cube desc also immutable and just make a shortcut in front-end to let
> user create new cube desc based on existing one?
>
> Best,
> Dayue




-- 
Regards,

*Bin Mahone | 马洪宾*
Apache Kylin: http://kylin.io
Github: https://github.com/binmahone

Reply via email to