hi dayue, I'll agree with you. Current cube desc/model desc design is a result of multiple rounds of re-designing, and it may failed to take maintenance convenience into well consideration. And to be honest it's quite complex now, especially when involved with cube/model updates.
Making cubes/models immutable looks appealing to me. However we might need some more front end work to reduce cube/model recreate overhead for users. @liyang and @luke will you please comment on this? On Tue, Aug 25, 2015 at 5:12 AM, Dayue Gao <[email protected]> wrote: > Hi developers, > > When I was working on https://issues.apache.org/jira/browse/KYLIN-958 < > https://issues.apache.org/jira/browse/KYLIN-958>, I found it difficult to > implement CubeController.updateCubeDesc. The problems are > > 1. CubeDesc.calculateSignature only include fact table name and partition > desc as data model information > > This means if user changes lookup tables or filter condition, cube desc > signature won't change and kylin will not clear already built cube > segments. BTW, why do we store signature in metadata rather than calculate > it on demands? I know it may be an optimization to avoid recalculating > signature every time, however desc changing shouldn't be a regular > operation, so persisting signature won't give us too much benefit. What's > more, once it's been recorded in metadata, it makes us difficult to change > the computing logic. > > 2. Maintain metadata consistency > > This is a more general problem. As we have separated metadata into > different files (cube, cube_desc, model_desc, project, etc) and maintaining > consistency across these files is not an easy task in both > FileResourceStore and HBaseResourceStore, IMO we'd better avoid operations > that change multiple metadata files as much as possible. > "CubeController.updateCubeDesc" is a notable counter-example. In order to > complete this operation, a sequence of metadata updates (model_desc -> cube > -> cube_desc -> cube -> project) is performed. Make sure > "CubeController.updateCubeDesc" won't leave metadata in half success state > is not easy. > > Given all these difficulties, do we really need to allow user to change > data model? Can we just make data model immutable and only allow user to > change cube desc? Immutable or versioned metadata is always good in my > experience, so a further question is can we make key parts (properties that > defines how cube was built, excluding description, notify_list for example) > of cube desc also immutable and just make a shortcut in front-end to let > user create new cube desc based on existing one? > > Best, > Dayue -- Regards, *Bin Mahone | 马洪宾* Apache Kylin: http://kylin.io Github: https://github.com/binmahone
