Hi Dayue,
    You are right, metadata is the key part of a system.
    For KYLIN-958, you could apply any workaround for short term. for long
term purpose, we will go through current implementation and try to fix with
right approach to avoid conflict.

    Underling storage is not an issue, actually we just migrated from MySQL
to HBase in early 0.6 version, to remove one more dependency. I think the
metadata storage already be extracted as interface, should be easy to add
other storage again if necessary.

    Thanks.









Best Regards!
---------------------

Luke Han

On Tue, Aug 25, 2015 at 11:35 AM, Dayue Gao <[email protected]> wrote:

> Metadata consistency is one of the most crucial things for many systems.
>
> So in the short run, to fix KYLIN-958, I suggest disallowing user to
> update data model. Even so, user can still create new data model to fulfill
> their needs.
>
> In the long run, I'd suggest migrating metadata persistence from NoSQL
> like HBase to a transactional database like MySQL. Although lots of work
> need to be done, it will make keeping metadata consistency a lot easier.
>
> What do you think?
>
> Best,
> Dayue
>
> > 在 2015年8月25日,上午11:11,Li Yang <[email protected]> 写道:
> >
> > Dayue has a good point. Although updating multiple resources in one
> request
> > is doable but the complexity does not worth the effort.
> >
> > Making model desc and cube desc immutable is a good idea. And we can
> still
> > implement "update" by first delete the old model and cube, then create
> new
> > ones with the same name. So from user point of view, it looks like an
> > update. This work around should do well on 0.7 branch where model and
> cube
> > are 1-1 strictly.
> >
> > The reason model and cube are separate resource is because in 0.8 branch,
> > they are 1-m relationship. User can create a model and create multiple
> > cubes on it.
> >
> > On Tue, Aug 25, 2015 at 10:31 AM, hongbin ma <[email protected]>
> wrote:
> >
> >> hi dayue,
> >>
> >> I'll agree with you. Current cube desc/model desc design is a result of
> >> multiple rounds of re-designing, and it may failed to take maintenance
> >> convenience into well consideration. And to be honest it's quite complex
> >> now, especially when involved with cube/model updates.
> >>
> >> Making cubes/models immutable looks appealing to me. However we might
> need
> >> some more front end work to reduce cube/model recreate overhead for
> users.
> >>
> >> @liyang and @luke  will you please comment on this?
> >>
> >> On Tue, Aug 25, 2015 at 5:12 AM, Dayue Gao <[email protected]> wrote:
> >>
> >>> Hi developers,
> >>>
> >>> When I was working on https://issues.apache.org/jira/browse/KYLIN-958
> <
> >>> https://issues.apache.org/jira/browse/KYLIN-958>, I found it difficult
> >> to
> >>> implement CubeController.updateCubeDesc. The problems are
> >>>
> >>> 1. CubeDesc.calculateSignature only include fact table name and
> partition
> >>> desc as data model information
> >>>
> >>> This means if user changes lookup tables or filter condition, cube desc
> >>> signature won't change and kylin will not clear already built cube
> >>> segments. BTW, why do we store signature in metadata rather than
> >> calculate
> >>> it on demands? I know it may be an optimization to avoid recalculating
> >>> signature every time, however desc changing shouldn't be a regular
> >>> operation, so persisting signature won't give us too much benefit.
> What's
> >>> more, once it's been recorded in metadata, it makes us difficult to
> >> change
> >>> the computing logic.
> >>>
> >>> 2. Maintain metadata consistency
> >>>
> >>> This is a more general problem. As we have separated metadata into
> >>> different files (cube, cube_desc, model_desc, project, etc) and
> >> maintaining
> >>> consistency across these files is not an easy task in both
> >>> FileResourceStore and HBaseResourceStore, IMO we'd better avoid
> >> operations
> >>> that change multiple metadata files as much as possible.
> >>> "CubeController.updateCubeDesc" is a notable counter-example. In order
> to
> >>> complete this operation, a sequence of metadata updates (model_desc ->
> >> cube
> >>> -> cube_desc -> cube -> project) is performed. Make sure
> >>> "CubeController.updateCubeDesc" won't leave metadata in half success
> >> state
> >>> is not easy.
> >>>
> >>> Given all these difficulties, do we really need to allow user to change
> >>> data model? Can we just make data model immutable and only allow user
> to
> >>> change cube desc? Immutable or versioned metadata is always good in my
> >>> experience, so a further question is can we make key parts (properties
> >> that
> >>> defines how cube was built, excluding description, notify_list for
> >> example)
> >>> of cube desc also immutable and just make a shortcut in front-end to
> let
> >>> user create new cube desc based on existing one?
> >>>
> >>> Best,
> >>> Dayue
> >>
> >>
> >>
> >>
> >> --
> >> Regards,
> >>
> >> *Bin Mahone | 马洪宾*
> >> Apache Kylin: http://kylin.io
> >> Github: https://github.com/binmahone
> >>
>
>
>

Reply via email to