[
https://issues.apache.org/jira/browse/LENS-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15867661#comment-15867661
]
Rajat Khandelwal commented on LENS-974:
---------------------------------------
But in both the approaches, I see some scope of improvement. Both the
approaches are variations of populate all and prune most. Note that some types
of prunings are fatal (if pruned once, it won't be a candidate anymore) and
some are recoverable (e.g. those pruned by `LightestFactResolver`, if pruned
once, it can still be a candidate if all the remaining candidates after pruning
are pruned later for other reasons). This is a problem for us in our current
setup as well since sometimes a fact (say `f1`) is pruned because it's heavy
but then other facts are later pruned for absence of partitions so while `f1`
could have answered the query but it can't now. One possible solution for this
would be to categorize types of pruning and if no candidates are found at the
end, backtrack till facts that are pruned for non-fatal reasons and take them
forward from where they were pruned. After the introduction of Segmentations,
this problem might gain more weight. Since processing a segmentation at any
point recursively would be very expensive and we'd like to minimize that and
open the black box only for the strongest of contenders. Basically, the
`populate all and prune most` approach will be more expensive than an approach
where we try to minimize processing on weak candidates. We can sort candidates
and do complete processing for each candidate and the first one that goes
through all steps can be selected. This way the possible complications involved
in backtracking will be removed. So we arrive at a third approach.
> Add cube-segmentation for base cube
> -----------------------------------
>
> Key: LENS-974
> URL: https://issues.apache.org/jira/browse/LENS-974
> Project: Apache Lens
> Issue Type: New Feature
> Components: cube
> Reporter: Sushil Mohanty
>
> With cube segmentation a cube can have multiple cubes and all these child
> cubes together will make the cube complete.
> CubeSegmentation and CubeFactTable will sit together, which means it can
> belong to only one base cube. A base cube can have one or more cube
> segmentations. Fields of segmentation will be intersection of all columns of
> its cubes. Segmentation will have weight to compare with its buddies (facts
> or other segmentations). Also it can have start and end time defined or it
> can derive from its underline facts.
> eg:
> base_cube
> |_fact1
> |_fact2
> |_cube_segment1
> |_cube1
> |_fact_11
> |_fact_12
> ...
> ...
> |_cube_segment2
> |_cube2
> |_fact_21
> |_fact_22
> ...
> ...
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)