[
https://issues.apache.org/jira/browse/CALCITE-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17100463#comment-17100463
]
Jinpeng Wu commented on CALCITE-3963:
-------------------------------------
I think we all agree that RelNodes in a RelSet should share the same logical
properties. The difference is how to do this.
I agree with Julian that MetadataQuery is a good design to propagate logical
properties for new RelNode. Storing a concrete value associate with a RelSet
require complicated logic to maintain and invalidate the cached value. If some
logic is considered flawed, it is a bug of metadata handler. It should be
metadata handler's job to ensure logical properties across the RelSet is
consistent.
Haisheng mentioned that we have to decide when this value is used for logical
space pruning. I think we can add a state field to RelSet, for example,
EXPLORED or SUBSTITUTION_APPLIED. MetadataHandler can also leverage this value
to decide its logic. This value requires invalidation when RelSets get merged.
But it should be much simpler than storing a concrete metadata result.
This strategy is somewhat like combining option one and option two. When new
RelNode is registered into a RelSet, logical properties are recomputed as cache
in RelMetadataQuery is invalidated. This value can not be used for logical
space pruning until the RelSet is in a suitable state. And how to decide the
state? It may be difficult now, but much simpler in top-down rule applying
strategy.
> Maintains logical properties at RelSet (equivalent group) instead of RelNode
> ----------------------------------------------------------------------------
>
> Key: CALCITE-3963
> URL: https://issues.apache.org/jira/browse/CALCITE-3963
> Project: Calcite
> Issue Type: Bug
> Reporter: Xiening Dai
> Assignee: Xiening Dai
> Priority: Major
>
> Currently the logical properties (such as row count, distinct row count, etc)
> are maintained at RelNode level. This creates a number of meta data
> consistency problems, e.g. CALCITE-1048, CALCITE-2166.
> In theory, all RelNodes in a RelSet should share the same logical properties
> per definition of relational equivalence. So it makes more sense to keep
> logical properties at RelSet level, rather than the RelNode. And such
> properties shouldn't change when new sub set is created or subset's best is
> changed.
> Specifically I think below build in metadata should fall into the logical
> properties category -
> Selectivity
> UniqueKeys
> ColumnUniqueness
> RowCount
> MaxRowCount
> MinRowCount
> DistinctRowCount
> Size (averageRowSize, averageColumnSize)
>
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)