[ https://issues.apache.org/jira/browse/HIVE-15234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15690721#comment-15690721 ]
Jesus Camacho Rodriguez commented on HIVE-15234: ------------------------------------------------ [~ashutoshc], patch looks good in general. I have only one question. Why don't we use HiveSemiJoin in all cases? I think we are always generating HiveSemiJoin instances in any case if we are using the right builders (or we probably should). > Semijoin cardinality estimation can be improved > ----------------------------------------------- > > Key: HIVE-15234 > URL: https://issues.apache.org/jira/browse/HIVE-15234 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer > Affects Versions: 2.0.0, 2.1.0 > Reporter: Ashutosh Chauhan > Assignee: Ashutosh Chauhan > Attachments: HIVE-15234.1.patch, HIVE-15234.2.patch, HIVE-15234.patch > > > Currently calcite optimization rules rely on (Hive)SemiJoin to represent semi > join node, whereas Stats estimate use {{leftSemiJoin}} field of Join to > estimate stats. As a result semi-join specific stats calculation logic is > never hit since at plan generation time HiveSemiJoin is created and > leftSemiJoin field of Join is never set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)