Thank you for the clarification. On Sat, Nov 7, 2020 at 7:37 AM Till Rohrmann <trohrm...@apache.org> wrote:
> Hi Rex, > > "HasUniqueKey" means that the left input has a unique key. > "JoinKeyContainsUniqueKey" means that the join key of the right side > contains the unique key of this relation. Hence, it looks normal to me. > > Cheers, > Till > > On Fri, Nov 6, 2020 at 7:29 PM Rex Fenley <r...@remind101.com> wrote: > >> Hello, >> >> I have a Job that's a series of Joins, GroupBys, and Aggs and it's >> bottlenecked in one of the joins. The join's cardinality is ~300 million >> rows on the left and ~200 million rows on the right all with unique keys. >> I'm seeing this in the plan for that bottlenecked Join. >> >> Join(joinType=[InnerJoin], where=[(user_id = id0)], select=[id, group_id, >> user_id, uuid, owner, id0, deleted_at], leftInputSpec=[HasUniqueKey], >> rightInputSpec=[JoinKeyContainsUniqueKey]) >> >> The join condition is basically (left.user_id === right.id). So `id0` >> must be right.id here. >> >> My first question is, what is the difference between >> >> leftInputSpec=[HasUniqueKey] >> >> and >> >> rightInputSpec=[JoinKeyContainsUniqueKey] >> >> ? >> >> Is the left side not using the join key for hashing the join but instead >> using its pk id, which would be underperformant? >> >> Is there anything else about this that stands out? >> >> Thanks! >> >> -- >> >> Rex Fenley | Software Engineer - Mobile and Backend >> >> >> Remind.com <https://www.remind.com/> | BLOG <http://blog.remind.com/> >> | FOLLOW US <https://twitter.com/remindhq> | LIKE US >> <https://www.facebook.com/remindhq> >> > -- Rex Fenley | Software Engineer - Mobile and Backend Remind.com <https://www.remind.com/> | BLOG <http://blog.remind.com/> | FOLLOW US <https://twitter.com/remindhq> | LIKE US <https://www.facebook.com/remindhq>