[ https://issues.apache.org/jira/browse/PHOENIX-852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14105520#comment-14105520 ]
Maryann Xue commented on PHOENIX-852: ------------------------------------- bq. What about the case where the RHS has been filtered down a lot and you have a fully qualified key? Then a full scan over the LHS will be much worse than a skip scan driven by the keys formed through the RHS rows. I think this may be the most common case. I don't think there's a silver bullet to this problem here before we have stats, and I assume the goal right now is trying to be relatively conservative at this stage. so why don't we just check-in now and go with "by default we do BETWEEN-AND for full key match (e.g. c1,c2,c3 matched in c1,c2,c3), but only IN clause if the SKIP_SCAN_HASH_JOIN hint is on."? At least people can start using this feature optimizing their queries and in some cases they'll have to be aware of the hints to do even better. > Optimize child/parent foreign key joins > --------------------------------------- > > Key: PHOENIX-852 > URL: https://issues.apache.org/jira/browse/PHOENIX-852 > Project: Phoenix > Issue Type: Improvement > Reporter: James Taylor > Assignee: Maryann Xue > Attachments: 852.patch, PHOENIX-852.patch > > > Often times a join will occur from a child to a parent. Our current algorithm > would do a full scan of one side or the other. We can do much better than > that if the HashCache contains the PK (or even part of the PK) from the table > being joined to. In these cases, we should drive the second scan through a > skip scan on the server side. -- This message was sent by Atlassian JIRA (v6.2#6252)