gianm commented on issue #4032: [Proposal] Simple join support URL: https://github.com/apache/incubator-druid/issues/4032#issuecomment-481850459 @yurmix for multi-value lookups, would you be thinking about extending the lookup framework to support multiple values for the same key? Or avoiding the lookup framework entirely? By the way, I've been thinking recently about reviving proposals for joins in Druid and had been considering a few goals for a solution, below. - Integrate with Druid SQL so you can use the "JOIN" keyword. Druid SQL usage would be the main motivator for this feature, and so having nice SQL syntax is more important than having nice JSON syntax. - Only the left-most datasource could be a distributed, multi-server datasource. This is so the join doesn't require a system to shuffle data between servers. - All other datasources, other than the leftmost, being joined should be lookup tables (we should allow JOIN onto a lookup table, why not), broadcast datasources, queries, or literal datasources (like an array of rows embedded in the query). - If one of the datasources being joined is a query datasource, I'm thinking the broker should run the query first, and then convert it to a literal datasource before sending it down to the historicals. Druid SQL already has something like this special cased for semi-joins. It should hopefully allow us to remove the special case and just handle semi-joins like any other join. - Start with left/right/inner equijoins only (using a hash-based algorithm). - Later on, add support for non-equijoins. - Later still, add support for shuffling data between servers.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
