[jira] [Updated] (HIVE-4850) Implement vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4850: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Remus! Implement vectorized JOIN operators --- Key: HIVE-4850 URL: https://issues.apache.org/jira/browse/HIVE-4850 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Assignee: Remus Rusanu Fix For: 0.13.0 Attachments: HIVE-4850.03.patch, HIVE-4850.04.patch, HIVE-4850.06.patch, HIVE-4850.07.patch, HIVE-4850.08.patch, HIVE-4850.09.patch, HIVE-4850.1.patch, HIVE-4850.2.patch Easysauce -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4850) Implement vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-4850: --- Attachment: HIVE-4850.08.patch Iteration 8, this must be the right one! Implement vectorized JOIN operators --- Key: HIVE-4850 URL: https://issues.apache.org/jira/browse/HIVE-4850 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Assignee: Remus Rusanu Attachments: HIVE-4850.03.patch, HIVE-4850.04.patch, HIVE-4850.06.patch, HIVE-4850.07.patch, HIVE-4850.08.patch, HIVE-4850.1.patch, HIVE-4850.2.patch Easysauce -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4850) Implement vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-4850: --- Status: Patch Available (was: In Progress) Added the SET recommended by Jitendra. Implement vectorized JOIN operators --- Key: HIVE-4850 URL: https://issues.apache.org/jira/browse/HIVE-4850 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Assignee: Remus Rusanu Attachments: HIVE-4850.03.patch, HIVE-4850.04.patch, HIVE-4850.06.patch, HIVE-4850.07.patch, HIVE-4850.08.patch, HIVE-4850.1.patch, HIVE-4850.2.patch Easysauce -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4850) Implement vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-4850: --- Status: In Progress (was: Patch Available) Implement vectorized JOIN operators --- Key: HIVE-4850 URL: https://issues.apache.org/jira/browse/HIVE-4850 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Assignee: Remus Rusanu Attachments: HIVE-4850.03.patch, HIVE-4850.04.patch, HIVE-4850.06.patch, HIVE-4850.07.patch, HIVE-4850.08.patch, HIVE-4850.1.patch, HIVE-4850.2.patch Easysauce -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4850) Implement vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-4850: --- Status: In Progress (was: Patch Available) Implement vectorized JOIN operators --- Key: HIVE-4850 URL: https://issues.apache.org/jira/browse/HIVE-4850 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Assignee: Remus Rusanu Attachments: HIVE-4850.03.patch, HIVE-4850.04.patch, HIVE-4850.06.patch, HIVE-4850.07.patch, HIVE-4850.08.patch, HIVE-4850.1.patch, HIVE-4850.2.patch Easysauce -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4850) Implement vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-4850: --- Status: Patch Available (was: In Progress) Implement vectorized JOIN operators --- Key: HIVE-4850 URL: https://issues.apache.org/jira/browse/HIVE-4850 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Assignee: Remus Rusanu Attachments: HIVE-4850.03.patch, HIVE-4850.04.patch, HIVE-4850.06.patch, HIVE-4850.07.patch, HIVE-4850.08.patch, HIVE-4850.09.patch, HIVE-4850.1.patch, HIVE-4850.2.patch Easysauce -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4850) Implement vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-4850: --- Attachment: HIVE-4850.09.patch Whitespaceonly diff results from Jitendra's run Implement vectorized JOIN operators --- Key: HIVE-4850 URL: https://issues.apache.org/jira/browse/HIVE-4850 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Assignee: Remus Rusanu Attachments: HIVE-4850.03.patch, HIVE-4850.04.patch, HIVE-4850.06.patch, HIVE-4850.07.patch, HIVE-4850.08.patch, HIVE-4850.09.patch, HIVE-4850.1.patch, HIVE-4850.2.patch Easysauce -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4850) Implement vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-4850: --- Attachment: HIVE-4850.06.patch fixed test to use mapjoin hint, w/o were resulting in shuffle join Implement vectorized JOIN operators --- Key: HIVE-4850 URL: https://issues.apache.org/jira/browse/HIVE-4850 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Assignee: Remus Rusanu Attachments: HIVE-4850.03.patch, HIVE-4850.04.patch, HIVE-4850.06.patch, HIVE-4850.1.patch, HIVE-4850.2.patch Easysauce -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4850) Implement vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-4850: --- Status: Patch Available (was: In Progress) Implement vectorized JOIN operators --- Key: HIVE-4850 URL: https://issues.apache.org/jira/browse/HIVE-4850 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Assignee: Remus Rusanu Attachments: HIVE-4850.03.patch, HIVE-4850.04.patch, HIVE-4850.06.patch, HIVE-4850.1.patch, HIVE-4850.2.patch Easysauce -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4850) Implement vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-4850: --- Status: In Progress (was: Patch Available) Implement vectorized JOIN operators --- Key: HIVE-4850 URL: https://issues.apache.org/jira/browse/HIVE-4850 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Assignee: Remus Rusanu Attachments: HIVE-4850.03.patch, HIVE-4850.04.patch, HIVE-4850.06.patch, HIVE-4850.1.patch, HIVE-4850.2.patch Easysauce -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4850) Implement vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-4850: --- Status: In Progress (was: Patch Available) Implement vectorized JOIN operators --- Key: HIVE-4850 URL: https://issues.apache.org/jira/browse/HIVE-4850 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Assignee: Remus Rusanu Attachments: HIVE-4850.03.patch, HIVE-4850.04.patch, HIVE-4850.06.patch, HIVE-4850.07.patch, HIVE-4850.1.patch, HIVE-4850.2.patch Easysauce -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4850) Implement vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-4850: --- Attachment: HIVE-4850.07.patch Implement vectorized JOIN operators --- Key: HIVE-4850 URL: https://issues.apache.org/jira/browse/HIVE-4850 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Assignee: Remus Rusanu Attachments: HIVE-4850.03.patch, HIVE-4850.04.patch, HIVE-4850.06.patch, HIVE-4850.07.patch, HIVE-4850.1.patch, HIVE-4850.2.patch Easysauce -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4850) Implement vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-4850: --- Status: Patch Available (was: In Progress) Removed the /* MAPJOIN */ hint, use hive.auto.convert.join=true; instead. the failing test passes on my machine, trying to figure out why fails on Jenkins. Implement vectorized JOIN operators --- Key: HIVE-4850 URL: https://issues.apache.org/jira/browse/HIVE-4850 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Assignee: Remus Rusanu Attachments: HIVE-4850.03.patch, HIVE-4850.04.patch, HIVE-4850.06.patch, HIVE-4850.07.patch, HIVE-4850.1.patch, HIVE-4850.2.patch Easysauce -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4850) Implement vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-4850: --- Attachment: HIVE-4850.04.patch Implement vectorized JOIN operators --- Key: HIVE-4850 URL: https://issues.apache.org/jira/browse/HIVE-4850 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Assignee: Remus Rusanu Attachments: HIVE-4850.03.patch, HIVE-4850.04.patch, HIVE-4850.1.patch, HIVE-4850.2.patch Easysauce -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4850) Implement vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-4850: --- Status: Patch Available (was: Open) Fixed the JoinUtils computeValue regression Implement vectorized JOIN operators --- Key: HIVE-4850 URL: https://issues.apache.org/jira/browse/HIVE-4850 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Assignee: Remus Rusanu Attachments: HIVE-4850.03.patch, HIVE-4850.04.patch, HIVE-4850.1.patch, HIVE-4850.2.patch Easysauce -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4850) Implement vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-4850: --- Attachment: HIVE-4850.04.patch Implement vectorized JOIN operators --- Key: HIVE-4850 URL: https://issues.apache.org/jira/browse/HIVE-4850 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Assignee: Remus Rusanu Attachments: HIVE-4850.03.patch, HIVE-4850.04.patch, HIVE-4850.1.patch, HIVE-4850.2.patch Easysauce -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4850) Implement vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-4850: --- Attachment: (was: HIVE-4850.04.patch) Implement vectorized JOIN operators --- Key: HIVE-4850 URL: https://issues.apache.org/jira/browse/HIVE-4850 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Assignee: Remus Rusanu Attachments: HIVE-4850.03.patch, HIVE-4850.04.patch, HIVE-4850.1.patch, HIVE-4850.2.patch Easysauce -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4850) Implement vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-4850: --- Attachment: HIVE-4850.03.patch Added test query but my local environment does not pass trunk clean. Uploading to get a pre-commit build infra run on the patch. Implement vectorized JOIN operators --- Key: HIVE-4850 URL: https://issues.apache.org/jira/browse/HIVE-4850 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Assignee: Remus Rusanu Attachments: HIVE-4850.03.patch, HIVE-4850.1.patch, HIVE-4850.2.patch Easysauce -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4850) Implement vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-4850: --- Status: Open (was: Patch Available) Implement vectorized JOIN operators --- Key: HIVE-4850 URL: https://issues.apache.org/jira/browse/HIVE-4850 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Assignee: Remus Rusanu Attachments: HIVE-4850.03.patch, HIVE-4850.1.patch, HIVE-4850.2.patch Easysauce -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4850) Implement vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-4850: --- Attachment: HIVE-4850.2.patch This is a working implementation based on current trunk. It is simpler than the .1 patch in as it delegates the JOIN entirely to the row-mode MapJoinOperator. The vectorized operator is literally calling the row-mode implementaiton for each row in the input batch and collects the row-mode forward into the output batch. This is not as bad as it seems because the JOIN operators has to resort to row-mode operations anyway, due to the small tables (hashtables) being row-mode (objects and object-inspectors). By delegating the entire join logic to the row mode we piggyback on the correctness of exiting implementation. I do plan to come up with a full-vectorized mode implementation but that would require changes to the hash table creation-serialization. Note that the filtering and key evaluation of the big table *does* use vectorized operators. the row mode applies only to the key HT lookup and to the JOIN logic. Implement vectorized JOIN operators --- Key: HIVE-4850 URL: https://issues.apache.org/jira/browse/HIVE-4850 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Assignee: Remus Rusanu Attachments: HIVE-4850.1.patch, HIVE-4850.2.patch Easysauce -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4850) Implement vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-4850: --- Attachment: HIVE-4850.1.patch This is an initial implementation of Map join. Multiple join aliases and multiple values per key work. The small aliases are row mode data (writable objects) and get converted to vector values *for each row in the bit table* (after filtering). Also the map hash has row mode keys (objects) and the vector mode keys get converted to object keys for lookup of *each row in the big table* (after filtering). Implement vectorized JOIN operators --- Key: HIVE-4850 URL: https://issues.apache.org/jira/browse/HIVE-4850 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Assignee: Remus Rusanu Attachments: HIVE-4850.1.patch Easysauce -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira