[jira] [Updated] (HIVE-10104) LLAP: Generate consistent splits and locations for the same split across jobs
[ https://issues.apache.org/jira/browse/HIVE-10104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-10104: -- Attachment: HIVE-10104.1.txt Patch to order the original splits by size and name. Location is based on a hash of the filename and start position. [~hagleitn] - could you please take a quick look for sanity. Will commit after I'm able to test it a bit on a cluster larger than 1 node. LLAP: Generate consistent splits and locations for the same split across jobs - Key: HIVE-10104 URL: https://issues.apache.org/jira/browse/HIVE-10104 Project: Hive Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Siddharth Seth Fix For: llap Attachments: HIVE-10104.1.txt Locations for splits are currently randomized. Also, the order of splits is random - depending on how threads end up generating the splits. Add an option to sort the splits, and generate repeatable locations - assuming all other factors are the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10104) LLAP: Generate consistent splits and locations for the same split across jobs
[ https://issues.apache.org/jira/browse/HIVE-10104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-10104: -- Attachment: HIVE-10104.2.txt Updated patch with the sort removed from the scheduler. Tested on a multi-node cluster. Will commit after the next rebase of the LLAP branch. LLAP: Generate consistent splits and locations for the same split across jobs - Key: HIVE-10104 URL: https://issues.apache.org/jira/browse/HIVE-10104 Project: Hive Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Siddharth Seth Fix For: llap Attachments: HIVE-10104.1.txt, HIVE-10104.2.txt Locations for splits are currently randomized. Also, the order of splits is random - depending on how threads end up generating the splits. Add an option to sort the splits, and generate repeatable locations - assuming all other factors are the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)