[
https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sergey Shelukhin updated HIVE-14589:
------------------------------------
Attachment: HIVE-14589.04.patch
> add consistent node replacement to LLAP for splits
> --------------------------------------------------
>
> Key: HIVE-14589
> URL: https://issues.apache.org/jira/browse/HIVE-14589
> Project: Hive
> Issue Type: Bug
> Reporter: Sergey Shelukhin
> Assignee: Sergey Shelukhin
> Attachments: HIVE-14589.01.patch, HIVE-14589.02.patch,
> HIVE-14589.03.patch, HIVE-14589.04.patch, HIVE-14589.patch
>
>
> See HIVE-14574. (copied from the comment below) This basically creates the
> nodes in ZK for "slots" in the cluster. The LLAPs try to take the lowest
> available slot, starting from 0. Unlike worker-... nodes, the slots are
> reused, which is the intent. The LLAPs are always sorted by the slot number
> for splits.
> The idea is that as long as LLAP is running, it will retain the same position
> in the ordering, regardless of other LLAPs restarting, without knowing about
> each other, the predecessors location (if restarted in a different place), or
> the total size of the cluster.
> The restarting LLAPs may not take the same positions as their predecessors
> (i.e. if two LLAPs restart they can swap slots) but it shouldn't matter
> because they have lost their cache anyway.
> I.e. if you have LLAPs with slots 1-2-3-4 and I nuke and restart 1, 2, and 4,
> they will take whatever slots, but 3 will stay the 3rd and retain cache
> locality.
> This also handles size increase, as new LLAPs will always be added to the end
> of the sequence, which is what consistent hashing needs.
> One case it doesn't handle is permanent cluster size reduction. There will be
> a permanent gap if LLAPs are removed that have the slots in the middle; until
> some are restarted, it will result in misses
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)