[ 
https://issues.apache.org/jira/browse/HIVE-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15504305#comment-15504305
 ] 

Gopal V commented on HIVE-14680:
--------------------------------

The off-by value is usually the file magic for the ORC file "ORC" (3 bytes). 
BISplit will ignore it & do (0+32Mb), the ETLSplit will start at the 1st stripe 
(3+33.99Mb).

This is not expected to happen for any split other than stripe #1 of a file.

> retain consistent splits /during/ (as opposed to across) LLAP failures on top 
> of HIVE-14589
> -------------------------------------------------------------------------------------------
>
>                 Key: HIVE-14680
>                 URL: https://issues.apache.org/jira/browse/HIVE-14680
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-14680.01.patch, HIVE-14680.02.patch, 
> HIVE-14680.patch
>
>
> see HIVE-14589.
> Basic idea (spent about 7 minutes thinking about this based on RB comment ;)) 
> is to return locations for all slots to HostAffinitySplitLocationProvider, 
> the missing slots being inactive locations (based solely on the last slot 
> actually present). For the splits mapped to these locations, fall back via 
> different hash functions, or some sort of probing.
> This still doesn't handle all the cases, namely when the last slots are gone 
> (consistent hashing is supposed to be good for this?); however for that we'd 
> need more involved coordination between nodes or a central updater to 
> indicate the number of nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to