[jira] [Commented] (HIVE-14680) retain consistent splits /during/ (as opposed to across) LLAP failures on top of HIVE-14589

2016-09-20 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15507337#comment-15507337
 ] 

Siddharth Seth commented on HIVE-14680:
---

Really think we're better off fixing this within Orc itself - instead of 
working around it in the split generator (which at some point will handle 
different file types). Can the ORC getSplits not deal with this in BI mode?

> retain consistent splits /during/ (as opposed to across) LLAP failures on top 
> of HIVE-14589
> ---
>
> Key: HIVE-14680
> URL: https://issues.apache.org/jira/browse/HIVE-14680
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0
>
> Attachments: HIVE-14680.01.patch, HIVE-14680.02.patch, 
> HIVE-14680.03.patch, HIVE-14680.patch
>
>
> see HIVE-14589.
> Basic idea (spent about 7 minutes thinking about this based on RB comment ;)) 
> is to return locations for all slots to HostAffinitySplitLocationProvider, 
> the missing slots being inactive locations (based solely on the last slot 
> actually present). For the splits mapped to these locations, fall back via 
> different hash functions, or some sort of probing.
> This still doesn't handle all the cases, namely when the last slots are gone 
> (consistent hashing is supposed to be good for this?); however for that we'd 
> need more involved coordination between nodes or a central updater to 
> indicate the number of nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14680) retain consistent splits /during/ (as opposed to across) LLAP failures on top of HIVE-14589

2016-09-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504607#comment-15504607
 ] 

Hive QA commented on HIVE-14680:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12829250/HIVE-14680.03.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10554 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3]
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching
{noformat}

Test results: 
https://builds.apache.org/job/jenkins-PreCommit-HIVE-Build/1232/testReport
Console output: 
https://builds.apache.org/job/jenkins-PreCommit-HIVE-Build/1232/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/jenkins-PreCommit-HIVE-Build-1232/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12829250 - jenkins-PreCommit-HIVE-Build

> retain consistent splits /during/ (as opposed to across) LLAP failures on top 
> of HIVE-14589
> ---
>
> Key: HIVE-14680
> URL: https://issues.apache.org/jira/browse/HIVE-14680
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14680.01.patch, HIVE-14680.02.patch, 
> HIVE-14680.03.patch, HIVE-14680.patch
>
>
> see HIVE-14589.
> Basic idea (spent about 7 minutes thinking about this based on RB comment ;)) 
> is to return locations for all slots to HostAffinitySplitLocationProvider, 
> the missing slots being inactive locations (based solely on the last slot 
> actually present). For the splits mapped to these locations, fall back via 
> different hash functions, or some sort of probing.
> This still doesn't handle all the cases, namely when the last slots are gone 
> (consistent hashing is supposed to be good for this?); however for that we'd 
> need more involved coordination between nodes or a central updater to 
> indicate the number of nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14680) retain consistent splits /during/ (as opposed to across) LLAP failures on top of HIVE-14589

2016-09-19 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504329#comment-15504329
 ] 

Sergey Shelukhin commented on HIVE-14680:
-

One-byte change.

> retain consistent splits /during/ (as opposed to across) LLAP failures on top 
> of HIVE-14589
> ---
>
> Key: HIVE-14680
> URL: https://issues.apache.org/jira/browse/HIVE-14680
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14680.01.patch, HIVE-14680.02.patch, 
> HIVE-14680.03.patch, HIVE-14680.patch
>
>
> see HIVE-14589.
> Basic idea (spent about 7 minutes thinking about this based on RB comment ;)) 
> is to return locations for all slots to HostAffinitySplitLocationProvider, 
> the missing slots being inactive locations (based solely on the last slot 
> actually present). For the splits mapped to these locations, fall back via 
> different hash functions, or some sort of probing.
> This still doesn't handle all the cases, namely when the last slots are gone 
> (consistent hashing is supposed to be good for this?); however for that we'd 
> need more involved coordination between nodes or a central updater to 
> indicate the number of nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14680) retain consistent splits /during/ (as opposed to across) LLAP failures on top of HIVE-14589

2016-09-19 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504324#comment-15504324
 ] 

Sergey Shelukhin commented on HIVE-14680:
-

[~sseth] I was assuming the normal block/split boundaries were k*large-ish 
power of two, so this would suffice.
Apparently there's no such restriction. +-3 can affect another bit, however if 
we make no assumptions about split boundaries, we cannot tell which way the 3 
goes (e.g. for 31323, we don't know if it has to be consistent with 31320 or 
31326). I guess we can just remove an extra bit.

> retain consistent splits /during/ (as opposed to across) LLAP failures on top 
> of HIVE-14589
> ---
>
> Key: HIVE-14680
> URL: https://issues.apache.org/jira/browse/HIVE-14680
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14680.01.patch, HIVE-14680.02.patch, 
> HIVE-14680.patch
>
>
> see HIVE-14589.
> Basic idea (spent about 7 minutes thinking about this based on RB comment ;)) 
> is to return locations for all slots to HostAffinitySplitLocationProvider, 
> the missing slots being inactive locations (based solely on the last slot 
> actually present). For the splits mapped to these locations, fall back via 
> different hash functions, or some sort of probing.
> This still doesn't handle all the cases, namely when the last slots are gone 
> (consistent hashing is supposed to be good for this?); however for that we'd 
> need more involved coordination between nodes or a central updater to 
> indicate the number of nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14680) retain consistent splits /during/ (as opposed to across) LLAP failures on top of HIVE-14589

2016-09-19 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504305#comment-15504305
 ] 

Gopal V commented on HIVE-14680:


The off-by value is usually the file magic for the ORC file "ORC" (3 bytes). 
BISplit will ignore it & do (0+32Mb), the ETLSplit will start at the 1st stripe 
(3+33.99Mb).

This is not expected to happen for any split other than stripe #1 of a file.

> retain consistent splits /during/ (as opposed to across) LLAP failures on top 
> of HIVE-14589
> ---
>
> Key: HIVE-14680
> URL: https://issues.apache.org/jira/browse/HIVE-14680
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14680.01.patch, HIVE-14680.02.patch, 
> HIVE-14680.patch
>
>
> see HIVE-14589.
> Basic idea (spent about 7 minutes thinking about this based on RB comment ;)) 
> is to return locations for all slots to HostAffinitySplitLocationProvider, 
> the missing slots being inactive locations (based solely on the last slot 
> actually present). For the splits mapped to these locations, fall back via 
> different hash functions, or some sort of probing.
> This still doesn't handle all the cases, namely when the last slots are gone 
> (consistent hashing is supposed to be good for this?); however for that we'd 
> need more involved coordination between nodes or a central updater to 
> indicate the number of nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14680) retain consistent splits /during/ (as opposed to across) LLAP failures on top of HIVE-14589

2016-09-19 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504287#comment-15504287
 ] 

Siddharth Seth commented on HIVE-14680:
---

bq. As for removing the 2 lowest bits, yes
Let me clarify the question.
Block boundary is 30MB. Split via read-footers generates the split start 30MB + 
3 bytes. Split without reading footer generates the start-offset as 30MB.
Will removing the 2 lower bits provide the same start offset for both splits. 
Otherwise these splits are not consistent, and will not go to the same node. 
30MB is probably a bad example. Will this work in all cases (32MB -2 bytes).

> retain consistent splits /during/ (as opposed to across) LLAP failures on top 
> of HIVE-14589
> ---
>
> Key: HIVE-14680
> URL: https://issues.apache.org/jira/browse/HIVE-14680
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14680.01.patch, HIVE-14680.02.patch, 
> HIVE-14680.patch
>
>
> see HIVE-14589.
> Basic idea (spent about 7 minutes thinking about this based on RB comment ;)) 
> is to return locations for all slots to HostAffinitySplitLocationProvider, 
> the missing slots being inactive locations (based solely on the last slot 
> actually present). For the splits mapped to these locations, fall back via 
> different hash functions, or some sort of probing.
> This still doesn't handle all the cases, namely when the last slots are gone 
> (consistent hashing is supposed to be good for this?); however for that we'd 
> need more involved coordination between nodes or a central updater to 
> indicate the number of nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14680) retain consistent splits /during/ (as opposed to across) LLAP failures on top of HIVE-14589

2016-09-16 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15498036#comment-15498036
 ] 

Hive QA commented on HIVE-14680:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12828889/HIVE-14680.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10562 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3]
org.apache.hadoop.hive.metastore.TestMetaStoreMetrics.testMetaDataCounts
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/1218/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/1218/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-1218/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12828889 - PreCommit-HIVE-MASTER-Build

> retain consistent splits /during/ (as opposed to across) LLAP failures on top 
> of HIVE-14589
> ---
>
> Key: HIVE-14680
> URL: https://issues.apache.org/jira/browse/HIVE-14680
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14680.01.patch, HIVE-14680.02.patch, 
> HIVE-14680.patch
>
>
> see HIVE-14589.
> Basic idea (spent about 7 minutes thinking about this based on RB comment ;)) 
> is to return locations for all slots to HostAffinitySplitLocationProvider, 
> the missing slots being inactive locations (based solely on the last slot 
> actually present). For the splits mapped to these locations, fall back via 
> different hash functions, or some sort of probing.
> This still doesn't handle all the cases, namely when the last slots are gone 
> (consistent hashing is supposed to be good for this?); however for that we'd 
> need more involved coordination between nodes or a central updater to 
> indicate the number of nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14680) retain consistent splits /during/ (as opposed to across) LLAP failures on top of HIVE-14589

2016-09-16 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497519#comment-15497519
 ] 

Sergey Shelukhin commented on HIVE-14680:
-

Note that we don't even know where they executed (by definition since we know 
they are inactive due to the fact that the znodes are not there), and we also 
don't know if the slots are empty because of temporary failure or resize. So we 
cannot display anything other than a missing slot.

As for removing the 2 lowest bits, yes :P

> retain consistent splits /during/ (as opposed to across) LLAP failures on top 
> of HIVE-14589
> ---
>
> Key: HIVE-14680
> URL: https://issues.apache.org/jira/browse/HIVE-14680
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14680.01.patch, HIVE-14680.02.patch, 
> HIVE-14680.patch
>
>
> see HIVE-14589.
> Basic idea (spent about 7 minutes thinking about this based on RB comment ;)) 
> is to return locations for all slots to HostAffinitySplitLocationProvider, 
> the missing slots being inactive locations (based solely on the last slot 
> actually present). For the splits mapped to these locations, fall back via 
> different hash functions, or some sort of probing.
> This still doesn't handle all the cases, namely when the last slots are gone 
> (consistent hashing is supposed to be good for this?); however for that we'd 
> need more involved coordination between nodes or a central updater to 
> indicate the number of nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14680) retain consistent splits /during/ (as opposed to across) LLAP failures on top of HIVE-14589

2016-09-16 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497505#comment-15497505
 ] 

Siddharth Seth commented on HIVE-14680:
---

+1 for the latest patch.

> retain consistent splits /during/ (as opposed to across) LLAP failures on top 
> of HIVE-14589
> ---
>
> Key: HIVE-14680
> URL: https://issues.apache.org/jira/browse/HIVE-14680
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14680.01.patch, HIVE-14680.02.patch, 
> HIVE-14680.patch
>
>
> see HIVE-14589.
> Basic idea (spent about 7 minutes thinking about this based on RB comment ;)) 
> is to return locations for all slots to HostAffinitySplitLocationProvider, 
> the missing slots being inactive locations (based solely on the last slot 
> actually present). For the splits mapped to these locations, fall back via 
> different hash functions, or some sort of probing.
> This still doesn't handle all the cases, namely when the last slots are gone 
> (consistent hashing is supposed to be good for this?); however for that we'd 
> need more involved coordination between nodes or a central updater to 
> indicate the number of nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14680) retain consistent splits /during/ (as opposed to across) LLAP failures on top of HIVE-14589

2016-09-16 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497503#comment-15497503
 ] 

Siddharth Seth commented on HIVE-14680:
---

Displaying them as inactive, or at least showing the expected / requested count 
would be useful.

Does the 4 byte offset fix work, if a split starts at something like 30MB into 
a file.

> retain consistent splits /during/ (as opposed to across) LLAP failures on top 
> of HIVE-14589
> ---
>
> Key: HIVE-14680
> URL: https://issues.apache.org/jira/browse/HIVE-14680
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14680.01.patch, HIVE-14680.02.patch, 
> HIVE-14680.patch
>
>
> see HIVE-14589.
> Basic idea (spent about 7 minutes thinking about this based on RB comment ;)) 
> is to return locations for all slots to HostAffinitySplitLocationProvider, 
> the missing slots being inactive locations (based solely on the last slot 
> actually present). For the splits mapped to these locations, fall back via 
> different hash functions, or some sort of probing.
> This still doesn't handle all the cases, namely when the last slots are gone 
> (consistent hashing is supposed to be good for this?); however for that we'd 
> need more involved coordination between nodes or a central updater to 
> indicate the number of nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14680) retain consistent splits /during/ (as opposed to across) LLAP failures on top of HIVE-14589

2016-09-16 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497262#comment-15497262
 ] 

Hive QA commented on HIVE-14680:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12828315/HIVE-14680.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10562 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3]
org.apache.hadoop.hive.metastore.TestMetaStoreMetrics.testMetaDataCounts
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/1212/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/1212/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-1212/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12828315 - PreCommit-HIVE-MASTER-Build

> retain consistent splits /during/ (as opposed to across) LLAP failures on top 
> of HIVE-14589
> ---
>
> Key: HIVE-14680
> URL: https://issues.apache.org/jira/browse/HIVE-14680
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14680.01.patch, HIVE-14680.02.patch, 
> HIVE-14680.patch
>
>
> see HIVE-14589.
> Basic idea (spent about 7 minutes thinking about this based on RB comment ;)) 
> is to return locations for all slots to HostAffinitySplitLocationProvider, 
> the missing slots being inactive locations (based solely on the last slot 
> actually present). For the splits mapped to these locations, fall back via 
> different hash functions, or some sort of probing.
> This still doesn't handle all the cases, namely when the last slots are gone 
> (consistent hashing is supposed to be good for this?); however for that we'd 
> need more involved coordination between nodes or a central updater to 
> indicate the number of nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14680) retain consistent splits /during/ (as opposed to across) LLAP failures on top of HIVE-14589

2016-09-16 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497238#comment-15497238
 ] 

Sergey Shelukhin commented on HIVE-14680:
-

Displaying them as "inactive" is actually just flipping the flag from false to 
true in the call to getAllOrdered..., but it looks rather weird to have them 
here for unknown (to the users) reason.
Yes startOffset is rounded to 4 bytes.

> retain consistent splits /during/ (as opposed to across) LLAP failures on top 
> of HIVE-14589
> ---
>
> Key: HIVE-14680
> URL: https://issues.apache.org/jira/browse/HIVE-14680
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14680.01.patch, HIVE-14680.patch
>
>
> see HIVE-14589.
> Basic idea (spent about 7 minutes thinking about this based on RB comment ;)) 
> is to return locations for all slots to HostAffinitySplitLocationProvider, 
> the missing slots being inactive locations (based solely on the last slot 
> actually present). For the splits mapped to these locations, fall back via 
> different hash functions, or some sort of probing.
> This still doesn't handle all the cases, namely when the last slots are gone 
> (consistent hashing is supposed to be good for this?); however for that we'd 
> need more involved coordination between nodes or a central updater to 
> indicate the number of nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14680) retain consistent splits /during/ (as opposed to across) LLAP failures on top of HIVE-14589

2016-09-16 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15496829#comment-15496829
 ] 

Hive QA commented on HIVE-14680:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12828315/HIVE-14680.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10562 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3]
org.apache.hadoop.hive.llap.security.TestLlapSignerImpl.testSigning
org.apache.hadoop.hive.metastore.TestMetaStoreMetrics.testMetaDataCounts
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/1209/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/1209/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-1209/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12828315 - PreCommit-HIVE-MASTER-Build

> retain consistent splits /during/ (as opposed to across) LLAP failures on top 
> of HIVE-14589
> ---
>
> Key: HIVE-14680
> URL: https://issues.apache.org/jira/browse/HIVE-14680
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14680.01.patch, HIVE-14680.patch
>
>
> see HIVE-14589.
> Basic idea (spent about 7 minutes thinking about this based on RB comment ;)) 
> is to return locations for all slots to HostAffinitySplitLocationProvider, 
> the missing slots being inactive locations (based solely on the last slot 
> actually present). For the splits mapped to these locations, fall back via 
> different hash functions, or some sort of probing.
> This still doesn't handle all the cases, namely when the last slots are gone 
> (consistent hashing is supposed to be good for this?); however for that we'd 
> need more involved coordination between nodes or a central updater to 
> indicate the number of nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14680) retain consistent splits /during/ (as opposed to across) LLAP failures on top of HIVE-14589

2016-09-14 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15492075#comment-15492075
 ] 

Siddharth Seth commented on HIVE-14680:
---

Mostly looks good.
Slots start at 0, correct?

{code}// Since our probing method is totally bogus, give up after some time and 
return everything.{code}
Think it'll be better to return nothing. That'll cause the scheduler to go 
random. Everything has a good chance of overwhelming a box - at least without 
grouping. With grouping, it may spread out.

Would be good to display the number of expected instances on the LlapWebService 
(it shows the ones which are up and running only). Could probably show the ones 
which have gone down, with the isAlive set to false. Separate jira?

[~gopalv] - any comments on the second level hash function, and how it moves 
between hosts with the (+1 * hash2)

{code}startOffset >> 2{code}
This is unrelated to this patch specifically, but related to the series of 
patches. This is trying to avoid differences in the start position, correct? 
Upto a difference of 4. Will it work for splits that don't start at 0?



> retain consistent splits /during/ (as opposed to across) LLAP failures on top 
> of HIVE-14589
> ---
>
> Key: HIVE-14680
> URL: https://issues.apache.org/jira/browse/HIVE-14680
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14680.01.patch, HIVE-14680.patch
>
>
> see HIVE-14589.
> Basic idea (spent about 7 minutes thinking about this based on RB comment ;)) 
> is to return locations for all slots to HostAffinitySplitLocationProvider, 
> the missing slots being inactive locations (based solely on the last slot 
> actually present). For the splits mapped to these locations, fall back via 
> different hash functions, or some sort of probing.
> This still doesn't handle all the cases, namely when the last slots are gone 
> (consistent hashing is supposed to be good for this?); however for that we'd 
> need more involved coordination between nodes or a central updater to 
> indicate the number of nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14680) retain consistent splits /during/ (as opposed to across) LLAP failures on top of HIVE-14589

2016-09-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15488746#comment-15488746
 ] 

Hive QA commented on HIVE-14680:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12828315/HIVE-14680.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10547 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[stats0]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3]
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/1168/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/1168/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-1168/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12828315 - PreCommit-HIVE-MASTER-Build

> retain consistent splits /during/ (as opposed to across) LLAP failures on top 
> of HIVE-14589
> ---
>
> Key: HIVE-14680
> URL: https://issues.apache.org/jira/browse/HIVE-14680
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14680.01.patch, HIVE-14680.patch
>
>
> see HIVE-14589.
> Basic idea (spent about 7 minutes thinking about this based on RB comment ;)) 
> is to return locations for all slots to HostAffinitySplitLocationProvider, 
> the missing slots being inactive locations (based solely on the last slot 
> actually present). For the splits mapped to these locations, fall back via 
> different hash functions, or some sort of probing.
> This still doesn't handle all the cases, namely when the last slots are gone 
> (consistent hashing is supposed to be good for this?); however for that we'd 
> need more involved coordination between nodes or a central updater to 
> indicate the number of nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14680) retain consistent splits /during/ (as opposed to across) LLAP failures on top of HIVE-14589

2016-09-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15488492#comment-15488492
 ] 

Sergey Shelukhin commented on HIVE-14680:
-

It's a golden-file-style test. It tests that the distributions fall within 
parameters that were determined by running the test, looking at results, and 
deciding that they are ok; then encoding that as acceptable results.

> retain consistent splits /during/ (as opposed to across) LLAP failures on top 
> of HIVE-14589
> ---
>
> Key: HIVE-14680
> URL: https://issues.apache.org/jira/browse/HIVE-14680
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14680.01.patch, HIVE-14680.patch
>
>
> see HIVE-14589.
> Basic idea (spent about 7 minutes thinking about this based on RB comment ;)) 
> is to return locations for all slots to HostAffinitySplitLocationProvider, 
> the missing slots being inactive locations (based solely on the last slot 
> actually present). For the splits mapped to these locations, fall back via 
> different hash functions, or some sort of probing.
> This still doesn't handle all the cases, namely when the last slots are gone 
> (consistent hashing is supposed to be good for this?); however for that we'd 
> need more involved coordination between nodes or a central updater to 
> indicate the number of nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14680) retain consistent splits /during/ (as opposed to across) LLAP failures on top of HIVE-14589

2016-09-13 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15488455#comment-15488455
 ] 

Siddharth Seth commented on HIVE-14680:
---

Does the silly test test anything? Otherwise may as well delete it. Maybe add a 
non-silly test.

> retain consistent splits /during/ (as opposed to across) LLAP failures on top 
> of HIVE-14589
> ---
>
> Key: HIVE-14680
> URL: https://issues.apache.org/jira/browse/HIVE-14680
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14680.01.patch, HIVE-14680.patch
>
>
> see HIVE-14589.
> Basic idea (spent about 7 minutes thinking about this based on RB comment ;)) 
> is to return locations for all slots to HostAffinitySplitLocationProvider, 
> the missing slots being inactive locations (based solely on the last slot 
> actually present). For the splits mapped to these locations, fall back via 
> different hash functions, or some sort of probing.
> This still doesn't handle all the cases, namely when the last slots are gone 
> (consistent hashing is supposed to be good for this?); however for that we'd 
> need more involved coordination between nodes or a central updater to 
> indicate the number of nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14680) retain consistent splits /during/ (as opposed to across) LLAP failures on top of HIVE-14589

2016-09-12 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485111#comment-15485111
 ] 

Sergey Shelukhin commented on HIVE-14680:
-

[~prasanth_j] [~gopalv] [~hagleitn] ping?

> retain consistent splits /during/ (as opposed to across) LLAP failures on top 
> of HIVE-14589
> ---
>
> Key: HIVE-14680
> URL: https://issues.apache.org/jira/browse/HIVE-14680
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14680.patch
>
>
> see HIVE-14589.
> Basic idea (spent about 7 minutes thinking about this based on RB comment ;)) 
> is to return locations for all slots to HostAffinitySplitLocationProvider, 
> the missing slots being inactive locations (based solely on the last slot 
> actually present). For the splits mapped to these locations, fall back via 
> different hash functions, or some sort of probing.
> This still doesn't handle all the cases, namely when the last slots are gone 
> (consistent hashing is supposed to be good for this?); however for that we'd 
> need more involved coordination between nodes or a central updater to 
> indicate the number of nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14680) retain consistent splits /during/ (as opposed to across) LLAP failures on top of HIVE-14589

2016-09-12 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485118#comment-15485118
 ] 

Gopal V commented on HIVE-14680:


I've been thinking about how this will jump through locality preferences.

{code}
 index = Hashing.consistentHash(hash1 + iter * hash2, locations.length);
{code}

> retain consistent splits /during/ (as opposed to across) LLAP failures on top 
> of HIVE-14589
> ---
>
> Key: HIVE-14680
> URL: https://issues.apache.org/jira/browse/HIVE-14680
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14680.patch
>
>
> see HIVE-14589.
> Basic idea (spent about 7 minutes thinking about this based on RB comment ;)) 
> is to return locations for all slots to HostAffinitySplitLocationProvider, 
> the missing slots being inactive locations (based solely on the last slot 
> actually present). For the splits mapped to these locations, fall back via 
> different hash functions, or some sort of probing.
> This still doesn't handle all the cases, namely when the last slots are gone 
> (consistent hashing is supposed to be good for this?); however for that we'd 
> need more involved coordination between nodes or a central updater to 
> indicate the number of nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14680) retain consistent splits /during/ (as opposed to across) LLAP failures on top of HIVE-14589

2016-09-02 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15459654#comment-15459654
 ] 

Sergey Shelukhin commented on HIVE-14680:
-

Tested the patch on cluster. When it doesn't hit HIVE-14608, it behaves very 
well... during node restarts, after node restarts, flexing the cluster down and 
back up, and the a higher size from down, too.

> retain consistent splits /during/ (as opposed to across) LLAP failures on top 
> of HIVE-14589
> ---
>
> Key: HIVE-14680
> URL: https://issues.apache.org/jira/browse/HIVE-14680
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14680.patch
>
>
> see HIVE-14589.
> Basic idea (spent about 7 minutes thinking about this based on RB comment ;)) 
> is to return locations for all slots to HostAffinitySplitLocationProvider, 
> the missing slots being inactive locations (based solely on the last slot 
> actually present). For the splits mapped to these locations, fall back via 
> different hash functions, or some sort of probing.
> This still doesn't handle all the cases, namely when the last slots are gone 
> (consistent hashing is supposed to be good for this?); however for that we'd 
> need more involved coordination between nodes or a central updater to 
> indicate the number of nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)