[jira] [Commented] (HIVE-15048) Update/Delete statement using wrong WriteEntity when subqueries are involved

2016-12-15 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15752322#comment-15752322
 ] 

Alan Gates commented on HIVE-15048:
---

Sorry, I missed that the new method updateOutputs was actually a breaking up of 
analyzeMerge into multiple methods.  I was reading it as a whole new method.  
Ok, given that:

+1

> Update/Delete statement using wrong WriteEntity when subqueries are involved
> 
>
> Key: HIVE-15048
> URL: https://issues.apache.org/jira/browse/HIVE-15048
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-15048.01.patch, HIVE-15048.02.patch, 
> HIVE-15048.03.patch, HIVE-15048.04.patch
>
>
> See TestDbTxnManager2 for referenced methods
> {noformat}
> checkCmdOnDriver(driver.run("create table target (a int, b int) " +
>   "partitioned by (p int, q int) clustered by (a) into 2  buckets " +
>   "stored as orc TBLPROPERTIES ('transactional'='true')"));
> checkCmdOnDriver(driver.run("create table source (a1 int, b1 int, p1 int, 
> q1 int) clustered by (a1) into 2  buckets stored as orc TBLPROPERTIES 
> ('transactional'='true')"));
> checkCmdOnDriver(driver.run("insert into target partition(p,q) values 
> (1,2,1,2), (3,4,1,2), (5,6,1,3), (7,8,2,2)"));
> checkCmdOnDriver(driver.run(
>   "update source set b1 = 1 where p1 in (select t.q from target t where 
> t.p=2)"));
> {noformat}
> The last Update stmt creates the following Entity objects in the QueryPlan
> inputs: [default@source, default@target, default@target@p=2/q=2]
> outputs: [default@target@p=2/q=2]
> Which is clearly wrong for outputs - the target table is not even 
> partitioned(or called 'target').
> This happens in UpdateDeleteSemanticAnalyzer.reparseAndSuperAnalyze()
> I suspect 
> update T ... where T.p IN (select d from T where ...) 
> type query would also get messed up (but not necessarily fail) if T is 
> partitioned and the subquery filters out some partitions but that does not 
> mean that the same partitions are filtered out in the parent query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15048) Update/Delete statement using wrong WriteEntity when subqueries are involved

2016-12-09 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15736240#comment-15736240
 ] 

Eugene Koifman commented on HIVE-15048:
---

WRT dynamic partitioning, that is also not new.  Update/delete statements have 
always ran with dyn part regardless of what WriteEntity objects there are there.
we.setDynamicPartitionWrite(original.isDynamicPartitionWrite()); just makes the 
lock management logic aware of.

> Update/Delete statement using wrong WriteEntity when subqueries are involved
> 
>
> Key: HIVE-15048
> URL: https://issues.apache.org/jira/browse/HIVE-15048
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-15048.01.patch, HIVE-15048.02.patch, 
> HIVE-15048.03.patch, HIVE-15048.04.patch
>
>
> See TestDbTxnManager2 for referenced methods
> {noformat}
> checkCmdOnDriver(driver.run("create table target (a int, b int) " +
>   "partitioned by (p int, q int) clustered by (a) into 2  buckets " +
>   "stored as orc TBLPROPERTIES ('transactional'='true')"));
> checkCmdOnDriver(driver.run("create table source (a1 int, b1 int, p1 int, 
> q1 int) clustered by (a1) into 2  buckets stored as orc TBLPROPERTIES 
> ('transactional'='true')"));
> checkCmdOnDriver(driver.run("insert into target partition(p,q) values 
> (1,2,1,2), (3,4,1,2), (5,6,1,3), (7,8,2,2)"));
> checkCmdOnDriver(driver.run(
>   "update source set b1 = 1 where p1 in (select t.q from target t where 
> t.p=2)"));
> {noformat}
> The last Update stmt creates the following Entity objects in the QueryPlan
> inputs: [default@source, default@target, default@target@p=2/q=2]
> outputs: [default@target@p=2/q=2]
> Which is clearly wrong for outputs - the target table is not even 
> partitioned(or called 'target').
> This happens in UpdateDeleteSemanticAnalyzer.reparseAndSuperAnalyze()
> I suspect 
> update T ... where T.p IN (select d from T where ...) 
> type query would also get messed up (but not necessarily fail) if T is 
> partitioned and the subquery filters out some partitions but that does not 
> mean that the same partitions are filtered out in the parent query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15048) Update/Delete statement using wrong WriteEntity when subqueries are involved

2016-12-09 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15736227#comment-15736227
 ] 

Eugene Koifman commented on HIVE-15048:
---

That is not what it does.  The code removes the table WriteEntity for target 
table and replaces it with some number of partition WriteEntity objects for 
that table.
So conceptually it does the same thing as before.

If you look at the new .q.out, the output shows the set inputs/outputs that it 
ends up with (not clearly highlight but they are there)

> Update/Delete statement using wrong WriteEntity when subqueries are involved
> 
>
> Key: HIVE-15048
> URL: https://issues.apache.org/jira/browse/HIVE-15048
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-15048.01.patch, HIVE-15048.02.patch, 
> HIVE-15048.03.patch, HIVE-15048.04.patch
>
>
> See TestDbTxnManager2 for referenced methods
> {noformat}
> checkCmdOnDriver(driver.run("create table target (a int, b int) " +
>   "partitioned by (p int, q int) clustered by (a) into 2  buckets " +
>   "stored as orc TBLPROPERTIES ('transactional'='true')"));
> checkCmdOnDriver(driver.run("create table source (a1 int, b1 int, p1 int, 
> q1 int) clustered by (a1) into 2  buckets stored as orc TBLPROPERTIES 
> ('transactional'='true')"));
> checkCmdOnDriver(driver.run("insert into target partition(p,q) values 
> (1,2,1,2), (3,4,1,2), (5,6,1,3), (7,8,2,2)"));
> checkCmdOnDriver(driver.run(
>   "update source set b1 = 1 where p1 in (select t.q from target t where 
> t.p=2)"));
> {noformat}
> The last Update stmt creates the following Entity objects in the QueryPlan
> inputs: [default@source, default@target, default@target@p=2/q=2]
> outputs: [default@target@p=2/q=2]
> Which is clearly wrong for outputs - the target table is not even 
> partitioned(or called 'target').
> This happens in UpdateDeleteSemanticAnalyzer.reparseAndSuperAnalyze()
> I suspect 
> update T ... where T.p IN (select d from T where ...) 
> type query would also get messed up (but not necessarily fail) if T is 
> partitioned and the subquery filters out some partitions but that does not 
> mean that the same partitions are filtered out in the parent query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15048) Update/Delete statement using wrong WriteEntity when subqueries are involved

2016-12-09 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15736090#comment-15736090
 ] 

Alan Gates commented on HIVE-15048:
---

I'm not sure I understand the change here.  The previous code looks like it was 
trying to avoid locking the whole table by figuring out which partitions would 
be read and only locking those partitions.  It looks like this goes wrong when 
there's a subquery involved, but in general should be sound.  If I understand 
your changes you're just moving it to always use dynamic partitioning.  But 
that locks the whole table, which we don't want.

> Update/Delete statement using wrong WriteEntity when subqueries are involved
> 
>
> Key: HIVE-15048
> URL: https://issues.apache.org/jira/browse/HIVE-15048
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-15048.01.patch, HIVE-15048.02.patch, 
> HIVE-15048.03.patch, HIVE-15048.04.patch
>
>
> See TestDbTxnManager2 for referenced methods
> {noformat}
> checkCmdOnDriver(driver.run("create table target (a int, b int) " +
>   "partitioned by (p int, q int) clustered by (a) into 2  buckets " +
>   "stored as orc TBLPROPERTIES ('transactional'='true')"));
> checkCmdOnDriver(driver.run("create table source (a1 int, b1 int, p1 int, 
> q1 int) clustered by (a1) into 2  buckets stored as orc TBLPROPERTIES 
> ('transactional'='true')"));
> checkCmdOnDriver(driver.run("insert into target partition(p,q) values 
> (1,2,1,2), (3,4,1,2), (5,6,1,3), (7,8,2,2)"));
> checkCmdOnDriver(driver.run(
>   "update source set b1 = 1 where p1 in (select t.q from target t where 
> t.p=2)"));
> {noformat}
> The last Update stmt creates the following Entity objects in the QueryPlan
> inputs: [default@source, default@target, default@target@p=2/q=2]
> outputs: [default@target@p=2/q=2]
> Which is clearly wrong for outputs - the target table is not even 
> partitioned(or called 'target').
> This happens in UpdateDeleteSemanticAnalyzer.reparseAndSuperAnalyze()
> I suspect 
> update T ... where T.p IN (select d from T where ...) 
> type query would also get messed up (but not necessarily fail) if T is 
> partitioned and the subquery filters out some partitions but that does not 
> mean that the same partitions are filtered out in the parent query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15048) Update/Delete statement using wrong WriteEntity when subqueries are involved

2016-12-08 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733862#comment-15733862
 ] 

Eugene Koifman commented on HIVE-15048:
---

failures are not related.

[~alangates], could you review please

> Update/Delete statement using wrong WriteEntity when subqueries are involved
> 
>
> Key: HIVE-15048
> URL: https://issues.apache.org/jira/browse/HIVE-15048
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-15048.01.patch, HIVE-15048.02.patch, 
> HIVE-15048.03.patch, HIVE-15048.04.patch
>
>
> See TestDbTxnManager2 for referenced methods
> {noformat}
> checkCmdOnDriver(driver.run("create table target (a int, b int) " +
>   "partitioned by (p int, q int) clustered by (a) into 2  buckets " +
>   "stored as orc TBLPROPERTIES ('transactional'='true')"));
> checkCmdOnDriver(driver.run("create table source (a1 int, b1 int, p1 int, 
> q1 int) clustered by (a1) into 2  buckets stored as orc TBLPROPERTIES 
> ('transactional'='true')"));
> checkCmdOnDriver(driver.run("insert into target partition(p,q) values 
> (1,2,1,2), (3,4,1,2), (5,6,1,3), (7,8,2,2)"));
> checkCmdOnDriver(driver.run(
>   "update source set b1 = 1 where p1 in (select t.q from target t where 
> t.p=2)"));
> {noformat}
> The last Update stmt creates the following Entity objects in the QueryPlan
> inputs: [default@source, default@target, default@target@p=2/q=2]
> outputs: [default@target@p=2/q=2]
> Which is clearly wrong for outputs - the target table is not even 
> partitioned(or called 'target').
> This happens in UpdateDeleteSemanticAnalyzer.reparseAndSuperAnalyze()
> I suspect 
> update T ... where T.p IN (select d from T where ...) 
> type query would also get messed up (but not necessarily fail) if T is 
> partitioned and the subquery filters out some partitions but that does not 
> mean that the same partitions are filtered out in the parent query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15048) Update/Delete statement using wrong WriteEntity when subqueries are involved

2016-12-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733846#comment-15733846
 ] 

Hive QA commented on HIVE-15048:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12842444/HIVE-15048.04.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10785 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_4] 
(batchId=92)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2504/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2504/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2504/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12842444 - PreCommit-HIVE-Build

> Update/Delete statement using wrong WriteEntity when subqueries are involved
> 
>
> Key: HIVE-15048
> URL: https://issues.apache.org/jira/browse/HIVE-15048
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-15048.01.patch, HIVE-15048.02.patch, 
> HIVE-15048.03.patch, HIVE-15048.04.patch
>
>
> See TestDbTxnManager2 for referenced methods
> {noformat}
> checkCmdOnDriver(driver.run("create table target (a int, b int) " +
>   "partitioned by (p int, q int) clustered by (a) into 2  buckets " +
>   "stored as orc TBLPROPERTIES ('transactional'='true')"));
> checkCmdOnDriver(driver.run("create table source (a1 int, b1 int, p1 int, 
> q1 int) clustered by (a1) into 2  buckets stored as orc TBLPROPERTIES 
> ('transactional'='true')"));
> checkCmdOnDriver(driver.run("insert into target partition(p,q) values 
> (1,2,1,2), (3,4,1,2), (5,6,1,3), (7,8,2,2)"));
> checkCmdOnDriver(driver.run(
>   "update source set b1 = 1 where p1 in (select t.q from target t where 
> t.p=2)"));
> {noformat}
> The last Update stmt creates the following Entity objects in the QueryPlan
> inputs: [default@source, default@target, default@target@p=2/q=2]
> outputs: [default@target@p=2/q=2]
> Which is clearly wrong for outputs - the target table is not even 
> partitioned(or called 'target').
> This happens in UpdateDeleteSemanticAnalyzer.reparseAndSuperAnalyze()
> I suspect 
> update T ... where T.p IN (select d from T where ...) 
> type query would also get messed up (but not necessarily fail) if T is 
> partitioned and the subquery filters out some partitions but that does not 
> mean that the same partitions are filtered out in the parent query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15048) Update/Delete statement using wrong WriteEntity when subqueries are involved

2016-12-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732720#comment-15732720
 ] 

Hive QA commented on HIVE-15048:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12842365/HIVE-15048.01.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 10783 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=92)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[auto_join_filters] 
(batchId=118)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join0] (batchId=118)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join37] 
(batchId=118)
org.apache.hive.service.server.TestHS2HttpServer.testContextRootUrlRewrite 
(batchId=185)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2491/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2491/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2491/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12842365 - PreCommit-HIVE-Build

> Update/Delete statement using wrong WriteEntity when subqueries are involved
> 
>
> Key: HIVE-15048
> URL: https://issues.apache.org/jira/browse/HIVE-15048
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-15048.01.patch
>
>
> See TestDbTxnManager2 for referenced methods
> {noformat}
> checkCmdOnDriver(driver.run("create table target (a int, b int) " +
>   "partitioned by (p int, q int) clustered by (a) into 2  buckets " +
>   "stored as orc TBLPROPERTIES ('transactional'='true')"));
> checkCmdOnDriver(driver.run("create table source (a1 int, b1 int, p1 int, 
> q1 int) clustered by (a1) into 2  buckets stored as orc TBLPROPERTIES 
> ('transactional'='true')"));
> checkCmdOnDriver(driver.run("insert into target partition(p,q) values 
> (1,2,1,2), (3,4,1,2), (5,6,1,3), (7,8,2,2)"));
> checkCmdOnDriver(driver.run(
>   "update source set b1 = 1 where p1 in (select t.q from target t where 
> t.p=2)"));
> {noformat}
> The last Update stmt creates the following Entity objects in the QueryPlan
> inputs: [default@source, default@target, default@target@p=2/q=2]
> outputs: [default@target@p=2/q=2]
> Which is clearly wrong for outputs - the target table is not even 
> partitioned(or called 'target').
> This happens in UpdateDeleteSemanticAnalyzer.reparseAndSuperAnalyze()
> I suspect 
> update T ... where T.p IN (select d from T where ...) 
> type query would also get messed up (but not necessarily fail) if T is 
> partitioned and the subquery filters out some partitions but that does not 
> mean that the same partitions are filtered out in the parent query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)