date:20190816

[jira] [Commented] (HIVE-22107) Correlated subquery producing wrong schema

2019-08-16 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-22107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909588#comment-16909588
 ] 

Hive QA commented on HIVE-22107:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12977839/HIVE-22107.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 16742 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[subquery_notexists] 
(batchId=99)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[external_jdbc_table_perf]
 (batchId=184)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_multi]
 (batchId=165)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=120)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_scalar] 
(batchId=130)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query10] 
(batchId=296)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query16] 
(batchId=296)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query35] 
(batchId=296)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query69] 
(batchId=296)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query94] 
(batchId=296)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query10]
 (batchId=296)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query16]
 (batchId=296)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query35]
 (batchId=296)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query69]
 (batchId=296)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query94]
 (batchId=296)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18361/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18361/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18361/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12977839 - PreCommit-HIVE-Build

> Correlated subquery producing wrong schema
> --
>
> Key: HIVE-22107
> URL: https://issues.apache.org/jira/browse/HIVE-22107
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-22107.1.patch, HIVE-22107.2.patch, 
> HIVE-22107.3.patch, HIVE-22107.4.patch
>
>
> *Repro*
> {code:sql}
> create table test(id int, name string,dept string);
> insert into test values(1,'a','it'),(2,'b','eee'),(NULL, 'c', 'cse');
> select distinct 'empno' as eid, a.id from test a where NOT EXISTS (select 
> c.id from test c where a.id=c.id);
> {code}
> {code}
> +---++
> |  eid  |  a.id  |
> +---++
> | NULL  | empno  |
> +---++
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (HIVE-22068) Return the last event id dumped as repl status to avoid notification event missing error.

2019-08-16 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22068?focusedWorklogId=296729=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296729
 ]

ASF GitHub Bot logged work on HIVE-22068:
-

Author: ASF GitHub Bot
Created on: 17/Aug/19 04:57
Start Date: 17/Aug/19 04:57
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #742: HIVE-22068 : 
Add more logging to notification cleaner and replication to track events
URL: https://github.com/apache/hive/pull/742#discussion_r314934595
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplLoadTask.java
 ##
 @@ -522,6 +525,41 @@ private int executeIncrementalLoad(DriverContext 
driverContext) {
   // bootstrap of tables if exist.
   if (builder.hasMoreWork() || work.getPathsToCopyIterator().hasNext() || 
work.hasBootstrapLoadTasks()) {
 DAGTraversal.traverse(childTasks, new 
AddDependencyToLeaves(TaskFactory.get(work, conf)));
+  } else {
+// Nothing to be done for repl load now. Add a task to update the 
last.repl.id of the
+// target database to the event id of the last event considered by the 
dump. Next
+// incremental cycle if starts from this id, the events considered for 
this dump, won't
+// be considered again.
+
+// The name of the database to be loaded into is either specified 
directly or is
+// available from the dump metadata.
+String dbName = work.dbNameToLoadIn;
+if (dbName == null || StringUtils.isNotBlank(dbName)) {
 
 Review comment:
   Should use StringUtils.isBlank(dbName). 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 296729)
Time Spent: 1h 40m  (was: 1.5h)

> Return the last event id dumped as repl status to avoid notification event 
> missing error.
> -
>
> Key: HIVE-22068
> URL: https://issues.apache.org/jira/browse/HIVE-22068
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22068.01.patch, HIVE-22068.02.patch, 
> HIVE-22068.03.patch, HIVE-22068.04.patch, HIVE-22068.05.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> In repl load, update the status of target database to the last event dumped 
> so that repl status returns that and next incremental can specify it as the 
> event from which to start the dump. WIthout that repl status might return and 
> old event which might cause, older events to be dumped again and/or a 
> notification event missing error if the older events are cleaned by the 
> cleaner.
> While at it
>  * Add more logging to DB notification listener cleaner thread
>  ** The time when it considered cleaning, the interval and time before which 
> events were cleared, the min and max id at that time
>  ** how many events were cleared
>  ** min and max id after the cleaning.
>  * In REPL::START document the starting event, end event if specified and the 
> maximum number of events, if specified.
>  *



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (HIVE-22068) Return the last event id dumped as repl status to avoid notification event missing error.

2019-08-16 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22068?focusedWorklogId=296730=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296730
 ]

ASF GitHub Bot logged work on HIVE-22068:
-

Author: ASF GitHub Bot
Created on: 17/Aug/19 04:57
Start Date: 17/Aug/19 04:57
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #742: HIVE-22068 : 
Add more logging to notification cleaner and replication to track events
URL: https://github.com/apache/hive/pull/742#discussion_r314934615
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplLoadTask.java
 ##
 @@ -522,6 +525,41 @@ private int executeIncrementalLoad(DriverContext 
driverContext) {
   // bootstrap of tables if exist.
   if (builder.hasMoreWork() || work.getPathsToCopyIterator().hasNext() || 
work.hasBootstrapLoadTasks()) {
 DAGTraversal.traverse(childTasks, new 
AddDependencyToLeaves(TaskFactory.get(work, conf)));
+  } else {
+// Nothing to be done for repl load now. Add a task to update the 
last.repl.id of the
+// target database to the event id of the last event considered by the 
dump. Next
+// incremental cycle if starts from this id, the events considered for 
this dump, won't
+// be considered again.
+
+// The name of the database to be loaded into is either specified 
directly or is
+// available from the dump metadata.
+String dbName = work.dbNameToLoadIn;
+if (dbName == null || StringUtils.isNotBlank(dbName)) {
+  if (work.currentReplScope != null) {
 
 Review comment:
   Add a comment about in which scenario we hit this case.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 296730)
Time Spent: 1h 50m  (was: 1h 40m)

> Return the last event id dumped as repl status to avoid notification event 
> missing error.
> -
>
> Key: HIVE-22068
> URL: https://issues.apache.org/jira/browse/HIVE-22068
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22068.01.patch, HIVE-22068.02.patch, 
> HIVE-22068.03.patch, HIVE-22068.04.patch, HIVE-22068.05.patch
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> In repl load, update the status of target database to the last event dumped 
> so that repl status returns that and next incremental can specify it as the 
> event from which to start the dump. WIthout that repl status might return and 
> old event which might cause, older events to be dumped again and/or a 
> notification event missing error if the older events are cleaned by the 
> cleaner.
> While at it
>  * Add more logging to DB notification listener cleaner thread
>  ** The time when it considered cleaning, the interval and time before which 
> events were cleared, the min and max id at that time
>  ** how many events were cleared
>  ** min and max id after the cleaning.
>  * In REPL::START document the starting event, end event if specified and the 
> maximum number of events, if specified.
>  *



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (HIVE-22068) Return the last event id dumped as repl status to avoid notification event missing error.

2019-08-16 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22068?focusedWorklogId=296728=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296728
 ]

ASF GitHub Bot logged work on HIVE-22068:
-

Author: ASF GitHub Bot
Created on: 17/Aug/19 04:54
Start Date: 17/Aug/19 04:54
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #742: HIVE-22068 : 
Add more logging to notification cleaner and replication to track events
URL: https://github.com/apache/hive/pull/742#discussion_r314934579
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplLoadTask.java
 ##
 @@ -522,6 +525,25 @@ private int executeIncrementalLoad(DriverContext 
driverContext) {
   // bootstrap of tables if exist.
   if (builder.hasMoreWork() || work.getPathsToCopyIterator().hasNext() || 
work.hasBootstrapLoadTasks()) {
 DAGTraversal.traverse(childTasks, new 
AddDependencyToLeaves(TaskFactory.get(work, conf)));
+  } else if (work.dbNameToLoadIn != null) {
+// Nothing to be done for repl load now. Add a task to update the 
last.repl.id of the
+// target database to the event id of the last event considered by the 
dump. Next
+// incremental cycle if starts from this id, the events considered for 
this dump, won't
+// be considered again. If we are replicating to multiple databases at 
a time, it's not
+// possible to know which all databases we are replicating into and 
hence we can not
+// update repl id in all those databases.
+String lastEventid = builder.eventTo().toString();
 
 Review comment:
   OK
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 296728)
Time Spent: 1.5h  (was: 1h 20m)

> Return the last event id dumped as repl status to avoid notification event 
> missing error.
> -
>
> Key: HIVE-22068
> URL: https://issues.apache.org/jira/browse/HIVE-22068
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22068.01.patch, HIVE-22068.02.patch, 
> HIVE-22068.03.patch, HIVE-22068.04.patch, HIVE-22068.05.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> In repl load, update the status of target database to the last event dumped 
> so that repl status returns that and next incremental can specify it as the 
> event from which to start the dump. WIthout that repl status might return and 
> old event which might cause, older events to be dumped again and/or a 
> notification event missing error if the older events are cleaned by the 
> cleaner.
> While at it
>  * Add more logging to DB notification listener cleaner thread
>  ** The time when it considered cleaning, the interval and time before which 
> events were cleared, the min and max id at that time
>  ** how many events were cleared
>  ** min and max id after the cleaning.
>  * In REPL::START document the starting event, end event if specified and the 
> maximum number of events, if specified.
>  *



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (HIVE-22068) Return the last event id dumped as repl status to avoid notification event missing error.

2019-08-16 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22068?focusedWorklogId=296727=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296727
 ]

ASF GitHub Bot logged work on HIVE-22068:
-

Author: ASF GitHub Bot
Created on: 17/Aug/19 04:53
Start Date: 17/Aug/19 04:53
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #742: HIVE-22068 : 
Add more logging to notification cleaner and replication to track events
URL: https://github.com/apache/hive/pull/742#discussion_r314934575
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTables.java
 ##
 @@ -750,6 +766,38 @@ public Table apply(@Nullable Table table) {
 .verifyResults(Arrays.asList("1", "2"));
   }
 
+  @Test
+  public void testIncrementalDumpEmptyDumpDirectory() throws Throwable {
 
 Review comment:
   We cannot reproduce this scenario with acid enabled. If ACID enabled, we 
will have at least open/commit txn event part of each incremental dump. So, 
impossible to get empty incremental dump. Please try this scenario with ACID 
disabled. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 296727)
Time Spent: 1h 20m  (was: 1h 10m)

> Return the last event id dumped as repl status to avoid notification event 
> missing error.
> -
>
> Key: HIVE-22068
> URL: https://issues.apache.org/jira/browse/HIVE-22068
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22068.01.patch, HIVE-22068.02.patch, 
> HIVE-22068.03.patch, HIVE-22068.04.patch, HIVE-22068.05.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> In repl load, update the status of target database to the last event dumped 
> so that repl status returns that and next incremental can specify it as the 
> event from which to start the dump. WIthout that repl status might return and 
> old event which might cause, older events to be dumped again and/or a 
> notification event missing error if the older events are cleaned by the 
> cleaner.
> While at it
>  * Add more logging to DB notification listener cleaner thread
>  ** The time when it considered cleaning, the interval and time before which 
> events were cleared, the min and max id at that time
>  ** how many events were cleared
>  ** min and max id after the cleaning.
>  * In REPL::START document the starting event, end event if specified and the 
> maximum number of events, if specified.
>  *



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (HIVE-22107) Correlated subquery producing wrong schema

2019-08-16 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-22107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909577#comment-16909577
 ] 

Hive QA commented on HIVE-22107:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
13s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
54s{color} | {color:blue} ql in master has 2251 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
42s{color} | {color:red} ql: The patch generated 1 new + 21 unchanged - 0 fixed 
= 22 total (was 21) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 39s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-18361/dev-support/hive-personality.sh
 |
| git revision | master / 3934de0 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18361/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18361/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Correlated subquery producing wrong schema
> --
>
> Key: HIVE-22107
> URL: https://issues.apache.org/jira/browse/HIVE-22107
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-22107.1.patch, HIVE-22107.2.patch, 
> HIVE-22107.3.patch, HIVE-22107.4.patch
>
>
> *Repro*
> {code:sql}
> create table test(id int, name string,dept string);
> insert into test values(1,'a','it'),(2,'b','eee'),(NULL, 'c', 'cse');
> select distinct 'empno' as eid, a.id from test a where NOT EXISTS (select 
> c.id from test c where a.id=c.id);
> {code}
> {code}
> +---++
> |  eid  |  a.id  |
> +---++
> | NULL  | empno  |
> +---++
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (HIVE-22068) Return the last event id dumped as repl status to avoid notification event missing error.

2019-08-16 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-22068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909557#comment-16909557
 ] 

Hive QA commented on HIVE-22068:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12977811/HIVE-22068.05.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 16744 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18360/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18360/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18360/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12977811 - PreCommit-HIVE-Build

> Return the last event id dumped as repl status to avoid notification event 
> missing error.
> -
>
> Key: HIVE-22068
> URL: https://issues.apache.org/jira/browse/HIVE-22068
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22068.01.patch, HIVE-22068.02.patch, 
> HIVE-22068.03.patch, HIVE-22068.04.patch, HIVE-22068.05.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> In repl load, update the status of target database to the last event dumped 
> so that repl status returns that and next incremental can specify it as the 
> event from which to start the dump. WIthout that repl status might return and 
> old event which might cause, older events to be dumped again and/or a 
> notification event missing error if the older events are cleaned by the 
> cleaner.
> While at it
>  * Add more logging to DB notification listener cleaner thread
>  ** The time when it considered cleaning, the interval and time before which 
> events were cleared, the min and max id at that time
>  ** how many events were cleared
>  ** min and max id after the cleaning.
>  * In REPL::START document the starting event, end event if specified and the 
> maximum number of events, if specified.
>  *



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (HIVE-22068) Return the last event id dumped as repl status to avoid notification event missing error.

2019-08-16 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-22068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909548#comment-16909548
 ] 

Hive QA commented on HIVE-22068:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
42s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
22s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m 
10s{color} | {color:blue} standalone-metastore/metastore-server in master has 
181 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
6s{color} | {color:blue} ql in master has 2251 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
39s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
44s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
13s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
39s{color} | {color:red} ql: The patch generated 2 new + 25 unchanged - 0 fixed 
= 27 total (was 25) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m 
14s{color} | {color:red} ql generated 3 new + 2251 unchanged - 0 fixed = 2254 
total (was 2251) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 36m  5s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Unread field:IncrementalDumpBegin.java:[line 52] |
|  |  Unread field:IncrementalDumpBegin.java:[line 54] |
|  |  Unread field:IncrementalDumpBegin.java:[line 53] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-18360/dev-support/hive-personality.sh
 |
| git revision | master / 3934de0 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18360/yetus/diff-checkstyle-ql.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18360/yetus/new-findbugs-ql.html
 |
| modules | C: standalone-metastore/metastore-server ql itests/hive-unit U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18360/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Return the last event id dumped as repl status to avoid notification event 
> missing error.
> -
>
> Key: HIVE-22068
> URL: https://issues.apache.org/jira/browse/HIVE-22068
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels:

[jira] [Commented] (HIVE-22099) Several date related UDFs can't handle Julian dates properly since HIVE-20007

2019-08-16 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-22099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909526#comment-16909526
 ] 

Hive QA commented on HIVE-22099:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12977801/HIVE-22099.4.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18359/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18359/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18359/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12977801/HIVE-22099.4.patch 
was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12977801 - PreCommit-HIVE-Build

> Several date related UDFs can't handle Julian dates properly since HIVE-20007
> -
>
> Key: HIVE-22099
> URL: https://issues.apache.org/jira/browse/HIVE-22099
> Project: Hive
>  Issue Type: Bug
>Reporter: Adam Szita
>Assignee: Adam Szita
>Priority: Major
> Attachments: HIVE-22099.0.patch, HIVE-22099.1.patch, 
> HIVE-22099.2.patch, HIVE-22099.3.patch, HIVE-22099.4.patch
>
>
> Currently dates that belong to Julian calendar (before Oct 15, 1582) are 
> handled improperly by date/timestamp UDFs.
> E.g. DateFormat UDF:
> Although the dates are in Julian calendar, the formatter insists to print 
> these according to Gregorian calendar causing multiple days of difference in 
> some cases:
>  
> {code:java}
> beeline> select date_format('1001-01-05','dd---MM--');
> ++
> | _c0 |
> ++
> | 30---12--1000 |
> ++{code}
>  I've observed similar problems in the following UDFs:
>  * add_months
>  * date_format
>  * day
>  * month
>  * months_between
>  * weekofyear
>  * year
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (HIVE-22063) Ranger Authorization in Hive based on object ownership - HMS code path

2019-08-16 Thread Sam An (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam An updated HIVE-22063:
--
Attachment: HIVE-22063.8.patch

> Ranger Authorization in Hive based on object ownership - HMS code path
> --
>
> Key: HIVE-22063
> URL: https://issues.apache.org/jira/browse/HIVE-22063
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Sam An
>Assignee: Sam An
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22063.1.patch, HIVE-22063.2.patch, 
> HIVE-22063.3.patch, HIVE-22063.4.patch, HIVE-22063.5.patch, 
> HIVE-22063.6.patch, HIVE-22063.7.patch, HIVE-22063.8.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This takes care of adding the owner and ownertype in the HMS code path



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (HIVE-22099) Several date related UDFs can't handle Julian dates properly since HIVE-20007

2019-08-16 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-22099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909444#comment-16909444
 ] 

Hive QA commented on HIVE-22099:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12977801/HIVE-22099.4.patch

{color:green}SUCCESS:{color} +1 due to 8 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 16744 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18358/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18358/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18358/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12977801 - PreCommit-HIVE-Build

> Several date related UDFs can't handle Julian dates properly since HIVE-20007
> -
>
> Key: HIVE-22099
> URL: https://issues.apache.org/jira/browse/HIVE-22099
> Project: Hive
>  Issue Type: Bug
>Reporter: Adam Szita
>Assignee: Adam Szita
>Priority: Major
> Attachments: HIVE-22099.0.patch, HIVE-22099.1.patch, 
> HIVE-22099.2.patch, HIVE-22099.3.patch, HIVE-22099.4.patch
>
>
> Currently dates that belong to Julian calendar (before Oct 15, 1582) are 
> handled improperly by date/timestamp UDFs.
> E.g. DateFormat UDF:
> Although the dates are in Julian calendar, the formatter insists to print 
> these according to Gregorian calendar causing multiple days of difference in 
> some cases:
>  
> {code:java}
> beeline> select date_format('1001-01-05','dd---MM--');
> ++
> | _c0 |
> ++
> | 30---12--1000 |
> ++{code}
>  I've observed similar problems in the following UDFs:
>  * add_months
>  * date_format
>  * day
>  * month
>  * months_between
>  * weekofyear
>  * year
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (HIVE-22120) Fix wrong results/ArrayOutOfBound exception in left outer map joins on specific boundary conditions

2019-08-16 Thread Ramesh Kumar Thangarajan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-22120:

Attachment: HIVE-22120.2.patch
Status: Patch Available  (was: Open)

> Fix wrong results/ArrayOutOfBound exception in left outer map joins on 
> specific boundary conditions
> ---
>
> Key: HIVE-22120
> URL: https://issues.apache.org/jira/browse/HIVE-22120
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, llap, Vectorization
>Affects Versions: 4.0.0
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22120.1.patch, HIVE-22120.2.patch
>
>
> Vectorized version of left outer map join produces wrong results or 
> encounters ArrayOutOfBound exception.
> The boundary conditions are:
>  * The complete batch of the big table should have the join key repeated for 
> all the join columns.
>  * The complete batch of the big table should have not have a matched key 
> value in the small table
>  * The repeated value should not be a null value
>  * Some rows should be filtered out as part of the on clause filter.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (HIVE-22120) Fix wrong results/ArrayOutOfBound exception in left outer map joins on specific boundary conditions

2019-08-16 Thread Ramesh Kumar Thangarajan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-22120:

Status: Open  (was: Patch Available)

> Fix wrong results/ArrayOutOfBound exception in left outer map joins on 
> specific boundary conditions
> ---
>
> Key: HIVE-22120
> URL: https://issues.apache.org/jira/browse/HIVE-22120
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, llap, Vectorization
>Affects Versions: 4.0.0
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22120.1.patch, HIVE-22120.2.patch
>
>
> Vectorized version of left outer map join produces wrong results or 
> encounters ArrayOutOfBound exception.
> The boundary conditions are:
>  * The complete batch of the big table should have the join key repeated for 
> all the join columns.
>  * The complete batch of the big table should have not have a matched key 
> value in the small table
>  * The repeated value should not be a null value
>  * Some rows should be filtered out as part of the on clause filter.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (HIVE-22099) Several date related UDFs can't handle Julian dates properly since HIVE-20007

2019-08-16 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-22099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909419#comment-16909419
 ] 

Hive QA commented on HIVE-22099:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
42s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
7s{color} | {color:blue} ql in master has 2251 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
41s{color} | {color:red} ql: The patch generated 1 new + 206 unchanged - 3 
fixed = 207 total (was 209) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m  9s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-18358/dev-support/hive-personality.sh
 |
| git revision | master / 3934de0 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18358/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18358/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Several date related UDFs can't handle Julian dates properly since HIVE-20007
> -
>
> Key: HIVE-22099
> URL: https://issues.apache.org/jira/browse/HIVE-22099
> Project: Hive
>  Issue Type: Bug
>Reporter: Adam Szita
>Assignee: Adam Szita
>Priority: Major
> Attachments: HIVE-22099.0.patch, HIVE-22099.1.patch, 
> HIVE-22099.2.patch, HIVE-22099.3.patch, HIVE-22099.4.patch
>
>
> Currently dates that belong to Julian calendar (before Oct 15, 1582) are 
> handled improperly by date/timestamp UDFs.
> E.g. DateFormat UDF:
> Although the dates are in Julian calendar, the formatter insists to print 
> these according to Gregorian calendar causing multiple days of difference in 
> some cases:
>  
> {code:java}
> beeline> select date_format('1001-01-05','dd---MM--');
> ++
> | _c0 |
> ++
> | 30---12--1000 |
> ++{code}
>  I've observed similar problems in the following UDFs:
>  * add_months
>  * date_format
>  * day
>  * month
>  * months_between
>  * weekofyear
>  * year
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (HIVE-22107) Correlated subquery producing wrong schema

2019-08-16 Thread Vineet Garg (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-22107:
---
Status: Open  (was: Patch Available)

> Correlated subquery producing wrong schema
> --
>
> Key: HIVE-22107
> URL: https://issues.apache.org/jira/browse/HIVE-22107
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-22107.1.patch, HIVE-22107.2.patch, 
> HIVE-22107.3.patch, HIVE-22107.4.patch
>
>
> *Repro*
> {code:sql}
> create table test(id int, name string,dept string);
> insert into test values(1,'a','it'),(2,'b','eee'),(NULL, 'c', 'cse');
> select distinct 'empno' as eid, a.id from test a where NOT EXISTS (select 
> c.id from test c where a.id=c.id);
> {code}
> {code}
> +---++
> |  eid  |  a.id  |
> +---++
> | NULL  | empno  |
> +---++
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (HIVE-22107) Correlated subquery producing wrong schema

2019-08-16 Thread Vineet Garg (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-22107:
---
Attachment: HIVE-22107.4.patch

> Correlated subquery producing wrong schema
> --
>
> Key: HIVE-22107
> URL: https://issues.apache.org/jira/browse/HIVE-22107
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-22107.1.patch, HIVE-22107.2.patch, 
> HIVE-22107.3.patch, HIVE-22107.4.patch
>
>
> *Repro*
> {code:sql}
> create table test(id int, name string,dept string);
> insert into test values(1,'a','it'),(2,'b','eee'),(NULL, 'c', 'cse');
> select distinct 'empno' as eid, a.id from test a where NOT EXISTS (select 
> c.id from test c where a.id=c.id);
> {code}
> {code}
> +---++
> |  eid  |  a.id  |
> +---++
> | NULL  | empno  |
> +---++
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (HIVE-22107) Correlated subquery producing wrong schema

2019-08-16 Thread Vineet Garg (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-22107:
---
Status: Patch Available  (was: Open)

> Correlated subquery producing wrong schema
> --
>
> Key: HIVE-22107
> URL: https://issues.apache.org/jira/browse/HIVE-22107
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-22107.1.patch, HIVE-22107.2.patch, 
> HIVE-22107.3.patch, HIVE-22107.4.patch
>
>
> *Repro*
> {code:sql}
> create table test(id int, name string,dept string);
> insert into test values(1,'a','it'),(2,'b','eee'),(NULL, 'c', 'cse');
> select distinct 'empno' as eid, a.id from test a where NOT EXISTS (select 
> c.id from test c where a.id=c.id);
> {code}
> {code}
> +---++
> |  eid  |  a.id  |
> +---++
> | NULL  | empno  |
> +---++
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Assigned] (HIVE-22125) Move to Kafka 2.3 Clients

2019-08-16 Thread slim bouguerra (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra reassigned HIVE-22125:
-


> Move to Kafka 2.3 Clients
> -
>
> Key: HIVE-22125
> URL: https://issues.apache.org/jira/browse/HIVE-22125
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Assigned] (HIVE-22124) Fix wrong results/ArrayOutOfBound exception in left outer map joins on specific boundary conditions with limit

2019-08-16 Thread Ramesh Kumar Thangarajan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-22124:
---


> Fix wrong results/ArrayOutOfBound exception in left outer map joins on 
> specific boundary conditions with limit
> --
>
> Key: HIVE-22124
> URL: https://issues.apache.org/jira/browse/HIVE-22124
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, llap, Vectorization
>Affects Versions: 4.0.0
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> This is the extension of the bug HIVE-22120. All the boundary conditions 
> mentioned in HIVE-22120 also apply here plus the query needs to have a limit. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work started] (HIVE-22124) Fix wrong results/ArrayOutOfBound exception in left outer map joins on specific boundary conditions with limit

2019-08-16 Thread Ramesh Kumar Thangarajan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-22124 started by Ramesh Kumar Thangarajan.
---
> Fix wrong results/ArrayOutOfBound exception in left outer map joins on 
> specific boundary conditions with limit
> --
>
> Key: HIVE-22124
> URL: https://issues.apache.org/jira/browse/HIVE-22124
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, llap, Vectorization
>Affects Versions: 4.0.0
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> This is the extension of the bug HIVE-22120. All the boundary conditions 
> mentioned in HIVE-22120 also apply here plus the query needs to have a limit. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (HIVE-20683) Add the Ability to push Dynamic Between and Bloom filters to Druid

2019-08-16 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20683?focusedWorklogId=296426=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296426
 ]

ASF GitHub Bot logged work on HIVE-20683:
-

Author: ASF GitHub Bot
Created on: 16/Aug/19 16:56
Start Date: 16/Aug/19 16:56
Worklog Time Spent: 10m 
  Work Description: b-slim commented on pull request #723: [HIVE-20683] Add 
the Ability to push Dynamic Between and Bloom filters to Druid
URL: https://github.com/apache/hive/pull/723#discussion_r314805297
 
 

 ##
 File path: ql/src/test/results/clientpositive/druid/druidmini_expressions.q.out
 ##
 @@ -1868,9 +1868,9 @@ POSTHOOK: query: SELECT DATE_ADD(cast(`__time` as date), 
CAST((cdouble / 1000) A
 POSTHOOK: type: QUERY
 POSTHOOK: Input: default@druid_table_alltypesorc
 POSTHOOK: Output: hdfs://### HDFS PATH ###
-1969-02-26 1970-11-04
-1969-03-19 1970-10-14
-1969-11-13 1970-02-17
+1969-12-15 1970-01-16
+1969-12-15 1970-01-16
+1969-12-15 1970-01-16
 
 Review comment:
   same here seems like timezone issue
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 296426)
Time Spent: 1h 10m  (was: 1h)

> Add the Ability to push Dynamic Between and Bloom filters to Druid
> --
>
> Key: HIVE-20683
> URL: https://issues.apache.org/jira/browse/HIVE-20683
> Project: Hive
>  Issue Type: New Feature
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20683.1.patch, HIVE-20683.2.patch, 
> HIVE-20683.3.patch, HIVE-20683.4.patch, HIVE-20683.5.patch, 
> HIVE-20683.6.patch, HIVE-20683.8.patch, HIVE-20683.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> For optimizing joins, Hive generates BETWEEN filter with min-max and BLOOM 
> filter for filtering one side of semi-join.
> Druid 0.13.0 will have support for Bloom filters (Added via 
> https://github.com/apache/incubator-druid/pull/6222)
> Implementation details - 
> # Hive generates and passes the filters as part of 'filterExpr' in TableScan. 
> # DruidQueryBasedRecordReader gets this filter passed as part of the conf. 
> # During execution phase, before sending the query to druid in 
> DruidQueryBasedRecordReader we will deserialize this filter, translate it 
> into a DruidDimFilter and add it to existing DruidQuery.  Tez executor 
> already ensures that when we start reading results from the record reader, 
> all the dynamic values are initialized. 
> # Explaining a druid query also prints the query sent to druid as 
> {{druid.json.query}}. We also need to make sure to update the druid query 
> with the filters. During explain we do not have the actual values for the 
> dynamic values, so instead of values we will print the dynamic expression 
> itself as part of druid query. 
> Note:- This work needs druid to be updated to version 0.13.0



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (HIVE-20683) Add the Ability to push Dynamic Between and Bloom filters to Druid

2019-08-16 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20683?focusedWorklogId=296425=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296425
 ]

ASF GitHub Bot logged work on HIVE-20683:
-

Author: ASF GitHub Bot
Created on: 16/Aug/19 16:55
Start Date: 16/Aug/19 16:55
Worklog Time Spent: 10m 
  Work Description: b-slim commented on pull request #723: [HIVE-20683] Add 
the Ability to push Dynamic Between and Bloom filters to Druid
URL: https://github.com/apache/hive/pull/723#discussion_r314805115
 
 

 ##
 File path: ql/src/test/results/clientpositive/druid/druidmini_expressions.q.out
 ##
 @@ -1868,9 +1868,9 @@ POSTHOOK: query: SELECT DATE_ADD(cast(`__time` as date), 
CAST((cdouble / 1000) A
 POSTHOOK: type: QUERY
 POSTHOOK: Input: default@druid_table_alltypesorc
 POSTHOOK: Output: hdfs://### HDFS PATH ###
-1969-02-26 1970-11-04
 
 Review comment:
   those changes seems not correct and might have some issue with timezones
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 296425)
Time Spent: 1h  (was: 50m)

> Add the Ability to push Dynamic Between and Bloom filters to Druid
> --
>
> Key: HIVE-20683
> URL: https://issues.apache.org/jira/browse/HIVE-20683
> Project: Hive
>  Issue Type: New Feature
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20683.1.patch, HIVE-20683.2.patch, 
> HIVE-20683.3.patch, HIVE-20683.4.patch, HIVE-20683.5.patch, 
> HIVE-20683.6.patch, HIVE-20683.8.patch, HIVE-20683.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> For optimizing joins, Hive generates BETWEEN filter with min-max and BLOOM 
> filter for filtering one side of semi-join.
> Druid 0.13.0 will have support for Bloom filters (Added via 
> https://github.com/apache/incubator-druid/pull/6222)
> Implementation details - 
> # Hive generates and passes the filters as part of 'filterExpr' in TableScan. 
> # DruidQueryBasedRecordReader gets this filter passed as part of the conf. 
> # During execution phase, before sending the query to druid in 
> DruidQueryBasedRecordReader we will deserialize this filter, translate it 
> into a DruidDimFilter and add it to existing DruidQuery.  Tez executor 
> already ensures that when we start reading results from the record reader, 
> all the dynamic values are initialized. 
> # Explaining a druid query also prints the query sent to druid as 
> {{druid.json.query}}. We also need to make sure to update the druid query 
> with the filters. During explain we do not have the actual values for the 
> dynamic values, so instead of values we will print the dynamic expression 
> itself as part of druid query. 
> Note:- This work needs druid to be updated to version 0.13.0



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (HIVE-20683) Add the Ability to push Dynamic Between and Bloom filters to Druid

2019-08-16 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20683?focusedWorklogId=296421=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296421
 ]

ASF GitHub Bot logged work on HIVE-20683:
-

Author: ASF GitHub Bot
Created on: 16/Aug/19 16:51
Start Date: 16/Aug/19 16:51
Worklog Time Spent: 10m 
  Work Description: b-slim commented on pull request #723: [HIVE-20683] Add 
the Ability to push Dynamic Between and Bloom filters to Druid
URL: https://github.com/apache/hive/pull/723#discussion_r314803659
 
 

 ##
 File path: ql/src/test/results/clientpositive/druid/druidmini_joins.q.out
 ##
 @@ -223,8 +228,8 @@ GROUP BY `tbl1`.`username`
 POSTHOOK: type: QUERY
 
 Review comment:
   can we add an order by here to avoid test flakyness 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 296421)
Time Spent: 50m  (was: 40m)

> Add the Ability to push Dynamic Between and Bloom filters to Druid
> --
>
> Key: HIVE-20683
> URL: https://issues.apache.org/jira/browse/HIVE-20683
> Project: Hive
>  Issue Type: New Feature
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20683.1.patch, HIVE-20683.2.patch, 
> HIVE-20683.3.patch, HIVE-20683.4.patch, HIVE-20683.5.patch, 
> HIVE-20683.6.patch, HIVE-20683.8.patch, HIVE-20683.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> For optimizing joins, Hive generates BETWEEN filter with min-max and BLOOM 
> filter for filtering one side of semi-join.
> Druid 0.13.0 will have support for Bloom filters (Added via 
> https://github.com/apache/incubator-druid/pull/6222)
> Implementation details - 
> # Hive generates and passes the filters as part of 'filterExpr' in TableScan. 
> # DruidQueryBasedRecordReader gets this filter passed as part of the conf. 
> # During execution phase, before sending the query to druid in 
> DruidQueryBasedRecordReader we will deserialize this filter, translate it 
> into a DruidDimFilter and add it to existing DruidQuery.  Tez executor 
> already ensures that when we start reading results from the record reader, 
> all the dynamic values are initialized. 
> # Explaining a druid query also prints the query sent to druid as 
> {{druid.json.query}}. We also need to make sure to update the druid query 
> with the filters. During explain we do not have the actual values for the 
> dynamic values, so instead of values we will print the dynamic expression 
> itself as part of druid query. 
> Note:- This work needs druid to be updated to version 0.13.0



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (HIVE-20683) Add the Ability to push Dynamic Between and Bloom filters to Druid

2019-08-16 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20683?focusedWorklogId=296416=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296416
 ]

ASF GitHub Bot logged work on HIVE-20683:
-

Author: ASF GitHub Bot
Created on: 16/Aug/19 16:46
Start Date: 16/Aug/19 16:46
Worklog Time Spent: 10m 
  Work Description: b-slim commented on pull request #723: [HIVE-20683] Add 
the Ability to push Dynamic Between and Bloom filters to Druid
URL: https://github.com/apache/hive/pull/723#discussion_r314801856
 
 

 ##
 File path: 
druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandlerUtils.java
 ##
 @@ -894,4 +944,257 @@ public static IndexSpec getIndexSpec(Configuration jc) {
 ImmutableList aggregatorFactories = 
aggregatorFactoryBuilder.build();
 return Pair.of(dimensions, aggregatorFactories.toArray(new 
AggregatorFactory[0]));
   }
+
+  // Druid only supports String,Long,Float,Double selectors
+  private static Set druidSupportedTypeInfos = 
ImmutableSet.of(
+  TypeInfoFactory.stringTypeInfo, TypeInfoFactory.charTypeInfo,
+  TypeInfoFactory.varcharTypeInfo, TypeInfoFactory.byteTypeInfo,
+  TypeInfoFactory.intTypeInfo, TypeInfoFactory.longTypeInfo,
+  TypeInfoFactory.shortTypeInfo, TypeInfoFactory.doubleTypeInfo
+  );
+
+  private static Set stringTypeInfos = ImmutableSet.of(
+  TypeInfoFactory.stringTypeInfo,
+  TypeInfoFactory.charTypeInfo, TypeInfoFactory.varcharTypeInfo
+  );
+
+
+  public static org.apache.druid.query.Query 
addDynamicFilters(org.apache.druid.query.Query query,
 
 Review comment:
   what you think about extracting all those utils into a new class called 
`DruidExpressionsUtils` ?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 296416)
Time Spent: 40m  (was: 0.5h)

> Add the Ability to push Dynamic Between and Bloom filters to Druid
> --
>
> Key: HIVE-20683
> URL: https://issues.apache.org/jira/browse/HIVE-20683
> Project: Hive
>  Issue Type: New Feature
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20683.1.patch, HIVE-20683.2.patch, 
> HIVE-20683.3.patch, HIVE-20683.4.patch, HIVE-20683.5.patch, 
> HIVE-20683.6.patch, HIVE-20683.8.patch, HIVE-20683.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> For optimizing joins, Hive generates BETWEEN filter with min-max and BLOOM 
> filter for filtering one side of semi-join.
> Druid 0.13.0 will have support for Bloom filters (Added via 
> https://github.com/apache/incubator-druid/pull/6222)
> Implementation details - 
> # Hive generates and passes the filters as part of 'filterExpr' in TableScan. 
> # DruidQueryBasedRecordReader gets this filter passed as part of the conf. 
> # During execution phase, before sending the query to druid in 
> DruidQueryBasedRecordReader we will deserialize this filter, translate it 
> into a DruidDimFilter and add it to existing DruidQuery.  Tez executor 
> already ensures that when we start reading results from the record reader, 
> all the dynamic values are initialized. 
> # Explaining a druid query also prints the query sent to druid as 
> {{druid.json.query}}. We also need to make sure to update the druid query 
> with the filters. During explain we do not have the actual values for the 
> dynamic values, so instead of values we will print the dynamic expression 
> itself as part of druid query. 
> Note:- This work needs druid to be updated to version 0.13.0



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (HIVE-20683) Add the Ability to push Dynamic Between and Bloom filters to Druid

2019-08-16 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20683?focusedWorklogId=296412=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296412
 ]

ASF GitHub Bot logged work on HIVE-20683:
-

Author: ASF GitHub Bot
Created on: 16/Aug/19 16:44
Start Date: 16/Aug/19 16:44
Worklog Time Spent: 10m 
  Work Description: b-slim commented on pull request #723: [HIVE-20683] Add 
the Ability to push Dynamic Between and Bloom filters to Druid
URL: https://github.com/apache/hive/pull/723#discussion_r314801367
 
 

 ##
 File path: 
druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandlerUtils.java
 ##
 @@ -894,4 +944,257 @@ public static IndexSpec getIndexSpec(Configuration jc) {
 ImmutableList aggregatorFactories = 
aggregatorFactoryBuilder.build();
 return Pair.of(dimensions, aggregatorFactories.toArray(new 
AggregatorFactory[0]));
   }
+
+  // Druid only supports String,Long,Float,Double selectors
+  private static Set druidSupportedTypeInfos = 
ImmutableSet.of(
+  TypeInfoFactory.stringTypeInfo, TypeInfoFactory.charTypeInfo,
+  TypeInfoFactory.varcharTypeInfo, TypeInfoFactory.byteTypeInfo,
+  TypeInfoFactory.intTypeInfo, TypeInfoFactory.longTypeInfo,
+  TypeInfoFactory.shortTypeInfo, TypeInfoFactory.doubleTypeInfo
+  );
+
+  private static Set stringTypeInfos = ImmutableSet.of(
+  TypeInfoFactory.stringTypeInfo,
+  TypeInfoFactory.charTypeInfo, TypeInfoFactory.varcharTypeInfo
+  );
+
+
+  public static org.apache.druid.query.Query 
addDynamicFilters(org.apache.druid.query.Query query,
+  ExprNodeGenericFuncDesc filterExpr, Configuration conf, boolean 
resolveDynamicValues
+  ) {
+List virtualColumns = Arrays
+.asList(getVirtualColumns(query).getVirtualColumns());
+org.apache.druid.query.Query rv = query;
+DimFilter joinReductionFilter = toDruidFilter(filterExpr, conf, 
virtualColumns,
+resolveDynamicValues
+);
+if(joinReductionFilter != null) {
+  String type = query.getType();
+  DimFilter filter = new AndDimFilter(joinReductionFilter, 
query.getFilter());
+  switch (type) {
+  case org.apache.druid.query.Query.TIMESERIES:
+rv = Druids.TimeseriesQueryBuilder.copy((TimeseriesQuery) query)
+.filters(filter)
+.virtualColumns(VirtualColumns.create(virtualColumns))
+.build();
+break;
+  case org.apache.druid.query.Query.TOPN:
+rv = new TopNQueryBuilder((TopNQuery) query)
+.filters(filter)
+.virtualColumns(VirtualColumns.create(virtualColumns))
+.build();
+break;
+  case org.apache.druid.query.Query.GROUP_BY:
+rv = new GroupByQuery.Builder((GroupByQuery) query)
+.setDimFilter(filter)
+.setVirtualColumns(VirtualColumns.create(virtualColumns))
+.build();
+break;
+  case org.apache.druid.query.Query.SCAN:
+rv = ScanQuery.ScanQueryBuilder.copy((ScanQuery) query)
+.filters(filter)
+.virtualColumns(VirtualColumns.create(virtualColumns))
+.build();
+break;
+  case org.apache.druid.query.Query.SELECT:
+rv = Druids.SelectQueryBuilder.copy((SelectQuery) query)
+.filters(filter)
+.virtualColumns(VirtualColumns.create(virtualColumns))
+.build();
+break;
+  default:
+throw new UnsupportedOperationException("Unsupported Query type " + 
type);
+  }
+}
+return rv;
+  }
+
+  private static DimFilter toDruidFilter(ExprNodeDesc filterExpr, 
Configuration configuration,
+  List virtualColumns, boolean resolveDynamicValues
+  ) {
+if(filterExpr == null) {
+  return null;
+}
+Class genericUDFClass = 
getGenericUDFClassFromExprDesc(filterExpr);
+if(FunctionRegistry.isOpAnd(filterExpr)) {
+  Iterator iterator = filterExpr.getChildren().iterator();
+  List delegates = Lists.newArrayList();
+  while (iterator.hasNext()) {
+DimFilter filter = toDruidFilter(iterator.next(), configuration, 
virtualColumns,
+resolveDynamicValues
+);
+if(filter != null) {
+  delegates.add(filter);
+}
+  }
+  if(delegates != null && !delegates.isEmpty()) {
+return new AndDimFilter(delegates);
+  }
+}
+if(FunctionRegistry.isOpOr(filterExpr)) {
+  Iterator iterator = filterExpr.getChildren().iterator();
+  List delegates = Lists.newArrayList();
+  while (iterator.hasNext()) {
+DimFilter filter = toDruidFilter(iterator.next(), configuration, 
virtualColumns,
+resolveDynamicValues
+);
+if(filter != null) {
+  delegates.add(filter);
+}
+  }
+  if(delegates != null) {
+return new OrDimFilter(delegates);
+  }
+} else if(GenericUDFBetween.class ==

[jira] [Work logged] (HIVE-20683) Add the Ability to push Dynamic Between and Bloom filters to Druid

2019-08-16 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20683?focusedWorklogId=296403=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296403
 ]

ASF GitHub Bot logged work on HIVE-20683:
-

Author: ASF GitHub Bot
Created on: 16/Aug/19 16:35
Start Date: 16/Aug/19 16:35
Worklog Time Spent: 10m 
  Work Description: b-slim commented on pull request #723: [HIVE-20683] Add 
the Ability to push Dynamic Between and Bloom filters to Druid
URL: https://github.com/apache/hive/pull/723#discussion_r314798310
 
 

 ##
 File path: 
druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandlerUtils.java
 ##
 @@ -894,4 +944,257 @@ public static IndexSpec getIndexSpec(Configuration jc) {
 ImmutableList aggregatorFactories = 
aggregatorFactoryBuilder.build();
 return Pair.of(dimensions, aggregatorFactories.toArray(new 
AggregatorFactory[0]));
   }
+
+  // Druid only supports String,Long,Float,Double selectors
+  private static Set druidSupportedTypeInfos = 
ImmutableSet.of(
+  TypeInfoFactory.stringTypeInfo, TypeInfoFactory.charTypeInfo,
+  TypeInfoFactory.varcharTypeInfo, TypeInfoFactory.byteTypeInfo,
+  TypeInfoFactory.intTypeInfo, TypeInfoFactory.longTypeInfo,
+  TypeInfoFactory.shortTypeInfo, TypeInfoFactory.doubleTypeInfo
+  );
+
+  private static Set stringTypeInfos = ImmutableSet.of(
+  TypeInfoFactory.stringTypeInfo,
+  TypeInfoFactory.charTypeInfo, TypeInfoFactory.varcharTypeInfo
+  );
+
+
+  public static org.apache.druid.query.Query 
addDynamicFilters(org.apache.druid.query.Query query,
+  ExprNodeGenericFuncDesc filterExpr, Configuration conf, boolean 
resolveDynamicValues
+  ) {
+List virtualColumns = Arrays
+.asList(getVirtualColumns(query).getVirtualColumns());
+org.apache.druid.query.Query rv = query;
+DimFilter joinReductionFilter = toDruidFilter(filterExpr, conf, 
virtualColumns,
+resolveDynamicValues
+);
+if(joinReductionFilter != null) {
+  String type = query.getType();
+  DimFilter filter = new AndDimFilter(joinReductionFilter, 
query.getFilter());
+  switch (type) {
+  case org.apache.druid.query.Query.TIMESERIES:
+rv = Druids.TimeseriesQueryBuilder.copy((TimeseriesQuery) query)
+.filters(filter)
+.virtualColumns(VirtualColumns.create(virtualColumns))
+.build();
+break;
+  case org.apache.druid.query.Query.TOPN:
+rv = new TopNQueryBuilder((TopNQuery) query)
+.filters(filter)
+.virtualColumns(VirtualColumns.create(virtualColumns))
+.build();
+break;
+  case org.apache.druid.query.Query.GROUP_BY:
+rv = new GroupByQuery.Builder((GroupByQuery) query)
+.setDimFilter(filter)
+.setVirtualColumns(VirtualColumns.create(virtualColumns))
+.build();
+break;
+  case org.apache.druid.query.Query.SCAN:
+rv = ScanQuery.ScanQueryBuilder.copy((ScanQuery) query)
+.filters(filter)
+.virtualColumns(VirtualColumns.create(virtualColumns))
+.build();
+break;
+  case org.apache.druid.query.Query.SELECT:
+rv = Druids.SelectQueryBuilder.copy((SelectQuery) query)
+.filters(filter)
+.virtualColumns(VirtualColumns.create(virtualColumns))
+.build();
+break;
+  default:
+throw new UnsupportedOperationException("Unsupported Query type " + 
type);
+  }
+}
+return rv;
+  }
+
+  private static DimFilter toDruidFilter(ExprNodeDesc filterExpr, 
Configuration configuration,
 
 Review comment:
   please add nullable annotations
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 296403)
Time Spent: 20m  (was: 10m)

> Add the Ability to push Dynamic Between and Bloom filters to Druid
> --
>
> Key: HIVE-20683
> URL: https://issues.apache.org/jira/browse/HIVE-20683
> Project: Hive
>  Issue Type: New Feature
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20683.1.patch, HIVE-20683.2.patch, 
> HIVE-20683.3.patch, HIVE-20683.4.patch, HIVE-20683.5.patch, 
> HIVE-20683.6.patch, HIVE-20683.8.patch, HIVE-20683.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> For optimizing joins, Hive generates

[jira] [Updated] (HIVE-22068) Return the last event id dumped as repl status to avoid notification event missing error.

2019-08-16 Thread Ashutosh Bapat (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat updated HIVE-22068:
--
Attachment: HIVE-22068.05.patch
Status: Patch Available  (was: In Progress)

Addressing comments by Sankar.

> Return the last event id dumped as repl status to avoid notification event 
> missing error.
> -
>
> Key: HIVE-22068
> URL: https://issues.apache.org/jira/browse/HIVE-22068
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22068.01.patch, HIVE-22068.02.patch, 
> HIVE-22068.03.patch, HIVE-22068.04.patch, HIVE-22068.05.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> In repl load, update the status of target database to the last event dumped 
> so that repl status returns that and next incremental can specify it as the 
> event from which to start the dump. WIthout that repl status might return and 
> old event which might cause, older events to be dumped again and/or a 
> notification event missing error if the older events are cleaned by the 
> cleaner.
> While at it
>  * Add more logging to DB notification listener cleaner thread
>  ** The time when it considered cleaning, the interval and time before which 
> events were cleared, the min and max id at that time
>  ** how many events were cleared
>  ** min and max id after the cleaning.
>  * In REPL::START document the starting event, end event if specified and the 
> maximum number of events, if specified.
>  *



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (HIVE-22068) Return the last event id dumped as repl status to avoid notification event missing error.

2019-08-16 Thread Ashutosh Bapat (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat updated HIVE-22068:
--
Status: In Progress  (was: Patch Available)

> Return the last event id dumped as repl status to avoid notification event 
> missing error.
> -
>
> Key: HIVE-22068
> URL: https://issues.apache.org/jira/browse/HIVE-22068
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22068.01.patch, HIVE-22068.02.patch, 
> HIVE-22068.03.patch, HIVE-22068.04.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> In repl load, update the status of target database to the last event dumped 
> so that repl status returns that and next incremental can specify it as the 
> event from which to start the dump. WIthout that repl status might return and 
> old event which might cause, older events to be dumped again and/or a 
> notification event missing error if the older events are cleaned by the 
> cleaner.
> While at it
>  * Add more logging to DB notification listener cleaner thread
>  ** The time when it considered cleaning, the interval and time before which 
> events were cleared, the min and max id at that time
>  ** how many events were cleared
>  ** min and max id after the cleaning.
>  * In REPL::START document the starting event, end event if specified and the 
> maximum number of events, if specified.
>  *



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (HIVE-22068) Return the last event id dumped as repl status to avoid notification event missing error.

2019-08-16 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22068?focusedWorklogId=296382=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296382
 ]

ASF GitHub Bot logged work on HIVE-22068:
-

Author: ASF GitHub Bot
Created on: 16/Aug/19 16:12
Start Date: 16/Aug/19 16:12
Worklog Time Spent: 10m 
  Work Description: ashutosh-bapat commented on pull request #742: 
HIVE-22068 : Add more logging to notification cleaner and replication to track 
events
URL: https://github.com/apache/hive/pull/742#discussion_r31479
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplLoadTask.java
 ##
 @@ -522,6 +525,25 @@ private int executeIncrementalLoad(DriverContext 
driverContext) {
   // bootstrap of tables if exist.
   if (builder.hasMoreWork() || work.getPathsToCopyIterator().hasNext() || 
work.hasBootstrapLoadTasks()) {
 DAGTraversal.traverse(childTasks, new 
AddDependencyToLeaves(TaskFactory.get(work, conf)));
+  } else if (work.dbNameToLoadIn != null) {
+// Nothing to be done for repl load now. Add a task to update the 
last.repl.id of the
+// target database to the event id of the last event considered by the 
dump. Next
+// incremental cycle if starts from this id, the events considered for 
this dump, won't
+// be considered again. If we are replicating to multiple databases at 
a time, it's not
+// possible to know which all databases we are replicating into and 
hence we can not
+// update repl id in all those databases.
+String lastEventid = builder.eventTo().toString();
 
 Review comment:
   I think the reason this code is getting duplicated multiple times is the 
number of variables that change between one code snippet to the other is almost 
same as the number of lines of code. So, it's probably 50/50 chance that we 
will have any real benefit from de-duplication.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 296382)
Time Spent: 1h 10m  (was: 1h)

> Return the last event id dumped as repl status to avoid notification event 
> missing error.
> -
>
> Key: HIVE-22068
> URL: https://issues.apache.org/jira/browse/HIVE-22068
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22068.01.patch, HIVE-22068.02.patch, 
> HIVE-22068.03.patch, HIVE-22068.04.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> In repl load, update the status of target database to the last event dumped 
> so that repl status returns that and next incremental can specify it as the 
> event from which to start the dump. WIthout that repl status might return and 
> old event which might cause, older events to be dumped again and/or a 
> notification event missing error if the older events are cleaned by the 
> cleaner.
> While at it
>  * Add more logging to DB notification listener cleaner thread
>  ** The time when it considered cleaning, the interval and time before which 
> events were cleared, the min and max id at that time
>  ** how many events were cleared
>  ** min and max id after the cleaning.
>  * In REPL::START document the starting event, end event if specified and the 
> maximum number of events, if specified.
>  *



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (HIVE-22068) Return the last event id dumped as repl status to avoid notification event missing error.

2019-08-16 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22068?focusedWorklogId=296375=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296375
 ]

ASF GitHub Bot logged work on HIVE-22068:
-

Author: ASF GitHub Bot
Created on: 16/Aug/19 16:06
Start Date: 16/Aug/19 16:06
Worklog Time Spent: 10m 
  Work Description: ashutosh-bapat commented on pull request #742: 
HIVE-22068 : Add more logging to notification cleaner and replication to track 
events
URL: https://github.com/apache/hive/pull/742#discussion_r314787910
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTables.java
 ##
 @@ -750,6 +766,38 @@ public Table apply(@Nullable Table table) {
 .verifyResults(Arrays.asList("1", "2"));
   }
 
+  @Test
+  public void testIncrementalDumpEmptyDumpDirectory() throws Throwable {
 
 Review comment:
   I have added a testcase in the patch. It's passing for me. Can you please 
check the same?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 296375)
Time Spent: 1h  (was: 50m)

> Return the last event id dumped as repl status to avoid notification event 
> missing error.
> -
>
> Key: HIVE-22068
> URL: https://issues.apache.org/jira/browse/HIVE-22068
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22068.01.patch, HIVE-22068.02.patch, 
> HIVE-22068.03.patch, HIVE-22068.04.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In repl load, update the status of target database to the last event dumped 
> so that repl status returns that and next incremental can specify it as the 
> event from which to start the dump. WIthout that repl status might return and 
> old event which might cause, older events to be dumped again and/or a 
> notification event missing error if the older events are cleaned by the 
> cleaner.
> While at it
>  * Add more logging to DB notification listener cleaner thread
>  ** The time when it considered cleaning, the interval and time before which 
> events were cleared, the min and max id at that time
>  ** how many events were cleared
>  ** min and max id after the cleaning.
>  * In REPL::START document the starting event, end event if specified and the 
> maximum number of events, if specified.
>  *



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (HIVE-22087) HMS Translation: Translate getDatabase() API to alter warehouse location

2019-08-16 Thread Naveen Gangam (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-22087:
-
Fix Version/s: 4.0.0

> HMS Translation: Translate getDatabase() API to alter warehouse location
> 
>
> Key: HIVE-22087
> URL: https://issues.apache.org/jira/browse/HIVE-22087
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22087.1.patch, HIVE-22087.2.patch, 
> HIVE-22087.3.patch, HIVE-22087.5.patch, HIVE-22087.6.patch, HIVE-22087.7.patch
>
>
> It makes sense to translate getDatabase() calls as well, to alter the 
> location for the Database based on whether or not the processor has 
> capabilities to write to the managed warehouse directory. Every DB has 2 
> locations, one external and the other in the managed warehouse directory. If 
> the processor has any AcidWrite capability, then the location remains 
> unchanged for the database.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (HIVE-22087) HMS Translation: Translate getDatabase() API to alter warehouse location

2019-08-16 Thread Naveen Gangam (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-22087:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Fix has been committed to master. Thanks for the review [~thejas].

> HMS Translation: Translate getDatabase() API to alter warehouse location
> 
>
> Key: HIVE-22087
> URL: https://issues.apache.org/jira/browse/HIVE-22087
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
> Attachments: HIVE-22087.1.patch, HIVE-22087.2.patch, 
> HIVE-22087.3.patch, HIVE-22087.5.patch, HIVE-22087.6.patch, HIVE-22087.7.patch
>
>
> It makes sense to translate getDatabase() calls as well, to alter the 
> location for the Database based on whether or not the processor has 
> capabilities to write to the managed warehouse directory. Every DB has 2 
> locations, one external and the other in the managed warehouse directory. If 
> the processor has any AcidWrite capability, then the location remains 
> unchanged for the database.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Resolved] (HIVE-21839) HMS Translation: Hive need to block create a type of table if the client does not have write capability

2019-08-16 Thread Naveen Gangam (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam resolved HIVE-21839.
--
   Resolution: Fixed
Fix Version/s: 4.0.0

> HMS Translation: Hive need to block create a type of table if the client does 
> not have write capability
> ---
>
> Key: HIVE-21839
> URL: https://issues.apache.org/jira/browse/HIVE-21839
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Yongzhi Chen
>Assignee: Naveen Gangam
>Priority: Minor
> Fix For: 4.0.0
>
>
> Hive can either return an error message or provide an API call to check the 
> capability even without a table instance.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (HIVE-21839) HMS Translation: Hive need to block create a type of table if the client does not have write capability

2019-08-16 Thread Naveen Gangam (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909090#comment-16909090
 ] 

Naveen Gangam commented on HIVE-21839:
--

This has been included in HIVE-21838 along with other fixes. Closing the jira.

> HMS Translation: Hive need to block create a type of table if the client does 
> not have write capability
> ---
>
> Key: HIVE-21839
> URL: https://issues.apache.org/jira/browse/HIVE-21839
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Yongzhi Chen
>Assignee: Naveen Gangam
>Priority: Minor
>
> Hive can either return an error message or provide an API call to check the 
> capability even without a table instance.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Assigned] (HIVE-22123) Use GetDatabaseResponse to allow for future extension

2019-08-16 Thread Naveen Gangam (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam reassigned HIVE-22123:



> Use GetDatabaseResponse to allow for future extension
> -
>
> Key: HIVE-22123
> URL: https://issues.apache.org/jira/browse/HIVE-22123
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Minor
>
> As part of the review, it was suggested to use the GetDatabaseResponse object 
> to allow for any potential future expansions for these requests.
> https://reviews.apache.org/r/71267/#comment304501



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (HIVE-22099) Several date related UDFs can't handle Julian dates properly since HIVE-20007

2019-08-16 Thread Adam Szita (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated HIVE-22099:
--
Attachment: HIVE-22099.4.patch

> Several date related UDFs can't handle Julian dates properly since HIVE-20007
> -
>
> Key: HIVE-22099
> URL: https://issues.apache.org/jira/browse/HIVE-22099
> Project: Hive
>  Issue Type: Bug
>Reporter: Adam Szita
>Assignee: Adam Szita
>Priority: Major
> Attachments: HIVE-22099.0.patch, HIVE-22099.1.patch, 
> HIVE-22099.2.patch, HIVE-22099.3.patch, HIVE-22099.4.patch
>
>
> Currently dates that belong to Julian calendar (before Oct 15, 1582) are 
> handled improperly by date/timestamp UDFs.
> E.g. DateFormat UDF:
> Although the dates are in Julian calendar, the formatter insists to print 
> these according to Gregorian calendar causing multiple days of difference in 
> some cases:
>  
> {code:java}
> beeline> select date_format('1001-01-05','dd---MM--');
> ++
> | _c0 |
> ++
> | 30---12--1000 |
> ++{code}
>  I've observed similar problems in the following UDFs:
>  * add_months
>  * date_format
>  * day
>  * month
>  * months_between
>  * weekofyear
>  * year
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (HIVE-22099) Several date related UDFs can't handle Julian dates properly since HIVE-20007

2019-08-16 Thread Adam Szita (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated HIVE-22099:
--
Status: Patch Available  (was: In Progress)

> Several date related UDFs can't handle Julian dates properly since HIVE-20007
> -
>
> Key: HIVE-22099
> URL: https://issues.apache.org/jira/browse/HIVE-22099
> Project: Hive
>  Issue Type: Bug
>Reporter: Adam Szita
>Assignee: Adam Szita
>Priority: Major
> Attachments: HIVE-22099.0.patch, HIVE-22099.1.patch, 
> HIVE-22099.2.patch, HIVE-22099.3.patch, HIVE-22099.4.patch
>
>
> Currently dates that belong to Julian calendar (before Oct 15, 1582) are 
> handled improperly by date/timestamp UDFs.
> E.g. DateFormat UDF:
> Although the dates are in Julian calendar, the formatter insists to print 
> these according to Gregorian calendar causing multiple days of difference in 
> some cases:
>  
> {code:java}
> beeline> select date_format('1001-01-05','dd---MM--');
> ++
> | _c0 |
> ++
> | 30---12--1000 |
> ++{code}
>  I've observed similar problems in the following UDFs:
>  * add_months
>  * date_format
>  * day
>  * month
>  * months_between
>  * weekofyear
>  * year
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (HIVE-22099) Several date related UDFs can't handle Julian dates properly since HIVE-20007

2019-08-16 Thread Adam Szita (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated HIVE-22099:
--
Status: In Progress  (was: Patch Available)

> Several date related UDFs can't handle Julian dates properly since HIVE-20007
> -
>
> Key: HIVE-22099
> URL: https://issues.apache.org/jira/browse/HIVE-22099
> Project: Hive
>  Issue Type: Bug
>Reporter: Adam Szita
>Assignee: Adam Szita
>Priority: Major
> Attachments: HIVE-22099.0.patch, HIVE-22099.1.patch, 
> HIVE-22099.2.patch, HIVE-22099.3.patch
>
>
> Currently dates that belong to Julian calendar (before Oct 15, 1582) are 
> handled improperly by date/timestamp UDFs.
> E.g. DateFormat UDF:
> Although the dates are in Julian calendar, the formatter insists to print 
> these according to Gregorian calendar causing multiple days of difference in 
> some cases:
>  
> {code:java}
> beeline> select date_format('1001-01-05','dd---MM--');
> ++
> | _c0 |
> ++
> | 30---12--1000 |
> ++{code}
>  I've observed similar problems in the following UDFs:
>  * add_months
>  * date_format
>  * day
>  * month
>  * months_between
>  * weekofyear
>  * year
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (HIVE-22068) Return the last event id dumped as repl status to avoid notification event missing error.

2019-08-16 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22068?focusedWorklogId=296269=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296269
 ]

ASF GitHub Bot logged work on HIVE-22068:
-

Author: ASF GitHub Bot
Created on: 16/Aug/19 12:49
Start Date: 16/Aug/19 12:49
Worklog Time Spent: 10m 
  Work Description: ashutosh-bapat commented on pull request #742: 
HIVE-22068 : Add more logging to notification cleaner and replication to track 
events
URL: https://github.com/apache/hive/pull/742#discussion_r314705989
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplLoadTask.java
 ##
 @@ -522,6 +525,25 @@ private int executeIncrementalLoad(DriverContext 
driverContext) {
   // bootstrap of tables if exist.
   if (builder.hasMoreWork() || work.getPathsToCopyIterator().hasNext() || 
work.hasBootstrapLoadTasks()) {
 DAGTraversal.traverse(childTasks, new 
AddDependencyToLeaves(TaskFactory.get(work, conf)));
+  } else if (work.dbNameToLoadIn != null) {
 
 Review comment:
   Ok. Thanks for bringing that up. Done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 296269)
Time Spent: 50m  (was: 40m)

> Return the last event id dumped as repl status to avoid notification event 
> missing error.
> -
>
> Key: HIVE-22068
> URL: https://issues.apache.org/jira/browse/HIVE-22068
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22068.01.patch, HIVE-22068.02.patch, 
> HIVE-22068.03.patch, HIVE-22068.04.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> In repl load, update the status of target database to the last event dumped 
> so that repl status returns that and next incremental can specify it as the 
> event from which to start the dump. WIthout that repl status might return and 
> old event which might cause, older events to be dumped again and/or a 
> notification event missing error if the older events are cleaned by the 
> cleaner.
> While at it
>  * Add more logging to DB notification listener cleaner thread
>  ** The time when it considered cleaning, the interval and time before which 
> events were cleared, the min and max id at that time
>  ** how many events were cleared
>  ** min and max id after the cleaning.
>  * In REPL::START document the starting event, end event if specified and the 
> maximum number of events, if specified.
>  *



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (HIVE-22068) Return the last event id dumped as repl status to avoid notification event missing error.

2019-08-16 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22068?focusedWorklogId=296244=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296244
 ]

ASF GitHub Bot logged work on HIVE-22068:
-

Author: ASF GitHub Bot
Created on: 16/Aug/19 11:58
Start Date: 16/Aug/19 11:58
Worklog Time Spent: 10m 
  Work Description: ashutosh-bapat commented on pull request #742: 
HIVE-22068 : Add more logging to notification cleaner and replication to track 
events
URL: https://github.com/apache/hive/pull/742#discussion_r314690479
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTables.java
 ##
 @@ -750,6 +766,38 @@ public Table apply(@Nullable Table table) {
 .verifyResults(Arrays.asList("1", "2"));
   }
 
+  @Test
+  public void testIncrementalDumpEmptyDumpDirectory() throws Throwable {
+WarehouseInstance.Tuple tuple = primary.run("use " + primaryDbName)
+.run("create external table t1 (id int)")
+.run("insert into table t1 values (1)")
+.run("insert into table t1 values (2)")
+.dump(primaryDbName, null);
+
+replica.load(replicatedDbName, tuple.dumpLocation)
+.status(replicatedDbName)
+.verifyResult(tuple.lastReplicationId);
+
+WarehouseInstance.Tuple incTuple = primary.dump(primaryDbName, 
tuple.lastReplicationId);
+
+replica.load(replicatedDbName, incTuple.dumpLocation)
+.status(replicatedDbName)
+.verifyResult(incTuple.lastReplicationId);
+
+// create events for some other database and then dump the primaryDbName 
to dump an empty directory.
+primary.run("create database " + extraPrimaryDb + " WITH DBPROPERTIES ( '" 
+
+SOURCE_OF_REPLICATION + "' = '1,2,3')");
+WarehouseInstance.Tuple inc2Tuple = primary.run("use " + extraPrimaryDb)
+.run("create table tbl (fld int)")
+.run("use " + primaryDbName)
+.dump(primaryDbName, incTuple.lastReplicationId);
+
 
 Review comment:
   Done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 296244)
Time Spent: 40m  (was: 0.5h)

> Return the last event id dumped as repl status to avoid notification event 
> missing error.
> -
>
> Key: HIVE-22068
> URL: https://issues.apache.org/jira/browse/HIVE-22068
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22068.01.patch, HIVE-22068.02.patch, 
> HIVE-22068.03.patch, HIVE-22068.04.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> In repl load, update the status of target database to the last event dumped 
> so that repl status returns that and next incremental can specify it as the 
> event from which to start the dump. WIthout that repl status might return and 
> old event which might cause, older events to be dumped again and/or a 
> notification event missing error if the older events are cleaned by the 
> cleaner.
> While at it
>  * Add more logging to DB notification listener cleaner thread
>  ** The time when it considered cleaning, the interval and time before which 
> events were cleared, the min and max id at that time
>  ** how many events were cleared
>  ** min and max id after the cleaning.
>  * In REPL::START document the starting event, end event if specified and the 
> maximum number of events, if specified.
>  *



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (HIVE-22122) TxnHandler.getValidWriteIdsForTable optimization for compacted tables

2019-08-16 Thread Peter Vary (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-22122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908933#comment-16908933
 ] 

Peter Vary commented on HIVE-22122:
---

CC: [~sankarh]

> TxnHandler.getValidWriteIdsForTable optimization for compacted tables
> -
>
> Key: HIVE-22122
> URL: https://issues.apache.org/jira/browse/HIVE-22122
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Peter Vary
>Priority: Major
>
> When we do not find open writes for the given validTxnList then we either:
>  # do not have any writes on the table - we can return writeIdHwm = 0, and no 
> invalid/aborted writes;
>  # we have only compacted writes on the table - we can return writeIdHwm = 
> nextWriteId -1, and no invalid/aborted;
>  # we have compacted writes and some invalid writes on the table - we can 
> return the lowest invalid write as a writeIdHwm and set it as invalid.
> What the current code does instead is sending writeIdHwm = nextWriteId -1, 
> and sending every write as invalid. This results the same response in case 
> 1-2, but probably a longer list in case 3.
> So we have place for some optimizations



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (HIVE-22120) Fix wrong results/ArrayOutOfBound exception in left outer map joins on specific boundary conditions

2019-08-16 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-22120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908914#comment-16908914
 ] 

Hive QA commented on HIVE-22120:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12977741/HIVE-22120.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 16709 tests 
executed
*Failed tests:*
{noformat}
TestDataSourceProviderFactory - did not produce a TEST-*.xml file (likely timed 
out) (batchId=232)
TestObjectStore - did not produce a TEST-*.xml file (likely timed out) 
(batchId=232)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[EAR-10592] (batchId=99)
org.apache.hadoop.hive.llap.cache.TestBuddyAllocator.testMTT[2] (batchId=360)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18357/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18357/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18357/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12977741 - PreCommit-HIVE-Build

> Fix wrong results/ArrayOutOfBound exception in left outer map joins on 
> specific boundary conditions
> ---
>
> Key: HIVE-22120
> URL: https://issues.apache.org/jira/browse/HIVE-22120
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, llap, Vectorization
>Affects Versions: 4.0.0
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22120.1.patch
>
>
> Vectorized version of left outer map join produces wrong results or 
> encounters ArrayOutOfBound exception.
> The boundary conditions are:
>  * The complete batch of the big table should have the join key repeated for 
> all the join columns.
>  * The complete batch of the big table should have not have a matched key 
> value in the small table
>  * The repeated value should not be a null value
>  * Some rows should be filtered out as part of the on clause filter.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (HIVE-22120) Fix wrong results/ArrayOutOfBound exception in left outer map joins on specific boundary conditions

2019-08-16 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-22120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908875#comment-16908875
 ] 

Hive QA commented on HIVE-22120:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
41s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
8s{color} | {color:blue} ql in master has 2251 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 46s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-18357/dev-support/hive-personality.sh
 |
| git revision | master / bd42f23 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18357/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Fix wrong results/ArrayOutOfBound exception in left outer map joins on 
> specific boundary conditions
> ---
>
> Key: HIVE-22120
> URL: https://issues.apache.org/jira/browse/HIVE-22120
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, llap, Vectorization
>Affects Versions: 4.0.0
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22120.1.patch
>
>
> Vectorized version of left outer map join produces wrong results or 
> encounters ArrayOutOfBound exception.
> The boundary conditions are:
>  * The complete batch of the big table should have the join key repeated for 
> all the join columns.
>  * The complete batch of the big table should have not have a matched key 
> value in the small table
>  * The repeated value should not be a null value
>  * Some rows should be filtered out as part of the on clause filter.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (HIVE-22107) Correlated subquery producing wrong schema

2019-08-16 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-22107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908819#comment-16908819
 ] 

Hive QA commented on HIVE-22107:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12977740/HIVE-22107.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 16740 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_5]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar]
 (batchId=171)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_scalar] 
(batchId=130)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18356/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18356/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18356/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12977740 - PreCommit-HIVE-Build

> Correlated subquery producing wrong schema
> --
>
> Key: HIVE-22107
> URL: https://issues.apache.org/jira/browse/HIVE-22107
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-22107.1.patch, HIVE-22107.2.patch, 
> HIVE-22107.3.patch
>
>
> *Repro*
> {code:sql}
> create table test(id int, name string,dept string);
> insert into test values(1,'a','it'),(2,'b','eee'),(NULL, 'c', 'cse');
> select distinct 'empno' as eid, a.id from test a where NOT EXISTS (select 
> c.id from test c where a.id=c.id);
> {code}
> {code}
> +---++
> |  eid  |  a.id  |
> +---++
> | NULL  | empno  |
> +---++
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (HIVE-22107) Correlated subquery producing wrong schema

2019-08-16 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-22107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908782#comment-16908782
 ] 

Hive QA commented on HIVE-22107:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
17s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
9s{color} | {color:blue} ql in master has 2251 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
39s{color} | {color:red} ql: The patch generated 1 new + 21 unchanged - 0 fixed 
= 22 total (was 21) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 50s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-18356/dev-support/hive-personality.sh
 |
| git revision | master / bd42f23 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18356/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18356/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Correlated subquery producing wrong schema
> --
>
> Key: HIVE-22107
> URL: https://issues.apache.org/jira/browse/HIVE-22107
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-22107.1.patch, HIVE-22107.2.patch, 
> HIVE-22107.3.patch
>
>
> *Repro*
> {code:sql}
> create table test(id int, name string,dept string);
> insert into test values(1,'a','it'),(2,'b','eee'),(NULL, 'c', 'cse');
> select distinct 'empno' as eid, a.id from test a where NOT EXISTS (select 
> c.id from test c where a.id=c.id);
> {code}
> {code}
> +---++
> |  eid  |  a.id  |
> +---++
> | NULL  | empno  |
> +---++
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (HIVE-22068) Return the last event id dumped as repl status to avoid notification event missing error.

2019-08-16 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22068?focusedWorklogId=296104=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296104
 ]

ASF GitHub Bot logged work on HIVE-22068:
-

Author: ASF GitHub Bot
Created on: 16/Aug/19 06:32
Start Date: 16/Aug/19 06:32
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #742: HIVE-22068 : 
Add more logging to notification cleaner and replication to track events
URL: https://github.com/apache/hive/pull/742#discussion_r314595417
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplLoadTask.java
 ##
 @@ -522,6 +525,25 @@ private int executeIncrementalLoad(DriverContext 
driverContext) {
   // bootstrap of tables if exist.
   if (builder.hasMoreWork() || work.getPathsToCopyIterator().hasNext() || 
work.hasBootstrapLoadTasks()) {
 DAGTraversal.traverse(childTasks, new 
AddDependencyToLeaves(TaskFactory.get(work, conf)));
+  } else if (work.dbNameToLoadIn != null) {
 
 Review comment:
   I think, work.dbNameToLoadIn will be null if you don't specify the name in 
REPL LOAD command. In this case, we should get the name from DumpMetadata to 
set the last repl ID.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 296104)
Time Spent: 20m  (was: 10m)

> Return the last event id dumped as repl status to avoid notification event 
> missing error.
> -
>
> Key: HIVE-22068
> URL: https://issues.apache.org/jira/browse/HIVE-22068
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22068.01.patch, HIVE-22068.02.patch, 
> HIVE-22068.03.patch, HIVE-22068.04.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In repl load, update the status of target database to the last event dumped 
> so that repl status returns that and next incremental can specify it as the 
> event from which to start the dump. WIthout that repl status might return and 
> old event which might cause, older events to be dumped again and/or a 
> notification event missing error if the older events are cleaned by the 
> cleaner.
> While at it
>  * Add more logging to DB notification listener cleaner thread
>  ** The time when it considered cleaning, the interval and time before which 
> events were cleared, the min and max id at that time
>  ** how many events were cleared
>  ** min and max id after the cleaning.
>  * In REPL::START document the starting event, end event if specified and the 
> maximum number of events, if specified.
>  *



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (HIVE-22068) Return the last event id dumped as repl status to avoid notification event missing error.

2019-08-16 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22068?focusedWorklogId=296105=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296105
 ]

ASF GitHub Bot logged work on HIVE-22068:
-

Author: ASF GitHub Bot
Created on: 16/Aug/19 06:32
Start Date: 16/Aug/19 06:32
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #742: HIVE-22068 : 
Add more logging to notification cleaner and replication to track events
URL: https://github.com/apache/hive/pull/742#discussion_r314596061
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTables.java
 ##
 @@ -750,6 +766,38 @@ public Table apply(@Nullable Table table) {
 .verifyResults(Arrays.asList("1", "2"));
   }
 
+  @Test
+  public void testIncrementalDumpEmptyDumpDirectory() throws Throwable {
 
 Review comment:
   Add another test case where we dynamically bootstrap a table (table level 
replication) with incremental dump but no events are dumped. It takes a special 
route in executeIncrementalLoad() method Line: 503 and so I guess, as per 
current change, it won't update the database last repl ID.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 296105)
Time Spent: 20m  (was: 10m)

> Return the last event id dumped as repl status to avoid notification event 
> missing error.
> -
>
> Key: HIVE-22068
> URL: https://issues.apache.org/jira/browse/HIVE-22068
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22068.01.patch, HIVE-22068.02.patch, 
> HIVE-22068.03.patch, HIVE-22068.04.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In repl load, update the status of target database to the last event dumped 
> so that repl status returns that and next incremental can specify it as the 
> event from which to start the dump. WIthout that repl status might return and 
> old event which might cause, older events to be dumped again and/or a 
> notification event missing error if the older events are cleaned by the 
> cleaner.
> While at it
>  * Add more logging to DB notification listener cleaner thread
>  ** The time when it considered cleaning, the interval and time before which 
> events were cleared, the min and max id at that time
>  ** how many events were cleared
>  ** min and max id after the cleaning.
>  * In REPL::START document the starting event, end event if specified and the 
> maximum number of events, if specified.
>  *



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (HIVE-22068) Return the last event id dumped as repl status to avoid notification event missing error.

2019-08-16 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22068?focusedWorklogId=296106=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296106
 ]

ASF GitHub Bot logged work on HIVE-22068:
-

Author: ASF GitHub Bot
Created on: 16/Aug/19 06:32
Start Date: 16/Aug/19 06:32
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #742: HIVE-22068 : 
Add more logging to notification cleaner and replication to track events
URL: https://github.com/apache/hive/pull/742#discussion_r314592846
 
 

 ##
 File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTables.java
 ##
 @@ -750,6 +766,38 @@ public Table apply(@Nullable Table table) {
 .verifyResults(Arrays.asList("1", "2"));
   }
 
+  @Test
+  public void testIncrementalDumpEmptyDumpDirectory() throws Throwable {
+WarehouseInstance.Tuple tuple = primary.run("use " + primaryDbName)
+.run("create external table t1 (id int)")
+.run("insert into table t1 values (1)")
+.run("insert into table t1 values (2)")
+.dump(primaryDbName, null);
+
+replica.load(replicatedDbName, tuple.dumpLocation)
+.status(replicatedDbName)
+.verifyResult(tuple.lastReplicationId);
+
+WarehouseInstance.Tuple incTuple = primary.dump(primaryDbName, 
tuple.lastReplicationId);
+
+replica.load(replicatedDbName, incTuple.dumpLocation)
+.status(replicatedDbName)
+.verifyResult(incTuple.lastReplicationId);
+
+// create events for some other database and then dump the primaryDbName 
to dump an empty directory.
+primary.run("create database " + extraPrimaryDb + " WITH DBPROPERTIES ( '" 
+
+SOURCE_OF_REPLICATION + "' = '1,2,3')");
+WarehouseInstance.Tuple inc2Tuple = primary.run("use " + extraPrimaryDb)
+.run("create table tbl (fld int)")
+.run("use " + primaryDbName)
+.dump(primaryDbName, incTuple.lastReplicationId);
+
 
 Review comment:
   Shall add a validation if REPL DUMP returned last_repl_id is same as the 
latest event ID in notification event table even though no events on dumped db.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 296106)
Time Spent: 0.5h  (was: 20m)

> Return the last event id dumped as repl status to avoid notification event 
> missing error.
> -
>
> Key: HIVE-22068
> URL: https://issues.apache.org/jira/browse/HIVE-22068
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22068.01.patch, HIVE-22068.02.patch, 
> HIVE-22068.03.patch, HIVE-22068.04.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In repl load, update the status of target database to the last event dumped 
> so that repl status returns that and next incremental can specify it as the 
> event from which to start the dump. WIthout that repl status might return and 
> old event which might cause, older events to be dumped again and/or a 
> notification event missing error if the older events are cleaned by the 
> cleaner.
> While at it
>  * Add more logging to DB notification listener cleaner thread
>  ** The time when it considered cleaning, the interval and time before which 
> events were cleared, the min and max id at that time
>  ** how many events were cleared
>  ** min and max id after the cleaning.
>  * In REPL::START document the starting event, end event if specified and the 
> maximum number of events, if specified.
>  *



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (HIVE-22068) Return the last event id dumped as repl status to avoid notification event missing error.

2019-08-16 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22068?focusedWorklogId=296103=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296103
 ]

ASF GitHub Bot logged work on HIVE-22068:
-

Author: ASF GitHub Bot
Created on: 16/Aug/19 06:32
Start Date: 16/Aug/19 06:32
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #742: HIVE-22068 : 
Add more logging to notification cleaner and replication to track events
URL: https://github.com/apache/hive/pull/742#discussion_r314596395
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplLoadTask.java
 ##
 @@ -522,6 +525,25 @@ private int executeIncrementalLoad(DriverContext 
driverContext) {
   // bootstrap of tables if exist.
   if (builder.hasMoreWork() || work.getPathsToCopyIterator().hasNext() || 
work.hasBootstrapLoadTasks()) {
 DAGTraversal.traverse(childTasks, new 
AddDependencyToLeaves(TaskFactory.get(work, conf)));
+  } else if (work.dbNameToLoadIn != null) {
+// Nothing to be done for repl load now. Add a task to update the 
last.repl.id of the
+// target database to the event id of the last event considered by the 
dump. Next
+// incremental cycle if starts from this id, the events considered for 
this dump, won't
+// be considered again. If we are replicating to multiple databases at 
a time, it's not
+// possible to know which all databases we are replicating into and 
hence we can not
+// update repl id in all those databases.
+String lastEventid = builder.eventTo().toString();
 
 Review comment:
   Can we try to re-use ReplLoadTask.updateDatabaseLastReplID method instead of 
duplicating the code here?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 296103)
Time Spent: 20m  (was: 10m)

> Return the last event id dumped as repl status to avoid notification event 
> missing error.
> -
>
> Key: HIVE-22068
> URL: https://issues.apache.org/jira/browse/HIVE-22068
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22068.01.patch, HIVE-22068.02.patch, 
> HIVE-22068.03.patch, HIVE-22068.04.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In repl load, update the status of target database to the last event dumped 
> so that repl status returns that and next incremental can specify it as the 
> event from which to start the dump. WIthout that repl status might return and 
> old event which might cause, older events to be dumped again and/or a 
> notification event missing error if the older events are cleaned by the 
> cleaner.
> While at it
>  * Add more logging to DB notification listener cleaner thread
>  ** The time when it considered cleaning, the interval and time before which 
> events were cleared, the min and max id at that time
>  ** how many events were cleared
>  ** min and max id after the cleaning.
>  * In REPL::START document the starting event, end event if specified and the 
> maximum number of events, if specified.
>  *



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (HIVE-22115) Prevent the creation of query-router logger in HS2 as per property

2019-08-16 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-22115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908767#comment-16908767
 ] 

Hive QA commented on HIVE-22115:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12977738/HIVE-22115.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 16740 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18355/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18355/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18355/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12977738 - PreCommit-HIVE-Build

> Prevent the creation of query-router logger in HS2 as per property
> --
>
> Key: HIVE-22115
> URL: https://issues.apache.org/jira/browse/HIVE-22115
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
> Attachments: HIVE-22115.patch, HIVE-22115.patch, HIVE-22115.patch
>
>
> Avoid the creation and registration of query-router logger if the Hive server 
> Property is set to false by the user
> {code}
> HiveConf.ConfVars.HIVE_SERVER2_LOGGING_OPERATION_ENABLED
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

50 matches

Mail list logo