[jira] [Work logged] (HIVE-23922) Improve code quality, UDFArgumentException.getMessage Method requires only two parameters

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23922?focusedWorklogId=469538&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469538
 ]

ASF GitHub Bot logged work on HIVE-23922:
-

Author: ASF GitHub Bot
Created on: 12/Aug/20 06:41
Start Date: 12/Aug/20 06:41
Worklog Time Spent: 10m 
  Work Description: sunchao commented on pull request #1307:
URL: https://github.com/apache/hive/pull/1307#issuecomment-672641648


   @dh20 looks good - can you remove `typeNames` as well in the method body? it 
is also not used.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469538)
Time Spent: 20m  (was: 10m)

> Improve code quality, UDFArgumentException.getMessage Method requires only 
> two parameters
> -
>
> Key: HIVE-23922
> URL: https://issues.apache.org/jira/browse/HIVE-23922
> Project: Hive
>  Issue Type: Improvement
>Reporter: hao
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> [UDFArgumentException.getMessage] This method only needs two parameters, 
> message and methods. The rest parameters are not used



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23993) Handle irrecoverable errors

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23993?focusedWorklogId=469537&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469537
 ]

ASF GitHub Bot logged work on HIVE-23993:
-

Author: ASF GitHub Bot
Created on: 12/Aug/20 06:39
Start Date: 12/Aug/20 06:39
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #1367:
URL: https://github.com/apache/hive/pull/1367#discussion_r469036313



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java
##
@@ -418,6 +420,27 @@ private void analyzeReplLoad(ASTNode ast) throws 
SemanticException {
 }
   }
 
+  private Path getLatestDumpPath() throws IOException {

Review comment:
   We can reuse the same code in ReplDumpTask





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469537)
Time Spent: 1h 50m  (was: 1h 40m)

> Handle irrecoverable errors
> ---
>
> Key: HIVE-23993
> URL: https://issues.apache.org/jira/browse/HIVE-23993
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23993.01.patch, HIVE-23993.02.patch, 
> HIVE-23993.03.patch, Retry Logic for Replication.pdf
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23922) Improve code quality, UDFArgumentException.getMessage Method requires only two parameters

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-23922:
--
Labels: pull-request-available  (was: )

> Improve code quality, UDFArgumentException.getMessage Method requires only 
> two parameters
> -
>
> Key: HIVE-23922
> URL: https://issues.apache.org/jira/browse/HIVE-23922
> Project: Hive
>  Issue Type: Improvement
>Reporter: hao
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [UDFArgumentException.getMessage] This method only needs two parameters, 
> message and methods. The rest parameters are not used



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23922) Improve code quality, UDFArgumentException.getMessage Method requires only two parameters

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23922?focusedWorklogId=469534&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469534
 ]

ASF GitHub Bot logged work on HIVE-23922:
-

Author: ASF GitHub Bot
Created on: 12/Aug/20 06:34
Start Date: 12/Aug/20 06:34
Worklog Time Spent: 10m 
  Work Description: dh20 commented on pull request #1307:
URL: https://github.com/apache/hive/pull/1307#issuecomment-672638988


   > @dh20 can you fix the title and the JIRA number? 
[HIVE-23896](https://issues.apache.org/jira/browse/HIVE-23896) seems unrelated 
to this PR.
   
   @sunchao Sorry, I made a mistake in Jire number. Now I have corrected it



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469534)
Remaining Estimate: 0h
Time Spent: 10m

> Improve code quality, UDFArgumentException.getMessage Method requires only 
> two parameters
> -
>
> Key: HIVE-23922
> URL: https://issues.apache.org/jira/browse/HIVE-23922
> Project: Hive
>  Issue Type: Improvement
>Reporter: hao
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [UDFArgumentException.getMessage] This method only needs two parameters, 
> message and methods. The rest parameters are not used



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23993) Handle irrecoverable errors

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23993?focusedWorklogId=469532&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469532
 ]

ASF GitHub Bot logged work on HIVE-23993:
-

Author: ASF GitHub Bot
Created on: 12/Aug/20 06:29
Start Date: 12/Aug/20 06:29
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #1367:
URL: https://github.com/apache/hive/pull/1367#discussion_r469032996



##
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/util/ReplUtils.java
##
@@ -366,4 +371,26 @@ public static boolean includeAcidTableInDump(HiveConf 
conf) {
   public static boolean tableIncludedInReplScope(ReplScope replScope, String 
tableName) {
 return ((replScope == null) || 
replScope.tableIncludedInReplScope(tableName));
   }
+
+  public static boolean failedWithNonRecoverableError(Path dumpRoot, HiveConf 
conf) throws SemanticException {
+if (dumpRoot == null) {

Review comment:
   Is this also applicable during load? I mean, can the dumpRoot here be 
null in load case as well?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469532)
Time Spent: 1h 40m  (was: 1.5h)

> Handle irrecoverable errors
> ---
>
> Key: HIVE-23993
> URL: https://issues.apache.org/jira/browse/HIVE-23993
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23993.01.patch, HIVE-23993.02.patch, 
> HIVE-23993.03.patch, Retry Logic for Replication.pdf
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23998) Upgrave Guava to 27 for Hive 2.3

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23998?focusedWorklogId=469530&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469530
 ]

ASF GitHub Bot logged work on HIVE-23998:
-

Author: ASF GitHub Bot
Created on: 12/Aug/20 06:23
Start Date: 12/Aug/20 06:23
Worklog Time Spent: 10m 
  Work Description: viirya commented on pull request #1395:
URL: https://github.com/apache/hive/pull/1395#issuecomment-672634649


   @sunchao Yeah, seems the tests was triggered.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469530)
Time Spent: 4h 50m  (was: 4h 40m)

> Upgrave Guava to 27 for Hive 2.3
> 
>
> Key: HIVE-23998
> URL: https://issues.apache.org/jira/browse/HIVE-23998
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.3.7
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23998.01.branch-2.3.patch
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Try to upgrade Guava to 27.0-jre for Hive 2.3 branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23998) Upgrave Guava to 27 for Hive 2.3

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23998?focusedWorklogId=469529&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469529
 ]

ASF GitHub Bot logged work on HIVE-23998:
-

Author: ASF GitHub Bot
Created on: 12/Aug/20 06:22
Start Date: 12/Aug/20 06:22
Worklog Time Spent: 10m 
  Work Description: viirya commented on pull request #1395:
URL: https://github.com/apache/hive/pull/1395#issuecomment-672634424


   cc @sunchao 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469529)
Time Spent: 4h 40m  (was: 4.5h)

> Upgrave Guava to 27 for Hive 2.3
> 
>
> Key: HIVE-23998
> URL: https://issues.apache.org/jira/browse/HIVE-23998
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.3.7
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23998.01.branch-2.3.patch
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> Try to upgrade Guava to 27.0-jre for Hive 2.3 branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23998) Upgrave Guava to 27 for Hive 2.3

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23998?focusedWorklogId=469527&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469527
 ]

ASF GitHub Bot logged work on HIVE-23998:
-

Author: ASF GitHub Bot
Created on: 12/Aug/20 06:22
Start Date: 12/Aug/20 06:22
Worklog Time Spent: 10m 
  Work Description: viirya opened a new pull request #1395:
URL: https://github.com/apache/hive/pull/1395


   
   
   ### What changes were proposed in this pull request?
   
   
   This PR proposes to upgrade Guava to 27 in Hive branch-2. This is basically 
used to trigger test for #1394.
   
   ### Why are the changes needed?
   
   
   When trying to upgrade Guava in Spark, found the following error. A Guava 
method became package-private since Guava version 20. So there is 
incompatibility with Guava versions > 19.0.
   
   ```
   sbt.ForkMain$ForkError: sbt.ForkMain$ForkError: 
java.lang.IllegalAccessError: tried to access method 
com.google.common.collect.Iterators.emptyIterator()Lcom/google/common/collect/UnmodifiableIterator;
 from class org.apache.hadoop.hive.ql.exec.FetchOperator
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:108)
at 
org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:87)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:541)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227)
   ```
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   Yes. This upgrades Guava to 27.
   
   ### How was this patch tested?
   
   
   Built Hive locally.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469527)
Time Spent: 4.5h  (was: 4h 20m)

> Upgrave Guava to 27 for Hive 2.3
> 
>
> Key: HIVE-23998
> URL: https://issues.apache.org/jira/browse/HIVE-23998
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.3.7
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23998.01.branch-2.3.patch
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Try to upgrade Guava to 27.0-jre for Hive 2.3 branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23896) hiveserver2 not listening on any port, am i miss some configurations?

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23896?focusedWorklogId=469524&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469524
 ]

ASF GitHub Bot logged work on HIVE-23896:
-

Author: ASF GitHub Bot
Created on: 12/Aug/20 06:19
Start Date: 12/Aug/20 06:19
Worklog Time Spent: 10m 
  Work Description: sunchao commented on pull request #1307:
URL: https://github.com/apache/hive/pull/1307#issuecomment-672633192


   @dh20 can you fix the title and the JIRA number? HIVE-23896 seems unrelated 
to this PR.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469524)
Time Spent: 1h 40m  (was: 1.5h)

> hiveserver2 not listening on any port, am i miss some configurations?
> -
>
> Key: HIVE-23896
> URL: https://issues.apache.org/jira/browse/HIVE-23896
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.2
> Environment: hive: 3.1.2
> hadoop: 3.2.1, standalone, url: hdfs://namenode.hadoop.svc.cluster.local:9000
> {quote}$ $HADOOP_HOME/bin/hadoop fs -mkdir /tmp
>  $ $HADOOP_HOME/bin/hadoop fs -mkdir /user/hive/warehouse
> {quote}
> hadoop commands  are workable in the hiveserver node(POD).
>  
>Reporter: alanwake
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
>  
>  
> i try deply hive 3.1.2 on k8s.  it was worked on version 2.3.2.
> metastore node and postgres node are ok, but hiveserver look like i miss some 
> important configuration properties?
> {code:java}
>  {code}
>  
>  
>  
> {code:java}
> [root@master hive]# ./get.sh 
> NAME READY   STATUSRESTARTS   AGE   IP
>  NODE   NOMINATED NODE   READINESS GATES
> hive-7bd48747d4-5zjmh1/1 Running   0  56s   10.244.3.110  
>  node03.51.local  
> metastore-66b58f9f76-6wsxj   1/1 Running   0  56s   10.244.3.109  
>  node03.51.local  
> postgres-57794b99b7-pqxwm1/1 Running   0  56s   10.244.2.241  
>  node02.51.local  NAMETYPECLUSTER-IP  
>  EXTERNAL-IP   PORT(S)   AGE   SELECTOR
> hiveNodePort10.108.40.17 
> 10002:30626/TCP,1:31845/TCP   56s   app=hive
> metastore   ClusterIP   10.106.159.220   9083/TCP   
>56s   app=metastore
> postgresClusterIP   10.108.85.47 5432/TCP   
>56s   app=postgres
> {code}
>  
>  
> {code:java}
> [root@master hive]# kubectl logs hive-7bd48747d4-5zjmh -n=hive
> Configuring core
>  - Setting hadoop.proxyuser.hue.hosts=*
>  - Setting fs.defaultFS=hdfs://namenode.hadoop.svc.cluster.local:9000
>  - Setting hadoop.http.staticuser.user=root
>  - Setting hadoop.proxyuser.hue.groups=*
> Configuring hdfs
>  - Setting dfs.namenode.datanode.registration.ip-hostname-check=false
>  - Setting dfs.webhdfs.enabled=true
>  - Setting dfs.permissions.enabled=false
> Configuring yarn
>  - Setting yarn.timeline-service.enabled=true
>  - Setting yarn.resourcemanager.system-metrics-publisher.enabled=true
>  - Setting 
> yarn.resourcemanager.store.class=org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore
>  - Setting 
> yarn.log.server.url=http://historyserver.hadoop.svc.cluster.local:8188/applicationhistory/logs/
>  - Setting yarn.resourcemanager.fs.state-store.uri=/rmstate
>  - Setting yarn.timeline-service.generic-application-history.enabled=true
>  - Setting yarn.log-aggregation-enable=true
>  - Setting 
> yarn.resourcemanager.hostname=resourcemanager.hadoop.svc.cluster.local
>  - Setting 
> yarn.resourcemanager.resource.tracker.address=resourcemanager.hadoop.svc.cluster.local:8031
>  - Setting 
> yarn.timeline-service.hostname=historyserver.hadoop.svc.cluster.local
>  - Setting 
> yarn.resourcemanager.scheduler.address=resourcemanager.hadoop.svc.cluster.local:8030
>  - Setting 
> yarn.resourcemanager.address=resourcemanager.hadoop.svc.cluster.local:8032
>  - Setting yarn.nodemanager.remote-app-log-dir=/app-logs
>  - Setting yarn.resourcemanager.recovery.enabled=true
> Configuring httpfs
> Configuring kms
> Configuring mapred
> Configuring hive
>  - Setting datanucleus.autoCreateSchema=false
>  - Setting javax.jdo.option.ConnectionPassword=hive
>  - Setting hive.metastore.uris=thrift://metastore:9083
>  - Setting 
> javax.jdo.option.Con

[jira] [Updated] (HIVE-23995) Don't set location for managed tables in case of replication

2020-08-11 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23995:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

committed to master , Thanks for the patch [~aasha] and review [~pkumarsinha]

> Don't set location for managed tables in case of replication
> 
>
> Key: HIVE-23995
> URL: https://issues.apache.org/jira/browse/HIVE-23995
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23995.01.patch, HIVE-23995.02.patch, 
> HIVE-23995.03.patch, HIVE-23995.04.patch, HIVE-23995.05.patch, 
> HIVE-23995.06.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Managed table location should not be set
> Migration code of replication should be removed
> add logging to all ack files
> set hive.repl.data.copy.lazy to true



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23896) hiveserver2 not listening on any port, am i miss some configurations?

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23896?focusedWorklogId=469519&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469519
 ]

ASF GitHub Bot logged work on HIVE-23896:
-

Author: ASF GitHub Bot
Created on: 12/Aug/20 05:46
Start Date: 12/Aug/20 05:46
Worklog Time Spent: 10m 
  Work Description: dh20 commented on pull request #1307:
URL: https://github.com/apache/hive/pull/1307#issuecomment-672609358


Hi, @sunchao 
   Could you review this PR, please?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469519)
Time Spent: 1.5h  (was: 1h 20m)

> hiveserver2 not listening on any port, am i miss some configurations?
> -
>
> Key: HIVE-23896
> URL: https://issues.apache.org/jira/browse/HIVE-23896
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.2
> Environment: hive: 3.1.2
> hadoop: 3.2.1, standalone, url: hdfs://namenode.hadoop.svc.cluster.local:9000
> {quote}$ $HADOOP_HOME/bin/hadoop fs -mkdir /tmp
>  $ $HADOOP_HOME/bin/hadoop fs -mkdir /user/hive/warehouse
> {quote}
> hadoop commands  are workable in the hiveserver node(POD).
>  
>Reporter: alanwake
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
>  
>  
> i try deply hive 3.1.2 on k8s.  it was worked on version 2.3.2.
> metastore node and postgres node are ok, but hiveserver look like i miss some 
> important configuration properties?
> {code:java}
>  {code}
>  
>  
>  
> {code:java}
> [root@master hive]# ./get.sh 
> NAME READY   STATUSRESTARTS   AGE   IP
>  NODE   NOMINATED NODE   READINESS GATES
> hive-7bd48747d4-5zjmh1/1 Running   0  56s   10.244.3.110  
>  node03.51.local  
> metastore-66b58f9f76-6wsxj   1/1 Running   0  56s   10.244.3.109  
>  node03.51.local  
> postgres-57794b99b7-pqxwm1/1 Running   0  56s   10.244.2.241  
>  node02.51.local  NAMETYPECLUSTER-IP  
>  EXTERNAL-IP   PORT(S)   AGE   SELECTOR
> hiveNodePort10.108.40.17 
> 10002:30626/TCP,1:31845/TCP   56s   app=hive
> metastore   ClusterIP   10.106.159.220   9083/TCP   
>56s   app=metastore
> postgresClusterIP   10.108.85.47 5432/TCP   
>56s   app=postgres
> {code}
>  
>  
> {code:java}
> [root@master hive]# kubectl logs hive-7bd48747d4-5zjmh -n=hive
> Configuring core
>  - Setting hadoop.proxyuser.hue.hosts=*
>  - Setting fs.defaultFS=hdfs://namenode.hadoop.svc.cluster.local:9000
>  - Setting hadoop.http.staticuser.user=root
>  - Setting hadoop.proxyuser.hue.groups=*
> Configuring hdfs
>  - Setting dfs.namenode.datanode.registration.ip-hostname-check=false
>  - Setting dfs.webhdfs.enabled=true
>  - Setting dfs.permissions.enabled=false
> Configuring yarn
>  - Setting yarn.timeline-service.enabled=true
>  - Setting yarn.resourcemanager.system-metrics-publisher.enabled=true
>  - Setting 
> yarn.resourcemanager.store.class=org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore
>  - Setting 
> yarn.log.server.url=http://historyserver.hadoop.svc.cluster.local:8188/applicationhistory/logs/
>  - Setting yarn.resourcemanager.fs.state-store.uri=/rmstate
>  - Setting yarn.timeline-service.generic-application-history.enabled=true
>  - Setting yarn.log-aggregation-enable=true
>  - Setting 
> yarn.resourcemanager.hostname=resourcemanager.hadoop.svc.cluster.local
>  - Setting 
> yarn.resourcemanager.resource.tracker.address=resourcemanager.hadoop.svc.cluster.local:8031
>  - Setting 
> yarn.timeline-service.hostname=historyserver.hadoop.svc.cluster.local
>  - Setting 
> yarn.resourcemanager.scheduler.address=resourcemanager.hadoop.svc.cluster.local:8030
>  - Setting 
> yarn.resourcemanager.address=resourcemanager.hadoop.svc.cluster.local:8032
>  - Setting yarn.nodemanager.remote-app-log-dir=/app-logs
>  - Setting yarn.resourcemanager.recovery.enabled=true
> Configuring httpfs
> Configuring kms
> Configuring mapred
> Configuring hive
>  - Setting datanucleus.autoCreateSchema=false
>  - Setting javax.jdo.option.ConnectionPassword=hive
>  - Setting hive.metastore.uris=thrift://metastore:9083
>  - Setting 
> javax.jdo.option.ConnectionURL=jdbc:postgresql://metastore/met

[jira] [Work logged] (HIVE-23998) Upgrave Guava to 27 for Hive 2.3

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23998?focusedWorklogId=469518&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469518
 ]

ASF GitHub Bot logged work on HIVE-23998:
-

Author: ASF GitHub Bot
Created on: 12/Aug/20 05:25
Start Date: 12/Aug/20 05:25
Worklog Time Spent: 10m 
  Work Description: sunchao commented on pull request #1394:
URL: https://github.com/apache/hive/pull/1394#issuecomment-672586082


   Yes that's right. If tests pass on branch-2 we can cherry-pick this to 
branch-2.3 and probably commit it directly, I think.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469518)
Time Spent: 4h 20m  (was: 4h 10m)

> Upgrave Guava to 27 for Hive 2.3
> 
>
> Key: HIVE-23998
> URL: https://issues.apache.org/jira/browse/HIVE-23998
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.3.7
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23998.01.branch-2.3.patch
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Try to upgrade Guava to 27.0-jre for Hive 2.3 branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23998) Upgrave Guava to 27 for Hive 2.3

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23998?focusedWorklogId=469517&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469517
 ]

ASF GitHub Bot logged work on HIVE-23998:
-

Author: ASF GitHub Bot
Created on: 12/Aug/20 05:23
Start Date: 12/Aug/20 05:23
Worklog Time Spent: 10m 
  Work Description: viirya commented on pull request #1394:
URL: https://github.com/apache/hive/pull/1394#issuecomment-672583314


   @sunchao OK, let me try it. But I think by the end the 2.3.8 should be from 
branch-2.3, right? 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469517)
Time Spent: 4h 10m  (was: 4h)

> Upgrave Guava to 27 for Hive 2.3
> 
>
> Key: HIVE-23998
> URL: https://issues.apache.org/jira/browse/HIVE-23998
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.3.7
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23998.01.branch-2.3.patch
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Try to upgrade Guava to 27.0-jre for Hive 2.3 branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23998) Upgrave Guava to 27 for Hive 2.3

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23998?focusedWorklogId=469515&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469515
 ]

ASF GitHub Bot logged work on HIVE-23998:
-

Author: ASF GitHub Bot
Created on: 12/Aug/20 04:45
Start Date: 12/Aug/20 04:45
Worklog Time Spent: 10m 
  Work Description: sunchao commented on pull request #1394:
URL: https://github.com/apache/hive/pull/1394#issuecomment-672569787


   @viirya I wonder if we can create another PR targeting branch-2. It seems 
there is minimum conflict between branch-2.3 and branch-2 for this, and the 
testing seems to be working for branch-2 (see 
[this](https://github.com/apache/hive/pull/1234) as example).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469515)
Time Spent: 4h  (was: 3h 50m)

> Upgrave Guava to 27 for Hive 2.3
> 
>
> Key: HIVE-23998
> URL: https://issues.apache.org/jira/browse/HIVE-23998
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.3.7
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23998.01.branch-2.3.patch
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Try to upgrade Guava to 27.0-jre for Hive 2.3 branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23998) Upgrave Guava to 27 for Hive 2.3

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23998?focusedWorklogId=469510&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469510
 ]

ASF GitHub Bot logged work on HIVE-23998:
-

Author: ASF GitHub Bot
Created on: 12/Aug/20 04:33
Start Date: 12/Aug/20 04:33
Worklog Time Spent: 10m 
  Work Description: viirya opened a new pull request #1394:
URL: https://github.com/apache/hive/pull/1394


   
   
   ### What changes were proposed in this pull request?
   
   
   This PR proposes to upgrade Guava to 27 in Hive 2.3 branch.
   
   ### Why are the changes needed?
   
   
   When trying to upgrade Guava in Spark, found the following error. A Guava 
method became package-private since Guava version 20. So there is 
incompatibility with Guava versions > 19.0.
   
   ```
   sbt.ForkMain$ForkError: sbt.ForkMain$ForkError: 
java.lang.IllegalAccessError: tried to access method 
com.google.common.collect.Iterators.emptyIterator()Lcom/google/common/collect/UnmodifiableIterator;
 from class org.apache.hadoop.hive.ql.exec.FetchOperator
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:108)
at 
org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:87)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:541)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227)
   ```
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   Yes. This upgrades Guava to 27.
   
   ### How was this patch tested?
   
   
   Built Hive locally.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469510)
Time Spent: 3h 40m  (was: 3.5h)

> Upgrave Guava to 27 for Hive 2.3
> 
>
> Key: HIVE-23998
> URL: https://issues.apache.org/jira/browse/HIVE-23998
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.3.7
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23998.01.branch-2.3.patch
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Try to upgrade Guava to 27.0-jre for Hive 2.3 branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23998) Upgrave Guava to 27 for Hive 2.3

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23998?focusedWorklogId=469511&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469511
 ]

ASF GitHub Bot logged work on HIVE-23998:
-

Author: ASF GitHub Bot
Created on: 12/Aug/20 04:33
Start Date: 12/Aug/20 04:33
Worklog Time Spent: 10m 
  Work Description: viirya commented on pull request #1394:
URL: https://github.com/apache/hive/pull/1394#issuecomment-672566845


   cc @sunchao Thanks.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469511)
Time Spent: 3h 50m  (was: 3h 40m)

> Upgrave Guava to 27 for Hive 2.3
> 
>
> Key: HIVE-23998
> URL: https://issues.apache.org/jira/browse/HIVE-23998
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.3.7
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23998.01.branch-2.3.patch
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Try to upgrade Guava to 27.0-jre for Hive 2.3 branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23998) Upgrave Guava to 27 for Hive 2.3

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23998?focusedWorklogId=469509&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469509
 ]

ASF GitHub Bot logged work on HIVE-23998:
-

Author: ASF GitHub Bot
Created on: 12/Aug/20 04:29
Start Date: 12/Aug/20 04:29
Worklog Time Spent: 10m 
  Work Description: sunchao commented on pull request #1365:
URL: https://github.com/apache/hive/pull/1365#issuecomment-672565803


   @viirya reverted - please ping me on the new PR and let's see if we can 
figure this testing thing out.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469509)
Time Spent: 3.5h  (was: 3h 20m)

> Upgrave Guava to 27 for Hive 2.3
> 
>
> Key: HIVE-23998
> URL: https://issues.apache.org/jira/browse/HIVE-23998
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.3.7
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23998.01.branch-2.3.patch
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Try to upgrade Guava to 27.0-jre for Hive 2.3 branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23998) Upgrave Guava to 27 for Hive 2.3

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23998?focusedWorklogId=469508&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469508
 ]

ASF GitHub Bot logged work on HIVE-23998:
-

Author: ASF GitHub Bot
Created on: 12/Aug/20 04:28
Start Date: 12/Aug/20 04:28
Worklog Time Spent: 10m 
  Work Description: viirya commented on pull request #1365:
URL: https://github.com/apache/hive/pull/1365#issuecomment-672565439


   No problem at all. @sunchao 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469508)
Time Spent: 3h 20m  (was: 3h 10m)

> Upgrave Guava to 27 for Hive 2.3
> 
>
> Key: HIVE-23998
> URL: https://issues.apache.org/jira/browse/HIVE-23998
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.3.7
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23998.01.branch-2.3.patch
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Try to upgrade Guava to 27.0-jre for Hive 2.3 branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23998) Upgrave Guava to 27 for Hive 2.3

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23998?focusedWorklogId=469507&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469507
 ]

ASF GitHub Bot logged work on HIVE-23998:
-

Author: ASF GitHub Bot
Created on: 12/Aug/20 04:27
Start Date: 12/Aug/20 04:27
Worklog Time Spent: 10m 
  Work Description: sunchao commented on pull request #1365:
URL: https://github.com/apache/hive/pull/1365#issuecomment-672565238


   @viirya Yes please do. Sorry about that :(



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469507)
Time Spent: 3h 10m  (was: 3h)

> Upgrave Guava to 27 for Hive 2.3
> 
>
> Key: HIVE-23998
> URL: https://issues.apache.org/jira/browse/HIVE-23998
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.3.7
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23998.01.branch-2.3.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Try to upgrade Guava to 27.0-jre for Hive 2.3 branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23998) Upgrave Guava to 27 for Hive 2.3

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23998?focusedWorklogId=469506&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469506
 ]

ASF GitHub Bot logged work on HIVE-23998:
-

Author: ASF GitHub Bot
Created on: 12/Aug/20 04:26
Start Date: 12/Aug/20 04:26
Worklog Time Spent: 10m 
  Work Description: viirya commented on pull request #1365:
URL: https://github.com/apache/hive/pull/1365#issuecomment-672564936


   Oh, I mean a new PR for this change.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469506)
Time Spent: 3h  (was: 2h 50m)

> Upgrave Guava to 27 for Hive 2.3
> 
>
> Key: HIVE-23998
> URL: https://issues.apache.org/jira/browse/HIVE-23998
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.3.7
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23998.01.branch-2.3.patch
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Try to upgrade Guava to 27.0-jre for Hive 2.3 branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23998) Upgrave Guava to 27 for Hive 2.3

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23998?focusedWorklogId=469505&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469505
 ]

ASF GitHub Bot logged work on HIVE-23998:
-

Author: ASF GitHub Bot
Created on: 12/Aug/20 04:25
Start Date: 12/Aug/20 04:25
Worklog Time Spent: 10m 
  Work Description: sunchao commented on pull request #1365:
URL: https://github.com/apache/hive/pull/1365#issuecomment-672564659


   No there is already a PR to revert it (see above message). I'll get it 
merged soon.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469505)
Time Spent: 2h 50m  (was: 2h 40m)

> Upgrave Guava to 27 for Hive 2.3
> 
>
> Key: HIVE-23998
> URL: https://issues.apache.org/jira/browse/HIVE-23998
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.3.7
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23998.01.branch-2.3.patch
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Try to upgrade Guava to 27.0-jre for Hive 2.3 branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23998) Upgrave Guava to 27 for Hive 2.3

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23998?focusedWorklogId=469503&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469503
 ]

ASF GitHub Bot logged work on HIVE-23998:
-

Author: ASF GitHub Bot
Created on: 12/Aug/20 04:24
Start Date: 12/Aug/20 04:24
Worklog Time Spent: 10m 
  Work Description: viirya commented on pull request #1365:
URL: https://github.com/apache/hive/pull/1365#issuecomment-672564334


   Oh ok. I need to create a new PR for this?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469503)
Time Spent: 2h 40m  (was: 2.5h)

> Upgrave Guava to 27 for Hive 2.3
> 
>
> Key: HIVE-23998
> URL: https://issues.apache.org/jira/browse/HIVE-23998
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.3.7
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23998.01.branch-2.3.patch
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Try to upgrade Guava to 27.0-jre for Hive 2.3 branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23998) Upgrave Guava to 27 for Hive 2.3

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23998?focusedWorklogId=469501&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469501
 ]

ASF GitHub Bot logged work on HIVE-23998:
-

Author: ASF GitHub Bot
Created on: 12/Aug/20 04:23
Start Date: 12/Aug/20 04:23
Worklog Time Spent: 10m 
  Work Description: sunchao commented on pull request #1365:
URL: https://github.com/apache/hive/pull/1365#issuecomment-672564132


   Oops. Sorry I merged this by mistake. Will create a PR to revert it.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469501)
Time Spent: 2.5h  (was: 2h 20m)

> Upgrave Guava to 27 for Hive 2.3
> 
>
> Key: HIVE-23998
> URL: https://issues.apache.org/jira/browse/HIVE-23998
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.3.7
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23998.01.branch-2.3.patch
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Try to upgrade Guava to 27.0-jre for Hive 2.3 branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24030) Upgrade ORC to 1.5.10

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24030?focusedWorklogId=469500&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469500
 ]

ASF GitHub Bot logged work on HIVE-24030:
-

Author: ASF GitHub Bot
Created on: 12/Aug/20 04:22
Start Date: 12/Aug/20 04:22
Worklog Time Spent: 10m 
  Work Description: sunchao merged pull request #1393:
URL: https://github.com/apache/hive/pull/1393


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469500)
Time Spent: 40m  (was: 0.5h)

> Upgrade ORC to 1.5.10
> -
>
> Key: HIVE-24030
> URL: https://issues.apache.org/jira/browse/HIVE-24030
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23998) Upgrave Guava to 27 for Hive 2.3

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23998?focusedWorklogId=469499&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469499
 ]

ASF GitHub Bot logged work on HIVE-23998:
-

Author: ASF GitHub Bot
Created on: 12/Aug/20 04:21
Start Date: 12/Aug/20 04:21
Worklog Time Spent: 10m 
  Work Description: sunchao merged pull request #1365:
URL: https://github.com/apache/hive/pull/1365


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469499)
Time Spent: 2h 20m  (was: 2h 10m)

> Upgrave Guava to 27 for Hive 2.3
> 
>
> Key: HIVE-23998
> URL: https://issues.apache.org/jira/browse/HIVE-23998
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.3.7
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23998.01.branch-2.3.patch
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Try to upgrade Guava to 27.0-jre for Hive 2.3 branch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24030) Upgrade ORC to 1.5.10

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24030?focusedWorklogId=469498&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469498
 ]

ASF GitHub Bot logged work on HIVE-24030:
-

Author: ASF GitHub Bot
Created on: 12/Aug/20 04:10
Start Date: 12/Aug/20 04:10
Worklog Time Spent: 10m 
  Work Description: dongjoon-hyun commented on pull request #1393:
URL: https://github.com/apache/hive/pull/1393#issuecomment-672561185


   Thank you so much, @sunchao !



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469498)
Time Spent: 0.5h  (was: 20m)

> Upgrade ORC to 1.5.10
> -
>
> Key: HIVE-24030
> URL: https://issues.apache.org/jira/browse/HIVE-24030
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23995) Don't set location for managed tables in case of replication

2020-08-11 Thread Aasha Medhi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-23995:
---
Description: 
Managed table location should not be set
Migration code of replication should be removed
add logging to all ack files
set hive.repl.data.copy.lazy to true

> Don't set location for managed tables in case of replication
> 
>
> Key: HIVE-23995
> URL: https://issues.apache.org/jira/browse/HIVE-23995
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23995.01.patch, HIVE-23995.02.patch, 
> HIVE-23995.03.patch, HIVE-23995.04.patch, HIVE-23995.05.patch, 
> HIVE-23995.06.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Managed table location should not be set
> Migration code of replication should be removed
> add logging to all ack files
> set hive.repl.data.copy.lazy to true



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24030) Upgrade ORC to 1.5.10

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24030?focusedWorklogId=469497&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469497
 ]

ASF GitHub Bot logged work on HIVE-24030:
-

Author: ASF GitHub Bot
Created on: 12/Aug/20 04:07
Start Date: 12/Aug/20 04:07
Worklog Time Spent: 10m 
  Work Description: dongjoon-hyun commented on pull request #1393:
URL: https://github.com/apache/hive/pull/1393#issuecomment-672560404


   Hi, @sunchao .
   Could you review this PR, please?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469497)
Time Spent: 20m  (was: 10m)

> Upgrade ORC to 1.5.10
> -
>
> Key: HIVE-24030
> URL: https://issues.apache.org/jira/browse/HIVE-24030
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23995) Don't set location for managed tables in case of replication

2020-08-11 Thread Aasha Medhi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-23995:
---
Summary: Don't set location for managed tables in case of replication  
(was: Don't set location for managed tables in case of repl load)

> Don't set location for managed tables in case of replication
> 
>
> Key: HIVE-23995
> URL: https://issues.apache.org/jira/browse/HIVE-23995
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23995.01.patch, HIVE-23995.02.patch, 
> HIVE-23995.03.patch, HIVE-23995.04.patch, HIVE-23995.05.patch, 
> HIVE-23995.06.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-18325) Config to do case unaware schema evolution to ORC reader.

2020-08-11 Thread zhaolong (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-18325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175952#comment-17175952
 ] 

zhaolong commented on HIVE-18325:
-

@ [~liujiayi771]  Yes, you are right.

> Config to do case unaware schema evolution to ORC reader.
> -
>
> Key: HIVE-18325
> URL: https://issues.apache.org/jira/browse/HIVE-18325
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Reporter: piyush mukati
>Priority: Critical
>
> in case of orc data reader schema passed by hive are all small cases and if 
> the column name stored in the file has any uppercase, it will return null 
> values for those columns even if the data is present in the file. 
> Column name matching while schema evolution should be case unaware. 
> we need to pass config for same from hive. the 
> config(orc.schema.evolution.case.sensitive) in orc will be exposed by 
> https://issues.apache.org/jira/browse/ORC-264 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-20033) Backport HIVE-19432 to branch-2, branch-3

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-20033?focusedWorklogId=469469&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469469
 ]

ASF GitHub Bot logged work on HIVE-20033:
-

Author: ASF GitHub Bot
Created on: 12/Aug/20 00:37
Start Date: 12/Aug/20 00:37
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #381:
URL: https://github.com/apache/hive/pull/381#issuecomment-672397556


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469469)
Time Spent: 1h  (was: 50m)

> Backport HIVE-19432 to branch-2, branch-3
> -
>
> Key: HIVE-20033
> URL: https://issues.apache.org/jira/browse/HIVE-20033
> Project: Hive
>  Issue Type: Bug
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20033.02-branch-3.patch, 
> HIVE-20033.03-branch-3.patch, HIVE-20033.1.branch-2.patch, 
> HIVE-20033.1.branch-3.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Backport HIVE-19432 to branch-2, branch-3



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24030) Upgrade ORC to 1.5.10

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24030?focusedWorklogId=469447&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469447
 ]

ASF GitHub Bot logged work on HIVE-24030:
-

Author: ASF GitHub Bot
Created on: 11/Aug/20 23:30
Start Date: 11/Aug/20 23:30
Worklog Time Spent: 10m 
  Work Description: dongjoon-hyun opened a new pull request #1393:
URL: https://github.com/apache/hive/pull/1393


   ### What changes were proposed in this pull request?
   
   This PR aims to upgrade ORC to 1.5.10.
   
   ### Why are the changes needed?
   
   This will bring the latest bug fixes.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Pass the CI.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469447)
Remaining Estimate: 0h
Time Spent: 10m

> Upgrade ORC to 1.5.10
> -
>
> Key: HIVE-24030
> URL: https://issues.apache.org/jira/browse/HIVE-24030
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24030) Upgrade ORC to 1.5.10

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24030:
--
Labels: pull-request-available  (was: )

> Upgrade ORC to 1.5.10
> -
>
> Key: HIVE-24030
> URL: https://issues.apache.org/jira/browse/HIVE-24030
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23952) Reuse VectorAggregationBuffer to reduce GC pressure in VectorGroupByOperator

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23952?focusedWorklogId=469430&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469430
 ]

ASF GitHub Bot logged work on HIVE-23952:
-

Author: ASF GitHub Bot
Created on: 11/Aug/20 21:54
Start Date: 11/Aug/20 21:54
Worklog Time Spent: 10m 
  Work Description: mustafaiman closed pull request #1337:
URL: https://github.com/apache/hive/pull/1337


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469430)
Time Spent: 40m  (was: 0.5h)

> Reuse VectorAggregationBuffer to reduce GC pressure in VectorGroupByOperator
> 
>
> Key: HIVE-23952
> URL: https://issues.apache.org/jira/browse/HIVE-23952
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: Screenshot 2020-07-30 at 7.38.13 AM.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> !Screenshot 2020-07-30 at 7.38.13 AM.png|width=1171,height=892!
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorGroupByOperator.java]
> {code:java}
> aggregationBuffer = allocateAggregationBuffer(); {code}
> Flushed out aggregation buffers could be reused instead of allocating 
> everytime here, to reduce GC pressure.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23975) Reuse evicted keys from aggregation buffers

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23975?focusedWorklogId=469429&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469429
 ]

ASF GitHub Bot logged work on HIVE-23975:
-

Author: ASF GitHub Bot
Created on: 11/Aug/20 21:54
Start Date: 11/Aug/20 21:54
Worklog Time Spent: 10m 
  Work Description: mustafaiman closed pull request #1352:
URL: https://github.com/apache/hive/pull/1352


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469429)
Time Spent: 50m  (was: 40m)

> Reuse evicted keys from aggregation buffers
> ---
>
> Key: HIVE-23975
> URL: https://issues.apache.org/jira/browse/HIVE-23975
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24011) Flaky test AsyncResponseHandlerTest

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24011?focusedWorklogId=469428&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469428
 ]

ASF GitHub Bot logged work on HIVE-24011:
-

Author: ASF GitHub Bot
Created on: 11/Aug/20 21:53
Start Date: 11/Aug/20 21:53
Worklog Time Spent: 10m 
  Work Description: mustafaiman closed pull request #1381:
URL: https://github.com/apache/hive/pull/1381


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469428)
Time Spent: 0.5h  (was: 20m)

> Flaky test AsyncResponseHandlerTest
> ---
>
> Key: HIVE-24011
> URL: https://issues.apache.org/jira/browse/HIVE-24011
> Project: Hive
>  Issue Type: Task
>Reporter: Pravin Sinha
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> http://ci.hive.apache.org/job/hive-precommit/job/PR-1343/6/testReport/junit/org.apache.hadoop.hive.llap/AsyncResponseHandlerTest/Testing___split_01___Archive___testStress/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23927) Cast to Timestamp generates different output for Integer & Float values

2020-08-11 Thread Jesus Camacho Rodriguez (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175829#comment-17175829
 ] 

Jesus Camacho Rodriguez commented on HIVE-23927:


[~prasad-acit], I remember discussing a similar type conversion issue in ORC 
too, since ORC schema evolution copied the automatic conversions that Hive was 
doing (probably in the context of ORC-539 / ORC-554). I think the conversion is 
uniform across different types in ORC now... That would cause some backwards 
compatibility issues, however I am not sure how common schema evolution from 
those types is. [~abstractdog], do you recall what was done for ORC?

Cc [~omalley]

> Cast to Timestamp generates different output for Integer & Float values 
> 
>
> Key: HIVE-23927
> URL: https://issues.apache.org/jira/browse/HIVE-23927
> Project: Hive
>  Issue Type: Bug
>Reporter: Renukaprasad C
>Priority: Major
>
> Double consider the input value as SECOND and converts into Millis internally.
> Whereas, Integer value will be considered as Millis and produce different 
> output.
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getTimestamp(Object,
>  PrimitiveObjectInspector, boolean) - Handles Integral & Decimal values 
> differently. This cause the issue.
> 0: jdbc:hive2://localhost:1> select cast(1.204135216E9 as timestamp) 
> Double2TimeStamp, cast(1204135216 as timestamp) Int2TimeStamp from abc 
> tablesample(1 rows);
> OK
> INFO  : Compiling 
> command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14): 
> select cast(1.204135216E9 as timestamp) Double2TimeStamp, cast(1204135216 as 
> timestamp) Int2TimeStamp from abc tablesample(1 rows)
> INFO  : Concurrency mode is disabled, not creating a lock manager
> INFO  : Semantic Analysis Completed (retrial = false)
> INFO  : Returning Hive schema: 
> Schema(fieldSchemas:[FieldSchema(name:double2timestamp, type:timestamp, 
> comment:null), FieldSchema(name:int2timestamp, type:timestamp, 
> comment:null)], properties:null)
> INFO  : Completed compiling 
> command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14); 
> Time taken: 0.175 seconds
> INFO  : Concurrency mode is disabled, not creating a lock manager
> INFO  : Executing 
> command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14): 
> select cast(1.204135216E9 as timestamp) Double2TimeStamp, cast(1204135216 as 
> timestamp) Int2TimeStamp from abc tablesample(1 rows)
> INFO  : Completed executing 
> command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14); 
> Time taken: 0.001 seconds
> INFO  : OK
> INFO  : Concurrency mode is disabled, not creating a lock manager
> ++--+
> |double2timestamp|  int2timestamp   |
> ++--+
> | 2008-02-27 18:00:16.0  | 1970-01-14 22:28:55.216  |
> ++--+



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24026) HMS/Ranger Spark view authorization plan

2020-08-11 Thread Sai Hemanth Gantasala (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175828#comment-17175828
 ] 

Sai Hemanth Gantasala commented on HIVE-24026:
--

+Using a Deferred View+: use a special tag, e.g. authorized=false, in table 
properties to indicate the view is created outside of HS2/Ranger authorization, 
meaning we didn't look inside the view when it was created, and because of 
that, we cannot really tell if it's legit to allow selection from the same view 
on HS2. 

So, the original view statement is just stored and your authorized flag is set 
to false. Now, each time we try to SELECT (from HS2), we have to check:
 # Is there a Ranger policy, giving me access to the view? If yes, proceed as 
if the view was authorized. This is assuming that only a superuser could have 
granted me SELECT privileges for the view.
 # If there is a Ranger policy, revoking access to the view, throw an error.
 # If there is no Ranger policy, resolve the view and check if we have access 
to each underlying table (just like executing the original SQL statement) -> 
deferred AuthZ check.

> HMS/Ranger Spark view authorization plan
> 
>
> Key: HIVE-24026
> URL: https://issues.apache.org/jira/browse/HIVE-24026
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Security
>Reporter: Sai Hemanth Gantasala
>Assignee: Sai Hemanth Gantasala
>Priority: Major
>
> Currently, Ranger disallows Spark from creating virtual views via HMS because 
> spark clients are normal users. We should have a capability where spark 
> client can create views in HS2.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23890) Create HMS endpoint for querying file lists using FlatBuffers as serialization

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23890?focusedWorklogId=469386&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469386
 ]

ASF GitHub Bot logged work on HIVE-23890:
-

Author: ASF GitHub Bot
Created on: 11/Aug/20 20:02
Start Date: 11/Aug/20 20:02
Worklog Time Spent: 10m 
  Work Description: vihangk1 commented on a change in pull request #1330:
URL: https://github.com/apache/hive/pull/1330#discussion_r468832097



##
File path: 
standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift
##
@@ -1861,6 +1861,19 @@ struct ScheduledQueryProgressInfo{
   4: optional string errorMessage,
 }
 
+struct GetFileListRequest {
+  1: optional string catName,
+  2: optional string dbName,
+  3: optional string tableName,
+  4: optional list partVals,

Review comment:
   I think there is a trade-off here. On larger tables with lots of 
partitions, doing multiple RPCs to the metastore for fetching the file-metadata 
one at a time not only is less efficient, it is likely that the 
ValidWriteIdList is updated for the table during the time and the cache hit 
ratio could go down. You are right about large data sent over network. In my 
experience the file-metadata which we are sending here is few hundred bytes per 
partition and the its not very large even for few thousands of the partition. 
If use a partitionNames list here in the request, clients can always do 
batching like requesting 1000 partitions at a time which would be more 
efficient.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469386)
Time Spent: 2h 20m  (was: 2h 10m)

> Create HMS endpoint for querying file lists using FlatBuffers as serialization
> --
>
> Key: HIVE-23890
> URL: https://issues.apache.org/jira/browse/HIVE-23890
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Barnabas Maidics
>Assignee: Barnabas Maidics
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> New thrift objects would be:
> {code:java}
> struct GetFileListRequest {
> 1: optional string catName,
> 2: required string dbName,
> 3: required string tableName,
> 4: required list partVals,
> 6: optional string validWriteIdList
> }
> struct GetFileListResponse {
> 1: required binary fileListData
> }
> {code}
> Where GetFileListResponse contains a binary field, which would be a 
> FlatBuffer object



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23927) Cast to Timestamp generates different output for Integer & Float values

2020-08-11 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175788#comment-17175788
 ] 

Renukaprasad C commented on HIVE-23927:
---

Thanks [~klcopp]
[~jcamachorodriguez], Please suggest how to proceed with this compatibility 
issue? Thank you.

> Cast to Timestamp generates different output for Integer & Float values 
> 
>
> Key: HIVE-23927
> URL: https://issues.apache.org/jira/browse/HIVE-23927
> Project: Hive
>  Issue Type: Bug
>Reporter: Renukaprasad C
>Priority: Major
>
> Double consider the input value as SECOND and converts into Millis internally.
> Whereas, Integer value will be considered as Millis and produce different 
> output.
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getTimestamp(Object,
>  PrimitiveObjectInspector, boolean) - Handles Integral & Decimal values 
> differently. This cause the issue.
> 0: jdbc:hive2://localhost:1> select cast(1.204135216E9 as timestamp) 
> Double2TimeStamp, cast(1204135216 as timestamp) Int2TimeStamp from abc 
> tablesample(1 rows);
> OK
> INFO  : Compiling 
> command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14): 
> select cast(1.204135216E9 as timestamp) Double2TimeStamp, cast(1204135216 as 
> timestamp) Int2TimeStamp from abc tablesample(1 rows)
> INFO  : Concurrency mode is disabled, not creating a lock manager
> INFO  : Semantic Analysis Completed (retrial = false)
> INFO  : Returning Hive schema: 
> Schema(fieldSchemas:[FieldSchema(name:double2timestamp, type:timestamp, 
> comment:null), FieldSchema(name:int2timestamp, type:timestamp, 
> comment:null)], properties:null)
> INFO  : Completed compiling 
> command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14); 
> Time taken: 0.175 seconds
> INFO  : Concurrency mode is disabled, not creating a lock manager
> INFO  : Executing 
> command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14): 
> select cast(1.204135216E9 as timestamp) Double2TimeStamp, cast(1204135216 as 
> timestamp) Int2TimeStamp from abc tablesample(1 rows)
> INFO  : Completed executing 
> command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14); 
> Time taken: 0.001 seconds
> INFO  : OK
> INFO  : Concurrency mode is disabled, not creating a lock manager
> ++--+
> |double2timestamp|  int2timestamp   |
> ++--+
> | 2008-02-27 18:00:16.0  | 1970-01-14 22:28:55.216  |
> ++--+



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23880) Bloom filters can be merged in a parallel way in VectorUDAFBloomFilterMerge

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23880?focusedWorklogId=469330&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469330
 ]

ASF GitHub Bot logged work on HIVE-23880:
-

Author: ASF GitHub Bot
Created on: 11/Aug/20 17:33
Start Date: 11/Aug/20 17:33
Worklog Time Spent: 10m 
  Work Description: mustafaiman commented on a change in pull request #1280:
URL: https://github.com/apache/hive/pull/1280#discussion_r468748810



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/VectorUDAFBloomFilterMerge.java
##
@@ -77,6 +75,211 @@ public void reset() {
   // Do not change the initial bytes which contain 
NumHashFunctions/NumBits!
   Arrays.fill(bfBytes, BloomKFilter.START_OF_SERIALIZED_LONGS, 
bfBytes.length, (byte) 0);
 }
+
+public boolean mergeBloomFilterBytesFromInputColumn(BytesColumnVector 
inputColumn,
+int batchSize, boolean selectedInUse, int[] selected, Configuration 
conf) {
+  // already set in previous iterations, no need to call initExecutor again
+  if (numThreads == 0) {
+return false;
+  }
+  if (executor == null) {
+initExecutor(conf, batchSize);
+if (!isParallel) {
+  return false;
+}
+  }
+
+  // split every bloom filter (represented by a part of a byte[]) across 
workers
+  for (int j = 0; j < batchSize; j++) {
+if (!selectedInUse && inputColumn.noNulls) {
+  splitVectorAcrossWorkers(workers, inputColumn.vector[j], 
inputColumn.start[j],
+  inputColumn.length[j]);
+} else if (!selectedInUse) {
+  if (!inputColumn.isNull[j]) {
+splitVectorAcrossWorkers(workers, inputColumn.vector[j], 
inputColumn.start[j],
+inputColumn.length[j]);
+  }
+} else if (inputColumn.noNulls) {
+  int i = selected[j];
+  splitVectorAcrossWorkers(workers, inputColumn.vector[i], 
inputColumn.start[i],
+  inputColumn.length[i]);
+} else {
+  int i = selected[j];
+  if (!inputColumn.isNull[i]) {
+splitVectorAcrossWorkers(workers, inputColumn.vector[i], 
inputColumn.start[i],
+inputColumn.length[i]);
+  }
+}
+  }
+
+  return true;
+}
+
+private void initExecutor(Configuration conf, int batchSize) {
+  numThreads = 
conf.getInt(HiveConf.ConfVars.TEZ_BLOOM_FILTER_MERGE_THREADS.varname,
+  HiveConf.ConfVars.TEZ_BLOOM_FILTER_MERGE_THREADS.defaultIntVal);
+  LOG.info("Number of threads used for bloom filter merge: {}", 
numThreads);
+
+  if (numThreads < 0) {
+throw new RuntimeException(
+"invalid number of threads for bloom filter merge: " + numThreads);
+  }
+  if (numThreads == 0) { // disable parallel feature
+return; // this will leave isParallel=false
+  }
+  isParallel = true;
+  executor = Executors.newFixedThreadPool(numThreads);
+
+  workers = new BloomFilterMergeWorker[numThreads];
+  for (int f = 0; f < numThreads; f++) {
+workers[f] = new BloomFilterMergeWorker(bfBytes, 0, bfBytes.length);
+  }
+
+  for (int f = 0; f < numThreads; f++) {
+executor.submit(workers[f]);
+  }
+}
+
+public int getNumberOfWaitingMergeTasks(){
+  int size = 0;
+  for (BloomFilterMergeWorker w : workers){
+size += w.queue.size();
+  }
+  return size;
+}
+
+public int getNumberOfMergingWorkers() {

Review comment:
   I see this method is used only for logging. What is the benefit of 
having that log? I am asking because if we get rid of this method, we can get 
rid of isMerging atomic variable.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469330)
Time Spent: 3h 40m  (was: 3.5h)

> Bloom filters can be merged in a parallel way in VectorUDAFBloomFilterMerge
> ---
>
> Key: HIVE-23880
> URL: https://issues.apache.org/jira/browse/HIVE-23880
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
> Attachments: lipwig-output3605036885489193068.svg
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Merging bloom filters in semijoin reduction can become the main bottleneck in 
> case of large number of source mapper tasks

[jira] [Work logged] (HIVE-23880) Bloom filters can be merged in a parallel way in VectorUDAFBloomFilterMerge

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23880?focusedWorklogId=469328&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469328
 ]

ASF GitHub Bot logged work on HIVE-23880:
-

Author: ASF GitHub Bot
Created on: 11/Aug/20 17:27
Start Date: 11/Aug/20 17:27
Worklog Time Spent: 10m 
  Work Description: mustafaiman commented on a change in pull request #1280:
URL: https://github.com/apache/hive/pull/1280#discussion_r468723475



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorGroupByOperator.java
##
@@ -252,6 +258,13 @@ protected VectorAggregationBufferRow 
allocateAggregationBuffer() throws HiveExce
   return bufferSet;
 }
 
+protected void finishAggregators(boolean aborted) {

Review comment:
   Instead of `finishAggregators`, can you make this method default `close` 
method for `ProcessingModeBase` and call `super.close(boolean)` from close 
methods of appropriate subclasses. That way common finalization code would be 
in `close` of common super class and specific finalization code would be in 
`close` method of each subclass.

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorGroupByOperator.java
##
@@ -517,6 +532,10 @@ public void close(boolean aborted) throws HiveException {
 
 }
 
+//TODO: implement finishAggregators
+protected void finishAggregators(boolean aborted) {

Review comment:
   What about this mode? Seems not complete.

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorGroupByOperator.java
##
@@ -1126,6 +1137,7 @@ protected void initializeOp(Configuration hconf) throws 
HiveException {
 VectorAggregateExpression vecAggrExpr = null;
 try {
   vecAggrExpr = ctor.newInstance(vecAggrDesc);
+  vecAggrExpr.withConf(hconf);

Review comment:
   Why is `withConf` a seperate method? Conf should be a parameter to 
VectorAggregateExpression's constructor.

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/VectorUDAFBloomFilterMerge.java
##
@@ -77,6 +75,211 @@ public void reset() {
   // Do not change the initial bytes which contain 
NumHashFunctions/NumBits!
   Arrays.fill(bfBytes, BloomKFilter.START_OF_SERIALIZED_LONGS, 
bfBytes.length, (byte) 0);
 }
+
+public boolean mergeBloomFilterBytesFromInputColumn(BytesColumnVector 
inputColumn,
+int batchSize, boolean selectedInUse, int[] selected, Configuration 
conf) {
+  // already set in previous iterations, no need to call initExecutor again
+  if (numThreads == 0) {
+return false;
+  }
+  if (executor == null) {
+initExecutor(conf, batchSize);
+if (!isParallel) {
+  return false;
+}
+  }
+
+  // split every bloom filter (represented by a part of a byte[]) across 
workers
+  for (int j = 0; j < batchSize; j++) {
+if (!selectedInUse && inputColumn.noNulls) {
+  splitVectorAcrossWorkers(workers, inputColumn.vector[j], 
inputColumn.start[j],
+  inputColumn.length[j]);
+} else if (!selectedInUse) {
+  if (!inputColumn.isNull[j]) {
+splitVectorAcrossWorkers(workers, inputColumn.vector[j], 
inputColumn.start[j],
+inputColumn.length[j]);
+  }
+} else if (inputColumn.noNulls) {
+  int i = selected[j];
+  splitVectorAcrossWorkers(workers, inputColumn.vector[i], 
inputColumn.start[i],
+  inputColumn.length[i]);
+} else {
+  int i = selected[j];
+  if (!inputColumn.isNull[i]) {
+splitVectorAcrossWorkers(workers, inputColumn.vector[i], 
inputColumn.start[i],
+inputColumn.length[i]);
+  }
+}
+  }
+
+  return true;
+}
+
+private void initExecutor(Configuration conf, int batchSize) {
+  numThreads = 
conf.getInt(HiveConf.ConfVars.TEZ_BLOOM_FILTER_MERGE_THREADS.varname,
+  HiveConf.ConfVars.TEZ_BLOOM_FILTER_MERGE_THREADS.defaultIntVal);
+  LOG.info("Number of threads used for bloom filter merge: {}", 
numThreads);
+
+  if (numThreads < 0) {
+throw new RuntimeException(
+"invalid number of threads for bloom filter merge: " + numThreads);
+  }
+  if (numThreads == 0) { // disable parallel feature
+return; // this will leave isParallel=false
+  }
+  isParallel = true;
+  executor = Executors.newFixedThreadPool(numThreads);
+
+  workers = new BloomFilterMergeWorker[numThreads];
+  for (int f = 0; f < numThreads; f++) {
+workers[f] = new BloomFilterMergeWorker(bfBytes, 0, bfBytes.length);
+  }
+
+  for (int f = 0; f < numThreads; f++) {
+executor.submit(workers[f]);
+  }
+}
+
+public int getNumberOfWaitingMergeTasks(){
+ 

[jira] [Assigned] (HIVE-24026) HMS/Ranger Spark view authorization plan

2020-08-11 Thread Sai Hemanth Gantasala (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sai Hemanth Gantasala reassigned HIVE-24026:



> HMS/Ranger Spark view authorization plan
> 
>
> Key: HIVE-24026
> URL: https://issues.apache.org/jira/browse/HIVE-24026
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Security
>Reporter: Sai Hemanth Gantasala
>Assignee: Sai Hemanth Gantasala
>Priority: Major
>
> Currently, Ranger disallows Spark from creating virtual views via HMS because 
> spark clients are normal users. We should have a capability where spark 
> client can create views in HS2.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22126) hive-exec packaging should shade guava

2020-08-11 Thread zhengchenyu (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175634#comment-17175634
 ] 

zhengchenyu commented on HIVE-22126:


[~euigeun_chung] I solve this problem, I exclude calcite package from 
bin.tar.gz. Sorry for a huge gap, my version(3.2.1) is too older than master.

> hive-exec packaging should shade guava
> --
>
> Key: HIVE-22126
> URL: https://issues.apache.org/jira/browse/HIVE-22126
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Eugene Chung
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22126.01.patch, HIVE-22126.02.patch, 
> HIVE-22126.03.patch, HIVE-22126.04.patch, HIVE-22126.05.patch, 
> HIVE-22126.06.patch, HIVE-22126.07.patch, HIVE-22126.08.patch, 
> HIVE-22126.09.patch, HIVE-22126.09.patch, HIVE-22126.09.patch, 
> HIVE-22126.09.patch, HIVE-22126.09.patch
>
>
> The ql/pom.xml includes complete guava library into hive-exec.jar 
> https://github.com/apache/hive/blob/master/ql/pom.xml#L990 This causes a 
> problems for downstream clients of hive which have hive-exec.jar in their 
> classpath since they are pinned to the same guava version as that of hive. 
> We should shade guava classes so that other components which depend on 
> hive-exec can independently use a different version of guava as needed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (HIVE-22126) hive-exec packaging should shade guava

2020-08-11 Thread zhengchenyu (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhengchenyu updated HIVE-22126:
---
Comment: was deleted

(was: [~euigeun_chung] I think it 's not a version problem. I think we wanna 
need to shade guava, we need shade all guava from all other component(for 
example: hadoop, spark), not but work well on a specific version. )

> hive-exec packaging should shade guava
> --
>
> Key: HIVE-22126
> URL: https://issues.apache.org/jira/browse/HIVE-22126
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Eugene Chung
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22126.01.patch, HIVE-22126.02.patch, 
> HIVE-22126.03.patch, HIVE-22126.04.patch, HIVE-22126.05.patch, 
> HIVE-22126.06.patch, HIVE-22126.07.patch, HIVE-22126.08.patch, 
> HIVE-22126.09.patch, HIVE-22126.09.patch, HIVE-22126.09.patch, 
> HIVE-22126.09.patch, HIVE-22126.09.patch
>
>
> The ql/pom.xml includes complete guava library into hive-exec.jar 
> https://github.com/apache/hive/blob/master/ql/pom.xml#L990 This causes a 
> problems for downstream clients of hive which have hive-exec.jar in their 
> classpath since they are pinned to the same guava version as that of hive. 
> We should shade guava classes so that other components which depend on 
> hive-exec can independently use a different version of guava as needed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24025) Add getTable and getAggrStatsFor to HS2 cache

2020-08-11 Thread Soumyakanti Das (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Soumyakanti Das reassigned HIVE-24025:
--


> Add getTable and getAggrStatsFor to HS2 cache
> -
>
> Key: HIVE-24025
> URL: https://issues.apache.org/jira/browse/HIVE-24025
> Project: Hive
>  Issue Type: New Feature
>  Components: HiveServer2
>Reporter: Soumyakanti Das
>Assignee: Soumyakanti Das
>Priority: Minor
>
> getAggrColStats API takes a long time to run in HMS. Adding it to the HS2 
> local cache can reduce the query compilation time significantly.
> Local cache was introduced in https://issues.apache.org/jira/browse/HIVE-23949



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23958) HiveServer2 should support additional keystore/truststores types besides JKS

2020-08-11 Thread Naveen Gangam (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175580#comment-17175580
 ] 

Naveen Gangam commented on HIVE-23958:
--

[~krisden] Typically in hive, there is a 1 day layover between approval and 
commit. I will commit this today.

> HiveServer2 should support additional keystore/truststores types besides JKS
> 
>
> Key: HIVE-23958
> URL: https://issues.apache.org/jira/browse/HIVE-23958
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Kevin Risden
>Assignee: Kevin Risden
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Currently HiveServer2 (through Jetty and Thrift) only supports JKS (and 
> PCKS12 based on JDK fallback) keystore/truststore types. There are additional 
> keystore/truststore types used for different applications like for FIPS 
> crypto algorithms. HS2 should support the default keystore type specified for 
> the JDK and not always use JKS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-23959) Provide an option to wipe out column stats for partitioned tables in case of column removal

2020-08-11 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich resolved HIVE-23959.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

pushed to master. Thank you Peter for reviewing the changes!

> Provide an option to wipe out column stats for partitioned tables in case of 
> column removal
> ---
>
> Key: HIVE-23959
> URL: https://issues.apache.org/jira/browse/HIVE-23959
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> in case of column removal / replacement - an update for each partition is 
> neccessary; which could take a while.
> goal here is to provide an option to switch to the bulk removal of column 
> statistics instead of working hard to retain as much as possible from the old 
> stats.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23959) Provide an option to wipe out column stats for partitioned tables in case of column removal

2020-08-11 Thread Zoltan Haindrich (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175578#comment-17175578
 ] 

Zoltan Haindrich commented on HIVE-23959:
-

[~yhaya]: sorry, I missed your comment.

Yes, the key difference when this feature is enabled that carefull  1-by-1 
partition update is skipped - instead; it will remove all column statistics for 
all partitions of the table. It will only execute a few queries - independently 
from the number of partitions - so it will be quick.

> Provide an option to wipe out column stats for partitioned tables in case of 
> column removal
> ---
>
> Key: HIVE-23959
> URL: https://issues.apache.org/jira/browse/HIVE-23959
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> in case of column removal / replacement - an update for each partition is 
> neccessary; which could take a while.
> goal here is to provide an option to switch to the bulk removal of column 
> statistics instead of working hard to retain as much as possible from the old 
> stats.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23927) Cast to Timestamp generates different output for Integer & Float values

2020-08-11 Thread Karen Coppage (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175576#comment-17175576
 ] 

Karen Coppage commented on HIVE-23927:
--

I think [~jcamachorodriguez] is your person.

> Cast to Timestamp generates different output for Integer & Float values 
> 
>
> Key: HIVE-23927
> URL: https://issues.apache.org/jira/browse/HIVE-23927
> Project: Hive
>  Issue Type: Bug
>Reporter: Renukaprasad C
>Priority: Major
>
> Double consider the input value as SECOND and converts into Millis internally.
> Whereas, Integer value will be considered as Millis and produce different 
> output.
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getTimestamp(Object,
>  PrimitiveObjectInspector, boolean) - Handles Integral & Decimal values 
> differently. This cause the issue.
> 0: jdbc:hive2://localhost:1> select cast(1.204135216E9 as timestamp) 
> Double2TimeStamp, cast(1204135216 as timestamp) Int2TimeStamp from abc 
> tablesample(1 rows);
> OK
> INFO  : Compiling 
> command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14): 
> select cast(1.204135216E9 as timestamp) Double2TimeStamp, cast(1204135216 as 
> timestamp) Int2TimeStamp from abc tablesample(1 rows)
> INFO  : Concurrency mode is disabled, not creating a lock manager
> INFO  : Semantic Analysis Completed (retrial = false)
> INFO  : Returning Hive schema: 
> Schema(fieldSchemas:[FieldSchema(name:double2timestamp, type:timestamp, 
> comment:null), FieldSchema(name:int2timestamp, type:timestamp, 
> comment:null)], properties:null)
> INFO  : Completed compiling 
> command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14); 
> Time taken: 0.175 seconds
> INFO  : Concurrency mode is disabled, not creating a lock manager
> INFO  : Executing 
> command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14): 
> select cast(1.204135216E9 as timestamp) Double2TimeStamp, cast(1204135216 as 
> timestamp) Int2TimeStamp from abc tablesample(1 rows)
> INFO  : Completed executing 
> command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14); 
> Time taken: 0.001 seconds
> INFO  : OK
> INFO  : Concurrency mode is disabled, not creating a lock manager
> ++--+
> |double2timestamp|  int2timestamp   |
> ++--+
> | 2008-02-27 18:00:16.0  | 1970-01-14 22:28:55.216  |
> ++--+



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24024) Improve logging around CompactionTxnHandler

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24024?focusedWorklogId=469209&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469209
 ]

ASF GitHub Bot logged work on HIVE-24024:
-

Author: ASF GitHub Bot
Created on: 11/Aug/20 13:25
Start Date: 11/Aug/20 13:25
Worklog Time Spent: 10m 
  Work Description: klcopp opened a new pull request #1389:
URL: https://github.com/apache/hive/pull/1389


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469209)
Remaining Estimate: 0h
Time Spent: 10m

> Improve logging around CompactionTxnHandler
> ---
>
> Key: HIVE-24024
> URL: https://issues.apache.org/jira/browse/HIVE-24024
> Project: Hive
>  Issue Type: Improvement
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> CompactionTxnHandler often doesn't log the preparedStatement parameters, 
> which is really painful when compaction isn't working the way it should. Also 
> expand logging around compaction Cleaner, Initiator, Worker. And some 
> formatting cleanup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24024) Improve logging around CompactionTxnHandler

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24024:
--
Labels: pull-request-available  (was: )

> Improve logging around CompactionTxnHandler
> ---
>
> Key: HIVE-24024
> URL: https://issues.apache.org/jira/browse/HIVE-24024
> Project: Hive
>  Issue Type: Improvement
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> CompactionTxnHandler often doesn't log the preparedStatement parameters, 
> which is really painful when compaction isn't working the way it should. Also 
> expand logging around compaction Cleaner, Initiator, Worker. And some 
> formatting cleanup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22932) Unable to kill Beeline with Ctrl+C

2020-08-11 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175566#comment-17175566
 ] 

Renukaprasad C commented on HIVE-22932:
---

Sure. Thank you [~wenjunma003].

> Unable to kill Beeline with Ctrl+C
> --
>
> Key: HIVE-22932
> URL: https://issues.apache.org/jira/browse/HIVE-22932
> Project: Hive
>  Issue Type: Bug
>Reporter: Renukaprasad C
>Assignee: wenjun ma
>Priority: Blocker
>
> Stopped the server and tried to stop the beeline console with "Ctrl+C". But 
> it unable to kill the process & gets process gets hanged. 
> Read call got blocked. 
> Attached the thread dump.
> 0: jdbc:hive2://localhost:1> show tables;
> Unknown HS2 problem when communicating with Thrift server.
> Error: org.apache.thrift.transport.TTransportException: 
> java.net.SocketException: Broken pipe (state=08S01,code=0)
> 0: jdbc:hive2://localhost:1> Interrupting... Please be patient this may 
> take some time.
> Interrupting... Please be patient this may take some time.
> Interrupting... Please be patient this may take some time.
> Interrupting... Please be patient this may take some time.
> Interrupting... Please be patient this may take some time.
> Interrupting... Please be patient this may take some time.
> Interrupting... Please be patient this may take some time.
> Interrupting... Please be patient this may take some time.
> Interrupting... Please be patient this may take some time.
> Interrupting... Please be patient this may take some time.
> Interrupting... Please be patient this may take some time.
> Interrupting... Please be patient this may take some time.
> Interrupting... Please be patient this may take some time.
> Interrupting... Please be patient this may take some time.
> Interrupting... Please be patient this may take some time.
> Interrupting... Please be patient this may take some time.
> Interrupting... Please be patient this may take some time.
> Interrupting... Please be patient this may take some time.
> 2020-02-26 17:40:42
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.72-b15 mixed mode):
> "NonBlockingInputStreamThread" #16 daemon prio=5 os_prio=0 
> tid=0x7f0318c10800 nid=0x258c in Object.wait() [0x7f031c193000]
>java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0xfe9113c0> (a 
> jline.internal.NonBlockingInputStream)
> at 
> jline.internal.NonBlockingInputStream.run(NonBlockingInputStream.java:278)
> - locked <0xfe9113c0> (a 
> jline.internal.NonBlockingInputStream)
> at java.lang.Thread.run(Thread.java:745)
> "Service Thread" #11 daemon prio=9 os_prio=0 tid=0x7f032006c000 
> nid=0x257b runnable [0x]
>java.lang.Thread.State: RUNNABLE
> "C1 CompilerThread3" #10 daemon prio=9 os_prio=0 tid=0x7f0320060800 
> nid=0x257a waiting on condition [0x]
>java.lang.Thread.State: RUNNABLE
> "C2 CompilerThread2" #9 daemon prio=9 os_prio=0 tid=0x7f0320056000 
> nid=0x2579 waiting on condition [0x]
>java.lang.Thread.State: RUNNABLE
> "C2 CompilerThread1" #8 daemon prio=9 os_prio=0 tid=0x7f0320054000 
> nid=0x2578 waiting on condition [0x]
>java.lang.Thread.State: RUNNABLE
> "C2 CompilerThread0" #7 daemon prio=9 os_prio=0 tid=0x7f0320051000 
> nid=0x2577 waiting on condition [0x]
>java.lang.Thread.State: RUNNABLE
> "JDWP Event Helper Thread" #6 daemon prio=10 os_prio=0 tid=0x7f032004f000 
> nid=0x2576 runnable [0x]
>java.lang.Thread.State: RUNNABLE
> "JDWP Transport Listener: dt_socket" #5 daemon prio=10 os_prio=0 
> tid=0x7f032004b800 nid=0x2575 runnable [0x]
>java.lang.Thread.State: RUNNABLE
> "Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x7f0320035800 
> nid=0x2574 waiting on condition [0x]
>java.lang.Thread.State: RUNNABLE
> "Finalizer" #3 daemon prio=8 os_prio=0 tid=0x7f0320003800 nid=0x2572 in 
> Object.wait() [0x7f0324b1c000]
>java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0xfe930770> (a 
> java.lang.ref.ReferenceQueue$Lock)
> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
> - locked <0xfe930770> (a java.lang.ref.ReferenceQueue$Lock)
> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
> at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
> "Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x006a6000 
> nid=0x2571 in Object.wait() [0x7f0324c1d000]
>java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Nati

[jira] [Work logged] (HIVE-23958) HiveServer2 should support additional keystore/truststores types besides JKS

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23958?focusedWorklogId=469208&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469208
 ]

ASF GitHub Bot logged work on HIVE-23958:
-

Author: ASF GitHub Bot
Created on: 11/Aug/20 13:19
Start Date: 11/Aug/20 13:19
Worklog Time Spent: 10m 
  Work Description: risdenk commented on pull request #1342:
URL: https://github.com/apache/hive/pull/1342#issuecomment-671942388


   Thanks for the review @nrg4878  is there anything more to do to merge? 
Thanks!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469208)
Time Spent: 50m  (was: 40m)

> HiveServer2 should support additional keystore/truststores types besides JKS
> 
>
> Key: HIVE-23958
> URL: https://issues.apache.org/jira/browse/HIVE-23958
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Kevin Risden
>Assignee: Kevin Risden
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Currently HiveServer2 (through Jetty and Thrift) only supports JKS (and 
> PCKS12 based on JDK fallback) keystore/truststore types. There are additional 
> keystore/truststore types used for different applications like for FIPS 
> crypto algorithms. HS2 should support the default keystore type specified for 
> the JDK and not always use JKS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23958) HiveServer2 should support additional keystore/truststores types besides JKS

2020-08-11 Thread Kevin Risden (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175564#comment-17175564
 ] 

Kevin Risden commented on HIVE-23958:
-

Thanks for the review [~ngangam] is there anything more to do to merge? Thanks!

> HiveServer2 should support additional keystore/truststores types besides JKS
> 
>
> Key: HIVE-23958
> URL: https://issues.apache.org/jira/browse/HIVE-23958
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Kevin Risden
>Assignee: Kevin Risden
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently HiveServer2 (through Jetty and Thrift) only supports JKS (and 
> PCKS12 based on JDK fallback) keystore/truststore types. There are additional 
> keystore/truststore types used for different applications like for FIPS 
> crypto algorithms. HS2 should support the default keystore type specified for 
> the JDK and not always use JKS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24024) Improve logging around CompactionTxnHandler

2020-08-11 Thread Karen Coppage (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage reassigned HIVE-24024:



> Improve logging around CompactionTxnHandler
> ---
>
> Key: HIVE-24024
> URL: https://issues.apache.org/jira/browse/HIVE-24024
> Project: Hive
>  Issue Type: Improvement
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>
> CompactionTxnHandler often doesn't log the preparedStatement parameters, 
> which is really painful when compaction isn't working the way it should. Also 
> expand logging around compaction Cleaner, Initiator, Worker. And some 
> formatting cleanup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23880) Bloom filters can be merged in a parallel way in VectorUDAFBloomFilterMerge

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23880?focusedWorklogId=469182&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469182
 ]

ASF GitHub Bot logged work on HIVE-23880:
-

Author: ASF GitHub Bot
Created on: 11/Aug/20 12:12
Start Date: 11/Aug/20 12:12
Worklog Time Spent: 10m 
  Work Description: pgaref commented on a change in pull request #1280:
URL: https://github.com/apache/hive/pull/1280#discussion_r468513649



##
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
##
@@ -4330,6 +4330,12 @@ private static void 
populateLlapDaemonVarsSet(Set llapDaemonVarsSetLocal
 "Bloom filter should be of at max certain size to be effective"),
 TEZ_BLOOM_FILTER_FACTOR("hive.tez.bloom.filter.factor", (float) 1.0,
 "Bloom filter should be a multiple of this factor with nDV"),
+TEZ_BLOOM_FILTER_MERGE_THREADS("hive.tez.bloom.filter.merge.threads", 1,
+"How many threads are used for merging bloom filters?\n"

Review comment:
   The number of threads used variable is actually in **addition to tasks 
main threads** -- I would make this a bit clearer

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/VectorAggregateExpression.java
##
@@ -20,24 +20,25 @@
 
 import java.io.Serializable;
 
+import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.hive.common.type.DataTypePhysicalVariation;
 import org.apache.hadoop.hive.ql.exec.vector.ColumnVector;
 import org.apache.hadoop.hive.ql.exec.vector.VectorAggregationBufferRow;
 import org.apache.hadoop.hive.ql.exec.vector.VectorAggregationDesc;
 import org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch;
 import org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression;
 import org.apache.hadoop.hive.ql.metadata.HiveException;
-import org.apache.hadoop.hive.ql.plan.AggregationDesc;
 import org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator;
-import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
 import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 import org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.Mode;
 
 /**
  * Base class for aggregation expressions.
  */
 public abstract class VectorAggregateExpression  implements Serializable {
-
+  protected final Logger LOG = LoggerFactory.getLogger(getClass().getName());

Review comment:
   Should we make this static? Do we really want an instance per Expr?
   PS: it also seems that we dont use it all below..

##
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
##
@@ -4330,6 +4330,12 @@ private static void 
populateLlapDaemonVarsSet(Set llapDaemonVarsSetLocal
 "Bloom filter should be of at max certain size to be effective"),
 TEZ_BLOOM_FILTER_FACTOR("hive.tez.bloom.filter.factor", (float) 1.0,
 "Bloom filter should be a multiple of this factor with nDV"),
+TEZ_BLOOM_FILTER_MERGE_THREADS("hive.tez.bloom.filter.merge.threads", 1,
+"How many threads are used for merging bloom filters?\n"
++ "-1: sanity check, it will fail if execution hits bloom filter 
merge codepath\n"
++ " 0: feature is disabled\n"

Review comment:
   feature disabled -- use only task main thread for BF merging

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/VectorUDAFBloomFilterMerge.java
##
@@ -77,6 +75,211 @@ public void reset() {
   // Do not change the initial bytes which contain 
NumHashFunctions/NumBits!
   Arrays.fill(bfBytes, BloomKFilter.START_OF_SERIALIZED_LONGS, 
bfBytes.length, (byte) 0);
 }
+

Review comment:
   Could add a comment describing the return boolean value 

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/VectorUDAFBloomFilterMerge.java
##
@@ -77,6 +75,211 @@ public void reset() {
   // Do not change the initial bytes which contain 
NumHashFunctions/NumBits!
   Arrays.fill(bfBytes, BloomKFilter.START_OF_SERIALIZED_LONGS, 
bfBytes.length, (byte) 0);
 }
+
+public boolean mergeBloomFilterBytesFromInputColumn(BytesColumnVector 
inputColumn,
+int batchSize, boolean selectedInUse, int[] selected, Configuration 
conf) {
+  // already set in previous iterations, no need to call initExecutor again
+  if (numThreads == 0) {
+return false;
+  }
+  if (executor == null) {
+initExecutor(conf, batchSize);
+if (!isParallel) {
+  return false;
+}
+  }
+
+  // split every bloom filter (represented by a part of a byte[]) across 
workers
+  for (int j = 0; j < batchSize; j++) {
+if (!selectedInUse && inputColumn.noNulls) {
+  splitVectorAcrossWo

[jira] [Commented] (HIVE-23995) Don't set location for managed tables in case of repl load

2020-08-11 Thread Pravin Sinha (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175496#comment-17175496
 ] 

Pravin Sinha commented on HIVE-23995:
-

+1

> Don't set location for managed tables in case of repl load
> --
>
> Key: HIVE-23995
> URL: https://issues.apache.org/jira/browse/HIVE-23995
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23995.01.patch, HIVE-23995.02.patch, 
> HIVE-23995.03.patch, HIVE-23995.04.patch, HIVE-23995.05.patch, 
> HIVE-23995.06.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24021) Read insert-only tables truncated by Impala correctly

2020-08-11 Thread Karen Coppage (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175494#comment-17175494
 ] 

Karen Coppage commented on HIVE-24021:
--

HIVE-24023 is also required for reading Impala-truncated insert-only parquet 
tables.

> Read insert-only tables truncated by Impala correctly
> -
>
> Key: HIVE-24021
> URL: https://issues.apache.org/jira/browse/HIVE-24021
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Impala truncates insert-only tables by writing a base directory containing an 
> empty file named "_empty". (Like Hive should, see HIVE-20137) Generally in 
> Hive a file name beginning with an underscore connotes a temporary file that 
> isn't supposed to be read by operations that didn't create it.
>  Before HIVE-23495, getAcidState listed each directory in the table 
> (HdfsUtils#listLocatedStatus) – and filtered out directories with names 
> beginning with an underscore or period as they are presumably temporary. This 
> allowed files called "_empty" to be read, since hive checked the directory 
> name and not the file name.
>  After HIVE-23495, we recursively list each file in the table 
> (AcidUtils#getHdfsDirSnapshots) with a filter that doesn't accept files with 
> names beginning with an underscore or period as they are presumably 
> temporary. As a result Hive reads the table data as if the truncate operation 
> had not happened.
> Since performance in getAcidState is important, probably the best solution is 
> make an exception in the filter and accept files with the name "_empty".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24023) Hive parquet reader can't read files with length=0

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24023:
--
Labels: pull-request-available  (was: )

> Hive parquet reader can't read files with length=0
> --
>
> Key: HIVE-24023
> URL: https://issues.apache.org/jira/browse/HIVE-24023
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Impala truncates insert-only parquet tables by creating a base directory 
> containing a completely empty file.
> Hive throws an exception upon reading when it looks for metadata:
> {code:java}
> Error: java.io.IOException: java.lang.RuntimeException:  is not a 
> Parquet file (too small length: 0) (state=,code=0){code}
> We can introduce a check for an empty file before Hive tries to read the 
> metadata.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24023) Hive parquet reader can't read files with length=0

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24023?focusedWorklogId=469172&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469172
 ]

ASF GitHub Bot logged work on HIVE-24023:
-

Author: ASF GitHub Bot
Created on: 11/Aug/20 11:43
Start Date: 11/Aug/20 11:43
Worklog Time Spent: 10m 
  Work Description: klcopp opened a new pull request #1388:
URL: https://github.com/apache/hive/pull/1388


   See HIVE-24023 for details.
   
   # How was this patch tested?
   Unit test



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469172)
Remaining Estimate: 0h
Time Spent: 10m

> Hive parquet reader can't read files with length=0
> --
>
> Key: HIVE-24023
> URL: https://issues.apache.org/jira/browse/HIVE-24023
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Impala truncates insert-only parquet tables by creating a base directory 
> containing a completely empty file.
> Hive throws an exception upon reading when it looks for metadata:
> {code:java}
> Error: java.io.IOException: java.lang.RuntimeException:  is not a 
> Parquet file (too small length: 0) (state=,code=0){code}
> We can introduce a check for an empty file before Hive tries to read the 
> metadata.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24023) Hive parquet reader can't read files with length=0

2020-08-11 Thread Karen Coppage (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage reassigned HIVE-24023:



> Hive parquet reader can't read files with length=0
> --
>
> Key: HIVE-24023
> URL: https://issues.apache.org/jira/browse/HIVE-24023
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>
> Impala truncates insert-only parquet tables by creating a base directory 
> containing a completely empty file.
> Hive throws an exception upon reading when it looks for metadata:
> {code:java}
> Error: java.io.IOException: java.lang.RuntimeException:  is not a 
> Parquet file (too small length: 0) (state=,code=0){code}
> We can introduce a check for an empty file before Hive tries to read the 
> metadata.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23995) Don't set location for managed tables in case of repl load

2020-08-11 Thread Aasha Medhi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-23995:
---
Attachment: HIVE-23995.06.patch
Status: Patch Available  (was: In Progress)

> Don't set location for managed tables in case of repl load
> --
>
> Key: HIVE-23995
> URL: https://issues.apache.org/jira/browse/HIVE-23995
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23995.01.patch, HIVE-23995.02.patch, 
> HIVE-23995.03.patch, HIVE-23995.04.patch, HIVE-23995.05.patch, 
> HIVE-23995.06.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23995) Don't set location for managed tables in case of repl load

2020-08-11 Thread Aasha Medhi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-23995:
---
Status: In Progress  (was: Patch Available)

> Don't set location for managed tables in case of repl load
> --
>
> Key: HIVE-23995
> URL: https://issues.apache.org/jira/browse/HIVE-23995
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23995.01.patch, HIVE-23995.02.patch, 
> HIVE-23995.03.patch, HIVE-23995.04.patch, HIVE-23995.05.patch, 
> HIVE-23995.06.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-23982) TestStatsReplicationScenariosMigrationNoAutogather is flaky

2020-08-11 Thread Aasha Medhi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi resolved HIVE-23982.

Resolution: Won't Fix

> TestStatsReplicationScenariosMigrationNoAutogather is flaky
> ---
>
> Key: HIVE-23982
> URL: https://issues.apache.org/jira/browse/HIVE-23982
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Arko Sharma
>Priority: Major
>
> http://ci.hive.apache.org/job/hive-precommit/job/master/148/testReport/junit/org.apache.hadoop.hive.ql.parse/TestStatsReplicationScenariosMigrationNoAutogather/Testing___split_16___Archive___testRetryFailure/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23982) TestStatsReplicationScenariosMigrationNoAutogather is flaky

2020-08-11 Thread Aasha Medhi (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175478#comment-17175478
 ] 

Aasha Medhi commented on HIVE-23982:


Removing migration code and test as part of HIVE-23982

> TestStatsReplicationScenariosMigrationNoAutogather is flaky
> ---
>
> Key: HIVE-23982
> URL: https://issues.apache.org/jira/browse/HIVE-23982
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Arko Sharma
>Priority: Major
>
> http://ci.hive.apache.org/job/hive-precommit/job/master/148/testReport/junit/org.apache.hadoop.hive.ql.parse/TestStatsReplicationScenariosMigrationNoAutogather/Testing___split_16___Archive___testRetryFailure/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24021) Read insert-only tables truncated by Impala correctly

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24021?focusedWorklogId=469170&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469170
 ]

ASF GitHub Bot logged work on HIVE-24021:
-

Author: ASF GitHub Bot
Created on: 11/Aug/20 11:34
Start Date: 11/Aug/20 11:34
Worklog Time Spent: 10m 
  Work Description: klcopp closed pull request #1384:
URL: https://github.com/apache/hive/pull/1384


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469170)
Time Spent: 1.5h  (was: 1h 20m)

> Read insert-only tables truncated by Impala correctly
> -
>
> Key: HIVE-24021
> URL: https://issues.apache.org/jira/browse/HIVE-24021
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Impala truncates insert-only tables by writing a base directory containing an 
> empty file named "_empty". (Like Hive should, see HIVE-20137) Generally in 
> Hive a file name beginning with an underscore connotes a temporary file that 
> isn't supposed to be read by operations that didn't create it.
>  Before HIVE-23495, getAcidState listed each directory in the table 
> (HdfsUtils#listLocatedStatus) – and filtered out directories with names 
> beginning with an underscore or period as they are presumably temporary. This 
> allowed files called "_empty" to be read, since hive checked the directory 
> name and not the file name.
>  After HIVE-23495, we recursively list each file in the table 
> (AcidUtils#getHdfsDirSnapshots) with a filter that doesn't accept files with 
> names beginning with an underscore or period as they are presumably 
> temporary. As a result Hive reads the table data as if the truncate operation 
> had not happened.
> Since performance in getAcidState is important, probably the best solution is 
> make an exception in the filter and accept files with the name "_empty".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24021) Read insert-only tables truncated by Impala correctly

2020-08-11 Thread Karen Coppage (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage resolved HIVE-24021.
--
Resolution: Won't Fix

Discussed with Impala devs, they're willing to drop the underscore from the 
file name.

> Read insert-only tables truncated by Impala correctly
> -
>
> Key: HIVE-24021
> URL: https://issues.apache.org/jira/browse/HIVE-24021
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Impala truncates insert-only tables by writing a base directory containing an 
> empty file named "_empty". (Like Hive should, see HIVE-20137) Generally in 
> Hive a file name beginning with an underscore connotes a temporary file that 
> isn't supposed to be read by operations that didn't create it.
>  Before HIVE-23495, getAcidState listed each directory in the table 
> (HdfsUtils#listLocatedStatus) – and filtered out directories with names 
> beginning with an underscore or period as they are presumably temporary. This 
> allowed files called "_empty" to be read, since hive checked the directory 
> name and not the file name.
>  After HIVE-23495, we recursively list each file in the table 
> (AcidUtils#getHdfsDirSnapshots) with a filter that doesn't accept files with 
> names beginning with an underscore or period as they are presumably 
> temporary. As a result Hive reads the table data as if the truncate operation 
> had not happened.
> Since performance in getAcidState is important, probably the best solution is 
> make an exception in the filter and accept files with the name "_empty".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-23986) flaky TestStatsReplicationScenariosMigration.testMetadataOnlyDump

2020-08-11 Thread Aasha Medhi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi resolved HIVE-23986.

Resolution: Won't Fix

> flaky TestStatsReplicationScenariosMigration.testMetadataOnlyDump
> -
>
> Key: HIVE-23986
> URL: https://issues.apache.org/jira/browse/HIVE-23986
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Arko Sharma
>Priority: Major
>
> http://ci.hive.apache.org/job/hive-precommit/job/master/143/testReport/junit/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23986) flaky TestStatsReplicationScenariosMigration.testMetadataOnlyDump

2020-08-11 Thread Aasha Medhi (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175468#comment-17175468
 ] 

Aasha Medhi commented on HIVE-23986:


We are removing migration code as part of HIVE-23995 

> flaky TestStatsReplicationScenariosMigration.testMetadataOnlyDump
> -
>
> Key: HIVE-23986
> URL: https://issues.apache.org/jira/browse/HIVE-23986
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Arko Sharma
>Priority: Major
>
> http://ci.hive.apache.org/job/hive-precommit/job/master/143/testReport/junit/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23849) Hive skips the creation of ColumnAccessInfo when creating a view

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23849?focusedWorklogId=469169&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469169
 ]

ASF GitHub Bot logged work on HIVE-23849:
-

Author: ASF GitHub Bot
Created on: 11/Aug/20 11:19
Start Date: 11/Aug/20 11:19
Worklog Time Spent: 10m 
  Work Description: bmaidics opened a new pull request #1387:
URL: https://github.com/apache/hive/pull/1387


   …g a view
   
   ### What changes were proposed in this pull request?
   Hive is skipping the creation of ColumnAccessInfo at view creation when CBO 
is disabled
   
   
   
   ### Why are the changes needed?
   In a secure environment, Hive is failing with AuthorizationException when 
creating a view (if cbo is disabled)
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   
   ### How was this patch tested?
   Run some qtests locally
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469169)
Time Spent: 2h 50m  (was: 2h 40m)

> Hive skips the creation of ColumnAccessInfo when creating a view
> 
>
> Key: HIVE-23849
> URL: https://issues.apache.org/jira/browse/HIVE-23849
> Project: Hive
>  Issue Type: Bug
>Reporter: Barnabas Maidics
>Assignee: Barnabas Maidics
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> When creating a view, Hive skips the creation of ColumnAccessInfo that should 
> be created at [step 8|#L12601]. This causes Authorization error. 
> Currently, this issue is "hidden" when CBO is enabled. By introducing 
> HIVE-14496, CalcitePlanner creates this ColumnAccessInfo at [step 
> 2|https://github.com/apache/hive/blob/11e069b277fd1a18899b8ca1d2926fcbe73f17f2/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L12459].
>  But after turning off CBO, the issue is still there. 
> I think the return statement in [step 
> 5|https://github.com/apache/hive/blob/11e069b277fd1a18899b8ca1d2926fcbe73f17f2/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L12574]
>  is not necessary.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24014) Need to delete DumpDirectoryCleanerTask

2020-08-11 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24014:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to master. Thanks for the patch [~^sharma] and review [~aasha]

> Need to delete DumpDirectoryCleanerTask
> ---
>
> Key: HIVE-24014
> URL: https://issues.apache.org/jira/browse/HIVE-24014
> Project: Hive
>  Issue Type: Bug
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24014.01.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> With the newer implementation, every dump operation cleans up the  
> dump-directory previously consumed by load operation. Hence, for a policy, at 
> most only one dump directory will be there. Also, now dump directory base 
> location config is policy level config and hence this DumpDirCleanerTask will 
> not be effective.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22126) hive-exec packaging should shade guava

2020-08-11 Thread zhengchenyu (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175462#comment-17175462
 ] 

zhengchenyu commented on HIVE-22126:


[~euigeun_chung] I think it 's not a version problem. I think we wanna need to 
shade guava, we need shade all guava from all other component(for example: 
hadoop, spark), not but work well on a specific version. 

> hive-exec packaging should shade guava
> --
>
> Key: HIVE-22126
> URL: https://issues.apache.org/jira/browse/HIVE-22126
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Eugene Chung
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22126.01.patch, HIVE-22126.02.patch, 
> HIVE-22126.03.patch, HIVE-22126.04.patch, HIVE-22126.05.patch, 
> HIVE-22126.06.patch, HIVE-22126.07.patch, HIVE-22126.08.patch, 
> HIVE-22126.09.patch, HIVE-22126.09.patch, HIVE-22126.09.patch, 
> HIVE-22126.09.patch, HIVE-22126.09.patch
>
>
> The ql/pom.xml includes complete guava library into hive-exec.jar 
> https://github.com/apache/hive/blob/master/ql/pom.xml#L990 This causes a 
> problems for downstream clients of hive which have hive-exec.jar in their 
> classpath since they are pinned to the same guava version as that of hive. 
> We should shade guava classes so that other components which depend on 
> hive-exec can independently use a different version of guava as needed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23849) Hive skips the creation of ColumnAccessInfo when creating a view

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23849?focusedWorklogId=469166&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469166
 ]

ASF GitHub Bot logged work on HIVE-23849:
-

Author: ASF GitHub Bot
Created on: 11/Aug/20 11:12
Start Date: 11/Aug/20 11:12
Worklog Time Spent: 10m 
  Work Description: bmaidics closed pull request #1386:
URL: https://github.com/apache/hive/pull/1386


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469166)
Time Spent: 2h 40m  (was: 2.5h)

> Hive skips the creation of ColumnAccessInfo when creating a view
> 
>
> Key: HIVE-23849
> URL: https://issues.apache.org/jira/browse/HIVE-23849
> Project: Hive
>  Issue Type: Bug
>Reporter: Barnabas Maidics
>Assignee: Barnabas Maidics
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> When creating a view, Hive skips the creation of ColumnAccessInfo that should 
> be created at [step 8|#L12601]. This causes Authorization error. 
> Currently, this issue is "hidden" when CBO is enabled. By introducing 
> HIVE-14496, CalcitePlanner creates this ColumnAccessInfo at [step 
> 2|https://github.com/apache/hive/blob/11e069b277fd1a18899b8ca1d2926fcbe73f17f2/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L12459].
>  But after turning off CBO, the issue is still there. 
> I think the return statement in [step 
> 5|https://github.com/apache/hive/blob/11e069b277fd1a18899b8ca1d2926fcbe73f17f2/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L12574]
>  is not necessary.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HIVE-22126) hive-exec packaging should shade guava

2020-08-11 Thread Eugene Chung (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175459#comment-17175459
 ] 

Eugene Chung edited comment on HIVE-22126 at 8/11/20, 11:12 AM:


[~zhengchenyu] Hive 4.0 is using calcite-core-1.21.0. You should check your 
classpath.


was (Author: euigeun_chung):
[~zhengchenyu] Hive 4.0 is using calcite-core-1.21.0.

> hive-exec packaging should shade guava
> --
>
> Key: HIVE-22126
> URL: https://issues.apache.org/jira/browse/HIVE-22126
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Eugene Chung
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22126.01.patch, HIVE-22126.02.patch, 
> HIVE-22126.03.patch, HIVE-22126.04.patch, HIVE-22126.05.patch, 
> HIVE-22126.06.patch, HIVE-22126.07.patch, HIVE-22126.08.patch, 
> HIVE-22126.09.patch, HIVE-22126.09.patch, HIVE-22126.09.patch, 
> HIVE-22126.09.patch, HIVE-22126.09.patch
>
>
> The ql/pom.xml includes complete guava library into hive-exec.jar 
> https://github.com/apache/hive/blob/master/ql/pom.xml#L990 This causes a 
> problems for downstream clients of hive which have hive-exec.jar in their 
> classpath since they are pinned to the same guava version as that of hive. 
> We should shade guava classes so that other components which depend on 
> hive-exec can independently use a different version of guava as needed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22126) hive-exec packaging should shade guava

2020-08-11 Thread Eugene Chung (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175459#comment-17175459
 ] 

Eugene Chung commented on HIVE-22126:
-

[~zhengchenyu] Hive 4.0 is using calcite-core-1.21.0.

> hive-exec packaging should shade guava
> --
>
> Key: HIVE-22126
> URL: https://issues.apache.org/jira/browse/HIVE-22126
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Eugene Chung
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22126.01.patch, HIVE-22126.02.patch, 
> HIVE-22126.03.patch, HIVE-22126.04.patch, HIVE-22126.05.patch, 
> HIVE-22126.06.patch, HIVE-22126.07.patch, HIVE-22126.08.patch, 
> HIVE-22126.09.patch, HIVE-22126.09.patch, HIVE-22126.09.patch, 
> HIVE-22126.09.patch, HIVE-22126.09.patch
>
>
> The ql/pom.xml includes complete guava library into hive-exec.jar 
> https://github.com/apache/hive/blob/master/ql/pom.xml#L990 This causes a 
> problems for downstream clients of hive which have hive-exec.jar in their 
> classpath since they are pinned to the same guava version as that of hive. 
> We should shade guava classes so that other components which depend on 
> hive-exec can independently use a different version of guava as needed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23927) Cast to Timestamp generates different output for Integer & Float values

2020-08-11 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175458#comment-17175458
 ] 

Renukaprasad C commented on HIVE-23927:
---

[~gopalv]
Shall we change the input unit to Millis for double datatype as well? Changing 
this may break the compatibility for existing users. Please suggest on this. 
Thank you.

> Cast to Timestamp generates different output for Integer & Float values 
> 
>
> Key: HIVE-23927
> URL: https://issues.apache.org/jira/browse/HIVE-23927
> Project: Hive
>  Issue Type: Bug
>Reporter: Renukaprasad C
>Priority: Major
>
> Double consider the input value as SECOND and converts into Millis internally.
> Whereas, Integer value will be considered as Millis and produce different 
> output.
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getTimestamp(Object,
>  PrimitiveObjectInspector, boolean) - Handles Integral & Decimal values 
> differently. This cause the issue.
> 0: jdbc:hive2://localhost:1> select cast(1.204135216E9 as timestamp) 
> Double2TimeStamp, cast(1204135216 as timestamp) Int2TimeStamp from abc 
> tablesample(1 rows);
> OK
> INFO  : Compiling 
> command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14): 
> select cast(1.204135216E9 as timestamp) Double2TimeStamp, cast(1204135216 as 
> timestamp) Int2TimeStamp from abc tablesample(1 rows)
> INFO  : Concurrency mode is disabled, not creating a lock manager
> INFO  : Semantic Analysis Completed (retrial = false)
> INFO  : Returning Hive schema: 
> Schema(fieldSchemas:[FieldSchema(name:double2timestamp, type:timestamp, 
> comment:null), FieldSchema(name:int2timestamp, type:timestamp, 
> comment:null)], properties:null)
> INFO  : Completed compiling 
> command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14); 
> Time taken: 0.175 seconds
> INFO  : Concurrency mode is disabled, not creating a lock manager
> INFO  : Executing 
> command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14): 
> select cast(1.204135216E9 as timestamp) Double2TimeStamp, cast(1204135216 as 
> timestamp) Int2TimeStamp from abc tablesample(1 rows)
> INFO  : Completed executing 
> command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14); 
> Time taken: 0.001 seconds
> INFO  : OK
> INFO  : Concurrency mode is disabled, not creating a lock manager
> ++--+
> |double2timestamp|  int2timestamp   |
> ++--+
> | 2008-02-27 18:00:16.0  | 1970-01-14 22:28:55.216  |
> ++--+



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HIVE-22126) hive-exec packaging should shade guava

2020-08-11 Thread zhengchenyu (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175452#comment-17175452
 ] 

zhengchenyu edited comment on HIVE-22126 at 8/11/20, 11:02 AM:
---

When I run the program, I found an AbstractMethodError. I think 
HiveAggregateFactoryImpl's createAggregate doesn't implement AggregateFactory, 
I think com.google.common.collect.ImmutableList would be relocation to 
org.apache.hive.com.google.common.collect.ImmutableList, so throw 
AbstractMethodError. 

Notes: for shade all guava jar, I remove guava lib from lib dir and use 
maven-shade-plugin in main pom.xml to shade all guava. 

Error stack are below: 
{code:java}
2020-08-11T18:26:51,434 ERROR [0ae58217-4908-4697-a9e7-a57f279a22a0 main] 
parse.CalcitePlanner: CBO failed, skipping CBO.
java.lang.RuntimeException: java.lang.AbstractMethodError: 
org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelFactories$HiveAggregateFactoryImpl.createAggregate(Lorg/apache/calcite/rel/RelNode;ZLorg/apache/calcite/util/ImmutableBitSet;Lcom/google/common/collect/ImmutableList;Ljava/util/List;)Lorg/apache/calcite/rel/RelNode;
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.rethrowCalciteException(CalcitePlanner.java:1539)
 ~[hive-exec-3.1.2.jar:3.1.2]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1417)
 ~[hive-exec-3.1.2.jar:3.1.2]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:1430)
 ~[hive-exec-3.1.2.jar:3.1.2]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:450)
 ~[hive-exec-3.1.2.jar:3.1.2]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12164)
 ~[hive-exec-3.1.2.jar:3.1.2]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:330)
 ~[hive-exec-3.1.2.jar:3.1.2]
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:285)
 ~[hive-exec-3.1.2.jar:3.1.2]
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:659) 
~[hive-exec-3.1.2.jar:3.1.2]
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826) 
~[hive-exec-3.1.2.jar:3.1.2]
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1773) 
~[hive-exec-3.1.2.jar:3.1.2]
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1768) 
~[hive-exec-3.1.2.jar:3.1.2]
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
 ~[hive-exec-3.1.2.jar:3.1.2]
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:214) 
~[hive-exec-3.1.2.jar:3.1.2]
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) 
~[hive-cli-3.1.2.jar:3.1.2]
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) 
~[hive-cli-3.1.2.jar:3.1.2]
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) 
~[hive-cli-3.1.2.jar:3.1.2]
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) 
~[hive-cli-3.1.2.jar:3.1.2]
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) 
~[hive-cli-3.1.2.jar:3.1.2]
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) 
~[hive-cli-3.1.2.jar:3.1.2]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_241]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_241]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_241]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_241]
at org.apache.hadoop.util.RunJar.run(RunJar.java:323) 
~[hadoop-common-3.2.1.jar:?]
at org.apache.hadoop.util.RunJar.main(RunJar.java:236) 
~[hadoop-common-3.2.1.jar:?]
Caused by: java.lang.AbstractMethodError: 
org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelFactories$HiveAggregateFactoryImpl.createAggregate(Lorg/apache/calcite/rel/RelNode;ZLorg/apache/calcite/util/ImmutableBitSet;Lcom/google/common/collect/ImmutableList;Ljava/util/List;)Lorg/apache/calcite/rel/RelNode;
at org.apache.calcite.tools.RelBuilder.aggregate(RelBuilder.java:1267) 
~[calcite-core-1.16.0.jar:1.16.0]
at 
org.apache.calcite.sql2rel.RelFieldTrimmer.trimFields(RelFieldTrimmer.java:886) 
~[calcite-core-1.16.0.jar:1.16.0]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_241]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_241]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_241]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_241]
at org.apache.calcite.util

[jira] [Commented] (HIVE-22126) hive-exec packaging should shade guava

2020-08-11 Thread zhengchenyu (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175452#comment-17175452
 ] 

zhengchenyu commented on HIVE-22126:


When I run the program, I found an AbstractMethodError. I think 
HiveAggregateFactoryImpl's createAggregate doesn't implement AggregateFactory, 
I think com.google.common.collect.ImmutableList would be relocation to 
org.apache.hive.com.google.common.collect.ImmutableList, so throw 
AbstractMethodError. 

Notes: for shade all guava jar, I remove guava lib from lib dir and use 
maven-shade-plugin in main pom.xml to shade all guava. 

Error stack are below: 

{code}

2020-08-11T18:26:51,434 ERROR [0ae58217-4908-4697-a9e7-a57f279a22a0 main] 
parse.CalcitePlanner: CBO failed, skipping CBO.2020-08-11T18:26:51,434 ERROR 
[0ae58217-4908-4697-a9e7-a57f279a22a0 main] parse.CalcitePlanner: CBO failed, 
skipping CBO.java.lang.RuntimeException: java.lang.AbstractMethodError: 
org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelFactories$HiveAggregateFactoryImpl.createAggregate(Lorg/apache/calcite/rel/RelNode;ZLorg/apache/calcite/util/ImmutableBitSet;Lcom/google/common/collect/ImmutableList;Ljava/util/List;)Lorg/apache/calcite/rel/RelNode;
 at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.rethrowCalciteException(CalcitePlanner.java:1539)
 ~[hive-exec-3.1.2.jar:3.1.2] at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1417)
 ~[hive-exec-3.1.2.jar:3.1.2] at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:1430)
 ~[hive-exec-3.1.2.jar:3.1.2] at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:450)
 ~[hive-exec-3.1.2.jar:3.1.2] at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12164)
 ~[hive-exec-3.1.2.jar:3.1.2] at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:330)
 ~[hive-exec-3.1.2.jar:3.1.2] at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:285)
 ~[hive-exec-3.1.2.jar:3.1.2] at 
org.apache.hadoop.hive.ql.Driver.compile(Driver.java:659) 
~[hive-exec-3.1.2.jar:3.1.2] at 
org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826) 
~[hive-exec-3.1.2.jar:3.1.2] at 
org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1773) 
~[hive-exec-3.1.2.jar:3.1.2] at 
org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1768) 
~[hive-exec-3.1.2.jar:3.1.2] at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
 ~[hive-exec-3.1.2.jar:3.1.2] at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:214) 
~[hive-exec-3.1.2.jar:3.1.2] at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) 
~[hive-cli-3.1.2.jar:3.1.2] at 
org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) 
~[hive-cli-3.1.2.jar:3.1.2] at 
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) 
~[hive-cli-3.1.2.jar:3.1.2] at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) 
~[hive-cli-3.1.2.jar:3.1.2] at 
org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) 
~[hive-cli-3.1.2.jar:3.1.2] at 
org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) 
~[hive-cli-3.1.2.jar:3.1.2] at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_241] at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_241] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_241] at java.lang.reflect.Method.invoke(Method.java:498) 
~[?:1.8.0_241] at org.apache.hadoop.util.RunJar.run(RunJar.java:323) 
~[hadoop-common-3.2.1.jar:?] at 
org.apache.hadoop.util.RunJar.main(RunJar.java:236) 
~[hadoop-common-3.2.1.jar:?]Caused by: java.lang.AbstractMethodError: 
org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelFactories$HiveAggregateFactoryImpl.createAggregate(Lorg/apache/calcite/rel/RelNode;ZLorg/apache/calcite/util/ImmutableBitSet;Lcom/google/common/collect/ImmutableList;Ljava/util/List;)Lorg/apache/calcite/rel/RelNode;
 at org.apache.calcite.tools.RelBuilder.aggregate(RelBuilder.java:1267) 
~[calcite-core-1.16.0.jar:1.16.0] at 
org.apache.calcite.sql2rel.RelFieldTrimmer.trimFields(RelFieldTrimmer.java:886) 
~[calcite-core-1.16.0.jar:1.16.0] at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_241] at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_241] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_241] at java.lang.reflect.Method.invoke(Method.java:498) 
~[?:1.8.0_241] at 
org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:524) 
~[calcite-core-1.16.0.jar:1.16.0] at 
org.apache.calcite.sql2rel.RelFieldTrimmer.dispatchTrimFields(RelFieldTrimmer.java:273)
 ~[cal

[jira] [Commented] (HIVE-18454) Incorrect rownum estimation in joins

2020-08-11 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-18454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175446#comment-17175446
 ] 

Renukaprasad C commented on HIVE-18454:
---

[~kgyrtkirk],
I tried to simulate the issue with the provided queries. but i couldnt on 
branch - 3.1.0. Is there any specific scenario or it got handled as part of 
some other issue already? Thanks in advance.

> Incorrect rownum estimation in joins
> 
>
> Key: HIVE-18454
> URL: https://issues.apache.org/jira/browse/HIVE-18454
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Priority: Major
>
> row counts seems to be off the charts...sometimes ~12 rows estimated when 
> the table has only 10 rows
> {code:java}
> create table s (x int);
> insert into s values
> (1),(2),(3),(4),(5),
> (6),(7),(8),(9),(10);
> create table tu(id_uv int,id_uw int,u int);
> create table tv(id_uv int,v int);
> create table tw(id_uw int,w int);
> from s
> insert overwrite table tu
> select x,x,x 
> where x<=6 or x=10
> insert overwrite table tv
> select x,x  
> where x<=3 or x=10
> insert overwrite table tw
> select x,x  
> ;
> set hive.explain.user=true;
> explain analyze
> select sum(u*v*w) from tu
> join tv on (tu.id_uv=tv.id_uv)
> join tw on (tu.id_uw=tw.id_uw)
> where w>9 and u>1 and v>3;
> desc formatted tv;
> {code}
> explain analyze output:
> {code:java}
> | Plan optimized by CBO. |
> ||
> | Vertex dependency in root stage|
> | Map 1 <- Map 3 (BROADCAST_EDGE), Map 4 (BROADCAST_EDGE) |
> | Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)|
> ||
> | Stage-0|
> |   Fetch Operator   |
> | limit:-1   |
> | Stage-1|
> |   Reducer 2|
> |   File Output Operator [FS_21] |
> | Group By Operator [GBY_19] (rows=1/1 width=8) |
> |   Output:["_col0"],aggregations:["sum(VALUE._col0)"] |
> | <-Map 1 [CUSTOM_SIMPLE_EDGE]   |
> |   PARTITION_ONLY_SHUFFLE [RS_18]   |
> | Group By Operator [GBY_17] (rows=1/1 width=8) |
> |   Output:["_col0"],aggregations:["sum(_col0)"] |
> |   Select Operator [SEL_15] (rows=48400/1 width=5) |
> | Output:["_col0"]   |
> | Map Join Operator [MAPJOIN_31] (rows=48400/1 width=5) |
> |   
> Conds:MAPJOIN_30._col1=RS_13._col0(Inner),HybridGraceHashJoin:true,Output:["_col2","_col4","_col6"]
>  |
> | <-Map 4 [BROADCAST_EDGE]   |
> |   BROADCAST [RS_13]|
> | PartitionCols:_col0|
> | Select Operator [SEL_8] (rows=3/1 width=3) |
> |   Output:["_col0","_col1"] |
> |   Filter Operator [FIL_29] (rows=3/1 width=3) |
> | predicate:((w > 9) and id_uw is not null) |
> | TableScan [TS_6] (rows=10/10 width=3) |
> |   
> default@tw,tw,Tbl:COMPLETE,Col:NONE,Output:["id_uw","w"] |
> | <-Map Join Operator [MAPJOIN_30] (rows=44000/1 width=5) |
> | 
> Conds:SEL_2._col0=RS_10._col0(Inner),HybridGraceHashJoin:true,Output:["_col1","_col2","_col4"]
>  |
> |   <-Map 3 [BROADCAST_EDGE] |
> | BROADCAST [RS_10]  |
> |   PartitionCols:_col0  |
> |   Select Operator [SEL_5] (rows=1632/1 width=3) |
> | Output:["_col0","_col1"]   |
> | Filter Operator [FIL_28] (rows=1632/1 width=3) |
> |   predicate:((v > 3) and id_uv is not null) |
> |   TableScan [TS_3] (rows=4898/4 width=3) |
> | 
> default@tv,tv,Tbl:COMPLETE,Col:NONE,Output:["id_uv","v"] |
> |   <-Select Operator [SEL_2] (rows=4/6 width=5) |
> |   Output:["_col0","_col1","_col2"] |
> |   Filter Operator [FIL_27] (rows=4/6 width=5) |
> | predicate:((u > 1) and id_uv is not null and id_uw 
> is not null) |
> | TableScan [TS_0] (rows=12/7 width=5) |
> |   
> default@tu,tu,Tbl:COMPLETE,Col:NONE,Output:["id_uv","id_uw","u"] |
> {code}



--
This message was sen

[jira] [Work logged] (HIVE-23849) Hive skips the creation of ColumnAccessInfo when creating a view

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23849?focusedWorklogId=469139&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469139
 ]

ASF GitHub Bot logged work on HIVE-23849:
-

Author: ASF GitHub Bot
Created on: 11/Aug/20 09:36
Start Date: 11/Aug/20 09:36
Worklog Time Spent: 10m 
  Work Description: bmaidics opened a new pull request #1386:
URL: https://github.com/apache/hive/pull/1386


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469139)
Time Spent: 2.5h  (was: 2h 20m)

> Hive skips the creation of ColumnAccessInfo when creating a view
> 
>
> Key: HIVE-23849
> URL: https://issues.apache.org/jira/browse/HIVE-23849
> Project: Hive
>  Issue Type: Bug
>Reporter: Barnabas Maidics
>Assignee: Barnabas Maidics
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> When creating a view, Hive skips the creation of ColumnAccessInfo that should 
> be created at [step 8|#L12601]. This causes Authorization error. 
> Currently, this issue is "hidden" when CBO is enabled. By introducing 
> HIVE-14496, CalcitePlanner creates this ColumnAccessInfo at [step 
> 2|https://github.com/apache/hive/blob/11e069b277fd1a18899b8ca1d2926fcbe73f17f2/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L12459].
>  But after turning off CBO, the issue is still there. 
> I think the return statement in [step 
> 5|https://github.com/apache/hive/blob/11e069b277fd1a18899b8ca1d2926fcbe73f17f2/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L12574]
>  is not necessary.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23995) Don't set location for managed tables in case of repl load

2020-08-11 Thread Aasha Medhi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-23995:
---
Attachment: HIVE-23995.05.patch
Status: Patch Available  (was: In Progress)

> Don't set location for managed tables in case of repl load
> --
>
> Key: HIVE-23995
> URL: https://issues.apache.org/jira/browse/HIVE-23995
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23995.01.patch, HIVE-23995.02.patch, 
> HIVE-23995.03.patch, HIVE-23995.04.patch, HIVE-23995.05.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23995) Don't set location for managed tables in case of repl load

2020-08-11 Thread Aasha Medhi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-23995:
---
Status: In Progress  (was: Patch Available)

> Don't set location for managed tables in case of repl load
> --
>
> Key: HIVE-23995
> URL: https://issues.apache.org/jira/browse/HIVE-23995
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23995.01.patch, HIVE-23995.02.patch, 
> HIVE-23995.03.patch, HIVE-23995.04.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24001) Don't cache MapWork in tez/ObjectCache during query-based compaction

2020-08-11 Thread Karen Coppage (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage resolved HIVE-24001.
--
Resolution: Fixed

Submitted to master. Thanks [~kuczoram] for the review!

> Don't cache MapWork in tez/ObjectCache during query-based compaction
> 
>
> Key: HIVE-24001
> URL: https://issues.apache.org/jira/browse/HIVE-24001
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Query-based major compaction can fail intermittently with the following issue:
> {code:java}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: One writer is 
> supposed to handle only one bucket. We saw these 2 different buckets: 1 and 6
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFValidateAcidSortOrder.evaluate(GenericUDFValidateAcidSortOrder.java:77)
> {code}
> This is consistently preceded in the application log with:
> {code:java}
>  [INFO] [TezChild] |tez.ObjectCache|: Found 
> hive_20200804185133_f04cca69-fa30-4f1b-a5fe-80fc2d749f48_Map 1__MAP_PLAN__ in 
> cache with value: org.apache.hadoop.hive.ql.plan.MapWork@74652101
> {code}
> Alternatively, when MapRecordProcessor doesn't find mapWork in 
> tez/ObjectCache (but instead caches mapWork), major compaction succeeds.
> The failure happens because, if MapWork is reused, 
> GenericUDFValidateAcidSortOrder (which is called during compaction) is also 
> reused on splits belonging to two different buckets, which produces an error.
> Solution is to avoid storing MapWork in the ObjectCache during query-based 
> compaction.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24001) Don't cache MapWork in tez/ObjectCache during query-based compaction

2020-08-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24001?focusedWorklogId=469116&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-469116
 ]

ASF GitHub Bot logged work on HIVE-24001:
-

Author: ASF GitHub Bot
Created on: 11/Aug/20 08:38
Start Date: 11/Aug/20 08:38
Worklog Time Spent: 10m 
  Work Description: klcopp merged pull request #1368:
URL: https://github.com/apache/hive/pull/1368


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 469116)
Time Spent: 40m  (was: 0.5h)

> Don't cache MapWork in tez/ObjectCache during query-based compaction
> 
>
> Key: HIVE-24001
> URL: https://issues.apache.org/jira/browse/HIVE-24001
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Query-based major compaction can fail intermittently with the following issue:
> {code:java}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: One writer is 
> supposed to handle only one bucket. We saw these 2 different buckets: 1 and 6
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFValidateAcidSortOrder.evaluate(GenericUDFValidateAcidSortOrder.java:77)
> {code}
> This is consistently preceded in the application log with:
> {code:java}
>  [INFO] [TezChild] |tez.ObjectCache|: Found 
> hive_20200804185133_f04cca69-fa30-4f1b-a5fe-80fc2d749f48_Map 1__MAP_PLAN__ in 
> cache with value: org.apache.hadoop.hive.ql.plan.MapWork@74652101
> {code}
> Alternatively, when MapRecordProcessor doesn't find mapWork in 
> tez/ObjectCache (but instead caches mapWork), major compaction succeeds.
> The failure happens because, if MapWork is reused, 
> GenericUDFValidateAcidSortOrder (which is called during compaction) is also 
> reused on splits belonging to two different buckets, which produces an error.
> Solution is to avoid storing MapWork in the ObjectCache during query-based 
> compaction.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)