date:20210118

[jira] [Commented] (HIVE-24650) hiveserver2 memory usage is extremely high, GC unable to recycle

2021-01-18 Thread Adrian Wang (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17267692#comment-17267692
 ] 

Adrian Wang commented on HIVE-24650:


As far as I am concerned, this could be an issue of calcite. Could you try `set 
hive.cbo.enable=false` and check what happens?

> hiveserver2 memory usage is extremely high, GC unable to recycle
> 
>
> Key: HIVE-24650
> URL: https://issues.apache.org/jira/browse/HIVE-24650
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0
> Environment: hive3.1.0
>Reporter: zhaojk
>Priority: Major
> Attachments: 1.png, 2.png, 3.png
>
>
> HDP's HiveServer2 is using 80GB of memory (HEAP is configured with 74GB), and 
> when the memory is full, there will be frequent Full GC, and then the memory 
> cannot be recycled, resulting in a service exception.Analyze memory usage.
> GC config:
> export HADOOP_OPTS="$HADOOP_OPTS 
> -Xloggc:\{{hive_log_dir}}/hiveserver2-gc-%t.log -XX:ConcGCThreads=30 
> -XX:ParallelGCThreads=30 -XX:+UseG1GC -XX:G1HeapRegionSize=8M 
> -XX:+UseStringDeduplication -XX:MaxGCPauseMillis=1000 
> -XX:InitiatingHeapOccupancyPercent=40 -XX:G1ReservePercent=15 
> -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCCause 
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=100M 
> -XX:+HeapDumpOnOutOfMemoryError 
> -XX:HeapDumpPath=/home/hive/hs2_heapdump.hprof 
> -Dhive.log.dir=\{{hive_log_dir}} -Dhive.log.file=hiveserver2.log
> The details at
> [https://blog.csdn.net/Small_codeing/article/details/112601226]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24558) Handle update in table level regular expression.

2021-01-18 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-24558:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Handle update in table level regular expression.
> 
>
> Key: HIVE-24558
> URL: https://issues.apache.org/jira/browse/HIVE-24558
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24558.01.patch, HIVE-24558.02.patch, 
> HIVE-24558.03.patch, HIVE-24558.04.patch, HIVE-24558.05.patch, 
> HIVE-24558.06.patch, HIVE-24558.07.patch, HIVE-24558.08.patch, 
> HIVE-24558.09.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24558) Handle update in table level regular expression.

2021-01-18 Thread Aasha Medhi (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17267671#comment-17267671
 ] 

Aasha Medhi commented on HIVE-24558:


Thank you for the review [~pkumarsinha]. Committed to master.

> Handle update in table level regular expression.
> 
>
> Key: HIVE-24558
> URL: https://issues.apache.org/jira/browse/HIVE-24558
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24558.01.patch, HIVE-24558.02.patch, 
> HIVE-24558.03.patch, HIVE-24558.04.patch, HIVE-24558.05.patch, 
> HIVE-24558.06.patch, HIVE-24558.07.patch, HIVE-24558.08.patch, 
> HIVE-24558.09.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24558) Handle update in table level regular expression.

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24558?focusedWorklogId=537624=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537624
 ]

ASF GitHub Bot logged work on HIVE-24558:
-

Author: ASF GitHub Bot
Created on: 19/Jan/21 05:50
Start Date: 19/Jan/21 05:50
Worklog Time Spent: 10m 
  Work Description: aasha merged pull request #1804:
URL: https://github.com/apache/hive/pull/1804


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537624)
Time Spent: 50m  (was: 40m)

> Handle update in table level regular expression.
> 
>
> Key: HIVE-24558
> URL: https://issues.apache.org/jira/browse/HIVE-24558
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24558.01.patch, HIVE-24558.02.patch, 
> HIVE-24558.03.patch, HIVE-24558.04.patch, HIVE-24558.05.patch, 
> HIVE-24558.06.patch, HIVE-24558.07.patch, HIVE-24558.08.patch, 
> HIVE-24558.09.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24558) Handle update in table level regular expression.

2021-01-18 Thread Pravin Sinha (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17267667#comment-17267667
 ] 

Pravin Sinha commented on HIVE-24558:
-

+1 LGTM

> Handle update in table level regular expression.
> 
>
> Key: HIVE-24558
> URL: https://issues.apache.org/jira/browse/HIVE-24558
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24558.01.patch, HIVE-24558.02.patch, 
> HIVE-24558.03.patch, HIVE-24558.04.patch, HIVE-24558.05.patch, 
> HIVE-24558.06.patch, HIVE-24558.07.patch, HIVE-24558.08.patch, 
> HIVE-24558.09.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24558) Handle update in table level regular expression.

2021-01-18 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-24558:
---
Status: In Progress  (was: Patch Available)

> Handle update in table level regular expression.
> 
>
> Key: HIVE-24558
> URL: https://issues.apache.org/jira/browse/HIVE-24558
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24558.01.patch, HIVE-24558.02.patch, 
> HIVE-24558.03.patch, HIVE-24558.04.patch, HIVE-24558.05.patch, 
> HIVE-24558.06.patch, HIVE-24558.07.patch, HIVE-24558.08.patch, 
> HIVE-24558.09.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24558) Handle update in table level regular expression.

2021-01-18 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-24558:
---
Attachment: HIVE-24558.09.patch
Status: Patch Available  (was: In Progress)

> Handle update in table level regular expression.
> 
>
> Key: HIVE-24558
> URL: https://issues.apache.org/jira/browse/HIVE-24558
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24558.01.patch, HIVE-24558.02.patch, 
> HIVE-24558.03.patch, HIVE-24558.04.patch, HIVE-24558.05.patch, 
> HIVE-24558.06.patch, HIVE-24558.07.patch, HIVE-24558.08.patch, 
> HIVE-24558.09.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24639) Raises SemanticException other than ClassCastException when filter has non-boolean expressions

2021-01-18 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng reassigned HIVE-24639:
--

Assignee: Zhihua Deng

> Raises SemanticException other than ClassCastException when filter has 
> non-boolean expressions
> --
>
> Key: HIVE-24639
> URL: https://issues.apache.org/jira/browse/HIVE-24639
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Sometimes we see ClassCastException in filters when fetching some rows of a 
> table or executing the query.  The 
> GenericUDFOPOr/GenericUDFOPAnd/FilterOperator assume that the output of their 
> conditions should be a boolean,  but there is no garanteed.  For example: 
> _select * from ccn_table where src + 1;_ 
> will throw ClassCastException:
> {code:java}
> Caused by: java.lang.ClassCastException: java.lang.Integer cannot be cast to 
> java.lang.Boolean
> at 
> org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:125)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:888)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:173)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:153)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:553)
> ...{code}
> We'd better to validate the filter during analyzing instead of at runtime and 
> bring more meaningful messages.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24404) Hive getUserName close db makes client operations lost metaStoreClient connection

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24404?focusedWorklogId=537580=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537580
 ]

ASF GitHub Bot logged work on HIVE-24404:
-

Author: ASF GitHub Bot
Created on: 19/Jan/21 01:32
Start Date: 19/Jan/21 01:32
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1685:
URL: https://github.com/apache/hive/pull/1685#issuecomment-762545173


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537580)
Time Spent: 0.5h  (was: 20m)

> Hive getUserName close db makes client operations lost metaStoreClient 
> connection
> -
>
> Key: HIVE-24404
> URL: https://issues.apache.org/jira/browse/HIVE-24404
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 2.3.7
> Environment: os: centos 7
> spark: 3.0.1
> hive: 2.3.7
>Reporter: Lichuanliang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> I'm using spark to execute a drop partition sql will always encounter a lost 
> metastore connection warning.
>  Spark ql:
> {code:java}
> alter table mydb.some_table drop if exists partition(dt = '2020-11-12',hh = 
> '17');
> {code}
> Execution log:
> {code:java}
> 20/11/12 19:37:57 WARN SessionState: METASTORE_FILTER_HOOK will be ignored, 
> since hive.security.authorization.manager is set to instance of 
> HiveAuthorizerFactory.20/11/12 19:37:57 WARN SessionState: 
> METASTORE_FILTER_HOOK will be ignored, since 
> hive.security.authorization.manager is set to instance of 
> HiveAuthorizerFactory.20/11/12 19:37:57 WARN RetryingMetaStoreClient: 
> MetaStoreClient lost connection. Attempting to reconnect (1 of 1) after 1s. 
> listPartitionsWithAuthInfoorg.apache.thrift.transport.TTransportException: 
> Cannot write to null outputStream at 
> org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:142)
>  at 
> org.apache.thrift.protocol.TBinaryProtocol.writeI32(TBinaryProtocol.java:185) 
> at 
> org.apache.thrift.protocol.TBinaryProtocol.writeMessageBegin(TBinaryProtocol.java:116)
>  at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:70) at 
> org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62) at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.send_get_partitions_ps_with_auth(ThriftHiveMetastore.java:2562)
>  at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partitions_ps_with_auth(ThriftHiveMetastore.java:2549)
>  at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsWithAuthInfo(HiveMetaStoreClient.java:1209)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:173)
>  at com.sun.proxy.$Proxy32.listPartitionsWithAuthInfo(Unknown Source) at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2336)
>  at com.sun.proxy.$Proxy32.listPartitionsWithAuthInfo(Unknown Source) at 
> org.apache.hadoop.hive.ql.metadata.Hive.getPartitions(Hive.java:2555) at 
> org.apache.hadoop.hive.ql.metadata.Hive.getPartitions(Hive.java:2581) at 
> org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$dropPartitions$2(HiveClientImpl.scala:628)
>  at 
> scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:245)
>  at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) 
> at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) 
> at

[jira] [Work logged] (HIVE-24371) Ranger Replication fallback to updateIfExists

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24371?focusedWorklogId=537583=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537583
 ]

ASF GitHub Bot logged work on HIVE-24371:
-

Author: ASF GitHub Bot
Created on: 19/Jan/21 01:32
Start Date: 19/Jan/21 01:32
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1663:
URL: https://github.com/apache/hive/pull/1663


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537583)
Time Spent: 0.5h  (was: 20m)

> Ranger Replication fallback to updateIfExists
> -
>
> Key: HIVE-24371
> URL: https://issues.apache.org/jira/browse/HIVE-24371
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24371.01.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Ranger Replication fallback to updateIfExists
> Add dummy resource as workaround while creating the deny policy to avoid it 
> from overriding the actual policy



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24363) Current order of transactional event listeners is prone to deadlock in backend DB connections

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24363?focusedWorklogId=537579=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537579
 ]

ASF GitHub Bot logged work on HIVE-24363:
-

Author: ASF GitHub Bot
Created on: 19/Jan/21 01:32
Start Date: 19/Jan/21 01:32
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1657:
URL: https://github.com/apache/hive/pull/1657


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537579)
Time Spent: 2h 20m  (was: 2h 10m)

> Current order of transactional event listeners is prone to deadlock in 
> backend DB connections
> -
>
> Key: HIVE-24363
> URL: https://issues.apache.org/jira/browse/HIVE-24363
> Project: Hive
>  Issue Type: Bug
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24363.01.patch, HIVE-24363.02.patch
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Currently the AcidEventListener is added to the end of list of transactional 
> event listeners. When DbNotificationListener is configured as 
> 'hive.metastore.transactional.event.listeners'. The final list will be formed 
> as :
> {"DbNotificationListener" , "AcidEventListener"}
> This will result in backend DB lock acquisition in this order:
> {code:java}
>  lock(a) {
> // perform some op on a
>     lock(b) {
>   // perform some op on b
> }
>   }
> {code}
> On the other hand, there are some HMS API say for example commit_txn(), which 
> calls the TxnHandler method directly, followed by DbNotificationListener 
> processing. Which will result in the lock acquisition in reverse order:
> {code:java}
> lock(b) {
> // perform some op on b    
> lock(a) {
> // perform some op on a
> }   
>  }
> {code}
> Note: 'a' and 'b' above are backend  DB lock and not jvm locks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24169) HiveServer2 UDF cache

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24169?focusedWorklogId=537581=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537581
 ]

ASF GitHub Bot logged work on HIVE-24169:
-

Author: ASF GitHub Bot
Created on: 19/Jan/21 01:32
Start Date: 19/Jan/21 01:32
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1503:
URL: https://github.com/apache/hive/pull/1503#issuecomment-762545197


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537581)
Time Spent: 1h 40m  (was: 1.5h)

> HiveServer2 UDF cache
> -
>
> Key: HIVE-24169
> URL: https://issues.apache.org/jira/browse/HIVE-24169
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Sam An
>Assignee: Sam An
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> UDF is cache per session. This optional feature can help speed up UDF access 
> in S3 scenario.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24370) Make the GetPartitionsProjectionSpec generic and add builder methods for tables and partitions in HiveMetaStoreClient

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24370?focusedWorklogId=537578=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537578
 ]

ASF GitHub Bot logged work on HIVE-24370:
-

Author: ASF GitHub Bot
Created on: 19/Jan/21 01:32
Start Date: 19/Jan/21 01:32
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1664:
URL: https://github.com/apache/hive/pull/1664#issuecomment-762545183


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537578)
Time Spent: 2h 10m  (was: 2h)

> Make the GetPartitionsProjectionSpec generic and add builder methods for 
> tables and partitions in HiveMetaStoreClient
> -
>
> Key: HIVE-24370
> URL: https://issues.apache.org/jira/browse/HIVE-24370
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Narayanan Venkateswaran
>Assignee: Narayanan Venkateswaran
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> HIVE-20306 defines a projection struct called GetPartitionsProjectionSpec 
> While the name has Partition in its name, this is a fairly generic struct 
> with nothing specific to partitions. This should be renamed to a more generic 
> name (GetProjectionSpec ?) and builder methods of this class for tables and 
> partitions must be added to HiveMetaStoreClient.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22981) DataFileReader is not closed in AvroGenericRecordReader#extractWriterTimezoneFromMetadata

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22981?focusedWorklogId=537577=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537577
 ]

ASF GitHub Bot logged work on HIVE-22981:
-

Author: ASF GitHub Bot
Created on: 19/Jan/21 01:32
Start Date: 19/Jan/21 01:32
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1686:
URL: https://github.com/apache/hive/pull/1686#issuecomment-762545169


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537577)
Time Spent: 0.5h  (was: 20m)

> DataFileReader is not closed in 
> AvroGenericRecordReader#extractWriterTimezoneFromMetadata
> -
>
> Key: HIVE-22981
> URL: https://issues.apache.org/jira/browse/HIVE-22981
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22981.01.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Method looks like :
> {code}
>  private ZoneId extractWriterTimezoneFromMetadata(JobConf job, FileSplit 
> split,
>   GenericDatumReader gdr) throws IOException {
> if (job == null || gdr == null || split == null || split.getPath() == 
> null) {
>   return null;
> }
> try {
>   DataFileReader dataFileReader =
>   new DataFileReader(new FsInput(split.getPath(), 
> job), gdr);
>   [...return...]
>   }
> } catch (IOException e) {
>   // Can't access metadata, carry on.
> }
> return null;
>   }
> {code}
> The DataFileReader is never closed which can cause a memory leak. We need a 
> try-with-resources here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24353) performance: Refactor TimestampTZ parsing

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24353?focusedWorklogId=537571=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537571
 ]

ASF GitHub Bot logged work on HIVE-24353:
-

Author: ASF GitHub Bot
Created on: 19/Jan/21 01:01
Start Date: 19/Jan/21 01:01
Worklog Time Spent: 10m 
  Work Description: VPriesnitz commented on pull request #1650:
URL: https://github.com/apache/hive/pull/1650#issuecomment-762537083


   Is there anything needed from my side to get this PR merged? 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537571)
Time Spent: 50m  (was: 40m)

> performance: Refactor TimestampTZ parsing
> -
>
> Key: HIVE-24353
> URL: https://issues.apache.org/jira/browse/HIVE-24353
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vincenz Priesnitz
>Assignee: Vincenz Priesnitz
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> I found that for datasets that contain a lot of timestamps (without 
> timezones) hive spends the majority of time in TimestampTZUtil.parse, in 
> particular constructing stractraces for the try-catch blocks. 
> When parsing TimestampTZ we are currently using a fallback chain with several 
> try-catch blocks. For a common timestamp string without a timezone, we 
> currently throw and catch 2 exceptions, and actually parse the string twice. 
> I propose a refactor, that parses the string once and then expresses the 
> fallback chain with queries to the parsed TemporalAccessor. 
>  
> Update: I added a PR that resolves this issue: 
> [https://github.com/apache/hive/pull/1650] 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24654) Table level replication support for Atlas metadata

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24654:
--
Labels: pull-request-available  (was: )

> Table level replication support for Atlas metadata
> --
>
> Key: HIVE-24654
> URL: https://issues.apache.org/jira/browse/HIVE-24654
> Project: Hive
>  Issue Type: Task
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24654.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Covers mainly Atlas export API payload change required to support table level 
> replication



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24654) Table level replication support for Atlas metadata

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24654?focusedWorklogId=537551=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537551
 ]

ASF GitHub Bot logged work on HIVE-24654:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 21:44
Start Date: 18/Jan/21 21:44
Worklog Time Spent: 10m 
  Work Description: pkumarsinha opened a new pull request #1883:
URL: https://github.com/apache/hive/pull/1883


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537551)
Remaining Estimate: 0h
Time Spent: 10m

> Table level replication support for Atlas metadata
> --
>
> Key: HIVE-24654
> URL: https://issues.apache.org/jira/browse/HIVE-24654
> Project: Hive
>  Issue Type: Task
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
> Attachments: HIVE-24654.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Covers mainly Atlas export API payload change required to support table level 
> replication



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24654) Table level replication support for Atlas metadata

2021-01-18 Thread Pravin Sinha (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pravin Sinha updated HIVE-24654:

Attachment: HIVE-24654.01.patch

> Table level replication support for Atlas metadata
> --
>
> Key: HIVE-24654
> URL: https://issues.apache.org/jira/browse/HIVE-24654
> Project: Hive
>  Issue Type: Task
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
> Attachments: HIVE-24654.01.patch
>
>
> Covers mainly Atlas export API payload change required to support table level 
> replication



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24654) Table level replication support for Atlas metadata

2021-01-18 Thread Pravin Sinha (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pravin Sinha updated HIVE-24654:

Status: Patch Available  (was: Open)

> Table level replication support for Atlas metadata
> --
>
> Key: HIVE-24654
> URL: https://issues.apache.org/jira/browse/HIVE-24654
> Project: Hive
>  Issue Type: Task
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>
> Covers mainly Atlas export API payload change required to support table level 
> replication



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24654) Table level replication support for Atlas metadata

2021-01-18 Thread Pravin Sinha (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pravin Sinha reassigned HIVE-24654:
---


> Table level replication support for Atlas metadata
> --
>
> Key: HIVE-24654
> URL: https://issues.apache.org/jira/browse/HIVE-24654
> Project: Hive
>  Issue Type: Task
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>
> Covers mainly Atlas export API payload change required to support table level 
> replication



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24558) Handle update in table level regular expression.

2021-01-18 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-24558:
---
Status: In Progress  (was: Patch Available)

> Handle update in table level regular expression.
> 
>
> Key: HIVE-24558
> URL: https://issues.apache.org/jira/browse/HIVE-24558
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24558.01.patch, HIVE-24558.02.patch, 
> HIVE-24558.03.patch, HIVE-24558.04.patch, HIVE-24558.05.patch, 
> HIVE-24558.06.patch, HIVE-24558.07.patch, HIVE-24558.08.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24558) Handle update in table level regular expression.

2021-01-18 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-24558:
---
Attachment: HIVE-24558.08.patch
Status: Patch Available  (was: In Progress)

> Handle update in table level regular expression.
> 
>
> Key: HIVE-24558
> URL: https://issues.apache.org/jira/browse/HIVE-24558
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24558.01.patch, HIVE-24558.02.patch, 
> HIVE-24558.03.patch, HIVE-24558.04.patch, HIVE-24558.05.patch, 
> HIVE-24558.06.patch, HIVE-24558.07.patch, HIVE-24558.08.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24558) Handle update in table level regular expression.

2021-01-18 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-24558:
---
Status: In Progress  (was: Patch Available)

> Handle update in table level regular expression.
> 
>
> Key: HIVE-24558
> URL: https://issues.apache.org/jira/browse/HIVE-24558
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24558.01.patch, HIVE-24558.02.patch, 
> HIVE-24558.03.patch, HIVE-24558.04.patch, HIVE-24558.05.patch, 
> HIVE-24558.06.patch, HIVE-24558.07.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24558) Handle update in table level regular expression.

2021-01-18 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-24558:
---
Attachment: HIVE-24558.07.patch
Status: Patch Available  (was: In Progress)

> Handle update in table level regular expression.
> 
>
> Key: HIVE-24558
> URL: https://issues.apache.org/jira/browse/HIVE-24558
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24558.01.patch, HIVE-24558.02.patch, 
> HIVE-24558.03.patch, HIVE-24558.04.patch, HIVE-24558.05.patch, 
> HIVE-24558.06.patch, HIVE-24558.07.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24534) Prevent comparisons between characters and decimals types when strict checks enabled

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24534?focusedWorklogId=537443=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537443
 ]

ASF GitHub Bot logged work on HIVE-24534:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 15:09
Start Date: 18/Jan/21 15:09
Worklog Time Spent: 10m 
  Work Description: zabetak commented on pull request #1780:
URL: https://github.com/apache/hive/pull/1780#issuecomment-762308202


   Closing and reopen to retrigger tests.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537443)
Time Spent: 2h 20m  (was: 2h 10m)

> Prevent comparisons between characters and decimals types when strict checks 
> enabled
> 
>
> Key: HIVE-24534
> URL: https://issues.apache.org/jira/browse/HIVE-24534
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> When we compare decimal and character types implicit conversions take place 
> that can lead to unexpected and surprising results. 
> {code:sql}
> create table t_str (str_col string);
> insert into t_str values ('1208925742523269458163819');select * from t_str 
> where str_col=1208925742523269479013976;
> {code}
> The SELECT query brings up one row while the filtering value is not the same 
> with the one present in the string column of the table. The problem is that 
> both types are converted to doubles and due to loss of precision the values 
> are deemed equal.
> Even if we change the implicit conversion to use another type (HIVE-24528) 
> there are always some cases that may lead to unexpected results. 
> The goal of this issue is to prevent comparisons between decimal and 
> character types when hive.strict.checks.type.safety is enabled and throw an 
> error. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24534) Prevent comparisons between characters and decimals types when strict checks enabled

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24534?focusedWorklogId=537444=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537444
 ]

ASF GitHub Bot logged work on HIVE-24534:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 15:09
Start Date: 18/Jan/21 15:09
Worklog Time Spent: 10m 
  Work Description: zabetak opened a new pull request #1780:
URL: https://github.com/apache/hive/pull/1780


   ### What changes were proposed in this pull request?
   Throw an error when `hive.strict.checks.type.safety=true` and the query 
contains comparison between decimals and character types.
   
   ### Why are the changes needed?
   To fail-fast and avoid unexpected query results. Examples in the JIRA.
   
   
   ### Does this PR introduce _any_ user-facing change?
   Queries relying on comparisons between decimal and character types will fail.
   
   ### How was this patch tested?
   `mvn clean test -Dtest=TestNegativeCliDriver 
-Dqfile="strict_type_decimal_char_00.q,strict_type_decimal_char_01.q,strict_type_decimal_string_00.q,strict_type_decimal_string_01.q,strict_type_decimal_string_02.q,strict_type_decimal_varchar_00.q,strict_type_decimal_varchar_01.q,strict_type_decimal_varchar_02.q,strict_type_decimal_varchar_03.q,strict_type_decimal_varchar_04.q,strict_type_decimal_varchar_05.q,strict_type_decimal_varchar_06.q,strict_type_decimal_varchar_07.q,strict_type_decimal_varchar_08.q"
 -Dtest.output.overwrite`
   `mvn test -Dtest=TestDecimalStringValidation`



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537444)
Time Spent: 2.5h  (was: 2h 20m)

> Prevent comparisons between characters and decimals types when strict checks 
> enabled
> 
>
> Key: HIVE-24534
> URL: https://issues.apache.org/jira/browse/HIVE-24534
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> When we compare decimal and character types implicit conversions take place 
> that can lead to unexpected and surprising results. 
> {code:sql}
> create table t_str (str_col string);
> insert into t_str values ('1208925742523269458163819');select * from t_str 
> where str_col=1208925742523269479013976;
> {code}
> The SELECT query brings up one row while the filtering value is not the same 
> with the one present in the string column of the table. The problem is that 
> both types are converted to doubles and due to loss of precision the values 
> are deemed equal.
> Even if we change the implicit conversion to use another type (HIVE-24528) 
> there are always some cases that may lead to unexpected results. 
> The goal of this issue is to prevent comparisons between decimal and 
> character types when hive.strict.checks.type.safety is enabled and throw an 
> error. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24534) Prevent comparisons between characters and decimals types when strict checks enabled

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24534?focusedWorklogId=537442=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537442
 ]

ASF GitHub Bot logged work on HIVE-24534:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 15:08
Start Date: 18/Jan/21 15:08
Worklog Time Spent: 10m 
  Work Description: zabetak closed pull request #1780:
URL: https://github.com/apache/hive/pull/1780


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537442)
Time Spent: 2h 10m  (was: 2h)

> Prevent comparisons between characters and decimals types when strict checks 
> enabled
> 
>
> Key: HIVE-24534
> URL: https://issues.apache.org/jira/browse/HIVE-24534
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> When we compare decimal and character types implicit conversions take place 
> that can lead to unexpected and surprising results. 
> {code:sql}
> create table t_str (str_col string);
> insert into t_str values ('1208925742523269458163819');select * from t_str 
> where str_col=1208925742523269479013976;
> {code}
> The SELECT query brings up one row while the filtering value is not the same 
> with the one present in the string column of the table. The problem is that 
> both types are converted to doubles and due to loss of precision the values 
> are deemed equal.
> Even if we change the implicit conversion to use another type (HIVE-24528) 
> there are always some cases that may lead to unexpected results. 
> The goal of this issue is to prevent comparisons between decimal and 
> character types when hive.strict.checks.type.safety is enabled and throw an 
> error. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24601) Control CBO fallback behavior via property

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24601?focusedWorklogId=537433=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537433
 ]

ASF GitHub Bot logged work on HIVE-24601:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 15:03
Start Date: 18/Jan/21 15:03
Worklog Time Spent: 10m 
  Work Description: zabetak commented on a change in pull request #1875:
URL: https://github.com/apache/hive/pull/1875#discussion_r559627202



##
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
##
@@ -1806,6 +1806,12 @@ private static void 
populateLlapDaemonVarsSet(Set llapDaemonVarsSetLocal
 
 // CBO related
 HIVE_CBO_ENABLED("hive.cbo.enable", true, "Flag to control enabling Cost 
Based Optimizations using Calcite framework."),
+HIVE_CBO_FALLBACK_STRATEGY("hive.cbo.fallback.strategy", "CONSERVATIVE", 
new StringSet("NEVER, CONSERVATIVE, ALWAYS, TEST"),

Review comment:
   Good catch @kgyrtkirk ! Fixed in 
https://github.com/apache/hive/pull/1875/commits/a0ed3355de858b9aa49ae6989966451b04029353
 and added tests in 
https://github.com/apache/hive/pull/1875/commits/bb7c60ff43257f6a9de4db125a71e6f2328017ee
 and 
https://github.com/apache/hive/pull/1875/commits/2d28db68a1830c2c4c9212d3650cd78db5375a4e.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537433)
Time Spent: 0.5h  (was: 20m)

> Control CBO fallback behavior via property
> --
>
> Key: HIVE-24601
> URL: https://issues.apache.org/jira/browse/HIVE-24601
> Project: Hive
>  Issue Type: Improvement
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When CBO optimizer fails there is a fallback mechanism(HIVE-7413) that will 
> retry to process the query using the legacy Hive optimizer. 
> There are use-cases where this behavior is not desirable notably for the 
> tests (HIVE-16058) but also for end users who would like to disable the 
> fall-back mechanism to avoid running problematic queries without realizing.
> The goal of this issue is to introduce a dedicated Hive property controlling 
> this behavior,{{hive.cbo.fallback.enable}}, for both tests and production. 
> The default value should be true and tests should run with this property set 
> to false. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24473) Make Hive buildable with HBase 2.x GA versions

2021-01-18 Thread Istvan Toth (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17267328#comment-17267328
 ] 

Istvan Toth commented on HIVE-24473:


No, it wouldn't make a difference. 
The public HBase maven artifacts are compiled with Hadoop2, so they will clash 
with any Hadoop 3.x libraries.

> Make Hive buildable with HBase 2.x GA versions
> --
>
> Key: HIVE-24473
> URL: https://issues.apache.org/jira/browse/HIVE-24473
> Project: Hive
>  Issue Type: Improvement
>  Components: HBase Handler
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive currently builds with a 2.0.0 pre-release.
> Unfortunately the HBase project doesn't provide public maven artifacts that 
> are binary compatible with Hadoop 3.1, so unless we add a build step that 
> recompiles HBase with Hadoop3, we cannot update a to GA HBase 2 version.
> We should at least make sure that Hive can be built with GA HBase 2 releases.
> -Update HBase to more recent version.-
> -We cannot use anything later than 2.2.4 because of HBASE-22394-
> -So the options are 2.1.10 and 2.2.4-
> -I suggest 2.1.10 because it's a chronologically later release, and it 
> maximises compatibility with HBase server deployments.-
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24652) If compactor worker times out, compaction is not cleared from queue

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24652?focusedWorklogId=537424=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537424
 ]

ASF GitHub Bot logged work on HIVE-24652:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 14:47
Start Date: 18/Jan/21 14:47
Worklog Time Spent: 10m 
  Work Description: klcopp commented on pull request #1881:
URL: https://github.com/apache/hive/pull/1881#issuecomment-762295707


   Note: The compaction's txn will stay open if it is open when the Worker 
times out. But AcidHouseKeeperService should close it when the txn times out.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537424)
Time Spent: 20m  (was: 10m)

> If compactor worker times out, compaction is not cleared from queue
> ---
>
> Key: HIVE-24652
> URL: https://issues.apache.org/jira/browse/HIVE-24652
> Project: Hive
>  Issue Type: Improvement
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> If worker is timed out (that is, takes longer than value of 
> hive.compactor.worker.timeout) then the corresponding entry is not cleared 
> from the COMPACTION_QUEUE table, the entry is left in state "working" or 
> "initiated" which means that compaction can't be run again on the 
> table/partition.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24653) Race condition between compactor marker generation and get splits

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24653:
--
Labels: pull-request-available  (was: )

> Race condition between compactor marker generation and get splits
> -
>
> Key: HIVE-24653
> URL: https://issues.apache.org/jira/browse/HIVE-24653
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In a rear scenario it's possible that the compactor moved the files in the 
> final location before creating the compactor marker, so it can be fetched by 
> get splits before the marker is created.
> 2020-09-14 04:55:25,978 [ERROR] ORC_GET_SPLITS #4 |io.AcidUtils|: Failed to 
> read 
> hdfs://host/warehouse/tablespace/managed/hive/database.db/table/partition=x/base_0011535/_metadata_acid:
>  No content to map to Object due to end of input
> java.io.EOFException: No content to map to Object due to end of input



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24653) Race condition between compactor marker generation and get splits

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24653?focusedWorklogId=537420=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537420
 ]

ASF GitHub Bot logged work on HIVE-24653:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 14:43
Start Date: 18/Jan/21 14:43
Worklog Time Spent: 10m 
  Work Description: asinkovits opened a new pull request #1882:
URL: https://github.com/apache/hive/pull/1882


   …t splits
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   In a rear scenario it's possible that the compactor moved the files in the 
final location before creating the compactor marker, so it can be fetched by 
get splits before the marker is created.
   
   2020-09-14 04:55:25,978 [ERROR] ORC_GET_SPLITS #4 |io.AcidUtils|: Failed to 
read 
hdfs://host/warehouse/tablespace/managed/hive/database.db/table/partition=x/base_0011535/_metadata_acid:
 No content to map to Object due to end of input
   java.io.EOFException: No content to map to Object due to end of input
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537420)
Remaining Estimate: 0h
Time Spent: 10m

> Race condition between compactor marker generation and get splits
> -
>
> Key: HIVE-24653
> URL: https://issues.apache.org/jira/browse/HIVE-24653
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In a rear scenario it's possible that the compactor moved the files in the 
> final location before creating the compactor marker, so it can be fetched by 
> get splits before the marker is created.
> 2020-09-14 04:55:25,978 [ERROR] ORC_GET_SPLITS #4 |io.AcidUtils|: Failed to 
> read 
> hdfs://host/warehouse/tablespace/managed/hive/database.db/table/partition=x/base_0011535/_metadata_acid:
>  No content to map to Object due to end of input
> java.io.EOFException: No content to map to Object due to end of input



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24652) If compactor worker times out, compaction is not cleared from queue

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24652:
--
Labels: pull-request-available  (was: )

> If compactor worker times out, compaction is not cleared from queue
> ---
>
> Key: HIVE-24652
> URL: https://issues.apache.org/jira/browse/HIVE-24652
> Project: Hive
>  Issue Type: Improvement
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If worker is timed out (that is, takes longer than value of 
> hive.compactor.worker.timeout) then the corresponding entry is not cleared 
> from the COMPACTION_QUEUE table, the entry is left in state "working" or 
> "initiated" which means that compaction can't be run again on the 
> table/partition.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24652) If compactor worker times out, compaction is not cleared from queue

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24652?focusedWorklogId=537419=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537419
 ]

ASF GitHub Bot logged work on HIVE-24652:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 14:42
Start Date: 18/Jan/21 14:42
Worklog Time Spent: 10m 
  Work Description: klcopp opened a new pull request #1881:
URL: https://github.com/apache/hive/pull/1881


   ### What changes were proposed in this pull request?
   
   ### Why are the changes needed?
   
   ### Does this PR introduce _any_ user-facing change?
   
   See HIVE-24652
   
   ### How was this patch tested?
   Unit test
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537419)
Remaining Estimate: 0h
Time Spent: 10m

> If compactor worker times out, compaction is not cleared from queue
> ---
>
> Key: HIVE-24652
> URL: https://issues.apache.org/jira/browse/HIVE-24652
> Project: Hive
>  Issue Type: Improvement
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If worker is timed out (that is, takes longer than value of 
> hive.compactor.worker.timeout) then the corresponding entry is not cleared 
> from the COMPACTION_QUEUE table, the entry is left in state "working" or 
> "initiated" which means that compaction can't be run again on the 
> table/partition.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24653) Race condition between compactor marker generation and get splits

2021-01-18 Thread Antal Sinkovits (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Sinkovits reassigned HIVE-24653:
--


> Race condition between compactor marker generation and get splits
> -
>
> Key: HIVE-24653
> URL: https://issues.apache.org/jira/browse/HIVE-24653
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Minor
>
> In a rear scenario it's possible that the compactor moved the files in the 
> final location before creating the compactor marker, so it can be fetched by 
> get splits before the marker is created.
> 2020-09-14 04:55:25,978 [ERROR] ORC_GET_SPLITS #4 |io.AcidUtils|: Failed to 
> read 
> hdfs://host/warehouse/tablespace/managed/hive/database.db/table/partition=x/base_0011535/_metadata_acid:
>  No content to map to Object due to end of input
> java.io.EOFException: No content to map to Object due to end of input



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24624) Repl Load should detect the compatible staging dir

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24624?focusedWorklogId=537410=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537410
 ]

ASF GitHub Bot logged work on HIVE-24624:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 14:14
Start Date: 18/Jan/21 14:14
Worklog Time Spent: 10m 
  Work Description: aasha commented on a change in pull request #1855:
URL: https://github.com/apache/hive/pull/1855#discussion_r559594849



##
File path: 
ql/src/test/org/apache/hadoop/hive/ql/parse/repl/load/TestDumpMetaData.java
##
@@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.parse.repl.load;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.ql.parse.SemanticException;
+import org.apache.hadoop.hive.ql.parse.repl.DumpType;
+import org.apache.hadoop.hive.ql.parse.repl.dump.Utils;
+import org.junit.Before;
+import org.junit.Test;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+
+import static org.junit.Assert.*;

Review comment:
   dont use * in the import





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537410)
Time Spent: 40m  (was: 0.5h)

> Repl Load should detect the compatible staging dir
> --
>
> Key: HIVE-24624
> URL: https://issues.apache.org/jira/browse/HIVE-24624
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pratyushotpal Madhukar
>Assignee: Pratyushotpal Madhukar
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24624.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Repl load in CDP when pointed to a staging dir should be able to detect 
> whether the staging dir has the dump structure in compatible format or not



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24624) Repl Load should detect the compatible staging dir

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24624?focusedWorklogId=537407=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537407
 ]

ASF GitHub Bot logged work on HIVE-24624:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 14:13
Start Date: 18/Jan/21 14:13
Worklog Time Spent: 10m 
  Work Description: aasha commented on a change in pull request #1855:
URL: https://github.com/apache/hive/pull/1855#discussion_r559594298



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java
##
@@ -359,6 +359,10 @@ private void analyzeReplLoad(ASTNode ast) throws 
SemanticException {
   if (loadPath != null) {
 DumpMetaData dmd = new DumpMetaData(loadPath, conf);
 
+if (!dmd.isVersionCompatible()) {
+  throw new SemanticException("Dump version: " + dmd.getHiveVersion() +
+  ". Versions older than 3 are not supported.");

Review comment:
   Dont hardcode 3. Use the constant.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537407)
Time Spent: 0.5h  (was: 20m)

> Repl Load should detect the compatible staging dir
> --
>
> Key: HIVE-24624
> URL: https://issues.apache.org/jira/browse/HIVE-24624
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pratyushotpal Madhukar
>Assignee: Pratyushotpal Madhukar
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24624.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Repl load in CDP when pointed to a staging dir should be able to detect 
> whether the staging dir has the dump structure in compatible format or not



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24473) Make Hive buildable with HBase 2.x GA versions

2021-01-18 Thread Zoltan Haindrich (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17267306#comment-17267306
 ] 

Zoltan Haindrich commented on HIVE-24473:
-

I understand your goal/etc - would it be better if hive would be using 
hadoop-3.2 ?

> Make Hive buildable with HBase 2.x GA versions
> --
>
> Key: HIVE-24473
> URL: https://issues.apache.org/jira/browse/HIVE-24473
> Project: Hive
>  Issue Type: Improvement
>  Components: HBase Handler
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive currently builds with a 2.0.0 pre-release.
> Unfortunately the HBase project doesn't provide public maven artifacts that 
> are binary compatible with Hadoop 3.1, so unless we add a build step that 
> recompiles HBase with Hadoop3, we cannot update a to GA HBase 2 version.
> We should at least make sure that Hive can be built with GA HBase 2 releases.
> -Update HBase to more recent version.-
> -We cannot use anything later than 2.2.4 because of HBASE-22394-
> -So the options are 2.1.10 and 2.2.4-
> -I suggest 2.1.10 because it's a chronologically later release, and it 
> maximises compatibility with HBase server deployments.-
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24652) If compactor worker times out, compaction is not cleared from queue

2021-01-18 Thread Karen Coppage (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage reassigned HIVE-24652:



> If compactor worker times out, compaction is not cleared from queue
> ---
>
> Key: HIVE-24652
> URL: https://issues.apache.org/jira/browse/HIVE-24652
> Project: Hive
>  Issue Type: Improvement
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>
> If worker is timed out (that is, takes longer than value of 
> hive.compactor.worker.timeout) then the corresponding entry is not cleared 
> from the COMPACTION_QUEUE table, the entry is left in state "working" or 
> "initiated" which means that compaction can't be run again on the 
> table/partition.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24558) Handle update in table level regular expression.

2021-01-18 Thread Pravin Sinha (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17267291#comment-17267291
 ] 

Pravin Sinha commented on HIVE-24558:
-

+1

> Handle update in table level regular expression.
> 
>
> Key: HIVE-24558
> URL: https://issues.apache.org/jira/browse/HIVE-24558
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24558.01.patch, HIVE-24558.02.patch, 
> HIVE-24558.03.patch, HIVE-24558.04.patch, HIVE-24558.05.patch, 
> HIVE-24558.06.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-24623) Wrong FS error during dump for table-level replication when staging is remote.

2021-01-18 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi resolved HIVE-24623.

Resolution: Fixed

> Wrong FS error during dump for table-level replication when staging is remote.
> --
>
> Key: HIVE-24623
> URL: https://issues.apache.org/jira/browse/HIVE-24623
> Project: Hive
>  Issue Type: Bug
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24623.01.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24623) Wrong FS error during dump for table-level replication when staging is remote.

2021-01-18 Thread Aasha Medhi (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17267288#comment-17267288
 ] 

Aasha Medhi commented on HIVE-24623:


Thank you for the patch [~^sharma] Committed to master

> Wrong FS error during dump for table-level replication when staging is remote.
> --
>
> Key: HIVE-24623
> URL: https://issues.apache.org/jira/browse/HIVE-24623
> Project: Hive
>  Issue Type: Bug
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24623.01.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24623) Wrong FS error during dump for table-level replication when staging is remote.

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24623?focusedWorklogId=537390=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537390
 ]

ASF GitHub Bot logged work on HIVE-24623:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 13:39
Start Date: 18/Jan/21 13:39
Worklog Time Spent: 10m 
  Work Description: aasha merged pull request #1854:
URL: https://github.com/apache/hive/pull/1854


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537390)
Time Spent: 20m  (was: 10m)

> Wrong FS error during dump for table-level replication when staging is remote.
> --
>
> Key: HIVE-24623
> URL: https://issues.apache.org/jira/browse/HIVE-24623
> Project: Hive
>  Issue Type: Bug
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24623.01.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-24597) Replication with timestamp type partition failing in HA case with same NS

2021-01-18 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi resolved HIVE-24597.

Resolution: Fixed

> Replication with timestamp type partition failing in HA case with same NS
> -
>
> Key: HIVE-24597
> URL: https://issues.apache.org/jira/browse/HIVE-24597
> Project: Hive
>  Issue Type: Bug
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24597.01.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24597) Replication with timestamp type partition failing in HA case with same NS

2021-01-18 Thread Aasha Medhi (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17267286#comment-17267286
 ] 

Aasha Medhi commented on HIVE-24597:


Committed to master. Thank you for the patch [~^sharma]

> Replication with timestamp type partition failing in HA case with same NS
> -
>
> Key: HIVE-24597
> URL: https://issues.apache.org/jira/browse/HIVE-24597
> Project: Hive
>  Issue Type: Bug
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24597.01.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24597) Replication with timestamp type partition failing in HA case with same NS

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24597?focusedWorklogId=537389=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537389
 ]

ASF GitHub Bot logged work on HIVE-24597:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 13:36
Start Date: 18/Jan/21 13:36
Worklog Time Spent: 10m 
  Work Description: aasha merged pull request #1838:
URL: https://github.com/apache/hive/pull/1838


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537389)
Time Spent: 20m  (was: 10m)

> Replication with timestamp type partition failing in HA case with same NS
> -
>
> Key: HIVE-24597
> URL: https://issues.apache.org/jira/browse/HIVE-24597
> Project: Hive
>  Issue Type: Bug
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24597.01.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24558) Handle update in table level regular expression.

2021-01-18 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-24558:
---
Attachment: HIVE-24558.06.patch
Status: Patch Available  (was: In Progress)

> Handle update in table level regular expression.
> 
>
> Key: HIVE-24558
> URL: https://issues.apache.org/jira/browse/HIVE-24558
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24558.01.patch, HIVE-24558.02.patch, 
> HIVE-24558.03.patch, HIVE-24558.04.patch, HIVE-24558.05.patch, 
> HIVE-24558.06.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24558) Handle update in table level regular expression.

2021-01-18 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-24558:
---
Status: In Progress  (was: Patch Available)

> Handle update in table level regular expression.
> 
>
> Key: HIVE-24558
> URL: https://issues.apache.org/jira/browse/HIVE-24558
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24558.01.patch, HIVE-24558.02.patch, 
> HIVE-24558.03.patch, HIVE-24558.04.patch, HIVE-24558.05.patch, 
> HIVE-24558.06.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24601) Control CBO fallback behavior via property

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24601?focusedWorklogId=537377=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537377
 ]

ASF GitHub Bot logged work on HIVE-24601:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 13:04
Start Date: 18/Jan/21 13:04
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on a change in pull request #1875:
URL: https://github.com/apache/hive/pull/1875#discussion_r559550634



##
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
##
@@ -1806,6 +1806,12 @@ private static void 
populateLlapDaemonVarsSet(Set llapDaemonVarsSetLocal
 
 // CBO related
 HIVE_CBO_ENABLED("hive.cbo.enable", true, "Flag to control enabling Cost 
Based Optimizations using Calcite framework."),
+HIVE_CBO_FALLBACK_STRATEGY("hive.cbo.fallback.strategy", "CONSERVATIVE", 
new StringSet("NEVER, CONSERVATIVE, ALWAYS, TEST"),

Review comment:
   this seem like the stringset has only one value which is `NEVER, 
CONSERVATIVE, ALWAYS, TEST`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537377)
Time Spent: 20m  (was: 10m)

> Control CBO fallback behavior via property
> --
>
> Key: HIVE-24601
> URL: https://issues.apache.org/jira/browse/HIVE-24601
> Project: Hive
>  Issue Type: Improvement
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When CBO optimizer fails there is a fallback mechanism(HIVE-7413) that will 
> retry to process the query using the legacy Hive optimizer. 
> There are use-cases where this behavior is not desirable notably for the 
> tests (HIVE-16058) but also for end users who would like to disable the 
> fall-back mechanism to avoid running problematic queries without realizing.
> The goal of this issue is to introduce a dedicated Hive property controlling 
> this behavior,{{hive.cbo.fallback.enable}}, for both tests and production. 
> The default value should be true and tests should run with this property set 
> to false. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24486) Enhance operator merge logic to also consider going thru RS operators

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24486?focusedWorklogId=537366=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537366
 ]

ASF GitHub Bot logged work on HIVE-24486:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 12:39
Start Date: 18/Jan/21 12:39
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on a change in pull request #1840:
URL: https://github.com/apache/hive/pull/1840#discussion_r559538641



##
File path: ql/src/test/results/clientpositive/llap/auto_join12.q.out
##
@@ -128,6 +108,16 @@ STAGE PLANS:
 Map-reduce partition columns: _col0 (type: string)
 Statistics: Num rows: 166 Data size: 29548 Basic 
stats: COMPLETE Column stats: COMPLETE
 value expressions: _col1 (type: string)
+Select Operator
+  expressions: key (type: string)
+  outputColumnNames: _col0
+  Statistics: Num rows: 166 Data size: 14442 Basic stats: 
COMPLETE Column stats: COMPLETE
+  Reduce Output Operator
+key expressions: _col0 (type: string)

Review comment:
   yes ..even if the tree is doing the same the rs is retained ;  we could 
probably improve on this later on...





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537366)
Time Spent: 0.5h  (was: 20m)

> Enhance operator merge logic to also consider going thru RS operators
> -
>
> Key: HIVE-24486
> URL: https://issues.apache.org/jira/browse/HIVE-24486
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: o.1.full.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> the targeted situation looks like this:
> {code}
> OP1 -> RS1.1 -> JOIN1.1
> OP1 -> RS1.2 -> JOIN1.2 
> OP2 -> RS2.1 -> JOIN1.1 -> RS3.1 
> OP2 -> RS2.2 -> JOIN1.2 -> RS3.2 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24649) Optimise Hive::addWriteNotificationLog for large data inserts

2021-01-18 Thread Peter Vary (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17267194#comment-17267194
 ] 

Peter Vary commented on HIVE-24649:
---

[~anishek]: If the transaction is not committed then we will have partitions 
created for the table but since the writes are not committed then the readers 
will know that they should not read them and they will handle this as an empty 
partition, so no data corruption should happen. Having extra empty partition is 
not nice, but the next query will just not create them again.

> Optimise Hive::addWriteNotificationLog for large data inserts
> -
>
> Key: HIVE-24649
> URL: https://issues.apache.org/jira/browse/HIVE-24649
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Rajesh Balamohan
>Priority: Major
>  Labels: performance
>
> When loading dynamic partition with large dataset, it spends lot of time in 
> "Hive::loadDynamicPartitions --> addWriteNotificationLog".
> Though it is for same for same table, it ends up loading table and partition 
> details for every partition and writes to notification log.
> Also, "Partition" details may be already present in {{PartitionDetails}} 
> object in {{Hive::loadDynamicPartitions}}. This is unnecessarily recomputed 
> again in {{HiveMetaStore::add_write_notification_log}}
>  
> Lines of interest:
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L3028
> https://github.com/apache/hive/blob/89073a94354f0cc14ec4ae0a43e05aae29276b4d/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L8500
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24569) LLAP daemon leaks file descriptors/log4j appenders

2021-01-18 Thread Stamatis Zampetakis (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17267193#comment-17267193
 ] 

Stamatis Zampetakis commented on HIVE-24569:


[~prasanth_j] indicated in the PR that removing the '.done' suffix in log files 
(which happens with the current PR) may cause some problems with respect to 
yarn log aggregation and tez ui. I searched a bit around the history of this 
naming pattern and it seems that indeed there are some dependencies with yarn 
and tez (relevant issues are HIVE-14224, HIVE-14225, SLIDER-116, TEZ-3629) so 
using plainly the {{IdlePurgePolicy}} may not be possible.

> LLAP daemon leaks file descriptors/log4j appenders
> --
>
> Key: HIVE-24569
> URL: https://issues.apache.org/jira/browse/HIVE-24569
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: llap-appender-gc-roots.png
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> With HIVE-9756 query logs in LLAP are directed to different files (file per 
> query) using a Log4j2 routing appender. Without a purge policy in place, 
> appenders are created dynamically by the routing appender, one for each 
> query, and remain in memory forever. The dynamic appenders write to files so 
> each appender holds to a file descriptor. 
> Further work HIVE-14224 has mitigated the issue by introducing a custom 
> purging policy (LlapRoutingAppenderPurgePolicy) which deletes the dynamic 
> appenders (and closes the respective files) when the query is completed 
> (org.apache.hadoop.hive.llap.daemon.impl.QueryTracker#handleLogOnQueryCompletion).
>  
> However, in the presence of multiple threads appending to the logs there are 
> race conditions. In an internal Hive cluster the number of file descriptors 
> started going up approx one descriptor leaking per query. After some 
> debugging it turns out that one thread (running the 
> QueryTracker#handleLogOnQueryCompletion) signals that the query has finished 
> and thus the purge policy should get rid of the respective appender (and 
> close the file) while another (Task-Executor-0) attempts to append another 
> log message for the same query. The initial appender is closed after the 
> request from the query tracker but a new one is created to accomodate the 
> message from the task executor and the latter is never removed thus creating 
> a leak. 
> Similar leaks have been identified and fixed for HS2 with the most similar 
> one being that described 
> [here|https://issues.apache.org/jira/browse/HIVE-22753?focusedCommentId=17021041=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17021041].
>  
> The problem relies on the timing of threads so it may not manifestate in all 
> versions between 2.2.0 and 4.0.0. Usually the leak can be seen either via 
> lsof (or other similar command) with the following output:
> {noformat}
> # 1494391 is the PID of the LLAP daemon process
> ls -ltr /proc/1494391/fd
> ...
> lrwx-- 1 hive hadoop 64 Dec 24 12:08 978 -> 
> /hadoop/yarn/log/application_1608659125567_0006/container_e04_1608659125567_0006_01_02/hive_20201224121724_66ce273d-54a9-4dcd-a9fb-20cb5691cef7-dag_1608659125567_0008_194.log
> lrwx-- 1 hive hadoop 64 Dec 24 12:08 977 -> 
> /hadoop/yarn/log/application_1608659125567_0006/container_e04_1608659125567_0006_01_02/hive_20201224121804_ce53eeb5-c73f-4999-b7a4-b4dd04d4e4de-dag_1608659125567_0008_197.log
> lrwx-- 1 hive hadoop 64 Dec 24 12:08 974 -> 
> /hadoop/yarn/log/application_1608659125567_0006/container_e04_1608659125567_0006_01_02/hive_20201224122002_1693bd7d-2f0e-4673-a8d1-b7cb14a02204-dag_1608659125567_0008_204.log
> lrwx-- 1 hive hadoop 64 Dec 24 12:08 989 -> 
> /hadoop/yarn/log/application_1608659125567_0006/container_e04_1608659125567_0006_01_02/hive_20201224121909_6a56218f-06c7-4906-9907-4b6dd824b100-dag_1608659125567_0008_201.log
> lrwx-- 1 hive hadoop 64 Dec 24 12:08 984 -> 
> /hadoop/yarn/log/application_1608659125567_0006/container_e04_1608659125567_0006_01_02/hive_20201224121754_78ef49a0-bc23-478f-9a16-87fa25e7a287-dag_1608659125567_0008_196.log
> lrwx-- 1 hive hadoop 64 Dec 24 12:08 983 -> 
> /hadoop/yarn/log/application_1608659125567_0006/container_e04_1608659125567_0006_01_02/hive_20201224121855_e65b9ebf-b2ec-4159-9570-1904442b7048-dag_1608659125567_0008_200.log
> lrwx-- 1 hive hadoop 64 Dec 24 12:08 981 -> 
> /hadoop/yarn/log/application_1608659125567_0006/container_e04_1608659125567_0006_01_02/hive_20201224121818_e9051ae3-1316-46af-aabb-22c53ed2fda7-dag_1608659125567_0008_198.log
> lrwx-- 1

[jira] [Commented] (HIVE-24569) LLAP daemon leaks file descriptors/log4j appenders

2021-01-18 Thread Stamatis Zampetakis (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17267187#comment-17267187
 ] 

Stamatis Zampetakis commented on HIVE-24569:


It turns out that the race condition that leads to the leak of file 
descriptors/appenders is already logged in HIVE-14300.

> LLAP daemon leaks file descriptors/log4j appenders
> --
>
> Key: HIVE-24569
> URL: https://issues.apache.org/jira/browse/HIVE-24569
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: llap-appender-gc-roots.png
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> With HIVE-9756 query logs in LLAP are directed to different files (file per 
> query) using a Log4j2 routing appender. Without a purge policy in place, 
> appenders are created dynamically by the routing appender, one for each 
> query, and remain in memory forever. The dynamic appenders write to files so 
> each appender holds to a file descriptor. 
> Further work HIVE-14224 has mitigated the issue by introducing a custom 
> purging policy (LlapRoutingAppenderPurgePolicy) which deletes the dynamic 
> appenders (and closes the respective files) when the query is completed 
> (org.apache.hadoop.hive.llap.daemon.impl.QueryTracker#handleLogOnQueryCompletion).
>  
> However, in the presence of multiple threads appending to the logs there are 
> race conditions. In an internal Hive cluster the number of file descriptors 
> started going up approx one descriptor leaking per query. After some 
> debugging it turns out that one thread (running the 
> QueryTracker#handleLogOnQueryCompletion) signals that the query has finished 
> and thus the purge policy should get rid of the respective appender (and 
> close the file) while another (Task-Executor-0) attempts to append another 
> log message for the same query. The initial appender is closed after the 
> request from the query tracker but a new one is created to accomodate the 
> message from the task executor and the latter is never removed thus creating 
> a leak. 
> Similar leaks have been identified and fixed for HS2 with the most similar 
> one being that described 
> [here|https://issues.apache.org/jira/browse/HIVE-22753?focusedCommentId=17021041=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17021041].
>  
> The problem relies on the timing of threads so it may not manifestate in all 
> versions between 2.2.0 and 4.0.0. Usually the leak can be seen either via 
> lsof (or other similar command) with the following output:
> {noformat}
> # 1494391 is the PID of the LLAP daemon process
> ls -ltr /proc/1494391/fd
> ...
> lrwx-- 1 hive hadoop 64 Dec 24 12:08 978 -> 
> /hadoop/yarn/log/application_1608659125567_0006/container_e04_1608659125567_0006_01_02/hive_20201224121724_66ce273d-54a9-4dcd-a9fb-20cb5691cef7-dag_1608659125567_0008_194.log
> lrwx-- 1 hive hadoop 64 Dec 24 12:08 977 -> 
> /hadoop/yarn/log/application_1608659125567_0006/container_e04_1608659125567_0006_01_02/hive_20201224121804_ce53eeb5-c73f-4999-b7a4-b4dd04d4e4de-dag_1608659125567_0008_197.log
> lrwx-- 1 hive hadoop 64 Dec 24 12:08 974 -> 
> /hadoop/yarn/log/application_1608659125567_0006/container_e04_1608659125567_0006_01_02/hive_20201224122002_1693bd7d-2f0e-4673-a8d1-b7cb14a02204-dag_1608659125567_0008_204.log
> lrwx-- 1 hive hadoop 64 Dec 24 12:08 989 -> 
> /hadoop/yarn/log/application_1608659125567_0006/container_e04_1608659125567_0006_01_02/hive_20201224121909_6a56218f-06c7-4906-9907-4b6dd824b100-dag_1608659125567_0008_201.log
> lrwx-- 1 hive hadoop 64 Dec 24 12:08 984 -> 
> /hadoop/yarn/log/application_1608659125567_0006/container_e04_1608659125567_0006_01_02/hive_20201224121754_78ef49a0-bc23-478f-9a16-87fa25e7a287-dag_1608659125567_0008_196.log
> lrwx-- 1 hive hadoop 64 Dec 24 12:08 983 -> 
> /hadoop/yarn/log/application_1608659125567_0006/container_e04_1608659125567_0006_01_02/hive_20201224121855_e65b9ebf-b2ec-4159-9570-1904442b7048-dag_1608659125567_0008_200.log
> lrwx-- 1 hive hadoop 64 Dec 24 12:08 981 -> 
> /hadoop/yarn/log/application_1608659125567_0006/container_e04_1608659125567_0006_01_02/hive_20201224121818_e9051ae3-1316-46af-aabb-22c53ed2fda7-dag_1608659125567_0008_198.log
> lrwx-- 1 hive hadoop 64 Dec 24 12:08 980 -> 
> /hadoop/yarn/log/application_1608659125567_0006/container_e04_1608659125567_0006_01_02/hive_20201224121744_fcf37921-4351-4368-95ee-b5be2592d89a-dag_1608659125567_0008_195.log
> lrwx-- 1 hive hadoop 64 Dec 24 12:08 979 -> 
>

[jira] [Work logged] (HIVE-24534) Prevent comparisons between characters and decimals types when strict checks enabled

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24534?focusedWorklogId=537340=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537340
 ]

ASF GitHub Bot logged work on HIVE-24534:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 11:22
Start Date: 18/Jan/21 11:22
Worklog Time Spent: 10m 
  Work Description: zabetak commented on a change in pull request #1780:
URL: https://github.com/apache/hive/pull/1780#discussion_r559496100



##
File path: ql/src/test/results/clientpositive/llap/avrotblsjoin.q.out
##
@@ -71,9 +71,9 @@ POSTHOOK: Input: _dummy_database@_dummy_table
 POSTHOOK: Output: default@table1_1
 POSTHOOK: Lineage: table1_1.col1 SCRIPT []
 POSTHOOK: Lineage: table1_1.col2 SCRIPT []
-WARNING: Comparing a bigint and a string may result in a loss of precision.
-WARNING: Comparing a bigint and a string may result in a loss of precision.
-WARNING: Comparing a bigint and a string may result in a loss of precision.
+WARNING: Comparing string and bigint may result in loss of information.

Review comment:
   I changed the message `to loss of information` since precision may not 
make sense for certain types such as string. I don't have a strong preference 
though so I can restore the old message.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537340)
Time Spent: 2h  (was: 1h 50m)

> Prevent comparisons between characters and decimals types when strict checks 
> enabled
> 
>
> Key: HIVE-24534
> URL: https://issues.apache.org/jira/browse/HIVE-24534
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> When we compare decimal and character types implicit conversions take place 
> that can lead to unexpected and surprising results. 
> {code:sql}
> create table t_str (str_col string);
> insert into t_str values ('1208925742523269458163819');select * from t_str 
> where str_col=1208925742523269479013976;
> {code}
> The SELECT query brings up one row while the filtering value is not the same 
> with the one present in the string column of the table. The problem is that 
> both types are converted to doubles and due to loss of precision the values 
> are deemed equal.
> Even if we change the implicit conversion to use another type (HIVE-24528) 
> there are always some cases that may lead to unexpected results. 
> The goal of this issue is to prevent comparisons between decimal and 
> character types when hive.strict.checks.type.safety is enabled and throw an 
> error. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24534) Prevent comparisons between characters and decimals types when strict checks enabled

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24534?focusedWorklogId=537322=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537322
 ]

ASF GitHub Bot logged work on HIVE-24534:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 10:28
Start Date: 18/Jan/21 10:28
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 commented on a change in pull request #1780:
URL: https://github.com/apache/hive/pull/1780#discussion_r559458095



##
File path: ql/src/test/results/clientpositive/llap/avrotblsjoin.q.out
##
@@ -71,9 +71,9 @@ POSTHOOK: Input: _dummy_database@_dummy_table
 POSTHOOK: Output: default@table1_1
 POSTHOOK: Lineage: table1_1.col1 SCRIPT []
 POSTHOOK: Lineage: table1_1.col2 SCRIPT []
-WARNING: Comparing a bigint and a string may result in a loss of precision.
-WARNING: Comparing a bigint and a string may result in a loss of precision.
-WARNING: Comparing a bigint and a string may result in a loss of precision.
+WARNING: Comparing string and bigint may result in loss of information.

Review comment:
   Maybe `a loss of precision` is more user friendly compared to 
`information`.

##
File path: 
ql/src/test/results/clientpositive/llap/partition_wise_fileformat2.q.out
##
@@ -123,31 +123,6 @@ POSTHOOK: Input: default@partition_test_partitioned@dt=102
100
100
100
-238val_238 102

Review comment:
   the result of this qfile seems unrelated to the changes...





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537322)
Time Spent: 1h 50m  (was: 1h 40m)

> Prevent comparisons between characters and decimals types when strict checks 
> enabled
> 
>
> Key: HIVE-24534
> URL: https://issues.apache.org/jira/browse/HIVE-24534
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> When we compare decimal and character types implicit conversions take place 
> that can lead to unexpected and surprising results. 
> {code:sql}
> create table t_str (str_col string);
> insert into t_str values ('1208925742523269458163819');select * from t_str 
> where str_col=1208925742523269479013976;
> {code}
> The SELECT query brings up one row while the filtering value is not the same 
> with the one present in the string column of the table. The problem is that 
> both types are converted to doubles and due to loss of precision the values 
> are deemed equal.
> Even if we change the implicit conversion to use another type (HIVE-24528) 
> there are always some cases that may lead to unexpected results. 
> The goal of this issue is to prevent comparisons between decimal and 
> character types when hive.strict.checks.type.safety is enabled and throw an 
> error. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24534) Prevent comparisons between characters and decimals types when strict checks enabled

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24534?focusedWorklogId=537317=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537317
 ]

ASF GitHub Bot logged work on HIVE-24534:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 10:14
Start Date: 18/Jan/21 10:14
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 commented on pull request #1780:
URL: https://github.com/apache/hive/pull/1780#issuecomment-762142694


   +1



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537317)
Time Spent: 1h 40m  (was: 1.5h)

> Prevent comparisons between characters and decimals types when strict checks 
> enabled
> 
>
> Key: HIVE-24534
> URL: https://issues.apache.org/jira/browse/HIVE-24534
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> When we compare decimal and character types implicit conversions take place 
> that can lead to unexpected and surprising results. 
> {code:sql}
> create table t_str (str_col string);
> insert into t_str values ('1208925742523269458163819');select * from t_str 
> where str_col=1208925742523269479013976;
> {code}
> The SELECT query brings up one row while the filtering value is not the same 
> with the one present in the string column of the table. The problem is that 
> both types are converted to doubles and due to loss of precision the values 
> are deemed equal.
> Even if we change the implicit conversion to use another type (HIVE-24528) 
> there are always some cases that may lead to unexpected results. 
> The goal of this issue is to prevent comparisons between decimal and 
> character types when hive.strict.checks.type.safety is enabled and throw an 
> error. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-14165) Remove Hive file listing during split computation

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-14165?focusedWorklogId=537316=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537316
 ]

ASF GitHub Bot logged work on HIVE-14165:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 10:06
Start Date: 18/Jan/21 10:06
Worklog Time Spent: 10m 
  Work Description: pvargacl commented on a change in pull request #1866:
URL: https://github.com/apache/hive/pull/1866#discussion_r559448989



##
File path: ql/src/test/queries/clientpositive/exim_04_evolved_parts.q
##
@@ -15,7 +15,7 @@ alter table exim_employee_n12 clustered by (emp_sex, 
emp_dept) sorted by (emp_id
 alter table exim_employee_n12 add partition (emp_country='in', emp_state='tn');
 
 alter table exim_employee_n12 set fileformat 
-   inputformat  "org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat" 
+   inputformat  "org.apache.hadoop.hive.ql.io.RCFileInputFormat"

Review comment:
   BucketizedHiveInputFormat can not be used this way, it is an internal 
inputformat used by mapjoins and it requires someone to setup either 
has.map.work or has.reduce.work property.
   This test was passing because the partitions in this table are empty and the 
InputFormat.getSplits were not called for them in the FetchOperator. It is 
impossible to read from a table that has data in it and was setup like this.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537316)
Time Spent: 0.5h  (was: 20m)

> Remove Hive file listing during split computation
> -
>
> Key: HIVE-14165
> URL: https://issues.apache.org/jira/browse/HIVE-14165
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Abdullah Yousufi
>Assignee: Peter Varga
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-14165.02.patch, HIVE-14165.03.patch, 
> HIVE-14165.04.patch, HIVE-14165.05.patch, HIVE-14165.06.patch, 
> HIVE-14165.07.patch, HIVE-14165.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The Hive side listing in FetchOperator.java is unnecessary, since Hadoop's 
> FileInputFormat.java will list the files during split computation anyway to 
> determine their size. One way to remove this is to catch the 
> InvalidInputFormat exception thrown by FileInputFormat#getSplits() on the 
> Hive side instead of doing the file listing beforehand.
> For S3 select queries on partitioned tables, this results in a 2x speedup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-14165) Remove Hive file listing during split computation

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-14165?focusedWorklogId=537314=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537314
 ]

ASF GitHub Bot logged work on HIVE-14165:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 09:56
Start Date: 18/Jan/21 09:56
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #1866:
URL: https://github.com/apache/hive/pull/1866#discussion_r559442547



##
File path: ql/src/test/queries/clientpositive/exim_04_evolved_parts.q
##
@@ -15,7 +15,7 @@ alter table exim_employee_n12 clustered by (emp_sex, 
emp_dept) sorted by (emp_id
 alter table exim_employee_n12 add partition (emp_country='in', emp_state='tn');
 
 alter table exim_employee_n12 set fileformat 
-   inputformat  "org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat" 
+   inputformat  "org.apache.hadoop.hive.ql.io.RCFileInputFormat"

Review comment:
   Why do we need this change?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537314)
Time Spent: 20m  (was: 10m)

> Remove Hive file listing during split computation
> -
>
> Key: HIVE-14165
> URL: https://issues.apache.org/jira/browse/HIVE-14165
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Abdullah Yousufi
>Assignee: Peter Varga
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-14165.02.patch, HIVE-14165.03.patch, 
> HIVE-14165.04.patch, HIVE-14165.05.patch, HIVE-14165.06.patch, 
> HIVE-14165.07.patch, HIVE-14165.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The Hive side listing in FetchOperator.java is unnecessary, since Hadoop's 
> FileInputFormat.java will list the files during split computation anyway to 
> determine their size. One way to remove this is to catch the 
> InvalidInputFormat exception thrown by FileInputFormat#getSplits() on the 
> Hive side instead of doing the file listing beforehand.
> For S3 select queries on partitioned tables, this results in a 2x speedup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24633) Support CTE with column labels

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24633?focusedWorklogId=537313=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537313
 ]

ASF GitHub Bot logged work on HIVE-24633:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 09:54
Start Date: 18/Jan/21 09:54
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on a change in pull request #1865:
URL: https://github.com/apache/hive/pull/1865#discussion_r559440784



##
File path: ql/src/test/queries/clientpositive/cte_8.q
##
@@ -0,0 +1,34 @@
+set hive.cli.print.header=true;
+
+create table t1(int_col int, bigint_col bigint);
+
+insert into t1 values(1, 2), (3, 4);
+
+explain cbo
+with cte1(a, b) as (select int_col x, bigint_col y from t1)
+select a, b from cte1;
+
+with cte1(a, b) as (select int_col x, bigint_col y from t1)
+select a, b from cte1;
+
+with cte1(a) as (select int_col x, bigint_col y from t1)

Review comment:
   Our behavior is the same as Postgres: ambiguous column reference is only 
allowed when the main query has `select *`
   If the main query has an explicit reference to the ambiguous column in any 
clause the query won't compile.
   Added test case for both scenario.
   
   The difference between Hive and Postgres is the final column names int the 
result set:
   ```
   with cte1(a) as (select int_col x, bigint_col a from t1)
   select * from cte1;
   ``` 
   Postgres: `a, a`
   Hive: `cte1.acte1._col1`
   
   After some research I found that we alter the column name to its internal 
name because CBO cannot handle ambiguous column names: 
[HIVE-19770](https://issues.apache.org/jira/browse/HIVE-19770)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537313)
Time Spent: 40m  (was: 0.5h)

> Support CTE with column labels
> --
>
> Key: HIVE-24633
> URL: https://issues.apache.org/jira/browse/HIVE-24633
> Project: Hive
>  Issue Type: Improvement
>  Components: Parser
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {code}
> with cte1(a, b) as (select int_col x, bigint_col y from t1)
> select a, b from cte1{code}
> {code}
> a b
> 1 2
> 3 4
> {code}
> {code}
>  ::=
>   [  ] 
>   [  ] [  ] [  ]
>  ::=
>   WITH [ RECURSIVE ] 
>  ::=
>[ {   }... ]
>  ::=
>[]
>   AS  [  ]
>  ::=
>   
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24633) Support CTE with column labels

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24633?focusedWorklogId=537312=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537312
 ]

ASF GitHub Bot logged work on HIVE-24633:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 09:53
Start Date: 18/Jan/21 09:53
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on a change in pull request #1865:
URL: https://github.com/apache/hive/pull/1865#discussion_r559440784



##
File path: ql/src/test/queries/clientpositive/cte_8.q
##
@@ -0,0 +1,34 @@
+set hive.cli.print.header=true;
+
+create table t1(int_col int, bigint_col bigint);
+
+insert into t1 values(1, 2), (3, 4);
+
+explain cbo
+with cte1(a, b) as (select int_col x, bigint_col y from t1)
+select a, b from cte1;
+
+with cte1(a, b) as (select int_col x, bigint_col y from t1)
+select a, b from cte1;
+
+with cte1(a) as (select int_col x, bigint_col y from t1)

Review comment:
   Our behavior is the same as Postgres: ambiguous column reference is only 
allowed when the main query has `select *`
   If the main query has an explicit reference to the ambiguous column the 
query won't compile.
   Added test case for both scenario.
   
   The difference between Hive and Postgres is the final column names int the 
result set:
   ```
   with cte1(a) as (select int_col x, bigint_col a from t1)
   select * from cte1;
   ``` 
   Postgres: `a, a`
   Hive: `cte1.acte1._col1`
   
   After some research I found that we alter the column name to its internal 
name because CBO cannot handle ambiguous column names: 
[HIVE-19770](https://issues.apache.org/jira/browse/HIVE-19770)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537312)
Time Spent: 0.5h  (was: 20m)

> Support CTE with column labels
> --
>
> Key: HIVE-24633
> URL: https://issues.apache.org/jira/browse/HIVE-24633
> Project: Hive
>  Issue Type: Improvement
>  Components: Parser
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {code}
> with cte1(a, b) as (select int_col x, bigint_col y from t1)
> select a, b from cte1{code}
> {code}
> a b
> 1 2
> 3 4
> {code}
> {code}
>  ::=
>   [  ] 
>   [  ] [  ] [  ]
>  ::=
>   WITH [ RECURSIVE ] 
>  ::=
>[ {   }... ]
>  ::=
>[]
>   AS  [  ]
>  ::=
>   
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-24629) Invoke optional output committer in TezProcessor

2021-01-18 Thread Peter Vary (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary resolved HIVE-24629.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Pushed to master.

Thanks for the patch [~Marton Bod] and [~abstractdog] for the review

> Invoke optional output committer in TezProcessor
> 
>
> Key: HIVE-24629
> URL: https://issues.apache.org/jira/browse/HIVE-24629
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> In order to enable Hive to write to Iceberg tables, we need to use an output 
> committer which will fire at the end of each Tez task execution (commitTask) 
> and the after the execution of each vertex (commitOutput/commitJob). This 
> output committer will issue a commit containing the written-out data files to 
> the Iceberg table, replacing its previous snapshot pointer with a new one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24629) Invoke optional output committer in TezProcessor

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24629?focusedWorklogId=537308=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537308
 ]

ASF GitHub Bot logged work on HIVE-24629:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 09:47
Start Date: 18/Jan/21 09:47
Worklog Time Spent: 10m 
  Work Description: pvary merged pull request #1857:
URL: https://github.com/apache/hive/pull/1857


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537308)
Time Spent: 50m  (was: 40m)

> Invoke optional output committer in TezProcessor
> 
>
> Key: HIVE-24629
> URL: https://issues.apache.org/jira/browse/HIVE-24629
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> In order to enable Hive to write to Iceberg tables, we need to use an output 
> committer which will fire at the end of each Tez task execution (commitTask) 
> and the after the execution of each vertex (commitOutput/commitJob). This 
> output committer will issue a commit containing the written-out data files to 
> the Iceberg table, replacing its previous snapshot pointer with a new one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-24644) QueryResultCache parses the query twice

2021-01-18 Thread Krisztian Kasa (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa resolved HIVE-24644.
---
Resolution: Fixed

Pushed to master. Thanks [~jcamachorodriguez], [~vavramenko] for review.

> QueryResultCache parses the query twice
> ---
>
> Key: HIVE-24644
> URL: https://issues.apache.org/jira/browse/HIVE-24644
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Parser
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Query result cache lookup results by query text which has fully resolved 
> table references.
> In order to generate this query text currently implementation 
> * transforms the AST tree back to String
> * parses the String generated in above step
> * traverse the new AST and replaces the table references to the fully 
> qualified ones
> * transforms the new AST tree back to String -> this will be the cache key



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24644) QueryResultCache parses the query twice

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24644?focusedWorklogId=537294=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537294
 ]

ASF GitHub Bot logged work on HIVE-24644:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 09:20
Start Date: 18/Jan/21 09:20
Worklog Time Spent: 10m 
  Work Description: kasakrisz merged pull request #1874:
URL: https://github.com/apache/hive/pull/1874


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537294)
Time Spent: 20m  (was: 10m)

> QueryResultCache parses the query twice
> ---
>
> Key: HIVE-24644
> URL: https://issues.apache.org/jira/browse/HIVE-24644
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Parser
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Query result cache lookup results by query text which has fully resolved 
> table references.
> In order to generate this query text currently implementation 
> * transforms the AST tree back to String
> * parses the String generated in above step
> * traverse the new AST and replaces the table references to the fully 
> qualified ones
> * transforms the new AST tree back to String -> this will be the cache key



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24651) write into new added column for partitioned table failed

2021-01-18 Thread Zhihua Deng (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17267109#comment-17267109
 ] 

Zhihua Deng commented on HIVE-24651:


Maybe you shoud alter with cascade when you add a new column to a partitioned 
table: 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Add/ReplaceColumns

> write into new added column for partitioned table failed
> 
>
> Key: HIVE-24651
> URL: https://issues.apache.org/jira/browse/HIVE-24651
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.2
> Environment: hive version: 3.1.2
>Reporter: Spongebob
>Priority: Major
>
> target table T1 has partition 'p1', then I added new column col2, and then 
> insert overwrite T1 partition='p1', col2 shows NULL values.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-2301) Throw error when attempting to create a column with the same name as a partition column

2021-01-18 Thread Zoltan Matyus (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Matyus resolved HIVE-2301.
-
Resolution: Done

> Throw error when attempting to create a column with the same name as a 
> partition column
> ---
>
> Key: HIVE-2301
> URL: https://issues.apache.org/jira/browse/HIVE-2301
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.8.0
>Reporter: Paul Yang
>Assignee: Zoltan Matyus
>Priority: Minor
>
> If an alter table is run to rename a column to the same name as a partition 
> column, the alter will succeed. However, subsequent operations on that table 
> will fail.
> {code}
> hive> create table tmp_pyang_test (key string) partitioned by (ds string);
> OK
> Time taken: 4.773 seconds
> hive> alter table tmp_pyang_test replace columns (ds string);
> OK
> Time taken: 1.254 seconds
> hive> describe tmp_pyang_test;
> FAILED: Error in metadata: Partition column name ds conflicts with table 
> columns.
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive>
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-2301) Throw error when attempting to create a column with the same name as a partition column

2021-01-18 Thread Zoltan Matyus (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Matyus reassigned HIVE-2301:
---

Assignee: Zoltan Matyus

> Throw error when attempting to create a column with the same name as a 
> partition column
> ---
>
> Key: HIVE-2301
> URL: https://issues.apache.org/jira/browse/HIVE-2301
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.8.0
>Reporter: Paul Yang
>Assignee: Zoltan Matyus
>Priority: Minor
>
> If an alter table is run to rename a column to the same name as a partition 
> column, the alter will succeed. However, subsequent operations on that table 
> will fail.
> {code}
> hive> create table tmp_pyang_test (key string) partitioned by (ds string);
> OK
> Time taken: 4.773 seconds
> hive> alter table tmp_pyang_test replace columns (ds string);
> OK
> Time taken: 1.254 seconds
> hive> describe tmp_pyang_test;
> FAILED: Error in metadata: Partition column name ds conflicts with table 
> columns.
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive>
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work started] (HIVE-2301) Throw error when attempting to create a column with the same name as a partition column

2021-01-18 Thread Zoltan Matyus (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-2301 started by Zoltan Matyus.
---
> Throw error when attempting to create a column with the same name as a 
> partition column
> ---
>
> Key: HIVE-2301
> URL: https://issues.apache.org/jira/browse/HIVE-2301
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.8.0
>Reporter: Paul Yang
>Assignee: Zoltan Matyus
>Priority: Minor
>
> If an alter table is run to rename a column to the same name as a partition 
> column, the alter will succeed. However, subsequent operations on that table 
> will fail.
> {code}
> hive> create table tmp_pyang_test (key string) partitioned by (ds string);
> OK
> Time taken: 4.773 seconds
> hive> alter table tmp_pyang_test replace columns (ds string);
> OK
> Time taken: 1.254 seconds
> hive> describe tmp_pyang_test;
> FAILED: Error in metadata: Partition column name ds conflicts with table 
> columns.
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> hive>
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24626) LLAP: reader threads could be starvated if all IO elevator threads are busy to enqueue to another readers with full queue

2021-01-18 Thread Jira



[ 
https://issues.apache.org/jira/browse/HIVE-24626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17267090#comment-17267090
 ] 

László Bodor commented on HIVE-24626:
-

PR merged, thanks for the review [~prasanth_j], [~pgaref]

> LLAP: reader threads could be starvated if all IO elevator threads are busy 
> to enqueue to another readers with full queue
> -
>
> Key: HIVE-24626
> URL: https://issues.apache.org/jira/browse/HIVE-24626
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
> Attachments: executor_stack_cache_none_12_io_threads.log
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The root cause is that the readers cannot queue.offer items to full queues, 
> which belong to consumers that are blocked on other consumers. 
> Scenario is like below:
> {code}
> --
> VERTICES  MODESTATUS  TOTAL  COMPLETED  RUNNING  PENDING  
> FAILED  KILLED
> --
> Map 2 ..  llap   RUNNING  3  210  
>  0   0
> Map 1 llap   RUNNING676  0  119  557  
>  0   0
> Map 3 llap   RUNNING108  0   21   87  
>  0  21
> Reducer 4 llapINITED  1  001  
>  0   0
> Map 5 llapINITED108  00  108  
>  0   0
> Reducer 6 llapINITED  4  004  
>  0   0
> Reducer 7 llapINITED  1  001  
>  0   0
> --
> VERTICES: 00/07  [>>--] 0%ELAPSED TIME: 3489.83 s
> --
> {code}
> Map2 is MAPJOINed to Map1. In an LLAP daemon, the forever running Map2 task 
> is blocked on nextCvb:
> {code}
> "TezTR-886270_0_1_0_1_0" #154 daemon prio=5 os_prio=0 tid=0x7f1b88348000 
> nid=0x147 waiting on condition [0x7f0ce005d000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x7f0de8025e00> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>   at 
> java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418)
>   at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.nextCvb(LlapRecordReader.java:517)
>   at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.next(LlapRecordReader.java:372)
>   at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.next(LlapRecordReader.java:82)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:362)
>   at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
>   at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:117)
>   at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
>   at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:115)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:437)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at

[jira] [Resolved] (HIVE-24626) LLAP: reader threads could be starvated if all IO elevator threads are busy to enqueue to another readers with full queue

2021-01-18 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-24626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor resolved HIVE-24626.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

> LLAP: reader threads could be starvated if all IO elevator threads are busy 
> to enqueue to another readers with full queue
> -
>
> Key: HIVE-24626
> URL: https://issues.apache.org/jira/browse/HIVE-24626
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: executor_stack_cache_none_12_io_threads.log
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The root cause is that the readers cannot queue.offer items to full queues, 
> which belong to consumers that are blocked on other consumers. 
> Scenario is like below:
> {code}
> --
> VERTICES  MODESTATUS  TOTAL  COMPLETED  RUNNING  PENDING  
> FAILED  KILLED
> --
> Map 2 ..  llap   RUNNING  3  210  
>  0   0
> Map 1 llap   RUNNING676  0  119  557  
>  0   0
> Map 3 llap   RUNNING108  0   21   87  
>  0  21
> Reducer 4 llapINITED  1  001  
>  0   0
> Map 5 llapINITED108  00  108  
>  0   0
> Reducer 6 llapINITED  4  004  
>  0   0
> Reducer 7 llapINITED  1  001  
>  0   0
> --
> VERTICES: 00/07  [>>--] 0%ELAPSED TIME: 3489.83 s
> --
> {code}
> Map2 is MAPJOINed to Map1. In an LLAP daemon, the forever running Map2 task 
> is blocked on nextCvb:
> {code}
> "TezTR-886270_0_1_0_1_0" #154 daemon prio=5 os_prio=0 tid=0x7f1b88348000 
> nid=0x147 waiting on condition [0x7f0ce005d000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x7f0de8025e00> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>   at 
> java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418)
>   at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.nextCvb(LlapRecordReader.java:517)
>   at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.next(LlapRecordReader.java:372)
>   at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.next(LlapRecordReader.java:82)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:362)
>   at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
>   at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:117)
>   at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
>   at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:115)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:437)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>

[jira] [Work logged] (HIVE-24626) LLAP: reader threads could be starvated if all IO elevator threads are busy to enqueue to another readers with full queue

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24626?focusedWorklogId=537287=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537287
 ]

ASF GitHub Bot logged work on HIVE-24626:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 08:53
Start Date: 18/Jan/21 08:53
Worklog Time Spent: 10m 
  Work Description: abstractdog merged pull request #1868:
URL: https://github.com/apache/hive/pull/1868


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537287)
Time Spent: 40m  (was: 0.5h)

> LLAP: reader threads could be starvated if all IO elevator threads are busy 
> to enqueue to another readers with full queue
> -
>
> Key: HIVE-24626
> URL: https://issues.apache.org/jira/browse/HIVE-24626
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
> Attachments: executor_stack_cache_none_12_io_threads.log
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The root cause is that the readers cannot queue.offer items to full queues, 
> which belong to consumers that are blocked on other consumers. 
> Scenario is like below:
> {code}
> --
> VERTICES  MODESTATUS  TOTAL  COMPLETED  RUNNING  PENDING  
> FAILED  KILLED
> --
> Map 2 ..  llap   RUNNING  3  210  
>  0   0
> Map 1 llap   RUNNING676  0  119  557  
>  0   0
> Map 3 llap   RUNNING108  0   21   87  
>  0  21
> Reducer 4 llapINITED  1  001  
>  0   0
> Map 5 llapINITED108  00  108  
>  0   0
> Reducer 6 llapINITED  4  004  
>  0   0
> Reducer 7 llapINITED  1  001  
>  0   0
> --
> VERTICES: 00/07  [>>--] 0%ELAPSED TIME: 3489.83 s
> --
> {code}
> Map2 is MAPJOINed to Map1. In an LLAP daemon, the forever running Map2 task 
> is blocked on nextCvb:
> {code}
> "TezTR-886270_0_1_0_1_0" #154 daemon prio=5 os_prio=0 tid=0x7f1b88348000 
> nid=0x147 waiting on condition [0x7f0ce005d000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x7f0de8025e00> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>   at 
> java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418)
>   at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.nextCvb(LlapRecordReader.java:517)
>   at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.next(LlapRecordReader.java:372)
>   at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.next(LlapRecordReader.java:82)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:362)
>   at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
>   at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:117)
>   at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
>   at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:115)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
>   at 
>

[jira] [Resolved] (HIVE-24630) clean up multiple parseDelta implementation in AcidUtils

2021-01-18 Thread Peter Varga (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Varga resolved HIVE-24630.

Fix Version/s: 4.0.0
   Resolution: Fixed

> clean up multiple parseDelta implementation in AcidUtils
> 
>
> Key: HIVE-24630
> URL: https://issues.apache.org/jira/browse/HIVE-24630
> Project: Hive
>  Issue Type: Improvement
>Reporter: Peter Varga
>Assignee: Peter Varga
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> * Remove code duplication
> * Use ParsedDeltaLight everywhere where rawformat is not used, because 
> parsing that is cheaper



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24630) clean up multiple parseDelta implementation in AcidUtils

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24630?focusedWorklogId=537272=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537272
 ]

ASF GitHub Bot logged work on HIVE-24630:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 08:30
Start Date: 18/Jan/21 08:30
Worklog Time Spent: 10m 
  Work Description: deniskuzZ merged pull request #1862:
URL: https://github.com/apache/hive/pull/1862


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537272)
Time Spent: 20m  (was: 10m)

> clean up multiple parseDelta implementation in AcidUtils
> 
>
> Key: HIVE-24630
> URL: https://issues.apache.org/jira/browse/HIVE-24630
> Project: Hive
>  Issue Type: Improvement
>Reporter: Peter Varga
>Assignee: Peter Varga
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> * Remove code duplication
> * Use ParsedDeltaLight everywhere where rawformat is not used, because 
> parsing that is cheaper



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24640) Add error message in hive proto logger for failed queries.

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24640:
--
Labels: pull-request-available  (was: )

> Add error message in hive proto logger for failed queries.
> --
>
> Key: HIVE-24640
> URL: https://issues.apache.org/jira/browse/HIVE-24640
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Harish JP
>Assignee: Harish JP
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-24640.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Error message is missing in the failed event in HiveProtoLoggingHook, extract 
> and add this information from context when available.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24649) Optimise Hive::addWriteNotificationLog for large data inserts

2021-01-18 Thread Anishek Agarwal (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17267074#comment-17267074
 ] 

Anishek Agarwal commented on HIVE-24649:


actually the method has few failure points, for example


{code:java}

getMSC().addDynamicPartitions(parentSession.getTxnMgr().getCurrentTxnId(), 
writeId,
tbl.getDbName(), tbl.getTableName(), partNames,
AcidUtils.toDataOperationType(operation));

{code}

if it fails the whole state is not reverted, there are now changes in 
partitions and respective tables in rdbms but not on txn_components tables, how 
does this affect other txns read view etc ? cc [~pvary]



> Optimise Hive::addWriteNotificationLog for large data inserts
> -
>
> Key: HIVE-24649
> URL: https://issues.apache.org/jira/browse/HIVE-24649
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Rajesh Balamohan
>Priority: Major
>  Labels: performance
>
> When loading dynamic partition with large dataset, it spends lot of time in 
> "Hive::loadDynamicPartitions --> addWriteNotificationLog".
> Though it is for same for same table, it ends up loading table and partition 
> details for every partition and writes to notification log.
> Also, "Partition" details may be already present in {{PartitionDetails}} 
> object in {{Hive::loadDynamicPartitions}}. This is unnecessarily recomputed 
> again in {{HiveMetaStore::add_write_notification_log}}
>  
> Lines of interest:
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L3028
> https://github.com/apache/hive/blob/89073a94354f0cc14ec4ae0a43e05aae29276b4d/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L8500
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24640) Add error message in hive proto logger for failed queries.

2021-01-18 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24640?focusedWorklogId=537264=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-537264
 ]

ASF GitHub Bot logged work on HIVE-24640:
-

Author: ASF GitHub Bot
Created on: 18/Jan/21 08:14
Start Date: 18/Jan/21 08:14
Worklog Time Spent: 10m 
  Work Description: harishjp opened a new pull request #1880:
URL: https://github.com/apache/hive/pull/1880


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 537264)
Remaining Estimate: 0h
Time Spent: 10m

> Add error message in hive proto logger for failed queries.
> --
>
> Key: HIVE-24640
> URL: https://issues.apache.org/jira/browse/HIVE-24640
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Harish JP
>Assignee: Harish JP
>Priority: Minor
> Attachments: HIVE-24640.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Error message is missing in the failed event in HiveProtoLoggingHook, extract 
> and add this information from context when available.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

76 matches

Mail list logo