date:20220727

[jira] [Resolved] (HIVE-26434) HMS thrift method recv_get_table_objects_by_name exception list is broken

2022-07-27 Thread Chao Sun (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun resolved HIVE-26434.
-
Fix Version/s: 2.3.10
 Assignee: Cheng Pan
   Resolution: Fixed

> HMS thrift method recv_get_table_objects_by_name exception list is broken
> -
>
> Key: HIVE-26434
> URL: https://issues.apache.org/jira/browse/HIVE-26434
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.9
>Reporter: Cheng Pan
>Assignee: Cheng Pan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.3.10
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> HIVE-15062 accidentally removed the HMS thrift method 
> recv_get_table_objects_by_name exception list, after HIVE-24608, some IT 
> consistently failed because of it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-26434) HMS thrift method recv_get_table_objects_by_name exception list is broken

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26434?focusedWorklogId=795909=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795909
 ]

ASF GitHub Bot logged work on HIVE-26434:
-

Author: ASF GitHub Bot
Created on: 28/Jul/22 04:43
Start Date: 28/Jul/22 04:43
Worklog Time Spent: 10m 
  Work Description: sunchao commented on PR #3483:
URL: https://github.com/apache/hive/pull/3483#issuecomment-1197654025

   Thanks, merged




Issue Time Tracking
---

Worklog Id: (was: 795909)
Time Spent: 50m  (was: 40m)

> HMS thrift method recv_get_table_objects_by_name exception list is broken
> -
>
> Key: HIVE-26434
> URL: https://issues.apache.org/jira/browse/HIVE-26434
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.9
>Reporter: Cheng Pan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> HIVE-15062 accidentally removed the HMS thrift method 
> recv_get_table_objects_by_name exception list, after HIVE-24608, some IT 
> consistently failed because of it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-26434) HMS thrift method recv_get_table_objects_by_name exception list is broken

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26434?focusedWorklogId=795908=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795908
 ]

ASF GitHub Bot logged work on HIVE-26434:
-

Author: ASF GitHub Bot
Created on: 28/Jul/22 04:43
Start Date: 28/Jul/22 04:43
Worklog Time Spent: 10m 
  Work Description: sunchao merged PR #3483:
URL: https://github.com/apache/hive/pull/3483




Issue Time Tracking
---

Worklog Id: (was: 795908)
Time Spent: 40m  (was: 0.5h)

> HMS thrift method recv_get_table_objects_by_name exception list is broken
> -
>
> Key: HIVE-26434
> URL: https://issues.apache.org/jira/browse/HIVE-26434
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.9
>Reporter: Cheng Pan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> HIVE-15062 accidentally removed the HMS thrift method 
> recv_get_table_objects_by_name exception list, after HIVE-24608, some IT 
> consistently failed because of it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-26435) HMS Summary

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26435?focusedWorklogId=795899=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795899
 ]

ASF GitHub Bot logged work on HIVE-26435:
-

Author: ASF GitHub Bot
Created on: 28/Jul/22 03:48
Start Date: 28/Jul/22 03:48
Worklog Time Spent: 10m 
  Work Description: ruyiZheng opened a new pull request, #3484:
URL: https://github.com/apache/hive/pull/3484

   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   




Issue Time Tracking
---

Worklog Id: (was: 795899)
Remaining Estimate: 0h
Time Spent: 10m

> HMS Summary
> ---
>
> Key: HIVE-26435
> URL: https://issues.apache.org/jira/browse/HIVE-26435
> Project: Hive
>  Issue Type: New Feature
>Reporter: Ruyi Zheng
>Assignee: Naveen Gangam
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive Metastore currently lacks visibility into its metadata. This work 
> includes enhancing the Hive Metatool to include an option(JSON, CONSOLE) to 
> print a summary (catalog name, database name, table name, partition column 
> count, number of rows. table type, file type, compression type, total data 
> size, etc). 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26435) HMS Summary

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26435:
--
Labels: pull-request-available  (was: )

> HMS Summary
> ---
>
> Key: HIVE-26435
> URL: https://issues.apache.org/jira/browse/HIVE-26435
> Project: Hive
>  Issue Type: New Feature
>Reporter: Ruyi Zheng
>Assignee: Naveen Gangam
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive Metastore currently lacks visibility into its metadata. This work 
> includes enhancing the Hive Metatool to include an option(JSON, CONSOLE) to 
> print a summary (catalog name, database name, table name, partition column 
> count, number of rows. table type, file type, compression type, total data 
> size, etc). 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-26434) HMS thrift method recv_get_table_objects_by_name exception list is broken

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26434?focusedWorklogId=795873=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795873
 ]

ASF GitHub Bot logged work on HIVE-26434:
-

Author: ASF GitHub Bot
Created on: 28/Jul/22 01:50
Start Date: 28/Jul/22 01:50
Worklog Time Spent: 10m 
  Work Description: pan3793 commented on PR #3483:
URL: https://github.com/apache/hive/pull/3483#issuecomment-1197559113

   The Jenkins report[1] shows this PR recovers ~60 tests compared to latest 
branch-2.3[2]
   
   [1] 
http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-3483/1/tests
   [2] 
http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/branch-2.3/26/tests/




Issue Time Tracking
---

Worklog Id: (was: 795873)
Time Spent: 0.5h  (was: 20m)

> HMS thrift method recv_get_table_objects_by_name exception list is broken
> -
>
> Key: HIVE-26434
> URL: https://issues.apache.org/jira/browse/HIVE-26434
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.9
>Reporter: Cheng Pan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> HIVE-15062 accidentally removed the HMS thrift method 
> recv_get_table_objects_by_name exception list, after HIVE-24608, some IT 
> consistently failed because of it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-26079) Upgrade protobuf to 3.16.1

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26079?focusedWorklogId=795858=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795858
 ]

ASF GitHub Bot logged work on HIVE-26079:
-

Author: ASF GitHub Bot
Created on: 28/Jul/22 00:22
Start Date: 28/Jul/22 00:22
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on PR #3309:
URL: https://github.com/apache/hive/pull/3309#issuecomment-1197513602

   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.




Issue Time Tracking
---

Worklog Id: (was: 795858)
Time Spent: 1.5h  (was: 1h 20m)

> Upgrade protobuf to 3.16.1
> --
>
> Key: HIVE-26079
> URL: https://issues.apache.org/jira/browse/HIVE-26079
> Project: Hive
>  Issue Type: Task
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Upgrade com.google.protobuf:protobuf-java from 2.5.0 to 3.16.1 to fix 
> CVE-2021-22569



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-26435) HMS Summary

2022-07-27 Thread Ruyi Zheng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruyi Zheng reassigned HIVE-26435:
-

Assignee: Naveen Gangam

> HMS Summary
> ---
>
> Key: HIVE-26435
> URL: https://issues.apache.org/jira/browse/HIVE-26435
> Project: Hive
>  Issue Type: New Feature
>Reporter: Ruyi Zheng
>Assignee: Naveen Gangam
>Priority: Major
>
> Hive Metastore currently lacks visibility into its metadata. This work 
> includes enhancing the Hive Metatool to include an option(JSON, CONSOLE) to 
> print a summary (catalog name, database name, table name, partition column 
> count, number of rows. table type, file type, compression type, total data 
> size, etc). 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-1626) stop using java.util.Stack

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-1626?focusedWorklogId=795795=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795795
 ]

ASF GitHub Bot logged work on HIVE-1626:


Author: ASF GitHub Bot
Created on: 27/Jul/22 18:43
Start Date: 27/Jul/22 18:43
Worklog Time Spent: 10m 
  Work Description: cmunkey commented on PR #3441:
URL: https://github.com/apache/hive/pull/3441#issuecomment-1197224233

   java.util.Stack uses equals() for search, so maybe it is appropriate to use 
equals here for indexOf().
   
   public synchronized int search(Object o)




Issue Time Tracking
---

Worklog Id: (was: 795795)
Time Spent: 1h 50m  (was: 1h 40m)

> stop using java.util.Stack
> --
>
> Key: HIVE-1626
> URL: https://issues.apache.org/jira/browse/HIVE-1626
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 0.7.0
>Reporter: John Sichi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-1626.2.patch, HIVE-1626.2.patch, HIVE-1626.3.patch, 
> HIVE-1626.3.patch, HIVE-1626.3.patch
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> We currently use Stack as part of the generic node walking library.  Stack 
> should not be used for this since its inheritance from Vector incurs 
> superfluous synchronization overhead.
> Most projects end up adding an ArrayStack implementation and using that 
> instead.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-1626) stop using java.util.Stack

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-1626?focusedWorklogId=795792=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795792
 ]

ASF GitHub Bot logged work on HIVE-1626:


Author: ASF GitHub Bot
Created on: 27/Jul/22 18:41
Start Date: 27/Jul/22 18:41
Worklog Time Spent: 10m 
  Work Description: cmunkey commented on PR #3441:
URL: https://github.com/apache/hive/pull/3441#issuecomment-1197221879

   Maybe call it RandomUpdateStack instead of MyArrayStack. 




Issue Time Tracking
---

Worklog Id: (was: 795792)
Time Spent: 1h 40m  (was: 1.5h)

> stop using java.util.Stack
> --
>
> Key: HIVE-1626
> URL: https://issues.apache.org/jira/browse/HIVE-1626
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 0.7.0
>Reporter: John Sichi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-1626.2.patch, HIVE-1626.2.patch, HIVE-1626.3.patch, 
> HIVE-1626.3.patch, HIVE-1626.3.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> We currently use Stack as part of the generic node walking library.  Stack 
> should not be used for this since its inheritance from Vector incurs 
> superfluous synchronization overhead.
> Most projects end up adding an ArrayStack implementation and using that 
> instead.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-26434) HMS thrift method recv_get_table_objects_by_name exception list is broken

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26434?focusedWorklogId=795783=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795783
 ]

ASF GitHub Bot logged work on HIVE-26434:
-

Author: ASF GitHub Bot
Created on: 27/Jul/22 18:08
Start Date: 27/Jul/22 18:08
Worklog Time Spent: 10m 
  Work Description: pan3793 commented on PR #3483:
URL: https://github.com/apache/hive/pull/3483#issuecomment-1197119589

   cc @sunchao @wangyum 
   
   BTW, I checked the spark does not affect by this issue, because use catches 
all exceptions thrown by the `recv_get_table_objects_by_name`
   
   
https://github.com/apache/spark/blob/47f0303944abb11d3018186bc125113772eff8ef/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L410-L419




Issue Time Tracking
---

Worklog Id: (was: 795783)
Time Spent: 20m  (was: 10m)

> HMS thrift method recv_get_table_objects_by_name exception list is broken
> -
>
> Key: HIVE-26434
> URL: https://issues.apache.org/jira/browse/HIVE-26434
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.9
>Reporter: Cheng Pan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> HIVE-15062 accidentally removed the HMS thrift method 
> recv_get_table_objects_by_name exception list, after HIVE-24608, some IT 
> consistently failed because of it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-26434) HMS thrift method recv_get_table_objects_by_name exception list is broken

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26434?focusedWorklogId=795778=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795778
 ]

ASF GitHub Bot logged work on HIVE-26434:
-

Author: ASF GitHub Bot
Created on: 27/Jul/22 18:04
Start Date: 27/Jul/22 18:04
Worklog Time Spent: 10m 
  Work Description: pan3793 opened a new pull request, #3483:
URL: https://github.com/apache/hive/pull/3483

   
   
   
   ### What changes were proposed in this pull request?
   
   Fix HMS thrift method recv_get_table_objects_by_name's exception list
   
   ### Why are the changes needed?
   
   [HIVE-15062](https://issues.apache.org/jira/browse/HIVE-15062) accidentally 
removed the HMS thrift method recv_get_table_objects_by_name exception list, 
after [HIVE-24608](https://issues.apache.org/jira/browse/HIVE-24608), some IT 
consistently failed because of it.
   
   e.g. `org.apache.hive.metastore.TestSetUGIOnOnlyClient#testSimpleTable`
   
   ```
   org.apache.thrift.TApplicationException: Internal error processing 
get_table_objects_by_name
at 
org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table_objects_by_name(ThriftHiveMetastore.java:1544)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_table_objects_by_name(ThriftHiveMetastore.java:1530)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTableObjectsByName(HiveMetaStoreClient.java:1363)
at 
org.apache.hadoop.hive.metastore.TestHiveMetaStore.testSimpleTable(TestHiveMetaStore.java:1478)
   ```
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes, w/ this patch, if the thrift exception can be propagated from hms to 
client correctly.
   
   ### How was this patch tested?
   
   Existing UT




Issue Time Tracking
---

Worklog Id: (was: 795778)
Remaining Estimate: 0h
Time Spent: 10m

> HMS thrift method recv_get_table_objects_by_name exception list is broken
> -
>
> Key: HIVE-26434
> URL: https://issues.apache.org/jira/browse/HIVE-26434
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.9
>Reporter: Cheng Pan
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HIVE-15062 accidentally removed the HMS thrift method 
> recv_get_table_objects_by_name exception list, after HIVE-24608, some IT 
> consistently failed because of it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26434) HMS thrift method recv_get_table_objects_by_name exception list is broken

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26434:
--
Labels: pull-request-available  (was: )

> HMS thrift method recv_get_table_objects_by_name exception list is broken
> -
>
> Key: HIVE-26434
> URL: https://issues.apache.org/jira/browse/HIVE-26434
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.9
>Reporter: Cheng Pan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HIVE-15062 accidentally removed the HMS thrift method 
> recv_get_table_objects_by_name exception list, after HIVE-24608, some IT 
> consistently failed because of it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-26433) HivePrivilegeObject's objectName is NULL when JdbcStorageHandler is used with METASTORE type

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26433?focusedWorklogId=795729=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795729
 ]

ASF GitHub Bot logged work on HIVE-26433:
-

Author: ASF GitHub Bot
Created on: 27/Jul/22 15:49
Start Date: 27/Jul/22 15:49
Worklog Time Spent: 10m 
  Work Description: deniskuzZ opened a new pull request, #3482:
URL: https://github.com/apache/hive/pull/3482

   …Handler is used with METASTORE type
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   




Issue Time Tracking
---

Worklog Id: (was: 795729)
Remaining Estimate: 0h
Time Spent: 10m

> HivePrivilegeObject's objectName is NULL when JdbcStorageHandler is used with 
> METASTORE type
> 
>
> Key: HIVE-26433
> URL: https://issues.apache.org/jira/browse/HIVE-26433
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26433) HivePrivilegeObject's objectName is NULL when JdbcStorageHandler is used with METASTORE type

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26433:
--
Labels: pull-request-available  (was: )

> HivePrivilegeObject's objectName is NULL when JdbcStorageHandler is used with 
> METASTORE type
> 
>
> Key: HIVE-26433
> URL: https://issues.apache.org/jira/browse/HIVE-26433
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26428) Limit usage of LLAP BPWrapper to threads of IO threadpools

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26428:
--
Labels: pull-request-available  (was: )

> Limit usage of LLAP BPWrapper to threads of IO threadpools
> --
>
> Key: HIVE-26428
> URL: https://issues.apache.org/jira/browse/HIVE-26428
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> BPWrapper is used in LRFU cache eviction policy to decrease the time spent 
> waiting for lock on the heap. This is done by adding a buffer as threadlocal 
> and accumulating CacheableBuffer instances there before trying to acquire a 
> lock. This works well when we have threads from pools such as IO-Elevator 
> threads or OrcEncode threads.
> For ephemeral threads there's no advantage of doing this as the buffers in 
> threadlocals may never reach the heap or list structures of LRFU, thereby 
> also making evictions less efficient. This can happen e.g. LLAPCacheAwareFS 
> is used with Parquet, where we're using the Tez threads for both execution 
> and IO.
> We should disable BPWrappers for such cases.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-26428) Limit usage of LLAP BPWrapper to threads of IO threadpools

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26428?focusedWorklogId=795696=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795696
 ]

ASF GitHub Bot logged work on HIVE-26428:
-

Author: ASF GitHub Bot
Created on: 27/Jul/22 14:58
Start Date: 27/Jul/22 14:58
Worklog Time Spent: 10m 
  Work Description: szlta opened a new pull request, #3481:
URL: https://github.com/apache/hive/pull/3481

   BPWrapper is used in LRFU cache eviction policy to decrease the time spent 
waiting for lock on the heap. This is done by adding a buffer as threadlocal 
and accumulating CacheableBuffer instances there before trying to acquire a 
lock. This works well when we have threads from pools such as IO-Elevator 
threads or OrcEncode threads.
   
   For ephemeral threads there's no advantage of doing this as the buffers in 
threadlocals may never reach the heap or list structures of LRFU, thereby also 
making evictions less efficient. This can happen e.g. LLAPCacheAwareFS is used 
with Parquet, where we're using the Tez threads for both execution and IO.
   
   We should disable BPWrappers for such cases.




Issue Time Tracking
---

Worklog Id: (was: 795696)
Remaining Estimate: 0h
Time Spent: 10m

> Limit usage of LLAP BPWrapper to threads of IO threadpools
> --
>
> Key: HIVE-26428
> URL: https://issues.apache.org/jira/browse/HIVE-26428
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> BPWrapper is used in LRFU cache eviction policy to decrease the time spent 
> waiting for lock on the heap. This is done by adding a buffer as threadlocal 
> and accumulating CacheableBuffer instances there before trying to acquire a 
> lock. This works well when we have threads from pools such as IO-Elevator 
> threads or OrcEncode threads.
> For ephemeral threads there's no advantage of doing this as the buffers in 
> threadlocals may never reach the heap or list structures of LRFU, thereby 
> also making evictions less efficient. This can happen e.g. LLAPCacheAwareFS 
> is used with Parquet, where we're using the Tez threads for both execution 
> and IO.
> We should disable BPWrappers for such cases.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-26414) Aborted/Cancelled CTAS operations must initiate cleanup of uncommitted data

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26414?focusedWorklogId=795674=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795674
 ]

ASF GitHub Bot logged work on HIVE-26414:
-

Author: ASF GitHub Bot
Created on: 27/Jul/22 14:23
Start Date: 27/Jul/22 14:23
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r931126498


##
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##
@@ -485,6 +493,27 @@ private void clearLocksAndHB() {
 stopHeartbeat();
   }
 
+  private void cleanupOutputDir(Context ctx) throws MetaException {
+if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+  Table destinationTable = ctx.getDestinationTable();
+  if (destinationTable != null) {
+try {
+  CompactionRequest rqst = new CompactionRequest(
+  destinationTable.getDbName(), 
destinationTable.getTableName(), CompactionType.MAJOR);
+  
rqst.setRunas(TxnUtils.findUserToRunAs(destinationTable.getSd().getLocation(),
+  destinationTable.getTTable(), conf));
+
+  rqst.putToProperties(META_TABLE_LOCATION, 
destinationTable.getSd().getLocation());
+  rqst.putToProperties(IF_PURGE, Boolean.toString(true));
+  TxnStore txnHandler = TxnUtils.getTxnStore(conf);

Review Comment:
   > btw, would it be hard to create a completionHook similar to Iceberg one?
   
   We could create one but it would include failures only within Query 
execution.
   Anything done after query execution (post execution activities) will not be 
within its scope, which is why I disregarded the Hook approach.
   
   The hooks are used as part of finally block here - 
   
https://github.com/apache/hive/blob/b197ed86029f07696e326acb5878f86c286e9e1a/ql/src/java/org/apache/hadoop/hive/ql/Executor.java#L118
   
   Cleanup will then be dependent on a HiveConf - `hive.query.lifetime.hooks`. 





Issue Time Tracking
---

Worklog Id: (was: 795674)
Time Spent: 6.5h  (was: 6h 20m)

> Aborted/Cancelled CTAS operations must initiate cleanup of uncommitted data
> ---
>
> Key: HIVE-26414
> URL: https://issues.apache.org/jira/browse/HIVE-26414
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> When a CTAS query fails before creation of table and after writing the data, 
> the data is present in the directory and not cleaned up currently by the 
> cleaner or any other mechanism currently. This is because the cleaner 
> requires a table corresponding to what its cleaning. In order surpass such a 
> situation, we can directly pass the relevant information to the cleaner so 
> that such uncommitted data is deleted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-26414) Aborted/Cancelled CTAS operations must initiate cleanup of uncommitted data

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26414?focusedWorklogId=795673=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795673
 ]

ASF GitHub Bot logged work on HIVE-26414:
-

Author: ASF GitHub Bot
Created on: 27/Jul/22 14:21
Start Date: 27/Jul/22 14:21
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r931126498


##
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##
@@ -485,6 +493,27 @@ private void clearLocksAndHB() {
 stopHeartbeat();
   }
 
+  private void cleanupOutputDir(Context ctx) throws MetaException {
+if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+  Table destinationTable = ctx.getDestinationTable();
+  if (destinationTable != null) {
+try {
+  CompactionRequest rqst = new CompactionRequest(
+  destinationTable.getDbName(), 
destinationTable.getTableName(), CompactionType.MAJOR);
+  
rqst.setRunas(TxnUtils.findUserToRunAs(destinationTable.getSd().getLocation(),
+  destinationTable.getTTable(), conf));
+
+  rqst.putToProperties(META_TABLE_LOCATION, 
destinationTable.getSd().getLocation());
+  rqst.putToProperties(IF_PURGE, Boolean.toString(true));
+  TxnStore txnHandler = TxnUtils.getTxnStore(conf);

Review Comment:
   > btw, would it be hard to create a completionHook similar to Iceberg one?
   
   We could create one but it would include failures only within Query 
execution.
   Anything done after query execution (post execution activities like 
releasing locks) will not be within its scope, which is why I disregarded the 
Hook approach.
   
   The hooks are used as part of finally block here - 
   
https://github.com/apache/hive/blob/b197ed86029f07696e326acb5878f86c286e9e1a/ql/src/java/org/apache/hadoop/hive/ql/Executor.java#L118
   
   Cleanup will then be dependent on a HiveConf - `hive.query.lifetime.hooks`. 





Issue Time Tracking
---

Worklog Id: (was: 795673)
Time Spent: 6h 20m  (was: 6h 10m)

> Aborted/Cancelled CTAS operations must initiate cleanup of uncommitted data
> ---
>
> Key: HIVE-26414
> URL: https://issues.apache.org/jira/browse/HIVE-26414
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> When a CTAS query fails before creation of table and after writing the data, 
> the data is present in the directory and not cleaned up currently by the 
> cleaner or any other mechanism currently. This is because the cleaner 
> requires a table corresponding to what its cleaning. In order surpass such a 
> situation, we can directly pass the relevant information to the cleaner so 
> that such uncommitted data is deleted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HIVE-26288) NPE in CompactionTxnHandler.markFailed()

2022-07-27 Thread Zsolt Miskolczi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zsolt Miskolczi resolved HIVE-26288.

Fix Version/s: 4.0.0-alpha-2
   Resolution: Fixed

> NPE in CompactionTxnHandler.markFailed()
> 
>
> Key: HIVE-26288
> URL: https://issues.apache.org/jira/browse/HIVE-26288
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: László Végh
>Assignee: Zsolt Miskolczi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-2
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Unhandled exceptions in 
> IMetaStoreClient.findNextCompact(FindNextCompactRequest) handled incorrectly 
> in worker. I these cases the CompcationInfo remains null, but the catch block 
> passes it to CompactionTxnHandler.markFailed() which causes an NPE.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-26396) The trunc function has a problem with precision interception and the result has many 0

2022-07-27 Thread Simhadri G (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simhadri G reassigned HIVE-26396:
-

Assignee: Simhadri G  (was: Xuedong Luan)

> The trunc function has a problem with precision interception and the result 
> has many 0
> --
>
> Key: HIVE-26396
> URL: https://issues.apache.org/jira/browse/HIVE-26396
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.3
> Environment: CDP7.1.7 ,RedHat7.6
>Reporter: phZhou
>Assignee: Simhadri G
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 3.2.0, 4.0.0, 4.0.0-alpha-2
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The trunc function has a problem with precision interception and the result 
> has many 0，There is a problem with the return when the data is of decimal 
> type, and it is displayed normally when the data is of double type。The test 
> is as follows:
> 1：Execute on beeline：
>  SELECT  trunc(15.8963,3);
> ++
> |          _c0           |
> ++
> | 15.896000  |
> ++
> 1 row selected (0.074 seconds)
> need return “15.896”is correct。



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HIVE-26396) The trunc function has a problem with precision interception and the result has many 0

2022-07-27 Thread Simhadri G (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-26396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17571929#comment-17571929
 ] 

Simhadri G commented on HIVE-26396:
---

Hi [~leoluan] , 


I have merged the fix for this change already in apache hive.

[GitHub Pull Request #3463|https://github.com/apache/hive/pull/3463] 

 

May I please know why it is reassigned?

Thanks!

> The trunc function has a problem with precision interception and the result 
> has many 0
> --
>
> Key: HIVE-26396
> URL: https://issues.apache.org/jira/browse/HIVE-26396
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.3
> Environment: CDP7.1.7 ,RedHat7.6
>Reporter: phZhou
>Assignee: Xuedong Luan
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 3.2.0, 4.0.0, 4.0.0-alpha-2
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The trunc function has a problem with precision interception and the result 
> has many 0，There is a problem with the return when the data is of decimal 
> type, and it is displayed normally when the data is of double type。The test 
> is as follows:
> 1：Execute on beeline：
>  SELECT  trunc(15.8963,3);
> ++
> |          _c0           |
> ++
> | 15.896000  |
> ++
> 1 row selected (0.074 seconds)
> need return “15.896”is correct。



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-26425) Skip SSL cert verification for downloading JWKS in HS2

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26425?focusedWorklogId=795658=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795658
 ]

ASF GitHub Bot logged work on HIVE-26425:
-

Author: ASF GitHub Bot
Created on: 27/Jul/22 13:36
Start Date: 27/Jul/22 13:36
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 commented on PR #3473:
URL: https://github.com/apache/hive/pull/3473#issuecomment-1196773584

   Hi @hsnusonic, cloud you please elaborate a litte bit why we need this in 
test while others don't?




Issue Time Tracking
---

Worklog Id: (was: 795658)
Time Spent: 1h 20m  (was: 1h 10m)

> Skip SSL cert verification for downloading JWKS in HS2
> --
>
> Key: HIVE-26425
> URL: https://issues.apache.org/jira/browse/HIVE-26425
> Project: Hive
>  Issue Type: New Feature
>Reporter: Yu-Wen Lai
>Assignee: Yu-Wen Lai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> In a dev/test/staging environment, we would probably use letsencrypt staging 
> certificate for a token generation service. However, its certificate is not 
> accepted by JVM by default. To ease JWT testing in those kind of 
> environments, we can introduce a property to disable the certificate 
> verification just for JWKS downloads.
> Ref: https://letsencrypt.org/docs/staging-environment/



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-26396) The trunc function has a problem with precision interception and the result has many 0

2022-07-27 Thread Xuedong Luan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuedong Luan reassigned HIVE-26396:
---

Assignee: Xuedong Luan  (was: Simhadri G)

> The trunc function has a problem with precision interception and the result 
> has many 0
> --
>
> Key: HIVE-26396
> URL: https://issues.apache.org/jira/browse/HIVE-26396
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.3
> Environment: CDP7.1.7 ,RedHat7.6
>Reporter: phZhou
>Assignee: Xuedong Luan
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 3.2.0, 4.0.0, 4.0.0-alpha-2
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The trunc function has a problem with precision interception and the result 
> has many 0，There is a problem with the return when the data is of decimal 
> type, and it is displayed normally when the data is of double type。The test 
> is as follows:
> 1：Execute on beeline：
>  SELECT  trunc(15.8963,3);
> ++
> |          _c0           |
> ++
> | 15.896000  |
> ++
> 1 row selected (0.074 seconds)
> need return “15.896”is correct。



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-25779) Deduplicate SerDe Info

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25779?focusedWorklogId=795649=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795649
 ]

ASF GitHub Bot logged work on HIVE-25779:
-

Author: ASF GitHub Bot
Created on: 27/Jul/22 13:01
Start Date: 27/Jul/22 13:01
Worklog Time Spent: 10m 
  Work Description: hsnusonic commented on code in PR #3221:
URL: https://github.com/apache/hive/pull/3221#discussion_r931034395


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java:
##
@@ -5322,16 +5380,25 @@ private void copyMSD(MStorageDescriptor newSd, 
MStorageDescriptor oldSd) {
 oldSd.setCD(newSd.getCD());
   }
 }
+// If the serde info of the old SD != the serde info of the new one,
+// then we set the serde info of the old SD as the new serde info.
+SerDeInfo oldSerDeInfo = null;
+SerDeInfo newSerDeInfo = null;
+try {
+  oldSerDeInfo = convertToSerDeInfo(oldSd.getSerDeInfo(), true);
+  newSerDeInfo = convertToSerDeInfo(newSd.getSerDeInfo(), true);
+} catch (MetaException e) {
+  LOG.debug("convertToSerDeInfo shouldn't throw MetaException when 
allowNull is set to true");
+}
+if (oldSerDeInfo == null || !oldSerDeInfo.equals(newSerDeInfo)) {
+  oldSd.setSerDeInfo(newSd.getSerDeInfo());

Review Comment:
   They are different types, SerDeInfo and MSerDeInfo. `newSd.getSerDeInfo()` 
actually returns MSerDeInfo.





Issue Time Tracking
---

Worklog Id: (was: 795649)
Time Spent: 4h  (was: 3h 50m)

> Deduplicate SerDe Info
> --
>
> Key: HIVE-25779
> URL: https://issues.apache.org/jira/browse/HIVE-25779
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore
>Reporter: Yu-Wen Lai
>Assignee: Yu-Wen Lai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> The proposal is that we can reuse serde info as how we reuse column 
> descriptors. (HIVE-2246)
> Currently, we store the metadata for partitions as PARTITIONS (N partitions) 
> -> SDS (N locations) -> SERDES (N entries). However,  all the SERDES for the 
> partitions in a table are the same if we don't explicitly specify it. That 
> is, each storage descriptor has a associated and exclusive serde info, but 
> the partitions' serde infos are mostly just the same as the table's. By 
> reusing the serde info, we can save some database storage and enhance the 
> query performance from HMS to the backend database.
> For backward compatibility, we also need to introduce a config for this 
> feature because there will be issues if HMS old instance and HMS new instance 
> with this feature are running together. With this feature, we will need to 
> check if others reference the serdes before deleting it, but the old instance 
> will just delete it.
> The other thing we need to take care of is custom serdes. If a partition's 
> serde is modified, we need to create a new record in SERDES so that we don't 
> interfere other partitions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-26432) Improve LlapCacheAwareFs by caching file status information

2022-07-27 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-26432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ádám Szita reassigned HIVE-26432:
-


> Improve LlapCacheAwareFs by caching file status information
> ---
>
> Key: HIVE-26432
> URL: https://issues.apache.org/jira/browse/HIVE-26432
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>
> The current implementation of LlapCacheAwareFs is used to wrap InputStreams 
> of non-ORC file formatted file reads, if set up to utilize LLAP caching.
> File content is cached by the calculated file ID and the required offsets 
> within the file. This is later served from cache, however LlapCacheAwareFs 
> acting as a FileSystem sometimes receives listStatus / getFileStatus calls 
> too, which is only proxied to the original FS. If such operation on the 
> original FS is slow, e.g. listing on S3, performance will be impacted. (This 
> is not the case with how ORC is integrated into LLAP cache as it's not acting 
> as a FS)
> I propose we cache the file status information too besides the content.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-25779) Deduplicate SerDe Info

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25779?focusedWorklogId=795646=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795646
 ]

ASF GitHub Bot logged work on HIVE-25779:
-

Author: ASF GitHub Bot
Created on: 27/Jul/22 12:50
Start Date: 27/Jul/22 12:50
Worklog Time Spent: 10m 
  Work Description: hsnusonic commented on code in PR #3221:
URL: https://github.com/apache/hive/pull/3221#discussion_r931021355


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java:
##
@@ -2635,10 +2634,9 @@ private void dropPartitionsByPartitionIds(List 
partitionIdList) throws Met
 for (Object[] fields : sqlResult) {
   sdIdList.add(MetastoreDirectSqlUtils.extractSqlLong(fields[0]));
   Long colId = MetastoreDirectSqlUtils.extractSqlLong(fields[1]);
-  if (!columnDescriptorIdList.contains(colId)) {
-columnDescriptorIdList.add(colId);
-  }
-  serdeIdList.add(MetastoreDirectSqlUtils.extractSqlLong(fields[2]));
+  columnDescriptorIdSet.add(colId);
+  Long serdeId = MetastoreDirectSqlUtils.extractSqlLong(fields[2]);
+  serdeIdSet.add(serdeId);

Review Comment:
   Sure, will do.





Issue Time Tracking
---

Worklog Id: (was: 795646)
Time Spent: 3h 50m  (was: 3h 40m)

> Deduplicate SerDe Info
> --
>
> Key: HIVE-25779
> URL: https://issues.apache.org/jira/browse/HIVE-25779
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore
>Reporter: Yu-Wen Lai
>Assignee: Yu-Wen Lai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> The proposal is that we can reuse serde info as how we reuse column 
> descriptors. (HIVE-2246)
> Currently, we store the metadata for partitions as PARTITIONS (N partitions) 
> -> SDS (N locations) -> SERDES (N entries). However,  all the SERDES for the 
> partitions in a table are the same if we don't explicitly specify it. That 
> is, each storage descriptor has a associated and exclusive serde info, but 
> the partitions' serde infos are mostly just the same as the table's. By 
> reusing the serde info, we can save some database storage and enhance the 
> query performance from HMS to the backend database.
> For backward compatibility, we also need to introduce a config for this 
> feature because there will be issues if HMS old instance and HMS new instance 
> with this feature are running together. With this feature, we will need to 
> check if others reference the serdes before deleting it, but the old instance 
> will just delete it.
> The other thing we need to take care of is custom serdes. If a partition's 
> serde is modified, we need to create a new record in SERDES so that we don't 
> interfere other partitions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-25779) Deduplicate SerDe Info

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25779?focusedWorklogId=795645=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795645
 ]

ASF GitHub Bot logged work on HIVE-25779:
-

Author: ASF GitHub Bot
Created on: 27/Jul/22 12:49
Start Date: 27/Jul/22 12:49
Worklog Time Spent: 10m 
  Work Description: hsnusonic commented on code in PR #3221:
URL: https://github.com/apache/hive/pull/3221#discussion_r931020343


##
standalone-metastore/metastore-server/src/main/resources/package.jdo:
##
@@ -433,7 +434,7 @@
   
 
   
-  

Review Comment:
   Only the "dependent" property is removed. This patch makes serDeInfo be 
shared between StorageDescriptors, so it is not "dependent" any more.





Issue Time Tracking
---

Worklog Id: (was: 795645)
Time Spent: 3h 40m  (was: 3.5h)

> Deduplicate SerDe Info
> --
>
> Key: HIVE-25779
> URL: https://issues.apache.org/jira/browse/HIVE-25779
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore
>Reporter: Yu-Wen Lai
>Assignee: Yu-Wen Lai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> The proposal is that we can reuse serde info as how we reuse column 
> descriptors. (HIVE-2246)
> Currently, we store the metadata for partitions as PARTITIONS (N partitions) 
> -> SDS (N locations) -> SERDES (N entries). However,  all the SERDES for the 
> partitions in a table are the same if we don't explicitly specify it. That 
> is, each storage descriptor has a associated and exclusive serde info, but 
> the partitions' serde infos are mostly just the same as the table's. By 
> reusing the serde info, we can save some database storage and enhance the 
> query performance from HMS to the backend database.
> For backward compatibility, we also need to introduce a config for this 
> feature because there will be issues if HMS old instance and HMS new instance 
> with this feature are running together. With this feature, we will need to 
> check if others reference the serdes before deleting it, but the old instance 
> will just delete it.
> The other thing we need to take care of is custom serdes. If a partition's 
> serde is modified, we need to create a new record in SERDES so that we don't 
> interfere other partitions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-1626) stop using java.util.Stack

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-1626?focusedWorklogId=795634=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795634
 ]

ASF GitHub Bot logged work on HIVE-1626:


Author: ASF GitHub Bot
Created on: 27/Jul/22 12:27
Start Date: 27/Jul/22 12:27
Worklog Time Spent: 10m 
  Work Description: pudidic commented on PR #3441:
URL: https://github.com/apache/hive/pull/3441#issuecomment-1196663673

   @cmunkey
   
   I tried to minimize indexed accessors of Stack, exposing push/pop/peek only. 
Because there already are several use cases of Stack#get(int), I implemented 
ArrayStack#get(int) also.
   
   However, LevelOrderWalker has a unique usage of Stack. LevelOrderWalker 
calls stack.add(0, element) and stack.remove(0) to add/remove elements at the 
start of the stack. They are inverse of stack.pop/push, which add/remove at the 
end. The rest of part still calls Stack.pop/push, as not inverse. I tried to 
replace stack.add(0, element) with stack.push and stack.remove(0) with 
stack.pop, but it behaved very differently. The inverse push/pop is not I 
wanted to expose. So I implemented a subclass in LevelOrderWalker to isolate 
its use from others. Maybe I need to document to make its purpose clear.
   
   I'll replace == with equals in indexOf(). Thank you for advice.




Issue Time Tracking
---

Worklog Id: (was: 795634)
Time Spent: 1.5h  (was: 1h 20m)

> stop using java.util.Stack
> --
>
> Key: HIVE-1626
> URL: https://issues.apache.org/jira/browse/HIVE-1626
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 0.7.0
>Reporter: John Sichi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-1626.2.patch, HIVE-1626.2.patch, HIVE-1626.3.patch, 
> HIVE-1626.3.patch, HIVE-1626.3.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> We currently use Stack as part of the generic node walking library.  Stack 
> should not be used for this since its inheritance from Vector incurs 
> superfluous synchronization overhead.
> Most projects end up adding an ArrayStack implementation and using that 
> instead.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-26288) NPE in CompactionTxnHandler.markFailed()

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26288?focusedWorklogId=795631=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795631
 ]

ASF GitHub Bot logged work on HIVE-26288:
-

Author: ASF GitHub Bot
Created on: 27/Jul/22 12:17
Start Date: 27/Jul/22 12:17
Worklog Time Spent: 10m 
  Work Description: deniskuzZ merged PR #3451:
URL: https://github.com/apache/hive/pull/3451




Issue Time Tracking
---

Worklog Id: (was: 795631)
Time Spent: 0.5h  (was: 20m)

> NPE in CompactionTxnHandler.markFailed()
> 
>
> Key: HIVE-26288
> URL: https://issues.apache.org/jira/browse/HIVE-26288
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: László Végh
>Assignee: Zsolt Miskolczi
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Unhandled exceptions in 
> IMetaStoreClient.findNextCompact(FindNextCompactRequest) handled incorrectly 
> in worker. I these cases the CompcationInfo remains null, but the catch block 
> passes it to CompactionTxnHandler.markFailed() which causes an NPE.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-26414) Aborted/Cancelled CTAS operations must initiate cleanup of uncommitted data

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26414?focusedWorklogId=795575=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795575
 ]

ASF GitHub Bot logged work on HIVE-26414:
-

Author: ASF GitHub Bot
Created on: 27/Jul/22 08:55
Start Date: 27/Jul/22 08:55
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r930799014


##
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##
@@ -485,6 +493,27 @@ private void clearLocksAndHB() {
 stopHeartbeat();
   }
 
+  private void cleanupOutputDir(Context ctx) throws MetaException {
+if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+  Table destinationTable = ctx.getDestinationTable();
+  if (destinationTable != null) {
+try {
+  CompactionRequest rqst = new CompactionRequest(
+  destinationTable.getDbName(), 
destinationTable.getTableName(), CompactionType.MAJOR);
+  
rqst.setRunas(TxnUtils.findUserToRunAs(destinationTable.getSd().getLocation(),
+  destinationTable.getTTable(), conf));
+
+  rqst.putToProperties(META_TABLE_LOCATION, 
destinationTable.getSd().getLocation());
+  rqst.putToProperties(IF_PURGE, Boolean.toString(true));
+  TxnStore txnHandler = TxnUtils.getTxnStore(conf);

Review Comment:
   btw, would it be hard to create a completionHook similar to Iceberg one?





Issue Time Tracking
---

Worklog Id: (was: 795575)
Time Spent: 6h 10m  (was: 6h)

> Aborted/Cancelled CTAS operations must initiate cleanup of uncommitted data
> ---
>
> Key: HIVE-26414
> URL: https://issues.apache.org/jira/browse/HIVE-26414
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> When a CTAS query fails before creation of table and after writing the data, 
> the data is present in the directory and not cleaned up currently by the 
> cleaner or any other mechanism currently. This is because the cleaner 
> requires a table corresponding to what its cleaning. In order surpass such a 
> situation, we can directly pass the relevant information to the cleaner so 
> that such uncommitted data is deleted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-26414) Aborted/Cancelled CTAS operations must initiate cleanup of uncommitted data

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26414?focusedWorklogId=795571=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795571
 ]

ASF GitHub Bot logged work on HIVE-26414:
-

Author: ASF GitHub Bot
Created on: 27/Jul/22 08:53
Start Date: 27/Jul/22 08:53
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r930796903


##
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##
@@ -485,6 +493,27 @@ private void clearLocksAndHB() {
 stopHeartbeat();
   }
 
+  private void cleanupOutputDir(Context ctx) throws MetaException {
+if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.TXN_CTAS_X_LOCK)) {
+  Table destinationTable = ctx.getDestinationTable();
+  if (destinationTable != null) {
+try {
+  CompactionRequest rqst = new CompactionRequest(
+  destinationTable.getDbName(), 
destinationTable.getTableName(), CompactionType.MAJOR);
+  
rqst.setRunas(TxnUtils.findUserToRunAs(destinationTable.getSd().getLocation(),
+  destinationTable.getTTable(), conf));
+
+  rqst.putToProperties(META_TABLE_LOCATION, 
destinationTable.getSd().getLocation());
+  rqst.putToProperties(IF_PURGE, Boolean.toString(true));
+  TxnStore txnHandler = TxnUtils.getTxnStore(conf);

Review Comment:
   that would be expensive to create a new txnStore everytime





Issue Time Tracking
---

Worklog Id: (was: 795571)
Time Spent: 6h  (was: 5h 50m)

> Aborted/Cancelled CTAS operations must initiate cleanup of uncommitted data
> ---
>
> Key: HIVE-26414
> URL: https://issues.apache.org/jira/browse/HIVE-26414
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> When a CTAS query fails before creation of table and after writing the data, 
> the data is present in the directory and not cleaned up currently by the 
> cleaner or any other mechanism currently. This is because the cleaner 
> requires a table corresponding to what its cleaning. In order surpass such a 
> situation, we can directly pass the relevant information to the cleaner so 
> that such uncommitted data is deleted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-26414) Aborted/Cancelled CTAS operations must initiate cleanup of uncommitted data

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26414?focusedWorklogId=795567=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795567
 ]

ASF GitHub Bot logged work on HIVE-26414:
-

Author: ASF GitHub Bot
Created on: 27/Jul/22 08:51
Start Date: 27/Jul/22 08:51
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r930794617


##
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##
@@ -598,6 +627,16 @@ public void rollbackTxn() throws LockException {
 }
   }
 
+  @Override
+  public void rollbackTxn(Context ctx) throws LockException {
+try {
+  cleanupOutputDir(ctx);
+} catch (TException e) {
+  throw new 
LockException(ErrorMsg.METASTORE_COMMUNICATION_FAILED.getMsg(), e);
+}
+rollbackTxn();

Review Comment:
   we should call rollback first, and then try the cleanup, if cleanup fails, 
we'll never mark a txn as aborted 





Issue Time Tracking
---

Worklog Id: (was: 795567)
Time Spent: 5h 50m  (was: 5h 40m)

> Aborted/Cancelled CTAS operations must initiate cleanup of uncommitted data
> ---
>
> Key: HIVE-26414
> URL: https://issues.apache.org/jira/browse/HIVE-26414
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> When a CTAS query fails before creation of table and after writing the data, 
> the data is present in the directory and not cleaned up currently by the 
> cleaner or any other mechanism currently. This is because the cleaner 
> requires a table corresponding to what its cleaning. In order surpass such a 
> situation, we can directly pass the relevant information to the cleaner so 
> that such uncommitted data is deleted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work started] (HIVE-26115) LLAP cache utilization for Iceberg Parquet files

2022-07-27 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-26115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-26115 started by Ádám Szita.
-
> LLAP cache utilization for Iceberg Parquet files
> 
>
> Key: HIVE-26115
> URL: https://issues.apache.org/jira/browse/HIVE-26115
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2022-04-05 at 10.08.27 AM.png, Screenshot 
> 2022-04-05 at 10.08.35 AM.png, Screenshot 2022-04-05 at 10.08.50 AM.png, 
> Screenshot 2022-04-05 at 10.09.03 AM.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Originally: 
> Parquet footer is read 3 times when reading iceberg data
> !Screenshot 2022-04-05 at 10.08.27 AM.png|width=627,height=331!
> Here is the breakup of 3 footer reads per file.
> !Screenshot 2022-04-05 at 10.08.35 AM.png|width=1109,height=500!
>  
>  
> !Screenshot 2022-04-05 at 10.08.50 AM.png|width=1067,height=447!
>  
>  
> !Screenshot 2022-04-05 at 10.09.03 AM.png|width=827,height=303!
>  
> HIVE-25827 already talks about the initial 2 footer reads per file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-26414) Aborted/Cancelled CTAS operations must initiate cleanup of uncommitted data

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26414?focusedWorklogId=795565=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795565
 ]

ASF GitHub Bot logged work on HIVE-26414:
-

Author: ASF GitHub Bot
Created on: 27/Jul/22 08:48
Start Date: 27/Jul/22 08:48
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r930790946


##
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java:
##
@@ -161,7 +161,7 @@ void replTableWriteIdState(String validWriteIdList, String 
dbName, String tableN
* @throws LockException if there is no current transaction or the
* transaction has already been committed or aborted.
*/
-  void rollbackTxn() throws LockException;
+  void rollbackTxn(Context ctx) throws LockException;

Review Comment:
   true, however, Impala could still be using an old API
   





Issue Time Tracking
---

Worklog Id: (was: 795565)
Time Spent: 5h 40m  (was: 5.5h)

> Aborted/Cancelled CTAS operations must initiate cleanup of uncommitted data
> ---
>
> Key: HIVE-26414
> URL: https://issues.apache.org/jira/browse/HIVE-26414
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> When a CTAS query fails before creation of table and after writing the data, 
> the data is present in the directory and not cleaned up currently by the 
> cleaner or any other mechanism currently. This is because the cleaner 
> requires a table corresponding to what its cleaning. In order surpass such a 
> situation, we can directly pass the relevant information to the cleaner so 
> that such uncommitted data is deleted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-26115) LLAP cache utilization for Iceberg Parquet files

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26115?focusedWorklogId=795561=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795561
 ]

ASF GitHub Bot logged work on HIVE-26115:
-

Author: ASF GitHub Bot
Created on: 27/Jul/22 08:39
Start Date: 27/Jul/22 08:39
Worklog Time Spent: 10m 
  Work Description: szlta opened a new pull request, #3480:
URL: https://github.com/apache/hive/pull/3480

   This change makes LLAP caching available for Parquet formatted tables stored 
with Iceberg. 
   For Parquet we can only rely on the LlapCacheAwareFS, which is a wrapper for 
InputStreams opened from files to be swapped by cache buffer reading streams.
   
   I also refactored the FileID generation to remove some code duplication, as 
previously invoked from ORC, Serde and Parquet readers. 




Issue Time Tracking
---

Worklog Id: (was: 795561)
Remaining Estimate: 0h
Time Spent: 10m

> LLAP cache utilization for Iceberg Parquet files
> 
>
> Key: HIVE-26115
> URL: https://issues.apache.org/jira/browse/HIVE-26115
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Ádám Szita
>Priority: Major
> Attachments: Screenshot 2022-04-05 at 10.08.27 AM.png, Screenshot 
> 2022-04-05 at 10.08.35 AM.png, Screenshot 2022-04-05 at 10.08.50 AM.png, 
> Screenshot 2022-04-05 at 10.09.03 AM.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Originally: 
> Parquet footer is read 3 times when reading iceberg data
> !Screenshot 2022-04-05 at 10.08.27 AM.png|width=627,height=331!
> Here is the breakup of 3 footer reads per file.
> !Screenshot 2022-04-05 at 10.08.35 AM.png|width=1109,height=500!
>  
>  
> !Screenshot 2022-04-05 at 10.08.50 AM.png|width=1067,height=447!
>  
>  
> !Screenshot 2022-04-05 at 10.09.03 AM.png|width=827,height=303!
>  
> HIVE-25827 already talks about the initial 2 footer reads per file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26115) LLAP cache utilization for Iceberg Parquet files

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26115:
--
Labels: pull-request-available  (was: )

> LLAP cache utilization for Iceberg Parquet files
> 
>
> Key: HIVE-26115
> URL: https://issues.apache.org/jira/browse/HIVE-26115
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2022-04-05 at 10.08.27 AM.png, Screenshot 
> 2022-04-05 at 10.08.35 AM.png, Screenshot 2022-04-05 at 10.08.50 AM.png, 
> Screenshot 2022-04-05 at 10.09.03 AM.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Originally: 
> Parquet footer is read 3 times when reading iceberg data
> !Screenshot 2022-04-05 at 10.08.27 AM.png|width=627,height=331!
> Here is the breakup of 3 footer reads per file.
> !Screenshot 2022-04-05 at 10.08.35 AM.png|width=1109,height=500!
>  
>  
> !Screenshot 2022-04-05 at 10.08.50 AM.png|width=1067,height=447!
>  
>  
> !Screenshot 2022-04-05 at 10.09.03 AM.png|width=827,height=303!
>  
> HIVE-25827 already talks about the initial 2 footer reads per file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-26288) NPE in CompactionTxnHandler.markFailed()

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26288?focusedWorklogId=795546=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795546
 ]

ASF GitHub Bot logged work on HIVE-26288:
-

Author: ASF GitHub Bot
Created on: 27/Jul/22 07:55
Start Date: 27/Jul/22 07:55
Worklog Time Spent: 10m 
  Work Description: InvisibleProgrammer commented on code in PR #3451:
URL: https://github.com/apache/hive/pull/3451#discussion_r930738558


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java:
##
@@ -600,6 +602,10 @@ public CompactorMR getMrCompactor() {
   }
 
   private void markFailed(CompactionInfo ci, String errorMessage) {
+if (ci == null) {
+  LOG.warn("CompactionInfo client was null. Could not mark failed: {}", 
ci);

Review Comment:
   Removed the parameter. Now it is: `LOG.warn("CompactionInfo client was null. 
Could not mark failed");`





Issue Time Tracking
---

Worklog Id: (was: 795546)
Time Spent: 20m  (was: 10m)

> NPE in CompactionTxnHandler.markFailed()
> 
>
> Key: HIVE-26288
> URL: https://issues.apache.org/jira/browse/HIVE-26288
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: László Végh
>Assignee: Zsolt Miskolczi
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Unhandled exceptions in 
> IMetaStoreClient.findNextCompact(FindNextCompactRequest) handled incorrectly 
> in worker. I these cases the CompcationInfo remains null, but the catch block 
> passes it to CompactionTxnHandler.markFailed() which causes an NPE.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-26431) Use correct schema for iceberg time travel queries

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26431?focusedWorklogId=795534=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795534
 ]

ASF GitHub Bot logged work on HIVE-26431:
-

Author: ASF GitHub Bot
Created on: 27/Jul/22 06:47
Start Date: 27/Jul/22 06:47
Worklog Time Spent: 10m 
  Work Description: lcspinter opened a new pull request, #3479:
URL: https://github.com/apache/hive/pull/3479

   
   
   ### What changes were proposed in this pull request?
   For time travel queries we currently always use the latest schema in 
IcebergSerDe, however since schemas are now versioned, we could use the schema 
which was active at the time in the past.
   
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   
   ### How was this patch tested?
   Unit test, manual test
   
   




Issue Time Tracking
---

Worklog Id: (was: 795534)
Remaining Estimate: 0h
Time Spent: 10m

> Use correct schema for iceberg time travel queries
> --
>
> Key: HIVE-26431
> URL: https://issues.apache.org/jira/browse/HIVE-26431
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For time travel queries we currently always use the latest schema in 
> IcebergSerDe, however since schemas are now versioned, we could use the 
> schema which was active at the time in the past. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26431) Use correct schema for iceberg time travel queries

2022-07-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26431:
--
Labels: pull-request-available  (was: )

> Use correct schema for iceberg time travel queries
> --
>
> Key: HIVE-26431
> URL: https://issues.apache.org/jira/browse/HIVE-26431
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For time travel queries we currently always use the latest schema in 
> IcebergSerDe, however since schemas are now versioned, we could use the 
> schema which was active at the time in the past. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-26431) Use correct schema for iceberg time travel queries

2022-07-27 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-26431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Pintér reassigned HIVE-26431:



> Use correct schema for iceberg time travel queries
> --
>
> Key: HIVE-26431
> URL: https://issues.apache.org/jira/browse/HIVE-26431
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>
> For time travel queries we currently always use the latest schema in 
> IcebergSerDe, however since schemas are now versioned, we could use the 
> schema which was active at the time in the past. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

41 matches

Mail list logo