[jira] [Updated] (HIVE-24467) ConditionalTask remove tasks that not selected exists thread safety problem

2021-07-20 Thread Xi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Chen updated HIVE-24467:
---
Affects Version/s: 1.1.0
   3.1.2

> ConditionalTask remove tasks that not selected exists thread safety problem
> ---
>
> Key: HIVE-24467
> URL: https://issues.apache.org/jira/browse/HIVE-24467
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0, 2.3.4, 3.1.2
>Reporter: guojh
>Assignee: guojh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> When hive execute jobs in parallel(control by “hive.exec.parallel” 
> parameter), ConditionalTasks  remove the tasks that not selected in parallel, 
> because there are thread safety issues, some task may not remove from the 
> dependent task tree. This is a very serious bug, which causes some stage task 
> not trigger execution.
> In our production cluster, the query run three conditional task in parallel, 
> after apply the patch of HIVE-21638, we found Stage-3 is miss and not submit 
> to runnable list for his parent Stage-31 is not done. But Stage-31 should 
> removed for it not selected.
> Stage dependencies is below:
> {code:java}
> STAGE DEPENDENCIES:
>   Stage-41 is a root stage
>   Stage-26 depends on stages: Stage-41
>   Stage-25 depends on stages: Stage-26 , consists of Stage-39, Stage-40, 
> Stage-2
>   Stage-39 has a backup stage: Stage-2
>   Stage-23 depends on stages: Stage-39
>   Stage-3 depends on stages: Stage-2, Stage-12, Stage-16, Stage-20, Stage-23, 
> Stage-24, Stage-27, Stage-28, Stage-31, Stage-32, Stage-35, Stage-36
>   Stage-8 depends on stages: Stage-3 , consists of Stage-5, Stage-4, Stage-6
>   Stage-5
>   Stage-0 depends on stages: Stage-5, Stage-4, Stage-7
>   Stage-51 depends on stages: Stage-0
>   Stage-4
>   Stage-6
>   Stage-7 depends on stages: Stage-6
>   Stage-40 has a backup stage: Stage-2
>   Stage-24 depends on stages: Stage-40
>   Stage-2
>   Stage-44 is a root stage
>   Stage-30 depends on stages: Stage-44
>   Stage-29 depends on stages: Stage-30 , consists of Stage-42, Stage-43, 
> Stage-12
>   Stage-42 has a backup stage: Stage-12
>   Stage-27 depends on stages: Stage-42
>   Stage-43 has a backup stage: Stage-12
>   Stage-28 depends on stages: Stage-43
>   Stage-12
>   Stage-47 is a root stage
>   Stage-34 depends on stages: Stage-47
>   Stage-33 depends on stages: Stage-34 , consists of Stage-45, Stage-46, 
> Stage-16
>   Stage-45 has a backup stage: Stage-16
>   Stage-31 depends on stages: Stage-45
>   Stage-46 has a backup stage: Stage-16
>   Stage-32 depends on stages: Stage-46
>   Stage-16
>   Stage-50 is a root stage
>   Stage-38 depends on stages: Stage-50
>   Stage-37 depends on stages: Stage-38 , consists of Stage-48, Stage-49, 
> Stage-20
>   Stage-48 has a backup stage: Stage-20
>   Stage-35 depends on stages: Stage-48
>   Stage-49 has a backup stage: Stage-20
>   Stage-36 depends on stages: Stage-49
>   Stage-20
> {code}
> Stage tasks execute log is below, we can see Stage-33 is conditional task and 
> it consists of Stage-45, Stage-46, Stage-16, Stage-16 is launched, Stage-45 
> and Stage-46 should remove from the dependent tree, Stage-31 is child of 
> Stage-45 parent of Stage-3, So, Stage-31 should removed too. As see in the 
> below log, we find Stage-31 is still in the parent list of Stage-3, this 
> should not happend.
> {code:java}
> 2020-12-03T01:09:50,939  INFO [HiveServer2-Background-Pool: Thread-87372] 
> ql.Driver: Launching Job 1 out of 17
> 2020-12-03T01:09:50,940  INFO [HiveServer2-Background-Pool: Thread-87372] 
> ql.Driver: Starting task [Stage-26:MAPRED] in parallel
> 2020-12-03T01:09:50,941  INFO [HiveServer2-Background-Pool: Thread-87372] 
> ql.Driver: Launching Job 2 out of 17
> 2020-12-03T01:09:50,943  INFO [HiveServer2-Background-Pool: Thread-87372] 
> ql.Driver: Starting task [Stage-30:MAPRED] in parallel
> 2020-12-03T01:09:50,943  INFO [HiveServer2-Background-Pool: Thread-87372] 
> ql.Driver: Launching Job 3 out of 17
> 2020-12-03T01:09:50,943  INFO [HiveServer2-Background-Pool: Thread-87372] 
> ql.Driver: Starting task [Stage-34:MAPRED] in parallel
> 2020-12-03T01:09:50,944  INFO [HiveServer2-Background-Pool: Thread-87372] 
> ql.Driver: Launching Job 4 out of 17
> 2020-12-03T01:09:50,944  INFO [HiveServer2-Background-Pool: Thread-87372] 
> ql.Driver: Starting task [Stage-38:MAPRED] in parallel
> 2020-12-03T01:10:32,946  INFO [HiveServer2-Background-Pool: Thread-87372] 
> ql.Driver: Starting task [Stage-29:CONDITIONAL] in parallel
> 2020-12-03T01:10:32,946  INFO [HiveServer2-Background-Pool: Thread-87372] 
> ql.Driver: Starting task [Stage-33:CONDITIONAL] in parallel
> 2020-12-03T01:10:32,946  INFO 

[jira] [Work logged] (HIVE-25358) Remove reviewer pattern

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25358?focusedWorklogId=625839=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625839
 ]

ASF GitHub Bot logged work on HIVE-25358:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 23:06
Start Date: 20/Jul/21 23:06
Worklog Time Spent: 10m 
  Work Description: jcamachor opened a new pull request #2506:
URL: https://github.com/apache/hive/pull/2506


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625839)
Remaining Estimate: 0h
Time Spent: 10m

> Remove reviewer pattern
> ---
>
> Key: HIVE-25358
> URL: https://issues.apache.org/jira/browse/HIVE-25358
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24448) Support case-sensitivity for tables in REMOTE database.

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24448?focusedWorklogId=625854=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625854
 ]

ASF GitHub Bot logged work on HIVE-24448:
-

Author: ASF GitHub Bot
Created on: 21/Jul/21 00:08
Start Date: 21/Jul/21 00:08
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2246:
URL: https://github.com/apache/hive/pull/2246#issuecomment-883785199


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625854)
Time Spent: 0.5h  (was: 20m)

> Support case-sensitivity for tables in REMOTE database.
> ---
>
> Key: HIVE-24448
> URL: https://issues.apache.org/jira/browse/HIVE-24448
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Naveen Gangam
>Assignee: Sai Hemanth Gantasala
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Hive tables are case-insensitive. So any case specified in user queries are 
> converted to lower case for query planning and all of the HMS metadata is 
> also persisted as lower case names.
> However, with REMOTE data sources, certain data source will support 
> case-sensitivity for tables. 
> So HiveServer2 query planner needs to preserve user-provided case to be used 
> with HMS APIs, for HMS to be able to fetch the metadata from a remote data 
> source.
> We now see something like this
> {noformat}
> 2020-11-25T16:45:36,402  WARN [HiveServer2-Handler-Pool: Thread-76] 
> thrift.ThriftCLIService: Error executing statement: 
> org.apache.hive.service.cli.HiveSQLException: Error while compiling 
> statement: FAILED: RuntimeException 
> MetaException(message:org.apache.hadoop.hive.serde2.SerDeException 
> org.apache.hive.storage.jdbc.exception.HiveJdbcDatabaseAccessException: Error 
> while trying to get column names: Table 'hive1.txns' doesn't exist)
>   at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:365)
>  ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:206)
>  ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:262)
>  ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:277) 
> ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:560)
>  ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:545)
>  ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at sun.reflect.GeneratedMethodAccessor68.invoke(Unknown Source) ~[?:?]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_231]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_231]
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
>  ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
>  ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
>  ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at java.security.AccessController.doPrivileged(Native Method) 
> ~[?:1.8.0_231]
>   at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_231]
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>  ~[hadoop-common-3.1.0.jar:?]
>   at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
>  ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at com.sun.proxy.$Proxy43.executeStatementAsync(Unknown Source) ~[?:?]
>   at 
> 

[jira] [Work logged] (HIVE-25091) Implement connector provider for MSSQL and Oracle

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25091?focusedWorklogId=625852=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625852
 ]

ASF GitHub Bot logged work on HIVE-25091:
-

Author: ASF GitHub Bot
Created on: 21/Jul/21 00:08
Start Date: 21/Jul/21 00:08
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2248:
URL: https://github.com/apache/hive/pull/2248#issuecomment-883785183


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625852)
Time Spent: 40m  (was: 0.5h)

> Implement connector provider for MSSQL and Oracle
> -
>
> Key: HIVE-25091
> URL: https://issues.apache.org/jira/browse/HIVE-25091
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sai Hemanth Gantasala
>Assignee: Sai Hemanth Gantasala
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Provide an implementation of Connector provider for MSSQL and Oracle



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25103) Update row.serde exclude defaults

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25103?focusedWorklogId=625853=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625853
 ]

ASF GitHub Bot logged work on HIVE-25103:
-

Author: ASF GitHub Bot
Created on: 21/Jul/21 00:08
Start Date: 21/Jul/21 00:08
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #2262:
URL: https://github.com/apache/hive/pull/2262


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625853)
Time Spent: 0.5h  (was: 20m)

> Update row.serde exclude defaults
> -
>
> Key: HIVE-25103
> URL: https://issues.apache.org/jira/browse/HIVE-25103
> Project: Hive
>  Issue Type: Improvement
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> HIVE-16222 introduced row.serde.inputformat.excludes setting to disable 
> row.serde for specific NON-Vectorized formats.
> Since MapredParquetInputFormat is currently natively vectorized it should be 
> removed from that list.
> Even when hive.vectorized.use.vectorized.input.format is DISABLED
> Vectorizer will not vectorize in row deserialize mode if the input format has 
> is natively Vectorized so it is safe to remove.
> Conf order to control vectorization:
> 1. hive.vectorized.use.vectorized.input.format
> 2. hive.vectorized.use.vector.serde.deserialize
> 3. hive.vectorized.use.row.serde.deserialize



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25115) Compaction queue entries may accumulate in "ready for cleaning" state

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25115?focusedWorklogId=625850=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625850
 ]

ASF GitHub Bot logged work on HIVE-25115:
-

Author: ASF GitHub Bot
Created on: 21/Jul/21 00:08
Start Date: 21/Jul/21 00:08
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2274:
URL: https://github.com/apache/hive/pull/2274#issuecomment-883785153


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625850)
Time Spent: 2h 40m  (was: 2.5h)

> Compaction queue entries may accumulate in "ready for cleaning" state
> -
>
> Key: HIVE-25115
> URL: https://issues.apache.org/jira/browse/HIVE-25115
> Project: Hive
>  Issue Type: Improvement
>Reporter: Karen Coppage
>Assignee: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> If the Cleaner does not delete any files, the compaction queue entry is 
> thrown back to the queue and remains in "ready for cleaning" state.
> Problem: If 2 compactions run on the same table and enter "ready for 
> cleaning" state at the same time, only one "cleaning" will remove obsolete 
> files, the other entry will remain in the queue in "ready for cleaning" state.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25134) NPE in TestHiveCli.java

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25134?focusedWorklogId=625851=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625851
 ]

ASF GitHub Bot logged work on HIVE-25134:
-

Author: ASF GitHub Bot
Created on: 21/Jul/21 00:08
Start Date: 21/Jul/21 00:08
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2307:
URL: https://github.com/apache/hive/pull/2307#issuecomment-883785124


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625851)
Time Spent: 40m  (was: 0.5h)

> NPE in TestHiveCli.java
> ---
>
> Key: HIVE-25134
> URL: https://issues.apache.org/jira/browse/HIVE-25134
> Project: Hive
>  Issue Type: Test
>  Components: Beeline, Test
>Reporter: gaozhan ding
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {code:java}
> @Before
> public void setup() throws IOException, URISyntaxException {
>   System.setProperty("datanucleus.schema.autoCreateAll", "true");
>   cli = new HiveCli();
>   initFromFile();
>   redirectOutputStream();
> }
> {code}
> In *setup()*, *initFromFile()* may access *err* before initialization
>  
>  
> {code:java}
> [ERROR] org.apache.hive.beeline.cli.TestHiveCli.testSetPromptValue  Time 
> elapsed: 1.167 s  <<< ERROR!
> java.lang.NullPointerException
> at 
> org.apache.hive.beeline.cli.TestHiveCli.executeCMD(TestHiveCli.java:249)
> at 
> org.apache.hive.beeline.cli.TestHiveCli.initFromFile(TestHiveCli.java:315)
> at org.apache.hive.beeline.cli.TestHiveCli.setup(TestHiveCli.java:288)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> at 
> org.junit.internal.runners.statements.RunBefores.invokeMethod(RunBefores.java:33)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
> at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25281) Add optional fields to enable returning filemetadata for tables and partitions

2021-07-20 Thread Vihang Karajgaonkar (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar resolved HIVE-25281.

Fix Version/s: 4.0.0
   Resolution: Fixed

> Add optional fields to enable returning filemetadata for tables and partitions
> --
>
> Key: HIVE-25281
> URL: https://issues.apache.org/jira/browse/HIVE-25281
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The hive_metastore.thrift interface defines the fields for Table and 
> Partition objects. Certain SQL engines like Impala use Table and Partition 
> from the HMS and then augment it to include additional metadata useful for 
> the engine itself e.g file metadata. It would be good to add support for such 
> fields in the thrift definition itself. These fields currently will be 
> optional fields so that HMS itself doesn't really need to support it for now, 
> but this can be supported in future depending on which SQL engine is talking 
> to HMS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25281) Add optional fields to enable returning filemetadata for tables and partitions

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25281?focusedWorklogId=625843=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625843
 ]

ASF GitHub Bot logged work on HIVE-25281:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 23:20
Start Date: 20/Jul/21 23:20
Worklog Time Spent: 10m 
  Work Description: vihangk1 merged pull request #2425:
URL: https://github.com/apache/hive/pull/2425


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625843)
Time Spent: 40m  (was: 0.5h)

> Add optional fields to enable returning filemetadata for tables and partitions
> --
>
> Key: HIVE-25281
> URL: https://issues.apache.org/jira/browse/HIVE-25281
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The hive_metastore.thrift interface defines the fields for Table and 
> Partition objects. Certain SQL engines like Impala use Table and Partition 
> from the HMS and then augment it to include additional metadata useful for 
> the engine itself e.g file metadata. It would be good to add support for such 
> fields in the thrift definition itself. These fields currently will be 
> optional fields so that HMS itself doesn't really need to support it for now, 
> but this can be supported in future depending on which SQL engine is talking 
> to HMS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25281) Add optional fields to enable returning filemetadata for tables and partitions

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25281?focusedWorklogId=625841=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625841
 ]

ASF GitHub Bot logged work on HIVE-25281:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 23:13
Start Date: 20/Jul/21 23:13
Worklog Time Spent: 10m 
  Work Description: vihangk1 commented on pull request #2425:
URL: https://github.com/apache/hive/pull/2425#issuecomment-883765338


   There is no code changes in the latest commit (only some comment 
realignments). I am going to merge the patch without waiting for the precommit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625841)
Time Spent: 0.5h  (was: 20m)

> Add optional fields to enable returning filemetadata for tables and partitions
> --
>
> Key: HIVE-25281
> URL: https://issues.apache.org/jira/browse/HIVE-25281
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The hive_metastore.thrift interface defines the fields for Table and 
> Partition objects. Certain SQL engines like Impala use Table and Partition 
> from the HMS and then augment it to include additional metadata useful for 
> the engine itself e.g file metadata. It would be good to add support for such 
> fields in the thrift definition itself. These fields currently will be 
> optional fields so that HMS itself doesn't really need to support it for now, 
> but this can be supported in future depending on which SQL engine is talking 
> to HMS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25358) Remove reviewer pattern

2021-07-20 Thread Jesus Camacho Rodriguez (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-25358:
--

Assignee: Jesus Camacho Rodriguez  (was: Panagiotis Garefalakis)

> Remove reviewer pattern
> ---
>
> Key: HIVE-25358
> URL: https://issues.apache.org/jira/browse/HIVE-25358
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25358) Remove reviewer pattern

2021-07-20 Thread Jesus Camacho Rodriguez (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-25358:
---
Reporter: Jesus Camacho Rodriguez  (was: Panagiotis Garefalakis)

> Remove reviewer pattern
> ---
>
> Key: HIVE-25358
> URL: https://issues.apache.org/jira/browse/HIVE-25358
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25358) Remove reviewer pattern

2021-07-20 Thread Jesus Camacho Rodriguez (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-25358:
--


> Remove reviewer pattern
> ---
>
> Key: HIVE-25358
> URL: https://issues.apache.org/jira/browse/HIVE-25358
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25351) stddev(), sstddev_pop() with CBO enable returning null

2021-07-20 Thread Pritha Dawn (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritha Dawn reassigned HIVE-25351:
--

Assignee: Pritha Dawn  (was: Pritha Dawn)

> stddev(), sstddev_pop() with CBO enable returning null
> --
>
> Key: HIVE-25351
> URL: https://issues.apache.org/jira/browse/HIVE-25351
> Project: Hive
>  Issue Type: Bug
>Reporter: Ashish Sharma
>Assignee: Pritha Dawn
>Priority: Blocker
>
> *script used to repro*
> create table cbo_test (key string, v1 double, v2 decimal(30,2), v3 
> decimal(30,2));
> insert into cbo_test values ("00140006375905", 10230.72, 
> 10230.72, 10230.69), ("00140006375905", 10230.72, 10230.72, 
> 10230.69), ("00140006375905", 10230.72, 10230.72, 10230.69), 
> ("00140006375905", 10230.72, 10230.72, 10230.69), 
> ("00140006375905", 10230.72, 10230.72, 10230.69), 
> ("00140006375905", 10230.72, 10230.72, 10230.69);
> select stddev(v1), stddev(v2), stddev(v3) from cbo_test;
> *Enable CBO*
> ++
> |  Explain   |
> ++
> | Plan optimized by CBO. |
> ||
> | Vertex dependency in root stage|
> | Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)|
> ||
> | Stage-0|
> |   Fetch Operator   |
> | limit:-1   |
> | Stage-1|
> |   Reducer 2 vectorized |
> |   File Output Operator [FS_13] |
> | Select Operator [SEL_12] (rows=1 width=24) |
> |   Output:["_col0","_col1","_col2"] |
> |   Group By Operator [GBY_11] (rows=1 width=72) |
> | 
> Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"],aggregations:["sum(VALUE._col0)","sum(VALUE._col1)","count(VALUE._col2)","sum(VALUE._col3)","sum(VALUE._col4)","count(VALUE._col5)","sum(VALUE._col6)","sum(VALUE._col7)","count(VALUE._col8)"]
>  |
> |   <-Map 1 [CUSTOM_SIMPLE_EDGE] vectorized  |
> | PARTITION_ONLY_SHUFFLE [RS_10] |
> |   Group By Operator [GBY_9] (rows=1 width=72) |
> | 
> Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"],aggregations:["sum(_col3)","sum(_col0)","count(_col0)","sum(_col5)","sum(_col4)","count(_col1)","sum(_col7)","sum(_col6)","count(_col2)"]
>  |
> | Select Operator [SEL_8] (rows=6 width=232) |
> |   
> Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7"] |
> |   TableScan [TS_0] (rows=6 width=232) |
> | default@cbo_test,cbo_test, ACID 
> table,Tbl:COMPLETE,Col:COMPLETE,Output:["v1","v2","v3"] |
> ||
> ++
> *Query Result* 
> _c0   _c1 _c2
> 0.0   NaN NaN
> *Disable CBO*
> ++
> |  Explain   |
> ++
> | Vertex dependency in root stage|
> | Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)|
> ||
> | Stage-0|
> |   Fetch Operator   |
> | limit:-1   |
> | Stage-1|
> |   Reducer 2 vectorized |
> |   File Output Operator [FS_11] |
> | Group By Operator [GBY_10] (rows=1 width=24) |
> |   
> Output:["_col0","_col1","_col2"],aggregations:["stddev(VALUE._col0)","stddev(VALUE._col1)","stddev(VALUE._col2)"]
>  |
> | <-Map 1 [CUSTOM_SIMPLE_EDGE] vectorized|
> |   PARTITION_ONLY_SHUFFLE [RS_9]|
> | Group By Operator [GBY_8] (rows=1 width=240) |
> |   
> Output:["_col0","_col1","_col2"],aggregations:["stddev(v1)","stddev(v2)","stddev(v3)"]
>  |
> |   Select Operator [SEL_7] (rows=6 width=232) |
> | Output:["v1","v2","v3"]|
> | TableScan [TS_0] (rows=6 width=232) |
> |   default@cbo_test,cbo_test, ACID 
> table,Tbl:COMPLETE,Col:COMPLETE,Output:["v1","v2","v3"] |
> |

[jira] [Work logged] (HIVE-25357) Fix the checkstyle issue in HiveIcebergMetaHook and the iceberg test issues to unblock the pre-commit tests

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25357?focusedWorklogId=625786=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625786
 ]

ASF GitHub Bot logged work on HIVE-25357:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 20:13
Start Date: 20/Jul/21 20:13
Worklog Time Spent: 10m 
  Work Description: kuczoram commented on pull request #2502:
URL: https://github.com/apache/hive/pull/2502#issuecomment-883664458


   Close this PR, as the problematic commit got reverted, so we will have time 
to fix the tests properly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625786)
Time Spent: 1h 20m  (was: 1h 10m)

> Fix the checkstyle issue in HiveIcebergMetaHook and the iceberg test issues 
> to unblock the pre-commit tests
> ---
>
> Key: HIVE-25357
> URL: https://issues.apache.org/jira/browse/HIVE-25357
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> [ERROR] 
> /home/jenkins/agent/workspace/hive-precommit_master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java:221:3:
>  Cyclomatic Complexity is 13 (max allowed is 12). [CyclomaticComplexity]
> This issue probably came in with 
> [this|https://github.com/apache/hive/commit/76c49b9df957c8c05b81a4016282c03648b728b9]
>  commit 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25357) Fix the checkstyle issue in HiveIcebergMetaHook and the iceberg test issues to unblock the pre-commit tests

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25357?focusedWorklogId=625787=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625787
 ]

ASF GitHub Bot logged work on HIVE-25357:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 20:13
Start Date: 20/Jul/21 20:13
Worklog Time Spent: 10m 
  Work Description: marton-bod commented on pull request #2502:
URL: https://github.com/apache/hive/pull/2502#issuecomment-883664839


   Looking at the history of HiveIcebergMetaHook changes: 
https://github.com/apache/hive/commits/master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java
   It's almost certainly a conflict with this: 
https://issues.apache.org/jira/browse/HIVE-25256
   My patch got merged yesterday (changing the metahook), the stats generation 
changed the metahook too but its green run was 5 days old, then it was merged 
too -> BOOM!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625787)
Time Spent: 1.5h  (was: 1h 20m)

> Fix the checkstyle issue in HiveIcebergMetaHook and the iceberg test issues 
> to unblock the pre-commit tests
> ---
>
> Key: HIVE-25357
> URL: https://issues.apache.org/jira/browse/HIVE-25357
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> [ERROR] 
> /home/jenkins/agent/workspace/hive-precommit_master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java:221:3:
>  Cyclomatic Complexity is 13 (max allowed is 12). [CyclomaticComplexity]
> This issue probably came in with 
> [this|https://github.com/apache/hive/commit/76c49b9df957c8c05b81a4016282c03648b728b9]
>  commit 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25357) Fix the checkstyle issue in HiveIcebergMetaHook and the iceberg test issues to unblock the pre-commit tests

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25357?focusedWorklogId=625785=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625785
 ]

ASF GitHub Bot logged work on HIVE-25357:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 20:12
Start Date: 20/Jul/21 20:12
Worklog Time Spent: 10m 
  Work Description: kuczoram closed pull request #2502:
URL: https://github.com/apache/hive/pull/2502


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625785)
Time Spent: 1h 10m  (was: 1h)

> Fix the checkstyle issue in HiveIcebergMetaHook and the iceberg test issues 
> to unblock the pre-commit tests
> ---
>
> Key: HIVE-25357
> URL: https://issues.apache.org/jira/browse/HIVE-25357
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> [ERROR] 
> /home/jenkins/agent/workspace/hive-precommit_master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java:221:3:
>  Cyclomatic Complexity is 13 (max allowed is 12). [CyclomaticComplexity]
> This issue probably came in with 
> [this|https://github.com/apache/hive/commit/76c49b9df957c8c05b81a4016282c03648b728b9]
>  commit 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25276) Enable automatic statistics generation for Iceberg tables

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25276?focusedWorklogId=625784=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625784
 ]

ASF GitHub Bot logged work on HIVE-25276:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 20:11
Start Date: 20/Jul/21 20:11
Worklog Time Spent: 10m 
  Work Description: pvary commented on pull request #2505:
URL: https://github.com/apache/hive/pull/2505#issuecomment-883663529


   Thanks @kuczoram for noticing and taking care of this!!! 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625784)
Time Spent: 5h 40m  (was: 5.5h)

> Enable automatic statistics generation for Iceberg tables
> -
>
> Key: HIVE-25276
> URL: https://issues.apache.org/jira/browse/HIVE-25276
> Project: Hive
>  Issue Type: Improvement
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> During inserts we should have calculate the column statistics



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25276) Enable automatic statistics generation for Iceberg tables

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25276?focusedWorklogId=625783=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625783
 ]

ASF GitHub Bot logged work on HIVE-25276:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 20:10
Start Date: 20/Jul/21 20:10
Worklog Time Spent: 10m 
  Work Description: pvary merged pull request #2505:
URL: https://github.com/apache/hive/pull/2505


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625783)
Time Spent: 5.5h  (was: 5h 20m)

> Enable automatic statistics generation for Iceberg tables
> -
>
> Key: HIVE-25276
> URL: https://issues.apache.org/jira/browse/HIVE-25276
> Project: Hive
>  Issue Type: Improvement
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> During inserts we should have calculate the column statistics



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25357) Fix the checkstyle issue in HiveIcebergMetaHook and the iceberg test issues to unblock the pre-commit tests

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25357?focusedWorklogId=625782=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625782
 ]

ASF GitHub Bot logged work on HIVE-25357:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 20:08
Start Date: 20/Jul/21 20:08
Worklog Time Spent: 10m 
  Work Description: kuczoram commented on pull request #2502:
URL: https://github.com/apache/hive/pull/2502#issuecomment-883661890


   > > Are you ok to revert it?
   > 
   > Sure. Revert it, and later we will try to find out what caused the 
conflict. The change had a green run before the merge, so there should be some 
other conflicting change, and the 2 of them are causing the issues together.
   > 
   > But if we know the last one we definitely want to revert it
   
   Ok, created a PR for the revert:
   https://github.com/apache/hive/pull/2505
   
   Yes, most probably there is some interference with a previous PR. We can 
find out which one and can fix it later then.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625782)
Time Spent: 1h  (was: 50m)

> Fix the checkstyle issue in HiveIcebergMetaHook and the iceberg test issues 
> to unblock the pre-commit tests
> ---
>
> Key: HIVE-25357
> URL: https://issues.apache.org/jira/browse/HIVE-25357
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> [ERROR] 
> /home/jenkins/agent/workspace/hive-precommit_master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java:221:3:
>  Cyclomatic Complexity is 13 (max allowed is 12). [CyclomaticComplexity]
> This issue probably came in with 
> [this|https://github.com/apache/hive/commit/76c49b9df957c8c05b81a4016282c03648b728b9]
>  commit 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25276) Enable automatic statistics generation for Iceberg tables

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25276?focusedWorklogId=625780=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625780
 ]

ASF GitHub Bot logged work on HIVE-25276:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 20:06
Start Date: 20/Jul/21 20:06
Worklog Time Spent: 10m 
  Work Description: kuczoram opened a new pull request #2505:
URL: https://github.com/apache/hive/pull/2505


   Reverts apache/hive#2419
   Unfortunately this commit broke the build because of some stylecheck issue:
   
http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/master/771/pipeline/
   Most probably caused some additional test failures as well.
   Revert this commit until finding out what caused this issue exactly as the 
PR for this commit had green runs before.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625780)
Time Spent: 5h 20m  (was: 5h 10m)

> Enable automatic statistics generation for Iceberg tables
> -
>
> Key: HIVE-25276
> URL: https://issues.apache.org/jira/browse/HIVE-25276
> Project: Hive
>  Issue Type: Improvement
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> During inserts we should have calculate the column statistics



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25357) Fix the checkstyle issue in HiveIcebergMetaHook and the iceberg test issues to unblock the pre-commit tests

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25357?focusedWorklogId=625777=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625777
 ]

ASF GitHub Bot logged work on HIVE-25357:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 20:01
Start Date: 20/Jul/21 20:01
Worklog Time Spent: 10m 
  Work Description: pvary commented on pull request #2502:
URL: https://github.com/apache/hive/pull/2502#issuecomment-883657350


   > Are you ok to revert it?
   
   Sure. Revert it, and later we will try to find out what caused the conflict. 
The change had a green run before the merge, so there should be some other 
conflicting change, and the 2 of them are causing the issues together. 
   
   But if we know the last one we definitely want to revert it


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625777)
Time Spent: 50m  (was: 40m)

> Fix the checkstyle issue in HiveIcebergMetaHook and the iceberg test issues 
> to unblock the pre-commit tests
> ---
>
> Key: HIVE-25357
> URL: https://issues.apache.org/jira/browse/HIVE-25357
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> [ERROR] 
> /home/jenkins/agent/workspace/hive-precommit_master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java:221:3:
>  Cyclomatic Complexity is 13 (max allowed is 12). [CyclomaticComplexity]
> This issue probably came in with 
> [this|https://github.com/apache/hive/commit/76c49b9df957c8c05b81a4016282c03648b728b9]
>  commit 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25357) Fix the checkstyle issue in HiveIcebergMetaHook and the iceberg test issues to unblock the pre-commit tests

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25357?focusedWorklogId=625776=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625776
 ]

ASF GitHub Bot logged work on HIVE-25357:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 19:58
Start Date: 20/Jul/21 19:58
Worklog Time Spent: 10m 
  Work Description: kuczoram edited a comment on pull request #2502:
URL: https://github.com/apache/hive/pull/2502#issuecomment-883654124


   > Do we know what caused the issue? Can we just revert the change?
   
   Most probably this one: https://issues.apache.org/jira/browse/HIVE-25276
   The build failure is definitely because of this patch, as the build failed 
on that commit:
   
http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/master/771/pipeline/
   Are you ok to revert it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625776)
Time Spent: 40m  (was: 0.5h)

> Fix the checkstyle issue in HiveIcebergMetaHook and the iceberg test issues 
> to unblock the pre-commit tests
> ---
>
> Key: HIVE-25357
> URL: https://issues.apache.org/jira/browse/HIVE-25357
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> [ERROR] 
> /home/jenkins/agent/workspace/hive-precommit_master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java:221:3:
>  Cyclomatic Complexity is 13 (max allowed is 12). [CyclomaticComplexity]
> This issue probably came in with 
> [this|https://github.com/apache/hive/commit/76c49b9df957c8c05b81a4016282c03648b728b9]
>  commit 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25357) Fix the checkstyle issue in HiveIcebergMetaHook and the iceberg test issues to unblock the pre-commit tests

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25357?focusedWorklogId=625775=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625775
 ]

ASF GitHub Bot logged work on HIVE-25357:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 19:55
Start Date: 20/Jul/21 19:55
Worklog Time Spent: 10m 
  Work Description: kuczoram commented on pull request #2502:
URL: https://github.com/apache/hive/pull/2502#issuecomment-883654124


   > Do we know what caused the issue? Can we just revert the change?
   
   Most probably this one: https://issues.apache.org/jira/browse/HIVE-25276
   The build failure is definitely because of this patch, as the build failed 
on that commit. The tests are also very probably.
   Are you ok to revert it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625775)
Time Spent: 0.5h  (was: 20m)

> Fix the checkstyle issue in HiveIcebergMetaHook and the iceberg test issues 
> to unblock the pre-commit tests
> ---
>
> Key: HIVE-25357
> URL: https://issues.apache.org/jira/browse/HIVE-25357
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> [ERROR] 
> /home/jenkins/agent/workspace/hive-precommit_master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java:221:3:
>  Cyclomatic Complexity is 13 (max allowed is 12). [CyclomaticComplexity]
> This issue probably came in with 
> [this|https://github.com/apache/hive/commit/76c49b9df957c8c05b81a4016282c03648b728b9]
>  commit 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25357) Fix the checkstyle issue in HiveIcebergMetaHook and the iceberg test issues to unblock the pre-commit tests

2021-07-20 Thread Peter Vary (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384456#comment-17384456
 ] 

Peter Vary commented on HIVE-25357:
---

Can we just revert the offending commit? 

> Fix the checkstyle issue in HiveIcebergMetaHook and the iceberg test issues 
> to unblock the pre-commit tests
> ---
>
> Key: HIVE-25357
> URL: https://issues.apache.org/jira/browse/HIVE-25357
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> [ERROR] 
> /home/jenkins/agent/workspace/hive-precommit_master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java:221:3:
>  Cyclomatic Complexity is 13 (max allowed is 12). [CyclomaticComplexity]
> This issue probably came in with 
> [this|https://github.com/apache/hive/commit/76c49b9df957c8c05b81a4016282c03648b728b9]
>  commit 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25357) Fix the checkstyle issue in HiveIcebergMetaHook and the iceberg test issues to unblock the pre-commit tests

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25357?focusedWorklogId=625773=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625773
 ]

ASF GitHub Bot logged work on HIVE-25357:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 19:51
Start Date: 20/Jul/21 19:51
Worklog Time Spent: 10m 
  Work Description: pvary commented on pull request #2502:
URL: https://github.com/apache/hive/pull/2502#issuecomment-883651739


   Do we know what caused the issue? Can we just revert the change? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625773)
Time Spent: 20m  (was: 10m)

> Fix the checkstyle issue in HiveIcebergMetaHook and the iceberg test issues 
> to unblock the pre-commit tests
> ---
>
> Key: HIVE-25357
> URL: https://issues.apache.org/jira/browse/HIVE-25357
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> [ERROR] 
> /home/jenkins/agent/workspace/hive-precommit_master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java:221:3:
>  Cyclomatic Complexity is 13 (max allowed is 12). [CyclomaticComplexity]
> This issue probably came in with 
> [this|https://github.com/apache/hive/commit/76c49b9df957c8c05b81a4016282c03648b728b9]
>  commit 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25281) Add optional fields to enable returning filemetadata for tables and partitions

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25281?focusedWorklogId=625763=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625763
 ]

ASF GitHub Bot logged work on HIVE-25281:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 19:25
Start Date: 20/Jul/21 19:25
Worklog Time Spent: 10m 
  Work Description: kishendas commented on a change in pull request #2425:
URL: https://github.com/apache/hive/pull/2425#discussion_r673413945



##
File path: 
standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift
##
@@ -610,6 +637,8 @@ struct Partition {
   10: optional i64 writeId=-1,
   11: optional bool isStatsCompliant,
   12: optional ColumnStatistics colStats // column statistics for partition

Review comment:
   Comma is missing after colStats. Does this still work fine, when you 
generate the code ?

##
File path: 
standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift
##
@@ -610,6 +637,8 @@ struct Partition {
   10: optional i64 writeId=-1,
   11: optional bool isStatsCompliant,
   12: optional ColumnStatistics colStats // column statistics for partition
+  13: optional FileMetadata fileMetadata  // optional serialized file-metadata 
useful
+// for certain execution engines

Review comment:
   Please align the comment

##
File path: 
standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift
##
@@ -595,6 +619,9 @@ struct Table {
   24: optional list requiredWriteCapabilities
   25: optional i64 id, // id of the table. It will be ignored 
if set. It's only for
 // read purposed
+  26: optional FileMetadata fileMetadata // optional serialized file-metadata 
for this table
+  // for certain execution engines

Review comment:
   You want to align this comment with previous line, so that its easy to 
read ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625763)
Time Spent: 20m  (was: 10m)

> Add optional fields to enable returning filemetadata for tables and partitions
> --
>
> Key: HIVE-25281
> URL: https://issues.apache.org/jira/browse/HIVE-25281
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The hive_metastore.thrift interface defines the fields for Table and 
> Partition objects. Certain SQL engines like Impala use Table and Partition 
> from the HMS and then augment it to include additional metadata useful for 
> the engine itself e.g file metadata. It would be good to add support for such 
> fields in the thrift definition itself. These fields currently will be 
> optional fields so that HMS itself doesn't really need to support it for now, 
> but this can be supported in future depending on which SQL engine is talking 
> to HMS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25357) Fix the checkstyle issue in HiveIcebergMetaHook and the iceberg test issues to unblock the pre-commit tests

2021-07-20 Thread Marta Kuczora (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora updated HIVE-25357:
-
Summary: Fix the checkstyle issue in HiveIcebergMetaHook and the iceberg 
test issues to unblock the pre-commit tests  (was: Fix the checkstyle issue in 
HiveIcebergMetaHook an ]which breaks the build)

> Fix the checkstyle issue in HiveIcebergMetaHook and the iceberg test issues 
> to unblock the pre-commit tests
> ---
>
> Key: HIVE-25357
> URL: https://issues.apache.org/jira/browse/HIVE-25357
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [ERROR] 
> /home/jenkins/agent/workspace/hive-precommit_master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java:221:3:
>  Cyclomatic Complexity is 13 (max allowed is 12). [CyclomaticComplexity]
> This issue probably came in with 
> [this|https://github.com/apache/hive/commit/76c49b9df957c8c05b81a4016282c03648b728b9]
>  commit 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25357) Fix the checkstyle issue in HiveIcebergMetaHook an ]which breaks the build

2021-07-20 Thread Marta Kuczora (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora updated HIVE-25357:
-
Summary: Fix the checkstyle issue in HiveIcebergMetaHook an ]which breaks 
the build  (was: Fix the checkstyle issue in HiveIcebergMetaHook which breaks 
the build)

> Fix the checkstyle issue in HiveIcebergMetaHook an ]which breaks the build
> --
>
> Key: HIVE-25357
> URL: https://issues.apache.org/jira/browse/HIVE-25357
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [ERROR] 
> /home/jenkins/agent/workspace/hive-precommit_master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java:221:3:
>  Cyclomatic Complexity is 13 (max allowed is 12). [CyclomaticComplexity]
> This issue probably came in with 
> [this|https://github.com/apache/hive/commit/76c49b9df957c8c05b81a4016282c03648b728b9]
>  commit 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25351) stddev(), sstddev_pop() with CBO enable returning null

2021-07-20 Thread Sankar Hariappan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan reassigned HIVE-25351:
---

Assignee: Pritha Dawn  (was: Ashish Sharma)

> stddev(), sstddev_pop() with CBO enable returning null
> --
>
> Key: HIVE-25351
> URL: https://issues.apache.org/jira/browse/HIVE-25351
> Project: Hive
>  Issue Type: Bug
>Reporter: Ashish Sharma
>Assignee: Pritha Dawn
>Priority: Blocker
>
> *script used to repro*
> create table cbo_test (key string, v1 double, v2 decimal(30,2), v3 
> decimal(30,2));
> insert into cbo_test values ("00140006375905", 10230.72, 
> 10230.72, 10230.69), ("00140006375905", 10230.72, 10230.72, 
> 10230.69), ("00140006375905", 10230.72, 10230.72, 10230.69), 
> ("00140006375905", 10230.72, 10230.72, 10230.69), 
> ("00140006375905", 10230.72, 10230.72, 10230.69), 
> ("00140006375905", 10230.72, 10230.72, 10230.69);
> select stddev(v1), stddev(v2), stddev(v3) from cbo_test;
> *Enable CBO*
> ++
> |  Explain   |
> ++
> | Plan optimized by CBO. |
> ||
> | Vertex dependency in root stage|
> | Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)|
> ||
> | Stage-0|
> |   Fetch Operator   |
> | limit:-1   |
> | Stage-1|
> |   Reducer 2 vectorized |
> |   File Output Operator [FS_13] |
> | Select Operator [SEL_12] (rows=1 width=24) |
> |   Output:["_col0","_col1","_col2"] |
> |   Group By Operator [GBY_11] (rows=1 width=72) |
> | 
> Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"],aggregations:["sum(VALUE._col0)","sum(VALUE._col1)","count(VALUE._col2)","sum(VALUE._col3)","sum(VALUE._col4)","count(VALUE._col5)","sum(VALUE._col6)","sum(VALUE._col7)","count(VALUE._col8)"]
>  |
> |   <-Map 1 [CUSTOM_SIMPLE_EDGE] vectorized  |
> | PARTITION_ONLY_SHUFFLE [RS_10] |
> |   Group By Operator [GBY_9] (rows=1 width=72) |
> | 
> Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"],aggregations:["sum(_col3)","sum(_col0)","count(_col0)","sum(_col5)","sum(_col4)","count(_col1)","sum(_col7)","sum(_col6)","count(_col2)"]
>  |
> | Select Operator [SEL_8] (rows=6 width=232) |
> |   
> Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7"] |
> |   TableScan [TS_0] (rows=6 width=232) |
> | default@cbo_test,cbo_test, ACID 
> table,Tbl:COMPLETE,Col:COMPLETE,Output:["v1","v2","v3"] |
> ||
> ++
> *Query Result* 
> _c0   _c1 _c2
> 0.0   NaN NaN
> *Disable CBO*
> ++
> |  Explain   |
> ++
> | Vertex dependency in root stage|
> | Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)|
> ||
> | Stage-0|
> |   Fetch Operator   |
> | limit:-1   |
> | Stage-1|
> |   Reducer 2 vectorized |
> |   File Output Operator [FS_11] |
> | Group By Operator [GBY_10] (rows=1 width=24) |
> |   
> Output:["_col0","_col1","_col2"],aggregations:["stddev(VALUE._col0)","stddev(VALUE._col1)","stddev(VALUE._col2)"]
>  |
> | <-Map 1 [CUSTOM_SIMPLE_EDGE] vectorized|
> |   PARTITION_ONLY_SHUFFLE [RS_9]|
> | Group By Operator [GBY_8] (rows=1 width=240) |
> |   
> Output:["_col0","_col1","_col2"],aggregations:["stddev(v1)","stddev(v2)","stddev(v3)"]
>  |
> |   Select Operator [SEL_7] (rows=6 width=232) |
> | Output:["v1","v2","v3"]|
> | TableScan [TS_0] (rows=6 width=232) |
> |   default@cbo_test,cbo_test, ACID 
> table,Tbl:COMPLETE,Col:COMPLETE,Output:["v1","v2","v3"] |
> |

[jira] [Assigned] (HIVE-24201) WorkloadManager can support delayed move if destination pool does not have enough sessions

2021-07-20 Thread Sankar Hariappan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan reassigned HIVE-24201:
---

Assignee: Pritha Dawn  (was: Pritha Dawn)

> WorkloadManager can support delayed move if destination pool does not have 
> enough sessions
> --
>
> Key: HIVE-24201
> URL: https://issues.apache.org/jira/browse/HIVE-24201
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, llap
>Affects Versions: 4.0.0
>Reporter: Adesh Kumar Rao
>Assignee: Pritha Dawn
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: DelayedMoveDesign.pdf
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> To reproduce, create a resource plan with move trigger, like below:
> {code:java}
> ++
> |line|
> ++
> | experiment[status=DISABLED,parallelism=null,defaultPool=default] |
> |  +  default[allocFraction=0.888,schedulingPolicy=null,parallelism=1] |
> |  |  mapped for default |
> |  +  pool2[allocFraction=0.1,schedulingPolicy=fair,parallelism=1] |
> |  |  trigger t1: if (ELAPSED_TIME > 20) { MOVE TO pool1 } |
> |  |  mapped for users: abcd   |
> |  +  pool1[allocFraction=0.012,schedulingPolicy=null,parallelism=1] |
> |  |  mapped for users: efgh   |
>  
> {code}
> Now, run two queries in pool1 and pool2 using different users. The query 
> running in pool2 will tried to move to pool1 and it will get killed because 
> pool1 will not have session to handle the query.
> Currently, the Workload management move trigger kills the query being moved 
> to a different pool if destination pool does not have enough capacity.  We 
> could have a "delayed move" configuration which lets the query run in the 
> source pool as long as possible, if the destination pool is full. It will 
> attempt the move to destination pool only when there is claim upon the source 
> pool or capacity available in the destination pool. If the destination pool 
> is not full, delayed move behaves as normal move i.e. the move will happen 
> immediately.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-20071) Migrate to jackson 2.x and prevent usage

2021-07-20 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-20071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich resolved HIVE-20071.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

merged into master. Thank you Krisztian for reviewing the changes!

> Migrate to jackson 2.x and prevent usage
> 
>
> Key: HIVE-20071
> URL: https://issues.apache.org/jira/browse/HIVE-20071
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: restrict_usage.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> there are still some places where jackson 1.x is being used; even thru it's 
> not even referenced from hive's pom.xml-s
> {code}
> git grep -E 'import org.codehaus.jackson'|wc -l
> 106
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-20071) Migrate to jackson 2.x and prevent usage

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-20071?focusedWorklogId=625697=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625697
 ]

ASF GitHub Bot logged work on HIVE-20071:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 17:17
Start Date: 20/Jul/21 17:17
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk merged pull request #2464:
URL: https://github.com/apache/hive/pull/2464


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625697)
Time Spent: 20m  (was: 10m)

> Migrate to jackson 2.x and prevent usage
> 
>
> Key: HIVE-20071
> URL: https://issues.apache.org/jira/browse/HIVE-20071
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: restrict_usage.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> there are still some places where jackson 1.x is being used; even thru it's 
> not even referenced from hive's pom.xml-s
> {code}
> git grep -E 'import org.codehaus.jackson'|wc -l
> 106
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25356) JDBCSplitFilterAboveJoinRule's onMatch method throws exception

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25356?focusedWorklogId=625663=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625663
 ]

ASF GitHub Bot logged work on HIVE-25356:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 16:16
Start Date: 20/Jul/21 16:16
Worklog Time Spent: 10m 
  Work Description: soumyakanti3578 opened a new pull request #2504:
URL: https://github.com/apache/hive/pull/2504


   JDBCSplitFilterAboveJoinRule's onMatch method throws exception because wrong 
rel is assigned to HiveJdbcConverter conv


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625663)
Remaining Estimate: 0h
Time Spent: 10m

> JDBCSplitFilterAboveJoinRule's onMatch method throws exception 
> ---
>
> Key: HIVE-25356
> URL: https://issues.apache.org/jira/browse/HIVE-25356
> Project: Hive
>  Issue Type: Bug
>Reporter: Soumyakanti Das
>Assignee: Soumyakanti Das
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {{In the line below, call.rel(0) doesn't return a HiveJdbcConverter.}}
> final HiveJdbcConverter conv = call.rel(0);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25356) JDBCSplitFilterAboveJoinRule's onMatch method throws exception

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25356:
--
Labels: pull-request-available  (was: )

> JDBCSplitFilterAboveJoinRule's onMatch method throws exception 
> ---
>
> Key: HIVE-25356
> URL: https://issues.apache.org/jira/browse/HIVE-25356
> Project: Hive
>  Issue Type: Bug
>Reporter: Soumyakanti Das
>Assignee: Soumyakanti Das
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {{In the line below, call.rel(0) doesn't return a HiveJdbcConverter.}}
> final HiveJdbcConverter conv = call.rel(0);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25356) JDBCSplitFilterAboveJoinRule's onMatch method throws exception

2021-07-20 Thread Soumyakanti Das (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Soumyakanti Das updated HIVE-25356:
---
Summary: JDBCSplitFilterAboveJoinRule's onMatch method throws exception   
(was: JDBCSplitFilterAboveJoinRule's onMatch method throws exception because 
wrong rel is assigned to HiveJdbcConverter conv)

> JDBCSplitFilterAboveJoinRule's onMatch method throws exception 
> ---
>
> Key: HIVE-25356
> URL: https://issues.apache.org/jira/browse/HIVE-25356
> Project: Hive
>  Issue Type: Bug
>Reporter: Soumyakanti Das
>Assignee: Soumyakanti Das
>Priority: Major
>
> {{In the line below, call.rel(0) doesn't return a HiveJdbcConverter.}}
> final HiveJdbcConverter conv = call.rel(0);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25345) Add logging based on new compaction metrics

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25345?focusedWorklogId=625659=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625659
 ]

ASF GitHub Bot logged work on HIVE-25345:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 16:03
Start Date: 20/Jul/21 16:03
Worklog Time Spent: 10m 
  Work Description: lcspinter commented on a change in pull request #2493:
URL: https://github.com/apache/hive/pull/2493#discussion_r673259608



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/metrics/AcidMetricService.java
##
@@ -85,36 +85,113 @@ public void run() {
 
   private void collectMetrics() throws MetaException {
 ShowCompactResponse currentCompactions = txnHandler.showCompact(new 
ShowCompactRequest());
-updateMetricsFromShowCompact(currentCompactions);
+updateMetricsFromShowCompact(currentCompactions, conf);
 updateDBMetrics();
   }
 
   private void updateDBMetrics() throws MetaException {
 MetricsInfo metrics = txnHandler.getMetricsInfo();
 
Metrics.getOrCreateGauge(NUM_TXN_TO_WRITEID).set(metrics.getTxnToWriteIdCount());
+if (metrics.getTxnToWriteIdCount() >=
+MetastoreConf.getIntVar(conf, 
MetastoreConf.ConfVars.COMPACTOR_TXN_TO_WRITEID_RECORD_THRESHOLD_WARNING) &&
+metrics.getTxnToWriteIdCount() <
+MetastoreConf.getIntVar(conf, 
MetastoreConf.ConfVars.COMPACTOR_TXN_TO_WRITEID_RECORD_THRESHOLD_ERROR)) {
+  LOG.warn("An excessive amount of (" + metrics.getTxnToWriteIdCount() + 
") Hive ACID metadata found in " +
+  "TXN_TO_WRITEID table, which can cause serious performance 
degradation.");
+} else if (metrics.getTxnToWriteIdCount() >=
+MetastoreConf.getIntVar(conf, 
MetastoreConf.ConfVars.COMPACTOR_TXN_TO_WRITEID_RECORD_THRESHOLD_ERROR)) {
+  LOG.error("An excessive amount of (" + metrics.getTxnToWriteIdCount() + 
") Hive ACID metadata found in " +
+  "TXN_TO_WRITEID table, which can cause serious performance 
degradation.");
+}
 
Metrics.getOrCreateGauge(NUM_COMPLETED_TXN_COMPONENTS).set(metrics.getCompletedTxnsCount());
-
+if (metrics.getCompletedTxnsCount() >=
+MetastoreConf.getIntVar(conf,
+
MetastoreConf.ConfVars.COMPACTOR_COMPLETED_TXN_COMPONENTS_RECORD_THRESHOLD_WARNING)
 &&
+metrics.getCompletedTxnsCount() <
+MetastoreConf.getIntVar(conf,
+
MetastoreConf.ConfVars.COMPACTOR_COMPLETED_TXN_COMPONENTS_RECORD_THRESHOLD_ERROR))
 {
+  LOG.warn("An excessive amount of (" + metrics.getCompletedTxnsCount() + 
") Hive ACID metadata found in " +
+  "COMPLETED_TXN_COMPONENTS table, which can cause serious performance 
degradation.");
+} else if (metrics.getCompletedTxnsCount() >= MetastoreConf.getIntVar(conf,
+
MetastoreConf.ConfVars.COMPACTOR_COMPLETED_TXN_COMPONENTS_RECORD_THRESHOLD_ERROR))
 {
+  LOG.error("An excessive amount of (" + metrics.getCompletedTxnsCount() + 
") Hive ACID metadata found in " +
+  "COMPLETED_TXN_COMPONENTS table, which can cause serious performance 
degradation.");
+}
 
Metrics.getOrCreateGauge(NUM_OPEN_REPL_TXNS).set(metrics.getOpenReplTxnsCount());
 
Metrics.getOrCreateGauge(OLDEST_OPEN_REPL_TXN_ID).set(metrics.getOldestOpenReplTxnId());
 
Metrics.getOrCreateGauge(OLDEST_OPEN_REPL_TXN_AGE).set(metrics.getOldestOpenReplTxnAge());
+if (metrics.getOldestOpenReplTxnAge() >=
+MetastoreConf.getTimeVar(conf, 
MetastoreConf.ConfVars.COMPACTOR_OLDEST_REPLICATION_OPENTXN_THRESHOLD_WARNING,
+TimeUnit.SECONDS) && metrics.getOldestOpenReplTxnAge() <
+MetastoreConf.getTimeVar(conf, 
MetastoreConf.ConfVars.COMPACTOR_OLDEST_REPLICATION_OPENTXN_THRESHOLD_ERROR,
+TimeUnit.SECONDS)) {
+  LOG.warn("A replication transaction has been open for " + 
metrics.getOldestOpenReplTxnAge() + " seconds. " +
+  "Before you abort a transaction that was created by replication, and 
which has been open a long time, " +
+  "make sure that the hive.repl.txn.timeout threshold has expired.");
+} else if (metrics.getOldestOpenReplTxnAge() >=
+MetastoreConf.getTimeVar(conf, 
MetastoreConf.ConfVars.COMPACTOR_OLDEST_REPLICATION_OPENTXN_THRESHOLD_ERROR,
+TimeUnit.SECONDS)) {
+  LOG.error("A replication transaction has been open for " + 
metrics.getOldestOpenReplTxnAge() + " seconds. " +
+  "Before you abort a transaction that was created by replication, and 
which has been open a long time, " +
+  "make sure that the hive.repl.txn.timeout threshold has expired.");
+}
 
Metrics.getOrCreateGauge(NUM_OPEN_NON_REPL_TXNS).set(metrics.getOpenNonReplTxnsCount());
 
Metrics.getOrCreateGauge(OLDEST_OPEN_NON_REPL_TXN_ID).set(metrics.getOldestOpenNonReplTxnId());
 

[jira] [Work logged] (HIVE-24235) Drop and recreate table during MR compaction leaves behind base/delta directory

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24235?focusedWorklogId=625654=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625654
 ]

ASF GitHub Bot logged work on HIVE-24235:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 15:57
Start Date: 20/Jul/21 15:57
Worklog Time Spent: 10m 
  Work Description: deniskuzZ opened a new pull request #2503:
URL: https://github.com/apache/hive/pull/2503


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625654)
Time Spent: 40m  (was: 0.5h)

> Drop and recreate table during MR compaction leaves behind base/delta 
> directory
> ---
>
> Key: HIVE-24235
> URL: https://issues.apache.org/jira/browse/HIVE-24235
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> If a table is dropped and recreated during MR compaction, the table directory 
> and a base (or delta, if minor compaction) directory could be created, with 
> or without data, while the table "does not exist".
> E.g.
> {code:java}
> create table c (i int) stored as orc tblproperties 
> ("NO_AUTO_COMPACTION"="true", "transactional"="true");
> insert into c values (9);
> insert into c values (9);
> alter table c compact 'major';
> While compaction job is running: {
> drop table c;
> create table c (i int) stored as orc tblproperties 
> ("NO_AUTO_COMPACTION"="true", "transactional"="true");
> }
> {code}
> The table directory should be empty, but table directory could look like this 
> after the job is finished:
> {code:java}
> Oct  6 14:23 c/base_002_v101/._orc_acid_version.crc
> Oct  6 14:23 c/base_002_v101/.bucket_0.crc
> Oct  6 14:23 c/base_002_v101/_orc_acid_version
> Oct  6 14:23 c/base_002_v101/bucket_0
> {code}
> or perhaps just: 
> {code:java}
> Oct  6 14:23 c/base_002_v101/._orc_acid_version.crc
> Oct  6 14:23 c/base_002_v101/_orc_acid_version
> {code}
> Insert another row and you have:
> {code:java}
> Oct  6 14:33 base_002_v101/
> Oct  6 14:33 base_002_v101/._orc_acid_version.crc
> Oct  6 14:33 base_002_v101/.bucket_0.crc
> Oct  6 14:33 base_002_v101/_orc_acid_version
> Oct  6 14:33 base_002_v101/bucket_0
> Oct  6 14:35 delta_001_001_/._orc_acid_version.crc
> Oct  6 14:35 delta_001_001_/.bucket_0_0.crc
> Oct  6 14:35 delta_001_001_/_orc_acid_version
> Oct  6 14:35 delta_001_001_/bucket_0_0
> {code}
> Selecting from the table will result in this error because the highest valid 
> writeId for this table is 1:
> {code:java}
> thrift.ThriftCLIService: Error fetching results: 
> org.apache.hive.service.cli.HiveSQLException: Unable to get the next row set
>         at 
> org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:482)
>  ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> ...
> Caused by: java.io.IOException: java.lang.RuntimeException: ORC split 
> generation failed with exception: java.io.IOException: Not enough history 
> available for (1,x).  Oldest available base: 
> .../warehouse/b/base_004_v092
> {code}
> Solution: Resolve the table again after compaction is finished; compare the 
> id with the table id from when compaction began. If the ids do not match, 
> abort the compaction's transaction.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25357) Fix the checkstyle issue in HiveIcebergMetaHook which breaks the build

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25357:
--
Labels: pull-request-available  (was: )

> Fix the checkstyle issue in HiveIcebergMetaHook which breaks the build
> --
>
> Key: HIVE-25357
> URL: https://issues.apache.org/jira/browse/HIVE-25357
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [ERROR] 
> /home/jenkins/agent/workspace/hive-precommit_master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java:221:3:
>  Cyclomatic Complexity is 13 (max allowed is 12). [CyclomaticComplexity]
> This issue probably came in with 
> [this|https://github.com/apache/hive/commit/76c49b9df957c8c05b81a4016282c03648b728b9]
>  commit 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25357) Fix the checkstyle issue in HiveIcebergMetaHook which breaks the build

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25357?focusedWorklogId=625648=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625648
 ]

ASF GitHub Bot logged work on HIVE-25357:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 15:46
Start Date: 20/Jul/21 15:46
Worklog Time Spent: 10m 
  Work Description: kuczoram opened a new pull request #2502:
URL: https://github.com/apache/hive/pull/2502


   …aks the build
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625648)
Remaining Estimate: 0h
Time Spent: 10m

> Fix the checkstyle issue in HiveIcebergMetaHook which breaks the build
> --
>
> Key: HIVE-25357
> URL: https://issues.apache.org/jira/browse/HIVE-25357
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
> Fix For: 4.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [ERROR] 
> /home/jenkins/agent/workspace/hive-precommit_master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java:221:3:
>  Cyclomatic Complexity is 13 (max allowed is 12). [CyclomaticComplexity]
> This issue probably came in with 
> [this|https://github.com/apache/hive/commit/76c49b9df957c8c05b81a4016282c03648b728b9]
>  commit 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25357) Fix the checkstyle issue in HiveIcebergMetaHook which breaks the build

2021-07-20 Thread Marta Kuczora (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora reassigned HIVE-25357:



> Fix the checkstyle issue in HiveIcebergMetaHook which breaks the build
> --
>
> Key: HIVE-25357
> URL: https://issues.apache.org/jira/browse/HIVE-25357
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
> Fix For: 4.0.0
>
>
> [ERROR] 
> /home/jenkins/agent/workspace/hive-precommit_master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java:221:3:
>  Cyclomatic Complexity is 13 (max allowed is 12). [CyclomaticComplexity]
> This issue probably came in with 
> [this|https://github.com/apache/hive/commit/76c49b9df957c8c05b81a4016282c03648b728b9]
>  commit 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25353) Incremental rebuild of partitioned insert only MV in presence of delete operations

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25353:
--
Labels: pull-request-available  (was: )

> Incremental rebuild of partitioned insert only MV in presence of delete 
> operations
> --
>
> Key: HIVE-25353
> URL: https://issues.apache.org/jira/browse/HIVE-25353
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, Materialized views
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25353) Incremental rebuild of partitioned insert only MV in presence of delete operations

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25353?focusedWorklogId=625634=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625634
 ]

ASF GitHub Bot logged work on HIVE-25353:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 14:59
Start Date: 20/Jul/21 14:59
Worklog Time Spent: 10m 
  Work Description: kasakrisz opened a new pull request #2501:
URL: https://github.com/apache/hive/pull/2501


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625634)
Remaining Estimate: 0h
Time Spent: 10m

> Incremental rebuild of partitioned insert only MV in presence of delete 
> operations
> --
>
> Key: HIVE-25353
> URL: https://issues.apache.org/jira/browse/HIVE-25353
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, Materialized views
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25356) JDBCSplitFilterAboveJoinRule's onMatch method throws exception because wrong rel is assigned to HiveJdbcConverter conv

2021-07-20 Thread Soumyakanti Das (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Soumyakanti Das reassigned HIVE-25356:
--


> JDBCSplitFilterAboveJoinRule's onMatch method throws exception because wrong 
> rel is assigned to HiveJdbcConverter conv
> --
>
> Key: HIVE-25356
> URL: https://issues.apache.org/jira/browse/HIVE-25356
> Project: Hive
>  Issue Type: Bug
>Reporter: Soumyakanti Das
>Assignee: Soumyakanti Das
>Priority: Major
>
> {{In the line below, call.rel(0) doesn't return a HiveJdbcConverter.}}
> final HiveJdbcConverter conv = call.rel(0);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25276) Enable automatic statistics generation for Iceberg tables

2021-07-20 Thread Peter Vary (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary resolved HIVE-25276.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Pushed to master.

Thanks for the review [~Marton Bod] and [~szita]!

> Enable automatic statistics generation for Iceberg tables
> -
>
> Key: HIVE-25276
> URL: https://issues.apache.org/jira/browse/HIVE-25276
> Project: Hive
>  Issue Type: Improvement
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> During inserts we should have calculate the column statistics



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25276) Enable automatic statistics generation for Iceberg tables

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25276?focusedWorklogId=625609=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625609
 ]

ASF GitHub Bot logged work on HIVE-25276:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 13:52
Start Date: 20/Jul/21 13:52
Worklog Time Spent: 10m 
  Work Description: pvary merged pull request #2419:
URL: https://github.com/apache/hive/pull/2419


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625609)
Time Spent: 5h 10m  (was: 5h)

> Enable automatic statistics generation for Iceberg tables
> -
>
> Key: HIVE-25276
> URL: https://issues.apache.org/jira/browse/HIVE-25276
> Project: Hive
>  Issue Type: Improvement
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> During inserts we should have calculate the column statistics



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25276) Enable automatic statistics generation for Iceberg tables

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25276?focusedWorklogId=625608=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625608
 ]

ASF GitHub Bot logged work on HIVE-25276:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 13:49
Start Date: 20/Jul/21 13:49
Worklog Time Spent: 10m 
  Work Description: marton-bod commented on a change in pull request #2419:
URL: https://github.com/apache/hive/pull/2419#discussion_r673140199



##
File path: 
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java
##
@@ -233,15 +237,21 @@ public void 
preAlterTable(org.apache.hadoop.hive.metastore.api.Table hmsTable, E
   preAlterTableProperties.tableLocation = sd.getLocation();
   preAlterTableProperties.format = sd.getInputFormat();
   preAlterTableProperties.schema = schema(catalogProperties, hmsTable);
-  preAlterTableProperties.spec = spec(conf, 
preAlterTableProperties.schema, catalogProperties, hmsTable);
   preAlterTableProperties.partitionKeys = hmsTable.getPartitionKeys();
 
   context.getProperties().put(HiveMetaHook.ALLOW_PARTITION_KEY_CHANGE, 
"true");
   // If there are partition keys specified remove them from the HMS table 
and add them to the column list
-  if (hmsTable.isSetPartitionKeys()) {
+  if (hmsTable.isSetPartitionKeys() && 
!hmsTable.getPartitionKeys().isEmpty()) {
+List spec = 
PartitionTransform.getPartitionTransformSpec(hmsTable.getPartitionKeys());
+if (!SessionStateUtil.addResource(conf, 
hive_metastoreConstants.PARTITION_TRANSFORM_SPEC, spec)) {
+  throw new MetaException("Query state attached to Session state must 
be not null. " +
+  "Partition transform metadata cannot be saved.");
+}
 hmsTable.getSd().getCols().addAll(hmsTable.getPartitionKeys());
 hmsTable.setPartitionKeysIsSet(false);
   }
+  preAlterTableProperties.spec = spec(conf, 
preAlterTableProperties.schema, hmsTable);

Review comment:
   Right, thanks




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625608)
Time Spent: 5h  (was: 4h 50m)

> Enable automatic statistics generation for Iceberg tables
> -
>
> Key: HIVE-25276
> URL: https://issues.apache.org/jira/browse/HIVE-25276
> Project: Hive
>  Issue Type: Improvement
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> During inserts we should have calculate the column statistics



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25338) AIOBE in conv UDF if input is empty

2021-07-20 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384268#comment-17384268
 ] 

Stamatis Zampetakis commented on HIVE-25338:


[~nareshpr] After having a second look in the PR I kind of change my mind. The 
CONV function is mostly known from MySQL and I think the implementation so far 
tends to stay as close to the MySQL version as possible.

I did a quick test in MySQL (version 8.0) and it returns the following results:

{noformat}
mysql> SELECT CONV('',10,2);
+---+
| CONV('',10,2) |
+---+
| NULL  |
+---+
mysql> SELECT CONV('4?:+',10,2);
+---+
| CONV('4?:+',10,2) |
+---+
| 100   |
+---+
mysql> SELECT CONV('1?:+0',10,2);
++
| CONV('1?:+0',10,2) |
++
| 1  |
++
mysql> SELECT CONV('*10?:+2',10,2);
+--+
| CONV('*10?:+2',10,2) |
+--+
| 0|
+--+
1 row in set, 1 warning (0.00 sec)

Warning (Code 1292): Truncated incorrect DECIMAL value: '*10?:+2'
{noformat}
So for empty string it returns NULL and for invalid literals it tries to kind 
of parse the literal till it finds an illegal character.

Based on the above I would suggest to return NULL for empty string as you had 
it initially. Sorry for the back and forth.



> AIOBE in conv UDF if input is empty
> ---
>
> Key: HIVE-25338
> URL: https://issues.apache.org/jira/browse/HIVE-25338
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Repro
> {code:java}
> create table test (a string);
> insert into test values ("");
> select conv(a,16,10) from test;{code}
> Exception trace:
> {code:java}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
>  at org.apache.hadoop.hive.ql.udf.UDFConv.evaluate(UDFConv.java:160){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25138) Auto disable scheduled queries after repeated failures

2021-07-20 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich resolved HIVE-25138.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

merged into master. Thank you Krisztian for reviewing!

> Auto disable scheduled queries after repeated failures
> --
>
> Key: HIVE-25138
> URL: https://issues.apache.org/jira/browse/HIVE-25138
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25348) Skip metrics collection about writes to tables with tblproperty no_auto_compaction=true if CTAS

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25348?focusedWorklogId=625549=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625549
 ]

ASF GitHub Bot logged work on HIVE-25348:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 12:25
Start Date: 20/Jul/21 12:25
Worklog Time Spent: 10m 
  Work Description: lcspinter commented on a change in pull request #2495:
URL: https://github.com/apache/hive/pull/2495#discussion_r672320320



##
File path: ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands3.java
##
@@ -604,4 +605,14 @@ private void assertOneTxn() throws Exception {
 Assert.assertEquals(TestTxnDbUtil.queryToString(hiveConf, "select * from 
TXNS"), 1,
 TestTxnDbUtil.countQueryAgent(hiveConf, "select count(*) from TXNS"));
   }
+
+  @Test public void testWritesToDisabledCompactionTableCtas() throws Exception 
{

Review comment:
   nit: method declaration in new line




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625549)
Time Spent: 3h  (was: 2h 50m)

> Skip metrics collection about writes to tables with tblproperty 
> no_auto_compaction=true if CTAS
> ---
>
> Key: HIVE-25348
> URL: https://issues.apache.org/jira/browse/HIVE-25348
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> We collect metrics about writes to tables with no_auto_compaction=true when 
> allocating writeids. In the case of CTAS, if ACID is enabled on the new 
> table, a writeid is allocated before the table object is created so we can't 
> get tblproperties from it when allocating the writeid.
> In this case we should skip collecting the metric.
> This commit fixes errors like this:
> {code:java}
> 2021-07-16 18:48:04,350 ERROR 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-9-thread-72]: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.HMSMetricsListener.onAllocWriteId(HMSMetricsListener.java:104)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.lambda$static$6(MetaStoreListenerNotifier.java:229)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:291)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.allocate_table_write_ids(HiveMetaStore.java:8592)
>   at sun.reflect.GeneratedMethodAccessor86.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:160)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:121)
>   at com.sun.proxy.$Proxy33.allocate_table_write_ids(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$allocate_table_write_ids.getResult(ThriftHiveMetastore.java:21584)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$allocate_table_write_ids.getResult(ThriftHiveMetastore.java:21568)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25348) Skip metrics collection about writes to tables with tblproperty no_auto_compaction=true if CTAS

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25348?focusedWorklogId=625526=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625526
 ]

ASF GitHub Bot logged work on HIVE-25348:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 12:22
Start Date: 20/Jul/21 12:22
Worklog Time Spent: 10m 
  Work Description: klcopp closed pull request #2495:
URL: https://github.com/apache/hive/pull/2495






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625526)
Time Spent: 2h 50m  (was: 2h 40m)

> Skip metrics collection about writes to tables with tblproperty 
> no_auto_compaction=true if CTAS
> ---
>
> Key: HIVE-25348
> URL: https://issues.apache.org/jira/browse/HIVE-25348
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> We collect metrics about writes to tables with no_auto_compaction=true when 
> allocating writeids. In the case of CTAS, if ACID is enabled on the new 
> table, a writeid is allocated before the table object is created so we can't 
> get tblproperties from it when allocating the writeid.
> In this case we should skip collecting the metric.
> This commit fixes errors like this:
> {code:java}
> 2021-07-16 18:48:04,350 ERROR 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-9-thread-72]: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.HMSMetricsListener.onAllocWriteId(HMSMetricsListener.java:104)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.lambda$static$6(MetaStoreListenerNotifier.java:229)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:291)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.allocate_table_write_ids(HiveMetaStore.java:8592)
>   at sun.reflect.GeneratedMethodAccessor86.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:160)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:121)
>   at com.sun.proxy.$Proxy33.allocate_table_write_ids(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$allocate_table_write_ids.getResult(ThriftHiveMetastore.java:21584)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$allocate_table_write_ids.getResult(ThriftHiveMetastore.java:21568)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24945) PTF: Support vectorization for lead/lag functions

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24945?focusedWorklogId=625525=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625525
 ]

ASF GitHub Bot logged work on HIVE-24945:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 12:22
Start Date: 20/Jul/21 12:22
Worklog Time Spent: 10m 
  Work Description: abstractdog commented on pull request #2278:
URL: https://github.com/apache/hive/pull/2278#issuecomment-883205928


   @ramesh0201 : rebased and fixed patch passed precommit testing, PR changes 
are in 
https://github.com/apache/hive/pull/2278/commits/81592ac9909d1e2875dfdcb2e2e86f1f597062b3


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625525)
Time Spent: 1h 50m  (was: 1h 40m)

> PTF: Support vectorization for lead/lag functions
> -
>
> Key: HIVE-24945
> URL: https://issues.apache.org/jira/browse/HIVE-24945
> Project: Hive
>  Issue Type: Sub-task
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25115) Compaction queue entries may accumulate in "ready for cleaning" state

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25115?focusedWorklogId=625521=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625521
 ]

ASF GitHub Bot logged work on HIVE-25115:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 12:21
Start Date: 20/Jul/21 12:21
Worklog Time Spent: 10m 
  Work Description: klcopp commented on a change in pull request #2277:
URL: https://github.com/apache/hive/pull/2277#discussion_r672892362



##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java
##
@@ -282,15 +282,12 @@ private ValidReaderWriteIdList 
getValidCleanerWriteIdList(CompactionInfo ci, Tab
 assert rsp != null && rsp.getTblValidWriteIdsSize() == 1;
 ValidReaderWriteIdList validWriteIdList =
 
TxnCommonUtils.createValidReaderWriteIdList(rsp.getTblValidWriteIds().get(0));
-boolean delayedCleanupEnabled = 
conf.getBoolVar(HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED);
-if (delayedCleanupEnabled) {
-  /*
-   * If delayed cleanup enabled, we need to filter the obsoletes dir list, 
to only remove directories that were made obsolete by this compaction
-   * If we have a higher retentionTime it is possible for a second 
compaction to run on the same partition. Cleaning up the first compaction
-   * should not touch the newer obsolete directories to not to violate the 
retentionTime for those.
-   */
-  validWriteIdList = 
validWriteIdList.updateHighWatermark(ci.highestWriteId);
-}
+/*
+ * We need to filter the obsoletes dir list, to only remove directories 
that were made obsolete by this compaction
+ * If we have a higher retentionTime it is possible for a second 
compaction to run on the same partition. Cleaning up the first compaction
+ * should not touch the newer obsolete directories to not to violate the 
retentionTime for those.
+ */
+validWriteIdList = validWriteIdList.updateHighWatermark(ci.highestWriteId);

Review comment:
   How do we know that ci.highestWriteId's txn <= the min open txn the 
cleaner uses, if MIN_HISTORY_LEVEL is still used?

##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java
##
@@ -282,15 +282,12 @@ private ValidReaderWriteIdList 
getValidCleanerWriteIdList(CompactionInfo ci, Tab
 assert rsp != null && rsp.getTblValidWriteIdsSize() == 1;
 ValidReaderWriteIdList validWriteIdList =
 
TxnCommonUtils.createValidReaderWriteIdList(rsp.getTblValidWriteIds().get(0));
-boolean delayedCleanupEnabled = 
conf.getBoolVar(HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED);
-if (delayedCleanupEnabled) {
-  /*
-   * If delayed cleanup enabled, we need to filter the obsoletes dir list, 
to only remove directories that were made obsolete by this compaction
-   * If we have a higher retentionTime it is possible for a second 
compaction to run on the same partition. Cleaning up the first compaction
-   * should not touch the newer obsolete directories to not to violate the 
retentionTime for those.
-   */
-  validWriteIdList = 
validWriteIdList.updateHighWatermark(ci.highestWriteId);
-}
+/*
+ * We need to filter the obsoletes dir list, to only remove directories 
that were made obsolete by this compaction
+ * If we have a higher retentionTime it is possible for a second 
compaction to run on the same partition. Cleaning up the first compaction
+ * should not touch the newer obsolete directories to not to violate the 
retentionTime for those.
+ */
+validWriteIdList = validWriteIdList.updateHighWatermark(ci.highestWriteId);

Review comment:
   Yes, ci.highestWriteId = the highest write id that was compacted.
   So if we have this after compaction:
   delta_1_1
   delta_2_2
   delta_3_3
   base_3
   ci.highestWriteId=3, so the cleaner will remove (assuming MIN_HISTORY_LEVEL 
is still being used) : 
   delta_1_1
   delta_2_2
   delta_3_3
   But how do we know those can be removed?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625521)
Time Spent: 2.5h  (was: 2h 20m)

> Compaction queue entries may accumulate in "ready for cleaning" state
> -
>
> Key: HIVE-25115
> URL: https://issues.apache.org/jira/browse/HIVE-25115
> Project: Hive
>  Issue Type: Improvement
>Reporter: Karen Coppage
>Assignee: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  

[jira] [Work logged] (HIVE-25276) Enable automatic statistics generation for Iceberg tables

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25276?focusedWorklogId=625511=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625511
 ]

ASF GitHub Bot logged work on HIVE-25276:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 12:20
Start Date: 20/Jul/21 12:20
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #2419:
URL: https://github.com/apache/hive/pull/2419#discussion_r672435549



##
File path: 
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java
##
@@ -233,15 +237,21 @@ public void 
preAlterTable(org.apache.hadoop.hive.metastore.api.Table hmsTable, E
   preAlterTableProperties.tableLocation = sd.getLocation();
   preAlterTableProperties.format = sd.getInputFormat();
   preAlterTableProperties.schema = schema(catalogProperties, hmsTable);
-  preAlterTableProperties.spec = spec(conf, 
preAlterTableProperties.schema, catalogProperties, hmsTable);
   preAlterTableProperties.partitionKeys = hmsTable.getPartitionKeys();
 
   context.getProperties().put(HiveMetaHook.ALLOW_PARTITION_KEY_CHANGE, 
"true");
   // If there are partition keys specified remove them from the HMS table 
and add them to the column list
-  if (hmsTable.isSetPartitionKeys()) {
+  if (hmsTable.isSetPartitionKeys() && 
!hmsTable.getPartitionKeys().isEmpty()) {
+List spec = 
PartitionTransform.getPartitionTransformSpec(hmsTable.getPartitionKeys());
+if (!SessionStateUtil.addResource(conf, 
hive_metastoreConstants.PARTITION_TRANSFORM_SPEC, spec)) {
+  throw new MetaException("Query state attached to Session state must 
be not null. " +
+  "Partition transform metadata cannot be saved.");
+}
 hmsTable.getSd().getCols().addAll(hmsTable.getPartitionKeys());
 hmsTable.setPartitionKeysIsSet(false);
   }
+  preAlterTableProperties.spec = spec(conf, 
preAlterTableProperties.schema, hmsTable);

Review comment:
   This is moved from line 236. We need it to be set, but we have to do it 
after we got the correct spec

##
File path: 
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java
##
@@ -233,15 +237,21 @@ public void 
preAlterTable(org.apache.hadoop.hive.metastore.api.Table hmsTable, E
   preAlterTableProperties.tableLocation = sd.getLocation();
   preAlterTableProperties.format = sd.getInputFormat();
   preAlterTableProperties.schema = schema(catalogProperties, hmsTable);
-  preAlterTableProperties.spec = spec(conf, 
preAlterTableProperties.schema, catalogProperties, hmsTable);
   preAlterTableProperties.partitionKeys = hmsTable.getPartitionKeys();
 
   context.getProperties().put(HiveMetaHook.ALLOW_PARTITION_KEY_CHANGE, 
"true");
   // If there are partition keys specified remove them from the HMS table 
and add them to the column list
-  if (hmsTable.isSetPartitionKeys()) {
+  if (hmsTable.isSetPartitionKeys() && 
!hmsTable.getPartitionKeys().isEmpty()) {
+List spec = 
PartitionTransform.getPartitionTransformSpec(hmsTable.getPartitionKeys());
+if (!SessionStateUtil.addResource(conf, 
hive_metastoreConstants.PARTITION_TRANSFORM_SPEC, spec)) {

Review comment:
   This is for migrating tables from non-Iceberg tables to Iceberg tables. 
Previously we just depended on the partition cols, from now on we need to have 
the data in the `SessionState` instead. So we put that there

##
File path: 
iceberg/iceberg-handler/src/test/results/positive/vectorized_iceberg_read.q.out
##
@@ -129,17 +129,17 @@ Stage-0
 Stage-1
   Reducer 2 vectorized
   File Output Operator [FS_11]
-Select Operator [SEL_10] (rows=1 width=564)
+Select Operator [SEL_10] (rows=1 width=372)

Review comment:
   TBH I am not sure, but I expect that has something to do with the new 
statistics

##
File path: 
ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestTezOutputCommitter.java
##
@@ -122,6 +122,7 @@ private IDriver getDriverWithCommitter(String 
committerClass) {
 conf.setVar(HiveConf.ConfVars.HIVE_AUTHORIZATION_MANAGER,
 
"org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory");
 conf.setBoolVar(HiveConf.ConfVars.HIVE_SUPPORT_CONCURRENCY, false);
+conf.setBoolVar(HiveConf.ConfVars.HIVESTATSCOLAUTOGATHER, false);

Review comment:
   Otherwise the tests are failing, because with stats turned on we 
generate 2 tasks instead of 1 (change of the execution plans which contain a 
stage)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, 

[jira] [Work logged] (HIVE-25115) Compaction queue entries may accumulate in "ready for cleaning" state

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25115?focusedWorklogId=625506=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625506
 ]

ASF GitHub Bot logged work on HIVE-25115:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 12:19
Start Date: 20/Jul/21 12:19
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #2277:
URL: https://github.com/apache/hive/pull/2277#discussion_r672900959



##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java
##
@@ -282,15 +282,12 @@ private ValidReaderWriteIdList 
getValidCleanerWriteIdList(CompactionInfo ci, Tab
 assert rsp != null && rsp.getTblValidWriteIdsSize() == 1;
 ValidReaderWriteIdList validWriteIdList =
 
TxnCommonUtils.createValidReaderWriteIdList(rsp.getTblValidWriteIds().get(0));
-boolean delayedCleanupEnabled = 
conf.getBoolVar(HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED);
-if (delayedCleanupEnabled) {
-  /*
-   * If delayed cleanup enabled, we need to filter the obsoletes dir list, 
to only remove directories that were made obsolete by this compaction
-   * If we have a higher retentionTime it is possible for a second 
compaction to run on the same partition. Cleaning up the first compaction
-   * should not touch the newer obsolete directories to not to violate the 
retentionTime for those.
-   */
-  validWriteIdList = 
validWriteIdList.updateHighWatermark(ci.highestWriteId);
-}
+/*
+ * We need to filter the obsoletes dir list, to only remove directories 
that were made obsolete by this compaction
+ * If we have a higher retentionTime it is possible for a second 
compaction to run on the same partition. Cleaning up the first compaction
+ * should not touch the newer obsolete directories to not to violate the 
retentionTime for those.
+ */
+validWriteIdList = validWriteIdList.updateHighWatermark(ci.highestWriteId);

Review comment:
   not sure I got the question. but highestWriteId is recorded at the time 
when the compaction txn starts, so it records all open txns that have to be 
ignored.

##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java
##
@@ -282,15 +282,12 @@ private ValidReaderWriteIdList 
getValidCleanerWriteIdList(CompactionInfo ci, Tab
 assert rsp != null && rsp.getTblValidWriteIdsSize() == 1;
 ValidReaderWriteIdList validWriteIdList =
 
TxnCommonUtils.createValidReaderWriteIdList(rsp.getTblValidWriteIds().get(0));
-boolean delayedCleanupEnabled = 
conf.getBoolVar(HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED);
-if (delayedCleanupEnabled) {
-  /*
-   * If delayed cleanup enabled, we need to filter the obsoletes dir list, 
to only remove directories that were made obsolete by this compaction
-   * If we have a higher retentionTime it is possible for a second 
compaction to run on the same partition. Cleaning up the first compaction
-   * should not touch the newer obsolete directories to not to violate the 
retentionTime for those.
-   */
-  validWriteIdList = 
validWriteIdList.updateHighWatermark(ci.highestWriteId);
-}
+/*
+ * We need to filter the obsoletes dir list, to only remove directories 
that were made obsolete by this compaction
+ * If we have a higher retentionTime it is possible for a second 
compaction to run on the same partition. Cleaning up the first compaction
+ * should not touch the newer obsolete directories to not to violate the 
retentionTime for those.
+ */
+validWriteIdList = validWriteIdList.updateHighWatermark(ci.highestWriteId);

Review comment:
   not sure I got the question. but highestWriteId is recorded at the time 
when the compaction txn starts, so it records writeid hwm and all open txns 
below it that have to be ignored.

##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java
##
@@ -282,15 +282,12 @@ private ValidReaderWriteIdList 
getValidCleanerWriteIdList(CompactionInfo ci, Tab
 assert rsp != null && rsp.getTblValidWriteIdsSize() == 1;
 ValidReaderWriteIdList validWriteIdList =
 
TxnCommonUtils.createValidReaderWriteIdList(rsp.getTblValidWriteIds().get(0));
-boolean delayedCleanupEnabled = 
conf.getBoolVar(HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED);
-if (delayedCleanupEnabled) {
-  /*
-   * If delayed cleanup enabled, we need to filter the obsoletes dir list, 
to only remove directories that were made obsolete by this compaction
-   * If we have a higher retentionTime it is possible for a second 
compaction to run on the same partition. Cleaning up the first compaction
-   * should not touch the newer obsolete directories to not to violate the 
retentionTime for those.
-   */
-  validWriteIdList = 

[jira] [Work logged] (HIVE-25345) Add logging based on new compaction metrics

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25345?focusedWorklogId=625497=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625497
 ]

ASF GitHub Bot logged work on HIVE-25345:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 12:18
Start Date: 20/Jul/21 12:18
Worklog Time Spent: 10m 
  Work Description: lcspinter opened a new pull request #2493:
URL: https://github.com/apache/hive/pull/2493


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625497)
Time Spent: 1h  (was: 50m)

> Add logging based on new compaction metrics
> ---
>
> Key: HIVE-25345
> URL: https://issues.apache.org/jira/browse/HIVE-25345
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25348) Skip metrics collection about writes to tables with tblproperty no_auto_compaction=true if CTAS

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25348?focusedWorklogId=625500=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625500
 ]

ASF GitHub Bot logged work on HIVE-25348:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 12:18
Start Date: 20/Jul/21 12:18
Worklog Time Spent: 10m 
  Work Description: klcopp opened a new pull request #2495:
URL: https://github.com/apache/hive/pull/2495






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625500)
Time Spent: 2h 40m  (was: 2.5h)

> Skip metrics collection about writes to tables with tblproperty 
> no_auto_compaction=true if CTAS
> ---
>
> Key: HIVE-25348
> URL: https://issues.apache.org/jira/browse/HIVE-25348
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> We collect metrics about writes to tables with no_auto_compaction=true when 
> allocating writeids. In the case of CTAS, if ACID is enabled on the new 
> table, a writeid is allocated before the table object is created so we can't 
> get tblproperties from it when allocating the writeid.
> In this case we should skip collecting the metric.
> This commit fixes errors like this:
> {code:java}
> 2021-07-16 18:48:04,350 ERROR 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-9-thread-72]: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.HMSMetricsListener.onAllocWriteId(HMSMetricsListener.java:104)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.lambda$static$6(MetaStoreListenerNotifier.java:229)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:291)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.allocate_table_write_ids(HiveMetaStore.java:8592)
>   at sun.reflect.GeneratedMethodAccessor86.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:160)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:121)
>   at com.sun.proxy.$Proxy33.allocate_table_write_ids(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$allocate_table_write_ids.getResult(ThriftHiveMetastore.java:21584)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$allocate_table_write_ids.getResult(ThriftHiveMetastore.java:21568)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25136) Remove MetaExceptions From RawStore First Cut

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25136?focusedWorklogId=625487=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625487
 ]

ASF GitHub Bot logged work on HIVE-25136:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 12:16
Start Date: 20/Jul/21 12:16
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2290:
URL: https://github.com/apache/hive/pull/2290#issuecomment-882944956


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625487)
Time Spent: 50m  (was: 40m)

> Remove MetaExceptions From RawStore First Cut
> -
>
> Key: HIVE-25136
> URL: https://issues.apache.org/jira/browse/HIVE-25136
> Project: Hive
>  Issue Type: Sub-task
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25348) Skip metrics collection about writes to tables with tblproperty no_auto_compaction=true if CTAS

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25348?focusedWorklogId=625469=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625469
 ]

ASF GitHub Bot logged work on HIVE-25348:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 12:14
Start Date: 20/Jul/21 12:14
Worklog Time Spent: 10m 
  Work Description: klcopp commented on pull request #2495:
URL: https://github.com/apache/hive/pull/2495#issuecomment-883238490


   moved to https://github.com/apache/hive/pull/2497


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625469)
Time Spent: 2.5h  (was: 2h 20m)

> Skip metrics collection about writes to tables with tblproperty 
> no_auto_compaction=true if CTAS
> ---
>
> Key: HIVE-25348
> URL: https://issues.apache.org/jira/browse/HIVE-25348
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> We collect metrics about writes to tables with no_auto_compaction=true when 
> allocating writeids. In the case of CTAS, if ACID is enabled on the new 
> table, a writeid is allocated before the table object is created so we can't 
> get tblproperties from it when allocating the writeid.
> In this case we should skip collecting the metric.
> This commit fixes errors like this:
> {code:java}
> 2021-07-16 18:48:04,350 ERROR 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-9-thread-72]: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.HMSMetricsListener.onAllocWriteId(HMSMetricsListener.java:104)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.lambda$static$6(MetaStoreListenerNotifier.java:229)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:291)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.allocate_table_write_ids(HiveMetaStore.java:8592)
>   at sun.reflect.GeneratedMethodAccessor86.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:160)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:121)
>   at com.sun.proxy.$Proxy33.allocate_table_write_ids(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$allocate_table_write_ids.getResult(ThriftHiveMetastore.java:21584)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$allocate_table_write_ids.getResult(ThriftHiveMetastore.java:21568)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25338) AIOBE in conv UDF if input is empty

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25338?focusedWorklogId=625467=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625467
 ]

ASF GitHub Bot logged work on HIVE-25338:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 12:13
Start Date: 20/Jul/21 12:13
Worklog Time Spent: 10m 
  Work Description: maheshk114 commented on pull request #2485:
URL: https://github.com/apache/hive/pull/2485#issuecomment-882475619


   the changes looks good to me. +1. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625467)
Time Spent: 40m  (was: 0.5h)

> AIOBE in conv UDF if input is empty
> ---
>
> Key: HIVE-25338
> URL: https://issues.apache.org/jira/browse/HIVE-25338
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Repro
> {code:java}
> create table test (a string);
> insert into test values ("");
> select conv(a,16,10) from test;{code}
> Exception trace:
> {code:java}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
>  at org.apache.hadoop.hive.ql.udf.UDFConv.evaluate(UDFConv.java:160){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25276) Enable automatic statistics generation for Iceberg tables

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25276?focusedWorklogId=625465=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625465
 ]

ASF GitHub Bot logged work on HIVE-25276:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 12:13
Start Date: 20/Jul/21 12:13
Worklog Time Spent: 10m 
  Work Description: szlta commented on a change in pull request #2419:
URL: https://github.com/apache/hive/pull/2419#discussion_r672325104



##
File path: 
ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestTezOutputCommitter.java
##
@@ -122,6 +122,7 @@ private IDriver getDriverWithCommitter(String 
committerClass) {
 conf.setVar(HiveConf.ConfVars.HIVE_AUTHORIZATION_MANAGER,
 
"org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory");
 conf.setBoolVar(HiveConf.ConfVars.HIVE_SUPPORT_CONCURRENCY, false);
+conf.setBoolVar(HiveConf.ConfVars.HIVESTATSCOLAUTOGATHER, false);

Review comment:
   Why is this required here?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625465)
Time Spent: 4h 40m  (was: 4.5h)

> Enable automatic statistics generation for Iceberg tables
> -
>
> Key: HIVE-25276
> URL: https://issues.apache.org/jira/browse/HIVE-25276
> Project: Hive
>  Issue Type: Improvement
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> During inserts we should have calculate the column statistics



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24037) Parallelize hash table constructions in map joins

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24037?focusedWorklogId=625457=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625457
 ]

ASF GitHub Bot logged work on HIVE-24037:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 12:12
Start Date: 20/Jul/21 12:12
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #2004:
URL: https://github.com/apache/hive/pull/2004


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625457)
Time Spent: 3h  (was: 2h 50m)

> Parallelize hash table constructions in map joins
> -
>
> Key: HIVE-24037
> URL: https://issues.apache.org/jira/browse/HIVE-24037
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Parallelize hash table constructions in map joins



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25325) Add TRUNCATE TABLE support for Hive Iceberg tables

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25325?focusedWorklogId=625428=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625428
 ]

ASF GitHub Bot logged work on HIVE-25325:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 12:08
Start Date: 20/Jul/21 12:08
Worklog Time Spent: 10m 
  Work Description: kuczoram commented on a change in pull request #2471:
URL: https://github.com/apache/hive/pull/2471#discussion_r672339586



##
File path: 
iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerWithEngine.java
##
@@ -1313,6 +1313,186 @@ public void testScanTableCaseInsensitive() throws 
IOException {
 Assert.assertArrayEquals(new Object[] {1L, "Bob", "Green"}, rows.get(1));
   }
 
+  @Test
+  public void testTruncateTable() throws IOException, TException, 
InterruptedException {
+// Create an Iceberg table with some records in it then execute a truncate 
table command.
+// Then check if the data is deleted and the table statistics are reset to 
0.
+String databaseName = "default";
+String tableName = "customers";
+Table icebergTable = testTables.createTable(shell, tableName, 
HiveIcebergStorageHandlerTestUtils.CUSTOMER_SCHEMA,
+fileFormat, HiveIcebergStorageHandlerTestUtils.CUSTOMER_RECORDS);
+testTruncateTable(databaseName, tableName, icebergTable, 
HiveIcebergStorageHandlerTestUtils.CUSTOMER_RECORDS,
+HiveIcebergStorageHandlerTestUtils.CUSTOMER_SCHEMA, true, false);
+  }
+
+  @Test
+  public void testTruncateEmptyTable() throws IOException, TException, 
InterruptedException {
+// Create an empty Iceberg table and execute a truncate table command on 
it.
+String databaseName = "default";
+String tableName = "customers";
+String fullTableName = databaseName + "." + tableName;

Review comment:
   Thanks, I changed that.

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java
##
@@ -3360,42 +3360,50 @@ public CmRecycleResponse cm_recycle(final 
CmRecycleRequest request) throws MetaE
   public void truncate_table(final String dbName, final String tableName, 
List partNames)
   throws NoSuchObjectException, MetaException {
 // Deprecated path, won't work for txn tables.
-truncateTableInternal(dbName, tableName, partNames, null, -1);
+truncateTableInternal(dbName, tableName, partNames, null, -1, null);
   }
 
   @Override
   public TruncateTableResponse truncate_table_req(TruncateTableRequest req)
   throws MetaException, TException {
 truncateTableInternal(req.getDbName(), req.getTableName(), 
req.getPartNames(),
-req.getValidWriteIdList(), req.getWriteId());
+req.getValidWriteIdList(), req.getWriteId(), 
req.getEnvironmentContext());
 return new TruncateTableResponse();
   }
 
   private void truncateTableInternal(String dbName, String tableName, 
List partNames,
- String validWriteIds, long writeId) 
throws MetaException, NoSuchObjectException {
+ String validWriteIds, long writeId, 
EnvironmentContext context) throws MetaException, NoSuchObjectException {
 boolean isSkipTrash = false, needCmRecycle = false;
 try {
   String[] parsedDbName = parseDbName(dbName, conf);
   Table tbl = get_table_core(parsedDbName[CAT_NAME], 
parsedDbName[DB_NAME], tableName);
 
-  boolean truncateFiles = !TxnUtils.isTransactionalTable(tbl) ||
-  !MetastoreConf.getBoolVar(getConf(), 
MetastoreConf.ConfVars.TRUNCATE_ACID_USE_BASE);
-
-  if (truncateFiles) {
-isSkipTrash = MetaStoreUtils.isSkipTrash(tbl.getParameters());
-Database db = get_database_core(parsedDbName[CAT_NAME], 
parsedDbName[DB_NAME]);
-needCmRecycle = ReplChangeManager.shouldEnableCm(db, tbl);
+  boolean skipDataDeletion = false;
+  if (context != null && context.getProperties() != null
+  && context.getProperties().get("truncateSkipDataDeletion") != null) {

Review comment:
   We can do that. Fixed it.

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java
##
@@ -3360,42 +3360,50 @@ public CmRecycleResponse cm_recycle(final 
CmRecycleRequest request) throws MetaE
   public void truncate_table(final String dbName, final String tableName, 
List partNames)
   throws NoSuchObjectException, MetaException {
 // Deprecated path, won't work for txn tables.
-truncateTableInternal(dbName, tableName, partNames, null, -1);
+truncateTableInternal(dbName, tableName, partNames, null, -1, null);
   }
 
   @Override
   public TruncateTableResponse truncate_table_req(TruncateTableRequest req)
   throws MetaException, TException {
 truncateTableInternal(req.getDbName(), 

[jira] [Work logged] (HIVE-25277) Slow Hive partition deletion for Cloud object stores with expensive ListFiles

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25277?focusedWorklogId=625387=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625387
 ]

ASF GitHub Bot logged work on HIVE-25277:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 12:04
Start Date: 20/Jul/21 12:04
Worklog Time Spent: 10m 
  Work Description: coufon commented on pull request #2421:
URL: https://github.com/apache/hive/pull/2421#issuecomment-882711397


   > Zoltan Haindrich Haymant Mangla Naveen Gangam may you take a look and 
merge this PR?
   
   Thank you @medb! @kgyrtkirk @hmangla98 @nrg4878 a friendly ping, could you 
please take a look? Thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625387)
Time Spent: 2h  (was: 1h 50m)

> Slow Hive partition deletion for Cloud object stores with expensive ListFiles
> -
>
> Key: HIVE-25277
> URL: https://issues.apache.org/jira/browse/HIVE-25277
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: All Versions
>Reporter: Zhou Fang
>Assignee: Zhou Fang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Deleting a Hive partition is slow when use a Cloud object store as the 
> warehouse for which ListFiles is expensive. A root cause is that the 
> recursive parent dir deletion is very inefficient: there are many duplicated 
> calls to isEmpty (ListFiles is called at the end). This fix sorts the parents 
> to delete according to the path size, and always processes the longest one 
> (e.g., a/b/c is always before a/b). As a result, each parent path is only 
> needed to be checked once.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25325) Add TRUNCATE TABLE support for Hive Iceberg tables

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25325?focusedWorklogId=625396=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625396
 ]

ASF GitHub Bot logged work on HIVE-25325:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 12:04
Start Date: 20/Jul/21 12:04
Worklog Time Spent: 10m 
  Work Description: marton-bod commented on a change in pull request #2471:
URL: https://github.com/apache/hive/pull/2471#discussion_r672353809



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java
##
@@ -3360,42 +3360,50 @@ public CmRecycleResponse cm_recycle(final 
CmRecycleRequest request) throws MetaE
   public void truncate_table(final String dbName, final String tableName, 
List partNames)
   throws NoSuchObjectException, MetaException {
 // Deprecated path, won't work for txn tables.
-truncateTableInternal(dbName, tableName, partNames, null, -1);
+truncateTableInternal(dbName, tableName, partNames, null, -1, null);
   }
 
   @Override
   public TruncateTableResponse truncate_table_req(TruncateTableRequest req)
   throws MetaException, TException {
 truncateTableInternal(req.getDbName(), req.getTableName(), 
req.getPartNames(),
-req.getValidWriteIdList(), req.getWriteId());
+req.getValidWriteIdList(), req.getWriteId(), 
req.getEnvironmentContext());
 return new TruncateTableResponse();
   }
 
   private void truncateTableInternal(String dbName, String tableName, 
List partNames,
- String validWriteIds, long writeId) 
throws MetaException, NoSuchObjectException {
+ String validWriteIds, long writeId, 
EnvironmentContext context) throws MetaException, NoSuchObjectException {
 boolean isSkipTrash = false, needCmRecycle = false;
 try {
   String[] parsedDbName = parseDbName(dbName, conf);
   Table tbl = get_table_core(parsedDbName[CAT_NAME], 
parsedDbName[DB_NAME], tableName);
 
-  boolean truncateFiles = !TxnUtils.isTransactionalTable(tbl) ||
-  !MetastoreConf.getBoolVar(getConf(), 
MetastoreConf.ConfVars.TRUNCATE_ACID_USE_BASE);
-
-  if (truncateFiles) {
-isSkipTrash = MetaStoreUtils.isSkipTrash(tbl.getParameters());
-Database db = get_database_core(parsedDbName[CAT_NAME], 
parsedDbName[DB_NAME]);
-needCmRecycle = ReplChangeManager.shouldEnableCm(db, tbl);
+  boolean skipDataDeletion = false;
+  if (context != null && context.getProperties() != null

Review comment:
   One minor thing, this should be `Optional.ofNullable(context)` if you 
want to guard against the `context` being null as well

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java
##
@@ -3360,42 +3360,50 @@ public CmRecycleResponse cm_recycle(final 
CmRecycleRequest request) throws MetaE
   public void truncate_table(final String dbName, final String tableName, 
List partNames)
   throws NoSuchObjectException, MetaException {
 // Deprecated path, won't work for txn tables.
-truncateTableInternal(dbName, tableName, partNames, null, -1);
+truncateTableInternal(dbName, tableName, partNames, null, -1, null);
   }
 
   @Override
   public TruncateTableResponse truncate_table_req(TruncateTableRequest req)
   throws MetaException, TException {
 truncateTableInternal(req.getDbName(), req.getTableName(), 
req.getPartNames(),
-req.getValidWriteIdList(), req.getWriteId());
+req.getValidWriteIdList(), req.getWriteId(), 
req.getEnvironmentContext());
 return new TruncateTableResponse();
   }
 
   private void truncateTableInternal(String dbName, String tableName, 
List partNames,
- String validWriteIds, long writeId) 
throws MetaException, NoSuchObjectException {
+ String validWriteIds, long writeId, 
EnvironmentContext context) throws MetaException, NoSuchObjectException {
 boolean isSkipTrash = false, needCmRecycle = false;
 try {
   String[] parsedDbName = parseDbName(dbName, conf);
   Table tbl = get_table_core(parsedDbName[CAT_NAME], 
parsedDbName[DB_NAME], tableName);
 
-  boolean truncateFiles = !TxnUtils.isTransactionalTable(tbl) ||
-  !MetastoreConf.getBoolVar(getConf(), 
MetastoreConf.ConfVars.TRUNCATE_ACID_USE_BASE);
-
-  if (truncateFiles) {
-isSkipTrash = MetaStoreUtils.isSkipTrash(tbl.getParameters());
-Database db = get_database_core(parsedDbName[CAT_NAME], 
parsedDbName[DB_NAME]);
-needCmRecycle = ReplChangeManager.shouldEnableCm(db, tbl);
+  boolean skipDataDeletion = false;
+  if (context != null && context.getProperties() != null

Review comment:
   In that case, if you get a null at any point during the map 

[jira] [Work logged] (HIVE-25348) Skip metrics collection about writes to tables with tblproperty no_auto_compaction=true if CTAS

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25348?focusedWorklogId=625378=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625378
 ]

ASF GitHub Bot logged work on HIVE-25348:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 12:02
Start Date: 20/Jul/21 12:02
Worklog Time Spent: 10m 
  Work Description: klcopp opened a new pull request #2497:
URL: https://github.com/apache/hive/pull/2497


   Tests: Unit tests


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625378)
Time Spent: 2h 20m  (was: 2h 10m)

> Skip metrics collection about writes to tables with tblproperty 
> no_auto_compaction=true if CTAS
> ---
>
> Key: HIVE-25348
> URL: https://issues.apache.org/jira/browse/HIVE-25348
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> We collect metrics about writes to tables with no_auto_compaction=true when 
> allocating writeids. In the case of CTAS, if ACID is enabled on the new 
> table, a writeid is allocated before the table object is created so we can't 
> get tblproperties from it when allocating the writeid.
> In this case we should skip collecting the metric.
> This commit fixes errors like this:
> {code:java}
> 2021-07-16 18:48:04,350 ERROR 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-9-thread-72]: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.HMSMetricsListener.onAllocWriteId(HMSMetricsListener.java:104)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.lambda$static$6(MetaStoreListenerNotifier.java:229)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:291)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.allocate_table_write_ids(HiveMetaStore.java:8592)
>   at sun.reflect.GeneratedMethodAccessor86.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:160)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:121)
>   at com.sun.proxy.$Proxy33.allocate_table_write_ids(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$allocate_table_write_ids.getResult(ThriftHiveMetastore.java:21584)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$allocate_table_write_ids.getResult(ThriftHiveMetastore.java:21568)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25346) cleanTxnToWriteIdTable breaks SNAPSHOT isolation

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25346?focusedWorklogId=625341=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625341
 ]

ASF GitHub Bot logged work on HIVE-25346:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 11:57
Start Date: 20/Jul/21 11:57
Worklog Time Spent: 10m 
  Work Description: zchovan opened a new pull request #2494:
URL: https://github.com/apache/hive/pull/2494


   Change-Id: I5f832626e7a38834441c38cdde20d57006d11a11
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625341)
Time Spent: 1h  (was: 50m)

> cleanTxnToWriteIdTable breaks SNAPSHOT isolation
> 
>
> Key: HIVE-25346
> URL: https://issues.apache.org/jira/browse/HIVE-25346
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25346) cleanTxnToWriteIdTable breaks SNAPSHOT isolation

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25346?focusedWorklogId=625329=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625329
 ]

ASF GitHub Bot logged work on HIVE-25346:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 11:55
Start Date: 20/Jul/21 11:55
Worklog Time Spent: 10m 
  Work Description: zchovan commented on pull request #2494:
URL: https://github.com/apache/hive/pull/2494#issuecomment-882487842


   Initial check to see how much this breaks write_set stuff in tests


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625329)
Time Spent: 50m  (was: 40m)

> cleanTxnToWriteIdTable breaks SNAPSHOT isolation
> 
>
> Key: HIVE-25346
> URL: https://issues.apache.org/jira/browse/HIVE-25346
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HIVE-24467) ConditionalTask remove tasks that not selected exists thread safety problem

2021-07-20 Thread Xi Chen (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384052#comment-17384052
 ] 

Xi Chen edited comment on HIVE-24467 at 7/20/21, 11:55 AM:
---

We also came into this problem. And the result is lost data.

Our query is dynamic INSERT OVERWRITE with UNION ALL, in the form of:
{code:java}
INSERT OVERWRITE TABLE dest_table PARTITION(...)
SELECT ... FROM
(
  SELECT ... FROM table_a JOIN table_b ...
  UNION ALL
  SELECT ... FROM table_a WHERE ...
  UNION ALL
  SELECT ... FROM table_c JOIN table_d ...
  UNION ALL
  SELECT ... FROM table_c WHERE ...
) mid
JOIN table_e ...;{code}
The stage dependencies is:
{code:java}
STAGE DEPENDENCIES:
  Stage-5 is a root stage
  Stage-6 depends on stages: Stage-5
  Stage-22 depends on stages: Stage-6 , consists of Stage-26, Stage-1
  Stage-26 has a backup stage: Stage-1
  Stage-21 depends on stages: Stage-26
  Stage-25 depends on stages: Stage-1, Stage-12, Stage-21, Stage-23
  Stage-20 depends on stages: Stage-25
  Stage-0 depends on stages: Stage-20
  Stage-1
  Stage-14 is a root stage
  Stage-15 depends on stages: Stage-14
  Stage-24 depends on stages: Stage-15 , consists of Stage-27, Stage-12
  Stage-27 has a backup stage: Stage-12
  Stage-23 depends on stages: Stage-27
  Stage-12
{code}
The problem is triggered in this way:
 # Both Stage-22 and Stage-24 are ConditionalTask and contain mapjoin. 
 # Their dependent tasks Stage-6 and Stage-15 have similar input data size and 
finish at the same time.
 # Thus the two ConditionalTask starts at the same time and come into this race 
condition, causing the backup stages Stage-1 and Stage-12 not correctly removed 
from Stage-25's dependency list.
 # Then Stage-25 Stage-20 Stage-0 will not trigger.
 # Stage-0 is a MoveTask so the data is totally lost and the query succeeds!

The output of the hive query that lost data :
{code:java}
Execution completed successfully
MapredLocal task succeeded
Launching Job 7 out of 9
Launching Job 8 out of 9
Number of reduce tasks is set to 0 since there's no reduce operator
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1573148374965_11083286, Tracking URL = 
http://xxx:8088/proxy/application_1573148374965_11083286/
Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1573148374965_11083286
Starting Job = job_1573148374965_11083287, Tracking URL = 
http://xxx:8088/proxy/application_1573148374965_11083287/
Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1573148374965_11083287
Hadoop job information for Stage-23: number of mappers: 2; number of reducers: 0
2021-07-20 17:45:11,543 Stage-23 map = 0%,  reduce = 0%
Hadoop job information for Stage-21: number of mappers: 3; number of reducers: 0
2021-07-20 17:45:18,975 Stage-21 map = 0%,  reduce = 0%
2021-07-20 17:45:31,609 Stage-23 map = 100%,  reduce = 0%, Cumulative CPU 12.67 
sec
MapReduce Total cumulative CPU time: 12 seconds 670 msec
Ended Job = job_1573148374965_11083286
2021-07-20 17:45:34,268 Stage-21 map = 33%,  reduce = 0%, Cumulative CPU 7.66 
sec
2021-07-20 17:45:46,381 Stage-21 map = 67%,  reduce = 0%, Cumulative CPU 19.27 
sec
2021-07-20 17:45:55,458 Stage-21 map = 100%,  reduce = 0%, Cumulative CPU 34.15 
sec
MapReduce Total cumulative CPU time: 34 seconds 150 msec
Ended Job = job_1573148374965_11083287
MapReduce Jobs Launched:
Stage-Stage-5: Map: 3  Reduce: 1   Cumulative CPU: 23.07 sec   HDFS Read: 43256 
HDFS Write: 772914 SUCCESS
Stage-Stage-14: Map: 1  Reduce: 1   Cumulative CPU: 7.13 sec   HDFS Read: 10353 
HDFS Write: 136 SUCCESS
Stage-Stage-6: Map: 1  Reduce: 1   Cumulative CPU: 8.32 sec   HDFS Read: 780428 
HDFS Write: 670621 SUCCESS
Stage-Stage-15: Map: 1  Reduce: 1   Cumulative CPU: 5.54 sec   HDFS Read: 7686 
HDFS Write: 136 SUCCESS
Stage-Stage-23: Map: 2   Cumulative CPU: 12.67 sec   HDFS Read: 23467580 HDFS 
Write: 4413 SUCCESS
Stage-Stage-21: Map: 3   Cumulative CPU: 34.15 sec   HDFS Read: 70350367 HDFS 
Write: 5286120 SUCCESS
Total MapReduce CPU Time Spent: 5 minutes 3 seconds 860 msec
OK
Time taken: 356.283 seconds

{code}
The job is executed every hour and about 5% jobs run into this problem. 

While the output of normal jobs:
{code:java}
Launching Job 9 out of 9Launching Job 9 out of 9
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1573148374965_11075024, Tracking URL = 
http://xxx8088/proxy/application_1573148374965_11075024/
Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1573148374965_11075024
Hadoop job information for Stage-20: number of mappers: 11; number of reducers: 0
2021-07-20 11:13:27,084 Stage-20 map = 0%,  reduce = 0%
2021-07-20 11:13:36,254 Stage-20 map = 9%,  reduce = 0%, Cumulative CPU 4.73 sec
2021-07-20 11:13:39,293 Stage-20 map = 27%,  reduce = 0%, Cumulative CPU 18.27 
sec
2021-07-20 11:13:42,348 Stage-20 map = 45%,  reduce = 0%, Cumulative CPU 34.42 
sec
2021-07-20 

[jira] [Comment Edited] (HIVE-24467) ConditionalTask remove tasks that not selected exists thread safety problem

2021-07-20 Thread Xi Chen (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384052#comment-17384052
 ] 

Xi Chen edited comment on HIVE-24467 at 7/20/21, 11:54 AM:
---

We also came into this problem. And the result is lost data.

Our query is dynamic INSERT OVERWRITE with UNION ALL, in the form of:
{code:java}
INSERT OVERWRITE TABLE dest_table PARTITION(...)
SELECT ... FROM
(
  SELECT ... FROM table_a JOIN table_b ...
  UNION ALL
  SELECT ... FROM table_a WHERE ...
  UNION ALL
  SELECT ... FROM table_c JOIN table_d ...
  UNION ALL
  SELECT ... FROM table_c WHERE ...
) mid
JOIN table_e ...;{code}
The stage dependencies is:
{code:java}
STAGE DEPENDENCIES:
  Stage-5 is a root stage
  Stage-6 depends on stages: Stage-5
  Stage-22 depends on stages: Stage-6 , consists of Stage-26, Stage-1
  Stage-26 has a backup stage: Stage-1
  Stage-21 depends on stages: Stage-26
  Stage-25 depends on stages: Stage-1, Stage-12, Stage-21, Stage-23
  Stage-20 depends on stages: Stage-25
  Stage-0 depends on stages: Stage-20
  Stage-1
  Stage-14 is a root stage
  Stage-15 depends on stages: Stage-14
  Stage-24 depends on stages: Stage-15 , consists of Stage-27, Stage-12
  Stage-27 has a backup stage: Stage-12
  Stage-23 depends on stages: Stage-27
  Stage-12
{code}
The problem is triggered in this way:
 # Both Stage-22 and Stage-24 are ConditionalTask and contain mapjoin. 
 # Their dependent tasks Stage-6 and Stage-15 have similar input data size and 
finish at the same time.
 # Thus the two ConditionalTask starts at the same time and come into this race 
condition, causing the backup stages Stage-1 and Stage-12 not correctly removed 
from Stage-25's dependency list.
 # Then Stage-25 Stage-20 Stage-0 will not trigger.
 # Stage-0 is a MoveTask so the data is totally lost and the query succeeds!

The output of the hive query that lost data :
{code:java}
Execution completed successfully
MapredLocal task succeeded
Launching Job 7 out of 9
Launching Job 8 out of 9
Number of reduce tasks is set to 0 since there's no reduce operator
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1573148374965_11083286, Tracking URL = 
http://xxx:8088/proxy/application_1573148374965_11083286/
Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1573148374965_11083286
Starting Job = job_1573148374965_11083287, Tracking URL = 
http://xxx:8088/proxy/application_1573148374965_11083287/
Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1573148374965_11083287
Hadoop job information for Stage-23: number of mappers: 2; number of reducers: 0
2021-07-20 17:45:11,543 Stage-23 map = 0%,  reduce = 0%
Hadoop job information for Stage-21: number of mappers: 3; number of reducers: 0
2021-07-20 17:45:18,975 Stage-21 map = 0%,  reduce = 0%
2021-07-20 17:45:31,609 Stage-23 map = 100%,  reduce = 0%, Cumulative CPU 12.67 
sec
MapReduce Total cumulative CPU time: 12 seconds 670 msec
Ended Job = job_1573148374965_11083286
2021-07-20 17:45:34,268 Stage-21 map = 33%,  reduce = 0%, Cumulative CPU 7.66 
sec
2021-07-20 17:45:46,381 Stage-21 map = 67%,  reduce = 0%, Cumulative CPU 19.27 
sec
2021-07-20 17:45:55,458 Stage-21 map = 100%,  reduce = 0%, Cumulative CPU 34.15 
sec
MapReduce Total cumulative CPU time: 34 seconds 150 msec
Ended Job = job_1573148374965_11083287
MapReduce Jobs Launched:
Stage-Stage-5: Map: 3  Reduce: 1   Cumulative CPU: 23.07 sec   HDFS Read: 43256 
HDFS Write: 772914 SUCCESS
Stage-Stage-14: Map: 1  Reduce: 1   Cumulative CPU: 7.13 sec   HDFS Read: 10353 
HDFS Write: 136 SUCCESS
Stage-Stage-6: Map: 1  Reduce: 1   Cumulative CPU: 8.32 sec   HDFS Read: 780428 
HDFS Write: 670621 SUCCESS
Stage-Stage-15: Map: 1  Reduce: 1   Cumulative CPU: 5.54 sec   HDFS Read: 7686 
HDFS Write: 136 SUCCESS
Stage-Stage-23: Map: 2   Cumulative CPU: 12.67 sec   HDFS Read: 23467580 HDFS 
Write: 4413 SUCCESS
Stage-Stage-21: Map: 3   Cumulative CPU: 34.15 sec   HDFS Read: 70350367 HDFS 
Write: 5286120 SUCCESS
Total MapReduce CPU Time Spent: 5 minutes 3 seconds 860 msec
OK
Time taken: 356.283 seconds

{code}
The job is executed every hour and about 5% jobs run into this problem. 

While the output of normal jobs:

 
{code:java}
Launching Job 9 out of 9Launching Job 9 out of 9
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1573148374965_11075024, Tracking URL = 
http://xxx8088/proxy/application_1573148374965_11075024/
Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1573148374965_11075024
Hadoop job information for Stage-20: number of mappers: 11; number of reducers: 0
2021-07-20 11:13:27,084 Stage-20 map = 0%,  reduce = 0%
2021-07-20 11:13:36,254 Stage-20 map = 9%,  reduce = 0%, Cumulative CPU 4.73 sec
2021-07-20 11:13:39,293 Stage-20 map = 27%,  reduce = 0%, Cumulative CPU 18.27 
sec
2021-07-20 11:13:42,348 Stage-20 map = 45%,  reduce = 0%, Cumulative CPU 34.42 
sec
2021-07-20 

[jira] [Work logged] (HIVE-25325) Add TRUNCATE TABLE support for Hive Iceberg tables

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25325?focusedWorklogId=625313=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625313
 ]

ASF GitHub Bot logged work on HIVE-25325:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 11:52
Start Date: 20/Jul/21 11:52
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #2471:
URL: https://github.com/apache/hive/pull/2471#discussion_r672453546



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java
##
@@ -3360,42 +3360,50 @@ public CmRecycleResponse cm_recycle(final 
CmRecycleRequest request) throws MetaE
   public void truncate_table(final String dbName, final String tableName, 
List partNames)
   throws NoSuchObjectException, MetaException {
 // Deprecated path, won't work for txn tables.
-truncateTableInternal(dbName, tableName, partNames, null, -1);
+truncateTableInternal(dbName, tableName, partNames, null, -1, null);
   }
 
   @Override
   public TruncateTableResponse truncate_table_req(TruncateTableRequest req)
   throws MetaException, TException {
 truncateTableInternal(req.getDbName(), req.getTableName(), 
req.getPartNames(),
-req.getValidWriteIdList(), req.getWriteId());
+req.getValidWriteIdList(), req.getWriteId(), 
req.getEnvironmentContext());
 return new TruncateTableResponse();
   }
 
   private void truncateTableInternal(String dbName, String tableName, 
List partNames,
- String validWriteIds, long writeId) 
throws MetaException, NoSuchObjectException {
+ String validWriteIds, long writeId, 
EnvironmentContext context) throws MetaException, NoSuchObjectException {
 boolean isSkipTrash = false, needCmRecycle = false;
 try {
   String[] parsedDbName = parseDbName(dbName, conf);
   Table tbl = get_table_core(parsedDbName[CAT_NAME], 
parsedDbName[DB_NAME], tableName);
 
-  boolean truncateFiles = !TxnUtils.isTransactionalTable(tbl) ||
-  !MetastoreConf.getBoolVar(getConf(), 
MetastoreConf.ConfVars.TRUNCATE_ACID_USE_BASE);
-
-  if (truncateFiles) {
-isSkipTrash = MetaStoreUtils.isSkipTrash(tbl.getParameters());
-Database db = get_database_core(parsedDbName[CAT_NAME], 
parsedDbName[DB_NAME]);
-needCmRecycle = ReplChangeManager.shouldEnableCm(db, tbl);
+  boolean skipDataDeletion = false;
+  if (context != null && context.getProperties() != null
+  && context.getProperties().get("truncateSkipDataDeletion") != null) {
+skipDataDeletion = 
Boolean.parseBoolean(context.getProperties().get("truncateSkipDataDeletion"));
   }
-  // This is not transactional
-  for (Path location : getLocationsForTruncate(getMS(), 
parsedDbName[CAT_NAME],
-  parsedDbName[DB_NAME], tableName, tbl, partNames)) {
-FileSystem fs = location.getFileSystem(getConf());
+
+  if (!skipDataDeletion) {
+boolean truncateFiles = !TxnUtils.isTransactionalTable(tbl)
+|| !MetastoreConf.getBoolVar(getConf(), 
MetastoreConf.ConfVars.TRUNCATE_ACID_USE_BASE);
+
 if (truncateFiles) {
-  truncateDataFiles(location, fs, isSkipTrash, needCmRecycle);
-} else {
-  // For Acid tables we don't need to delete the old files, only write 
an empty baseDir.
-  // Compaction and cleaner will take care of the rest
-  addTruncateBaseFile(location, writeId, fs);
+  isSkipTrash = MetaStoreUtils.isSkipTrash(tbl.getParameters());
+  Database db = get_database_core(parsedDbName[CAT_NAME], 
parsedDbName[DB_NAME]);
+  needCmRecycle = ReplChangeManager.shouldEnableCm(db, tbl);
+}
+// This is not transactional
+for (Path location : getLocationsForTruncate(getMS(), 
parsedDbName[CAT_NAME], parsedDbName[DB_NAME], tableName,
+tbl, partNames)) {
+  FileSystem fs = location.getFileSystem(getConf());
+  if (truncateFiles) {

Review comment:
   Opps.. I have missed line 3395, so this check is not embedded. Sorry :(




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625313)
Time Spent: 3h 40m  (was: 3.5h)

> Add TRUNCATE TABLE support for Hive Iceberg tables
> --
>
> Key: HIVE-25325
> URL: https://issues.apache.org/jira/browse/HIVE-25325
> Project: Hive
>  Issue Type: Improvement
>

[jira] [Work logged] (HIVE-25345) Add logging based on new compaction metrics

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25345?focusedWorklogId=625309=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625309
 ]

ASF GitHub Bot logged work on HIVE-25345:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 11:52
Start Date: 20/Jul/21 11:52
Worklog Time Spent: 10m 
  Work Description: klcopp commented on a change in pull request #2493:
URL: https://github.com/apache/hive/pull/2493#discussion_r672868711



##
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
##
@@ -3194,6 +3194,15 @@ private static void 
populateLlapDaemonVarsSet(Set llapDaemonVarsSetLocal
 "Age of table/partition's oldest aborted transaction when compaction 
will be triggered. " +
 "Default time unit is: hours. Set to a negative number to disable."),
 
+
HIVE_COMPACTOR_ACTIVE_DELTA_DIR_THRESHOLD("hive.compactor.active.delta.dir.threshold",
 200,
+"Number if active delta directories under a given table/partition."),

Review comment:
   I think the descriptions here should reflect that these are thresholds, 
and that logging will happen if they are passed

##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
##
@@ -432,6 +432,88 @@ public static ConfVars getMetaConf(String name) {
 COMPACTOR_RUN_AS_USER("metastore.compactor.run.as.user", 
"hive.compactor.run.as.user", "",
 "Specify the user to run compactor Initiator and Worker as. If empty 
string, defaults to table/partition " +
 "directory owner."),
+COMPACTOR_OLDEST_REPLICATION_OPENTXN_THRESHOLD_WARNING(

Review comment:
   For all of these: Instead of: "after which a warning should be raised" I 
think it would be clearer to say: "after which a warning will be logged"

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSMetricsListener.java
##
@@ -101,7 +101,14 @@ public void onAllocWriteId(AllocWriteIdEvent 
allocWriteIdEvent, Connection dbCon
   Table table = getTable(allocWriteIdEvent);
 
   if (MetaStoreUtils.isNoAutoCompactSet(table.getParameters())) {
-
Metrics.getOrCreateGauge(MetricsConstants.WRITES_TO_DISABLED_COMPACTION_TABLE).incrementAndGet();
+int noAutoCompactSet =
+
Metrics.getOrCreateGauge(MetricsConstants.WRITES_TO_DISABLED_COMPACTION_TABLE).incrementAndGet();
+if (noAutoCompactSet >=
+MetastoreConf.getIntVar(getConf(),
+
MetastoreConf.ConfVars.COMPACTOR_NUMBER_OF_DISABLED_COMPACTION_TABLES_THRESHOLD))
 {
+  LOGGER.warn("Number of tables where the compaction is turned off is: 
" + noAutoCompactSet);

Review comment:
   This might be clearer: "There has been a write to a table where 
auto-compaction is disabled (tblproperties ("no_auto_compact"="true"))...
   And definitely log the db and table name, so users can find it and re-enable 
auto-compaction.

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/metrics/AcidMetricService.java
##
@@ -85,36 +85,113 @@ public void run() {
 
   private void collectMetrics() throws MetaException {
 ShowCompactResponse currentCompactions = txnHandler.showCompact(new 
ShowCompactRequest());
-updateMetricsFromShowCompact(currentCompactions);
+updateMetricsFromShowCompact(currentCompactions, conf);
 updateDBMetrics();
   }
 
   private void updateDBMetrics() throws MetaException {
 MetricsInfo metrics = txnHandler.getMetricsInfo();
 
Metrics.getOrCreateGauge(NUM_TXN_TO_WRITEID).set(metrics.getTxnToWriteIdCount());
+if (metrics.getTxnToWriteIdCount() >=
+MetastoreConf.getIntVar(conf, 
MetastoreConf.ConfVars.COMPACTOR_TXN_TO_WRITEID_RECORD_THRESHOLD_WARNING) &&
+metrics.getTxnToWriteIdCount() <
+MetastoreConf.getIntVar(conf, 
MetastoreConf.ConfVars.COMPACTOR_TXN_TO_WRITEID_RECORD_THRESHOLD_ERROR)) {
+  LOG.warn("An excessive amount of (" + metrics.getTxnToWriteIdCount() + 
") Hive ACID metadata found in " +
+  "TXN_TO_WRITEID table, which can cause serious performance 
degradation.");
+} else if (metrics.getTxnToWriteIdCount() >=
+MetastoreConf.getIntVar(conf, 
MetastoreConf.ConfVars.COMPACTOR_TXN_TO_WRITEID_RECORD_THRESHOLD_ERROR)) {
+  LOG.error("An excessive amount of (" + metrics.getTxnToWriteIdCount() + 
") Hive ACID metadata found in " +
+  "TXN_TO_WRITEID table, which can cause serious performance 
degradation.");
+}
 
Metrics.getOrCreateGauge(NUM_COMPLETED_TXN_COMPONENTS).set(metrics.getCompletedTxnsCount());
-
+if (metrics.getCompletedTxnsCount() >=
+MetastoreConf.getIntVar(conf,
+
MetastoreConf.ConfVars.COMPACTOR_COMPLETED_TXN_COMPONENTS_RECORD_THRESHOLD_WARNING)
 &&
+

[jira] [Work logged] (HIVE-25350) Replication fails for external tables on setting owner/groups

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25350?focusedWorklogId=625304=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625304
 ]

ASF GitHub Bot logged work on HIVE-25350:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 11:51
Start Date: 20/Jul/21 11:51
Worklog Time Spent: 10m 
  Work Description: ayushtkn opened a new pull request #2498:
URL: https://github.com/apache/hive/pull/2498


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625304)
Time Spent: 0.5h  (was: 20m)

> Replication fails for external tables on setting owner/groups
> -
>
> Key: HIVE-25350
> URL: https://issues.apache.org/jira/browse/HIVE-25350
> Project: Hive
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> DirCopyTask tries to preserve user group permissions, irrespective whether 
> they have been specified to be preserved or not.
> Changing user/group requires SuperUser privileges, hence the task fails.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25190) BytesColumnVector fails when the aggregate size is > 1gb

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25190?focusedWorklogId=625300=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625300
 ]

ASF GitHub Bot logged work on HIVE-25190:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 11:51
Start Date: 20/Jul/21 11:51
Worklog Time Spent: 10m 
  Work Description: omalley closed pull request #2408:
URL: https://github.com/apache/hive/pull/2408


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625300)
Time Spent: 2h 50m  (was: 2h 40m)

> BytesColumnVector fails when the aggregate size is > 1gb
> 
>
> Key: HIVE-25190
> URL: https://issues.apache.org/jira/browse/HIVE-25190
> Project: Hive
>  Issue Type: Bug
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Currently, BytesColumnVector will allocate a buffer for small values (< 1mb), 
> but fail with:
> {code:java}
> new RuntimeException("Overflow of newLength. smallBuffer.length="
> + smallBuffer.length + ", nextElemLength=" + nextElemLength);
> {code:java}
> if the aggregate size of the buffer crosses over 1gb. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24555) Improve Hive Class Logging/Error Handling

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24555?focusedWorklogId=625264=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625264
 ]

ASF GitHub Bot logged work on HIVE-24555:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 11:46
Start Date: 20/Jul/21 11:46
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1801:
URL: https://github.com/apache/hive/pull/1801


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625264)
Time Spent: 1h 20m  (was: 1h 10m)

> Improve Hive Class Logging/Error Handling
> -
>
> Key: HIVE-24555
> URL: https://issues.apache.org/jira/browse/HIVE-24555
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> * Do not log-and-throw
> * Pass relevant error message
> * Do not pass the message from the caught message up the chain, it will be 
> passed up as part of the 'caused by' chain
> * StringBuffer/StringBuilder
> * Use anchors in DEBUG logging



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25349) Skip password authentication when a trusted header is present in the Http request

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25349?focusedWorklogId=625253=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625253
 ]

ASF GitHub Bot logged work on HIVE-25349:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 11:45
Start Date: 20/Jul/21 11:45
Worklog Time Spent: 10m 
  Work Description: saihemanth-cloudera opened a new pull request #2496:
URL: https://github.com/apache/hive/pull/2496


   …http request
   
   
   
   ### What changes were proposed in this pull request?
   Skip password-based authorization, when a trusted proxy header is present in 
the HTTP header.
   
   
   
   ### Why are the changes needed?
   We don't need to authorization again, since this trusted header represents 
that the user is authenticated beforehand like knox service.
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   
   
   ### How was this patch tested?
   Local machine, remote cluster.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625253)
Time Spent: 0.5h  (was: 20m)

> Skip password authentication when a trusted header is present in the Http 
> request
> -
>
> Key: HIVE-25349
> URL: https://issues.apache.org/jira/browse/HIVE-25349
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive, HiveServer2
>Reporter: Sai Hemanth Gantasala
>Assignee: Sai Hemanth Gantasala
>Priority: Major
>  Labels: pull-request-available, security-review-needed
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Whenever a trusted header is present in the HTTP servlet request, skip the 
> password based authentication, since the user is pre-authorized and extract 
> the user name from Authorization header.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25190) BytesColumnVector fails when the aggregate size is > 1gb

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25190?focusedWorklogId=625257=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625257
 ]

ASF GitHub Bot logged work on HIVE-25190:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 11:45
Start Date: 20/Jul/21 11:45
Worklog Time Spent: 10m 
  Work Description: omalley commented on pull request #2408:
URL: https://github.com/apache/hive/pull/2408#issuecomment-882883266


   Thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625257)
Time Spent: 2h 40m  (was: 2.5h)

> BytesColumnVector fails when the aggregate size is > 1gb
> 
>
> Key: HIVE-25190
> URL: https://issues.apache.org/jira/browse/HIVE-25190
> Project: Hive
>  Issue Type: Bug
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Currently, BytesColumnVector will allocate a buffer for small values (< 1mb), 
> but fail with:
> {code:java}
> new RuntimeException("Overflow of newLength. smallBuffer.length="
> + smallBuffer.length + ", nextElemLength=" + nextElemLength);
> {code:java}
> if the aggregate size of the buffer crosses over 1gb. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22626) Fix Replication related tests

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22626?focusedWorklogId=625238=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625238
 ]

ASF GitHub Bot logged work on HIVE-22626:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 11:44
Start Date: 20/Jul/21 11:44
Worklog Time Spent: 10m 
  Work Description: aasha merged pull request #2452:
URL: https://github.com/apache/hive/pull/2452


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625238)
Time Spent: 1.5h  (was: 1h 20m)

> Fix Replication related tests
> -
>
> Key: HIVE-22626
> URL: https://issues.apache.org/jira/browse/HIVE-22626
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Reporter: Zoltan Haindrich
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
> Attachments: qalogs.tgz
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> For TestStatsReplicationScenariosACIDNoAutogather:
> this test is running "alone" because but still; it sometimes runs more than 
> 40m which results in a timeout
>  a jira search reveals that was pretty common: 
>  
> [https://issues.apache.org/jira/issues/?jql=text%20~%20%22TestStatsReplicationScenariosACIDNoAutogather%22%20order%20by%20updated%20desc]
> from the hive logs:
>  * it seems like after a few minutes this test starts there is an exception:
> {code:java}
> 2019-12-10T22:43:19,594 DEBUG [Finalizer] metastore.HiveMetaStoreClient: 
> Unable to shutdown metastore client. Will try closing transport directly.
> org.apache.thrift.transport.TTransportException: java.net.SocketException: 
> Socket closed
> at 
> org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:161)
>  ~[libthrift-0.9.3-1.jar:0.9.3-1]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:73) 
> ~[libthrift-0.9.3-1.jar:0.9.3-1]
> at 
> org.apache.thrift.TServiceClient.sendBaseOneway(TServiceClient.java:66) 
> ~[libthrift-0.9.3-1.jar:0.9.3-1]
> at 
> com.facebook.fb303.FacebookService$Client.send_shutdown(FacebookService.java:436)
>  ~[libfb303-0.9.3.jar:?]
> at 
> com.facebook.fb303.FacebookService$Client.shutdown(FacebookService.java:430) 
> ~[libfb303-0.9.3.jar:?]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.close(HiveMetaStoreClient.java:776)
>  [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_102]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_102]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_102]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_102]
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212)
>  [hive-standalone-metastore-common-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at com.sun.proxy.$Proxy62.close(Unknown Source) [?:?]
> at org.apache.hadoop.hive.ql.metadata.Hive.close(Hive.java:542) 
> [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.metadata.Hive.finalize(Hive.java:514) 
> [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at java.lang.System$2.invokeFinalize(System.java:1270) [?:1.8.0_102]
> at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:98) 
> [?:1.8.0_102]
> at java.lang.ref.Finalizer.access$100(Finalizer.java:34) [?:1.8.0_102]
> at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:210) 
> [?:1.8.0_102]
> Caused by: java.net.SocketException: Socket closed
> at 
> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:116) 
> ~[?:1.8.0_102]
> at java.net.SocketOutputStream.write(SocketOutputStream.java:153) 
> ~[?:1.8.0_102]
> at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) 
> ~[?:1.8.0_102]
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) 
> ~[?:1.8.0_102]
> at 
> org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:159)
>  ~[libthrift-0.9.3-1.jar:0.9.3-1]
> {code}
>  * after that some NoSuchObjectExceptions follow
>  * and then some replications seems to happen
> I don't fully understand this; I'll attach the logs...



--
This message was sent by Atlassian Jira

[jira] [Commented] (HIVE-24467) ConditionalTask remove tasks that not selected exists thread safety problem

2021-07-20 Thread Xi Chen (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384052#comment-17384052
 ] 

Xi Chen commented on HIVE-24467:


We also came into this problem. And the result is lost data.

Our query is dynamic INSERT OVERWRITE with UNION ALL, in the form of:
{code:java}
INSERT OVERWRITE TABLE dest_table PARTITION(...)
SELECT ... FROM
(
  SELECT ... FROM table_a JOIN table_b ...
  UNION ALL
  SELECT ... FROM table_a WHERE ...
  UNION ALL
  SELECT ... FROM table_c JOIN table_d ...
  UNION ALL
  SELECT ... FROM table_c WHERE ...
) mid
JOIN table_e ...;{code}
The stage dependencies is:
{code:java}
STAGE DEPENDENCIES:
  Stage-5 is a root stage
  Stage-6 depends on stages: Stage-5
  Stage-22 depends on stages: Stage-6 , consists of Stage-26, Stage-1
  Stage-26 has a backup stage: Stage-1
  Stage-21 depends on stages: Stage-26
  Stage-25 depends on stages: Stage-1, Stage-12, Stage-21, Stage-23
  Stage-20 depends on stages: Stage-25
  Stage-0 depends on stages: Stage-20
  Stage-1
  Stage-14 is a root stage
  Stage-15 depends on stages: Stage-14
  Stage-24 depends on stages: Stage-15 , consists of Stage-27, Stage-12
  Stage-27 has a backup stage: Stage-12
  Stage-23 depends on stages: Stage-27
  Stage-12
{code}
The problem is triggered in this way:
 # Both Stage-22 and Stage-24 are ConditionalTask and contain mapjoin. 
 # Their dependent tasks Stage-6 and Stage-15 has similar input data size and 
finish at the same time. 
 # Thus the two ConditionalTask starts at the same time and come into this race 
condition, causing the backup stages Stage-1 and Stage-12 not correctly removed 
from Stage-25's dependency list. 
 # Then Stage-25 Stage-20 Stage-0 will not trigger. 
 # Stage-0 is a MoveTask so the data is totally lost and the query succeeds!

The last stdout of the hive query that lost data is:

!image-2021-07-20-18-22-20-218.png!
{code:java}
 {code}
While the normal output should be:

!image-2021-07-20-18-24-10-716.png!

> ConditionalTask remove tasks that not selected exists thread safety problem
> ---
>
> Key: HIVE-24467
> URL: https://issues.apache.org/jira/browse/HIVE-24467
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.3.4
>Reporter: guojh
>Assignee: guojh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> When hive execute jobs in parallel(control by “hive.exec.parallel” 
> parameter), ConditionalTasks  remove the tasks that not selected in parallel, 
> because there are thread safety issues, some task may not remove from the 
> dependent task tree. This is a very serious bug, which causes some stage task 
> not trigger execution.
> In our production cluster, the query run three conditional task in parallel, 
> after apply the patch of HIVE-21638, we found Stage-3 is miss and not submit 
> to runnable list for his parent Stage-31 is not done. But Stage-31 should 
> removed for it not selected.
> Stage dependencies is below:
> {code:java}
> STAGE DEPENDENCIES:
>   Stage-41 is a root stage
>   Stage-26 depends on stages: Stage-41
>   Stage-25 depends on stages: Stage-26 , consists of Stage-39, Stage-40, 
> Stage-2
>   Stage-39 has a backup stage: Stage-2
>   Stage-23 depends on stages: Stage-39
>   Stage-3 depends on stages: Stage-2, Stage-12, Stage-16, Stage-20, Stage-23, 
> Stage-24, Stage-27, Stage-28, Stage-31, Stage-32, Stage-35, Stage-36
>   Stage-8 depends on stages: Stage-3 , consists of Stage-5, Stage-4, Stage-6
>   Stage-5
>   Stage-0 depends on stages: Stage-5, Stage-4, Stage-7
>   Stage-51 depends on stages: Stage-0
>   Stage-4
>   Stage-6
>   Stage-7 depends on stages: Stage-6
>   Stage-40 has a backup stage: Stage-2
>   Stage-24 depends on stages: Stage-40
>   Stage-2
>   Stage-44 is a root stage
>   Stage-30 depends on stages: Stage-44
>   Stage-29 depends on stages: Stage-30 , consists of Stage-42, Stage-43, 
> Stage-12
>   Stage-42 has a backup stage: Stage-12
>   Stage-27 depends on stages: Stage-42
>   Stage-43 has a backup stage: Stage-12
>   Stage-28 depends on stages: Stage-43
>   Stage-12
>   Stage-47 is a root stage
>   Stage-34 depends on stages: Stage-47
>   Stage-33 depends on stages: Stage-34 , consists of Stage-45, Stage-46, 
> Stage-16
>   Stage-45 has a backup stage: Stage-16
>   Stage-31 depends on stages: Stage-45
>   Stage-46 has a backup stage: Stage-16
>   Stage-32 depends on stages: Stage-46
>   Stage-16
>   Stage-50 is a root stage
>   Stage-38 depends on stages: Stage-50
>   Stage-37 depends on stages: Stage-38 , consists of Stage-48, Stage-49, 
> Stage-20
>   Stage-48 has a backup stage: Stage-20
>   Stage-35 depends on stages: Stage-48
>   Stage-49 has a backup stage: Stage-20
>   Stage-36 depends on 

[jira] [Work logged] (HIVE-25348) Skip metrics collection about writes to tables with tblproperty no_auto_compaction=true if CTAS

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25348?focusedWorklogId=625157=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625157
 ]

ASF GitHub Bot logged work on HIVE-25348:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 10:14
Start Date: 20/Jul/21 10:14
Worklog Time Spent: 10m 
  Work Description: lcspinter commented on a change in pull request #2495:
URL: https://github.com/apache/hive/pull/2495#discussion_r672320320



##
File path: ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands3.java
##
@@ -604,4 +605,14 @@ private void assertOneTxn() throws Exception {
 Assert.assertEquals(TestTxnDbUtil.queryToString(hiveConf, "select * from 
TXNS"), 1,
 TestTxnDbUtil.countQueryAgent(hiveConf, "select count(*) from TXNS"));
   }
+
+  @Test public void testWritesToDisabledCompactionTableCtas() throws Exception 
{

Review comment:
   nit: method declaration in new line




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625157)
Time Spent: 2h 10m  (was: 2h)

> Skip metrics collection about writes to tables with tblproperty 
> no_auto_compaction=true if CTAS
> ---
>
> Key: HIVE-25348
> URL: https://issues.apache.org/jira/browse/HIVE-25348
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> We collect metrics about writes to tables with no_auto_compaction=true when 
> allocating writeids. In the case of CTAS, if ACID is enabled on the new 
> table, a writeid is allocated before the table object is created so we can't 
> get tblproperties from it when allocating the writeid.
> In this case we should skip collecting the metric.
> This commit fixes errors like this:
> {code:java}
> 2021-07-16 18:48:04,350 ERROR 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-9-thread-72]: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.HMSMetricsListener.onAllocWriteId(HMSMetricsListener.java:104)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.lambda$static$6(MetaStoreListenerNotifier.java:229)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:291)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.allocate_table_write_ids(HiveMetaStore.java:8592)
>   at sun.reflect.GeneratedMethodAccessor86.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:160)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:121)
>   at com.sun.proxy.$Proxy33.allocate_table_write_ids(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$allocate_table_write_ids.getResult(ThriftHiveMetastore.java:21584)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$allocate_table_write_ids.getResult(ThriftHiveMetastore.java:21568)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25348) Skip metrics collection about writes to tables with tblproperty no_auto_compaction=true if CTAS

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25348?focusedWorklogId=625138=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625138
 ]

ASF GitHub Bot logged work on HIVE-25348:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 10:11
Start Date: 20/Jul/21 10:11
Worklog Time Spent: 10m 
  Work Description: klcopp closed pull request #2495:
URL: https://github.com/apache/hive/pull/2495






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625138)
Time Spent: 2h  (was: 1h 50m)

> Skip metrics collection about writes to tables with tblproperty 
> no_auto_compaction=true if CTAS
> ---
>
> Key: HIVE-25348
> URL: https://issues.apache.org/jira/browse/HIVE-25348
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> We collect metrics about writes to tables with no_auto_compaction=true when 
> allocating writeids. In the case of CTAS, if ACID is enabled on the new 
> table, a writeid is allocated before the table object is created so we can't 
> get tblproperties from it when allocating the writeid.
> In this case we should skip collecting the metric.
> This commit fixes errors like this:
> {code:java}
> 2021-07-16 18:48:04,350 ERROR 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-9-thread-72]: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.HMSMetricsListener.onAllocWriteId(HMSMetricsListener.java:104)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.lambda$static$6(MetaStoreListenerNotifier.java:229)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:291)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.allocate_table_write_ids(HiveMetaStore.java:8592)
>   at sun.reflect.GeneratedMethodAccessor86.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:160)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:121)
>   at com.sun.proxy.$Proxy33.allocate_table_write_ids(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$allocate_table_write_ids.getResult(ThriftHiveMetastore.java:21584)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$allocate_table_write_ids.getResult(ThriftHiveMetastore.java:21568)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25115) Compaction queue entries may accumulate in "ready for cleaning" state

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25115?focusedWorklogId=625135=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625135
 ]

ASF GitHub Bot logged work on HIVE-25115:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 10:10
Start Date: 20/Jul/21 10:10
Worklog Time Spent: 10m 
  Work Description: klcopp commented on a change in pull request #2277:
URL: https://github.com/apache/hive/pull/2277#discussion_r672892362



##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java
##
@@ -282,15 +282,12 @@ private ValidReaderWriteIdList 
getValidCleanerWriteIdList(CompactionInfo ci, Tab
 assert rsp != null && rsp.getTblValidWriteIdsSize() == 1;
 ValidReaderWriteIdList validWriteIdList =
 
TxnCommonUtils.createValidReaderWriteIdList(rsp.getTblValidWriteIds().get(0));
-boolean delayedCleanupEnabled = 
conf.getBoolVar(HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED);
-if (delayedCleanupEnabled) {
-  /*
-   * If delayed cleanup enabled, we need to filter the obsoletes dir list, 
to only remove directories that were made obsolete by this compaction
-   * If we have a higher retentionTime it is possible for a second 
compaction to run on the same partition. Cleaning up the first compaction
-   * should not touch the newer obsolete directories to not to violate the 
retentionTime for those.
-   */
-  validWriteIdList = 
validWriteIdList.updateHighWatermark(ci.highestWriteId);
-}
+/*
+ * We need to filter the obsoletes dir list, to only remove directories 
that were made obsolete by this compaction
+ * If we have a higher retentionTime it is possible for a second 
compaction to run on the same partition. Cleaning up the first compaction
+ * should not touch the newer obsolete directories to not to violate the 
retentionTime for those.
+ */
+validWriteIdList = validWriteIdList.updateHighWatermark(ci.highestWriteId);

Review comment:
   How do we know that ci.highestWriteId's txn <= the min open txn the 
cleaner uses, if MIN_HISTORY_LEVEL is still used?

##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java
##
@@ -282,15 +282,12 @@ private ValidReaderWriteIdList 
getValidCleanerWriteIdList(CompactionInfo ci, Tab
 assert rsp != null && rsp.getTblValidWriteIdsSize() == 1;
 ValidReaderWriteIdList validWriteIdList =
 
TxnCommonUtils.createValidReaderWriteIdList(rsp.getTblValidWriteIds().get(0));
-boolean delayedCleanupEnabled = 
conf.getBoolVar(HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED);
-if (delayedCleanupEnabled) {
-  /*
-   * If delayed cleanup enabled, we need to filter the obsoletes dir list, 
to only remove directories that were made obsolete by this compaction
-   * If we have a higher retentionTime it is possible for a second 
compaction to run on the same partition. Cleaning up the first compaction
-   * should not touch the newer obsolete directories to not to violate the 
retentionTime for those.
-   */
-  validWriteIdList = 
validWriteIdList.updateHighWatermark(ci.highestWriteId);
-}
+/*
+ * We need to filter the obsoletes dir list, to only remove directories 
that were made obsolete by this compaction
+ * If we have a higher retentionTime it is possible for a second 
compaction to run on the same partition. Cleaning up the first compaction
+ * should not touch the newer obsolete directories to not to violate the 
retentionTime for those.
+ */
+validWriteIdList = validWriteIdList.updateHighWatermark(ci.highestWriteId);

Review comment:
   Yes, ci.highestWriteId = the highest write id that was compacted.
   So if we have this after compaction:
   delta_1_1
   delta_2_2
   delta_3_3
   base_3
   ci.highestWriteId=3, so the cleaner will remove (assuming MIN_HISTORY_LEVEL 
is still being used) : 
   delta_1_1
   delta_2_2
   delta_3_3
   But how do we know those can be removed?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625135)
Time Spent: 2h 10m  (was: 2h)

> Compaction queue entries may accumulate in "ready for cleaning" state
> -
>
> Key: HIVE-25115
> URL: https://issues.apache.org/jira/browse/HIVE-25115
> Project: Hive
>  Issue Type: Improvement
>Reporter: Karen Coppage
>Assignee: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time 

[jira] [Work logged] (HIVE-25276) Enable automatic statistics generation for Iceberg tables

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25276?focusedWorklogId=625136=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625136
 ]

ASF GitHub Bot logged work on HIVE-25276:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 10:10
Start Date: 20/Jul/21 10:10
Worklog Time Spent: 10m 
  Work Description: marton-bod commented on a change in pull request #2419:
URL: https://github.com/apache/hive/pull/2419#discussion_r672164500



##
File path: 
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java
##
@@ -233,15 +237,21 @@ public void 
preAlterTable(org.apache.hadoop.hive.metastore.api.Table hmsTable, E
   preAlterTableProperties.tableLocation = sd.getLocation();
   preAlterTableProperties.format = sd.getInputFormat();
   preAlterTableProperties.schema = schema(catalogProperties, hmsTable);
-  preAlterTableProperties.spec = spec(conf, 
preAlterTableProperties.schema, catalogProperties, hmsTable);
   preAlterTableProperties.partitionKeys = hmsTable.getPartitionKeys();
 
   context.getProperties().put(HiveMetaHook.ALLOW_PARTITION_KEY_CHANGE, 
"true");
   // If there are partition keys specified remove them from the HMS table 
and add them to the column list
-  if (hmsTable.isSetPartitionKeys()) {
+  if (hmsTable.isSetPartitionKeys() && 
!hmsTable.getPartitionKeys().isEmpty()) {
+List spec = 
PartitionTransform.getPartitionTransformSpec(hmsTable.getPartitionKeys());
+if (!SessionStateUtil.addResource(conf, 
hive_metastoreConstants.PARTITION_TRANSFORM_SPEC, spec)) {
+  throw new MetaException("Query state attached to Session state must 
be not null. " +
+  "Partition transform metadata cannot be saved.");
+}
 hmsTable.getSd().getCols().addAll(hmsTable.getPartitionKeys());
 hmsTable.setPartitionKeysIsSet(false);
   }
+  preAlterTableProperties.spec = spec(conf, 
preAlterTableProperties.schema, hmsTable);

Review comment:
   Is this needed here? Or if we use it only for the validation inside 
`spec()`, then maybe remove the local variable, or better yet, extract the 
validation logic into a different method we can call here?

##
File path: 
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java
##
@@ -233,15 +237,21 @@ public void 
preAlterTable(org.apache.hadoop.hive.metastore.api.Table hmsTable, E
   preAlterTableProperties.tableLocation = sd.getLocation();
   preAlterTableProperties.format = sd.getInputFormat();
   preAlterTableProperties.schema = schema(catalogProperties, hmsTable);
-  preAlterTableProperties.spec = spec(conf, 
preAlterTableProperties.schema, catalogProperties, hmsTable);
   preAlterTableProperties.partitionKeys = hmsTable.getPartitionKeys();
 
   context.getProperties().put(HiveMetaHook.ALLOW_PARTITION_KEY_CHANGE, 
"true");
   // If there are partition keys specified remove them from the HMS table 
and add them to the column list
-  if (hmsTable.isSetPartitionKeys()) {
+  if (hmsTable.isSetPartitionKeys() && 
!hmsTable.getPartitionKeys().isEmpty()) {
+List spec = 
PartitionTransform.getPartitionTransformSpec(hmsTable.getPartitionKeys());
+if (!SessionStateUtil.addResource(conf, 
hive_metastoreConstants.PARTITION_TRANSFORM_SPEC, spec)) {

Review comment:
   Where was this part before that we saved it into the session conf?

##
File path: 
iceberg/iceberg-handler/src/test/results/positive/vectorized_iceberg_read.q.out
##
@@ -129,17 +129,17 @@ Stage-0
 Stage-1
   Reducer 2 vectorized
   File Output Operator [FS_11]
-Select Operator [SEL_10] (rows=1 width=564)
+Select Operator [SEL_10] (rows=1 width=372)

Review comment:
   Out of curiosity: do you know why the width has descreased so much?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625136)
Time Spent: 4.5h  (was: 4h 20m)

> Enable automatic statistics generation for Iceberg tables
> -
>
> Key: HIVE-25276
> URL: https://issues.apache.org/jira/browse/HIVE-25276
> Project: Hive
>  Issue Type: Improvement
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> During inserts we should have calculate the column statistics



--
This 

[jira] [Work logged] (HIVE-24945) PTF: Support vectorization for lead/lag functions

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24945?focusedWorklogId=625137=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625137
 ]

ASF GitHub Bot logged work on HIVE-24945:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 10:11
Start Date: 20/Jul/21 10:11
Worklog Time Spent: 10m 
  Work Description: abstractdog commented on pull request #2278:
URL: https://github.com/apache/hive/pull/2278#issuecomment-883205928


   @ramesh0201 : rebased and fixed patch passed precommit testing, PR changes 
are in 
https://github.com/apache/hive/pull/2278/commits/81592ac9909d1e2875dfdcb2e2e86f1f597062b3


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625137)
Time Spent: 1h 40m  (was: 1.5h)

> PTF: Support vectorization for lead/lag functions
> -
>
> Key: HIVE-24945
> URL: https://issues.apache.org/jira/browse/HIVE-24945
> Project: Hive
>  Issue Type: Sub-task
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25276) Enable automatic statistics generation for Iceberg tables

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25276?focusedWorklogId=625124=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625124
 ]

ASF GitHub Bot logged work on HIVE-25276:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 10:09
Start Date: 20/Jul/21 10:09
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #2419:
URL: https://github.com/apache/hive/pull/2419#discussion_r672435549



##
File path: 
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java
##
@@ -233,15 +237,21 @@ public void 
preAlterTable(org.apache.hadoop.hive.metastore.api.Table hmsTable, E
   preAlterTableProperties.tableLocation = sd.getLocation();
   preAlterTableProperties.format = sd.getInputFormat();
   preAlterTableProperties.schema = schema(catalogProperties, hmsTable);
-  preAlterTableProperties.spec = spec(conf, 
preAlterTableProperties.schema, catalogProperties, hmsTable);
   preAlterTableProperties.partitionKeys = hmsTable.getPartitionKeys();
 
   context.getProperties().put(HiveMetaHook.ALLOW_PARTITION_KEY_CHANGE, 
"true");
   // If there are partition keys specified remove them from the HMS table 
and add them to the column list
-  if (hmsTable.isSetPartitionKeys()) {
+  if (hmsTable.isSetPartitionKeys() && 
!hmsTable.getPartitionKeys().isEmpty()) {
+List spec = 
PartitionTransform.getPartitionTransformSpec(hmsTable.getPartitionKeys());
+if (!SessionStateUtil.addResource(conf, 
hive_metastoreConstants.PARTITION_TRANSFORM_SPEC, spec)) {
+  throw new MetaException("Query state attached to Session state must 
be not null. " +
+  "Partition transform metadata cannot be saved.");
+}
 hmsTable.getSd().getCols().addAll(hmsTable.getPartitionKeys());
 hmsTable.setPartitionKeysIsSet(false);
   }
+  preAlterTableProperties.spec = spec(conf, 
preAlterTableProperties.schema, hmsTable);

Review comment:
   This is moved from line 236. We need it to be set, but we have to do it 
after we got the correct spec

##
File path: 
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java
##
@@ -233,15 +237,21 @@ public void 
preAlterTable(org.apache.hadoop.hive.metastore.api.Table hmsTable, E
   preAlterTableProperties.tableLocation = sd.getLocation();
   preAlterTableProperties.format = sd.getInputFormat();
   preAlterTableProperties.schema = schema(catalogProperties, hmsTable);
-  preAlterTableProperties.spec = spec(conf, 
preAlterTableProperties.schema, catalogProperties, hmsTable);
   preAlterTableProperties.partitionKeys = hmsTable.getPartitionKeys();
 
   context.getProperties().put(HiveMetaHook.ALLOW_PARTITION_KEY_CHANGE, 
"true");
   // If there are partition keys specified remove them from the HMS table 
and add them to the column list
-  if (hmsTable.isSetPartitionKeys()) {
+  if (hmsTable.isSetPartitionKeys() && 
!hmsTable.getPartitionKeys().isEmpty()) {
+List spec = 
PartitionTransform.getPartitionTransformSpec(hmsTable.getPartitionKeys());
+if (!SessionStateUtil.addResource(conf, 
hive_metastoreConstants.PARTITION_TRANSFORM_SPEC, spec)) {

Review comment:
   This is for migrating tables from non-Iceberg tables to Iceberg tables. 
Previously we just depended on the partition cols, from now on we need to have 
the data in the `SessionState` instead. So we put that there

##
File path: 
iceberg/iceberg-handler/src/test/results/positive/vectorized_iceberg_read.q.out
##
@@ -129,17 +129,17 @@ Stage-0
 Stage-1
   Reducer 2 vectorized
   File Output Operator [FS_11]
-Select Operator [SEL_10] (rows=1 width=564)
+Select Operator [SEL_10] (rows=1 width=372)

Review comment:
   TBH I am not sure, but I expect that has something to do with the new 
statistics

##
File path: 
ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestTezOutputCommitter.java
##
@@ -122,6 +122,7 @@ private IDriver getDriverWithCommitter(String 
committerClass) {
 conf.setVar(HiveConf.ConfVars.HIVE_AUTHORIZATION_MANAGER,
 
"org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory");
 conf.setBoolVar(HiveConf.ConfVars.HIVE_SUPPORT_CONCURRENCY, false);
+conf.setBoolVar(HiveConf.ConfVars.HIVESTATSCOLAUTOGATHER, false);

Review comment:
   Otherwise the tests are failing, because with stats turned on we 
generate 2 tasks instead of 1 (change of the execution plans which contain a 
stage)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, 

[jira] [Work logged] (HIVE-25348) Skip metrics collection about writes to tables with tblproperty no_auto_compaction=true if CTAS

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25348?focusedWorklogId=625114=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625114
 ]

ASF GitHub Bot logged work on HIVE-25348:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 10:08
Start Date: 20/Jul/21 10:08
Worklog Time Spent: 10m 
  Work Description: klcopp opened a new pull request #2495:
URL: https://github.com/apache/hive/pull/2495






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625114)
Time Spent: 1h 50m  (was: 1h 40m)

> Skip metrics collection about writes to tables with tblproperty 
> no_auto_compaction=true if CTAS
> ---
>
> Key: HIVE-25348
> URL: https://issues.apache.org/jira/browse/HIVE-25348
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> We collect metrics about writes to tables with no_auto_compaction=true when 
> allocating writeids. In the case of CTAS, if ACID is enabled on the new 
> table, a writeid is allocated before the table object is created so we can't 
> get tblproperties from it when allocating the writeid.
> In this case we should skip collecting the metric.
> This commit fixes errors like this:
> {code:java}
> 2021-07-16 18:48:04,350 ERROR 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-9-thread-72]: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.HMSMetricsListener.onAllocWriteId(HMSMetricsListener.java:104)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.lambda$static$6(MetaStoreListenerNotifier.java:229)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:291)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.allocate_table_write_ids(HiveMetaStore.java:8592)
>   at sun.reflect.GeneratedMethodAccessor86.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:160)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:121)
>   at com.sun.proxy.$Proxy33.allocate_table_write_ids(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$allocate_table_write_ids.getResult(ThriftHiveMetastore.java:21584)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$allocate_table_write_ids.getResult(ThriftHiveMetastore.java:21568)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25115) Compaction queue entries may accumulate in "ready for cleaning" state

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25115?focusedWorklogId=625120=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625120
 ]

ASF GitHub Bot logged work on HIVE-25115:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 10:08
Start Date: 20/Jul/21 10:08
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #2277:
URL: https://github.com/apache/hive/pull/2277#discussion_r672900959



##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java
##
@@ -282,15 +282,12 @@ private ValidReaderWriteIdList 
getValidCleanerWriteIdList(CompactionInfo ci, Tab
 assert rsp != null && rsp.getTblValidWriteIdsSize() == 1;
 ValidReaderWriteIdList validWriteIdList =
 
TxnCommonUtils.createValidReaderWriteIdList(rsp.getTblValidWriteIds().get(0));
-boolean delayedCleanupEnabled = 
conf.getBoolVar(HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED);
-if (delayedCleanupEnabled) {
-  /*
-   * If delayed cleanup enabled, we need to filter the obsoletes dir list, 
to only remove directories that were made obsolete by this compaction
-   * If we have a higher retentionTime it is possible for a second 
compaction to run on the same partition. Cleaning up the first compaction
-   * should not touch the newer obsolete directories to not to violate the 
retentionTime for those.
-   */
-  validWriteIdList = 
validWriteIdList.updateHighWatermark(ci.highestWriteId);
-}
+/*
+ * We need to filter the obsoletes dir list, to only remove directories 
that were made obsolete by this compaction
+ * If we have a higher retentionTime it is possible for a second 
compaction to run on the same partition. Cleaning up the first compaction
+ * should not touch the newer obsolete directories to not to violate the 
retentionTime for those.
+ */
+validWriteIdList = validWriteIdList.updateHighWatermark(ci.highestWriteId);

Review comment:
   not sure I got the question. but highestWriteId is recorded at the time 
when the compaction txn starts, so it records all open txns that have to be 
ignored.

##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java
##
@@ -282,15 +282,12 @@ private ValidReaderWriteIdList 
getValidCleanerWriteIdList(CompactionInfo ci, Tab
 assert rsp != null && rsp.getTblValidWriteIdsSize() == 1;
 ValidReaderWriteIdList validWriteIdList =
 
TxnCommonUtils.createValidReaderWriteIdList(rsp.getTblValidWriteIds().get(0));
-boolean delayedCleanupEnabled = 
conf.getBoolVar(HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED);
-if (delayedCleanupEnabled) {
-  /*
-   * If delayed cleanup enabled, we need to filter the obsoletes dir list, 
to only remove directories that were made obsolete by this compaction
-   * If we have a higher retentionTime it is possible for a second 
compaction to run on the same partition. Cleaning up the first compaction
-   * should not touch the newer obsolete directories to not to violate the 
retentionTime for those.
-   */
-  validWriteIdList = 
validWriteIdList.updateHighWatermark(ci.highestWriteId);
-}
+/*
+ * We need to filter the obsoletes dir list, to only remove directories 
that were made obsolete by this compaction
+ * If we have a higher retentionTime it is possible for a second 
compaction to run on the same partition. Cleaning up the first compaction
+ * should not touch the newer obsolete directories to not to violate the 
retentionTime for those.
+ */
+validWriteIdList = validWriteIdList.updateHighWatermark(ci.highestWriteId);

Review comment:
   not sure I got the question. but highestWriteId is recorded at the time 
when the compaction txn starts, so it records writeid hwm and all open txns 
below it that have to be ignored.

##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java
##
@@ -282,15 +282,12 @@ private ValidReaderWriteIdList 
getValidCleanerWriteIdList(CompactionInfo ci, Tab
 assert rsp != null && rsp.getTblValidWriteIdsSize() == 1;
 ValidReaderWriteIdList validWriteIdList =
 
TxnCommonUtils.createValidReaderWriteIdList(rsp.getTblValidWriteIds().get(0));
-boolean delayedCleanupEnabled = 
conf.getBoolVar(HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED);
-if (delayedCleanupEnabled) {
-  /*
-   * If delayed cleanup enabled, we need to filter the obsoletes dir list, 
to only remove directories that were made obsolete by this compaction
-   * If we have a higher retentionTime it is possible for a second 
compaction to run on the same partition. Cleaning up the first compaction
-   * should not touch the newer obsolete directories to not to violate the 
retentionTime for those.
-   */
-  validWriteIdList = 

[jira] [Work logged] (HIVE-25345) Add logging based on new compaction metrics

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25345?focusedWorklogId=625111=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625111
 ]

ASF GitHub Bot logged work on HIVE-25345:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 10:07
Start Date: 20/Jul/21 10:07
Worklog Time Spent: 10m 
  Work Description: lcspinter opened a new pull request #2493:
URL: https://github.com/apache/hive/pull/2493


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625111)
Time Spent: 40m  (was: 0.5h)

> Add logging based on new compaction metrics
> ---
>
> Key: HIVE-25345
> URL: https://issues.apache.org/jira/browse/HIVE-25345
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25336) Use single call to get tables in DropDatabaseAnalyzer

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25336?focusedWorklogId=625105=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625105
 ]

ASF GitHub Bot logged work on HIVE-25336:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 10:06
Start Date: 20/Jul/21 10:06
Worklog Time Spent: 10m 
  Work Description: aasha merged pull request #2481:
URL: https://github.com/apache/hive/pull/2481


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625105)
Time Spent: 0.5h  (was: 20m)

> Use single call to get tables in DropDatabaseAnalyzer
> -
>
> Key: HIVE-25336
> URL: https://issues.apache.org/jira/browse/HIVE-25336
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Optimise 
> org.apache.hadoop.hive.ql.ddl.database.drop.DropDatabaseAnalyzer.analyzeInternal(DropDatabaseAnalyzer.java:61),
>  where it fetches entire tables one by one. Move to a single call. This could 
> save around 20+ seconds when large number of tables are present.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25276) Enable automatic statistics generation for Iceberg tables

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25276?focusedWorklogId=625098=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625098
 ]

ASF GitHub Bot logged work on HIVE-25276:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 10:05
Start Date: 20/Jul/21 10:05
Worklog Time Spent: 10m 
  Work Description: szlta commented on a change in pull request #2419:
URL: https://github.com/apache/hive/pull/2419#discussion_r672325104



##
File path: 
ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestTezOutputCommitter.java
##
@@ -122,6 +122,7 @@ private IDriver getDriverWithCommitter(String 
committerClass) {
 conf.setVar(HiveConf.ConfVars.HIVE_AUTHORIZATION_MANAGER,
 
"org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory");
 conf.setBoolVar(HiveConf.ConfVars.HIVE_SUPPORT_CONCURRENCY, false);
+conf.setBoolVar(HiveConf.ConfVars.HIVESTATSCOLAUTOGATHER, false);

Review comment:
   Why is this required here?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625098)
Time Spent: 4h 10m  (was: 4h)

> Enable automatic statistics generation for Iceberg tables
> -
>
> Key: HIVE-25276
> URL: https://issues.apache.org/jira/browse/HIVE-25276
> Project: Hive
>  Issue Type: Improvement
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> During inserts we should have calculate the column statistics



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25209) SELECT query with SUM function producing unexpected result

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25209?focusedWorklogId=625079=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625079
 ]

ASF GitHub Bot logged work on HIVE-25209:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 10:03
Start Date: 20/Jul/21 10:03
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk merged pull request #2360:
URL: https://github.com/apache/hive/pull/2360


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625079)
Time Spent: 1h 40m  (was: 1.5h)

> SELECT query with SUM function producing unexpected result
> --
>
> Key: HIVE-25209
> URL: https://issues.apache.org/jira/browse/HIVE-25209
> Project: Hive
>  Issue Type: Bug
>Reporter: Soumyakanti Das
>Assignee: Soumyakanti Das
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Hive: SELECT query with SUM function producing unexpected result
> Problem Statement:
> {noformat}
> SELECT SUM(1) FROM t1;
>  result: 0
> SELECT SUM(agg0) FROM (
> SELECT SUM(1) as agg0 FROM t1 WHERE t1.c0 UNION ALL 
> SELECT SUM(1) as agg0 FROM t1 WHERE NOT (t1.c0) UNION ALL 
> SELECT SUM(1) as agg0 FROM t1 WHERE (t1.c0) IS NULL
> ) as asdf;
>  result: null {noformat}
> Steps to reproduce:
> {noformat}
> DROP DATABASE IF EXISTS db5 CASCADE;
> CREATE DATABASE db5;
> use db5;
> CREATE TABLE IF NOT EXISTS t1(c0 boolean, c1 boolean);
> SELECT SUM(1) FROM t1;
> -- result: 0
> SELECT SUM(agg0) FROM (
> SELECT SUM(1) as agg0 FROM t1 WHERE t1.c0 UNION ALL 
> SELECT SUM(1) as agg0 FROM t1 WHERE NOT (t1.c0) UNION ALL 
> SELECT SUM(1) as agg0 FROM t1 WHERE (t1.c0) IS NULL
> ) as asdf;
> -- result: null {noformat}
> Observations:
> SELECT SUM(1) as agg0 FROM t1 WHERE t1.c0 = t1.c1; – will result in null
> Similarity with postgres, 
>  both the queries result in null
> Similarity with Impala,
>  both the queries result in null



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25348) Skip metrics collection about writes to tables with tblproperty no_auto_compaction=true if CTAS

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25348?focusedWorklogId=625081=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625081
 ]

ASF GitHub Bot logged work on HIVE-25348:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 10:03
Start Date: 20/Jul/21 10:03
Worklog Time Spent: 10m 
  Work Description: klcopp commented on pull request #2495:
URL: https://github.com/apache/hive/pull/2495#issuecomment-883238490


   moved to https://github.com/apache/hive/pull/2497


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625081)
Time Spent: 1h 40m  (was: 1.5h)

> Skip metrics collection about writes to tables with tblproperty 
> no_auto_compaction=true if CTAS
> ---
>
> Key: HIVE-25348
> URL: https://issues.apache.org/jira/browse/HIVE-25348
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> We collect metrics about writes to tables with no_auto_compaction=true when 
> allocating writeids. In the case of CTAS, if ACID is enabled on the new 
> table, a writeid is allocated before the table object is created so we can't 
> get tblproperties from it when allocating the writeid.
> In this case we should skip collecting the metric.
> This commit fixes errors like this:
> {code:java}
> 2021-07-16 18:48:04,350 ERROR 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-9-thread-72]: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.HMSMetricsListener.onAllocWriteId(HMSMetricsListener.java:104)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.lambda$static$6(MetaStoreListenerNotifier.java:229)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:291)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.allocate_table_write_ids(HiveMetaStore.java:8592)
>   at sun.reflect.GeneratedMethodAccessor86.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:160)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:121)
>   at com.sun.proxy.$Proxy33.allocate_table_write_ids(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$allocate_table_write_ids.getResult(ThriftHiveMetastore.java:21584)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$allocate_table_write_ids.getResult(ThriftHiveMetastore.java:21568)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25338) AIOBE in conv UDF if input is empty

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25338?focusedWorklogId=625078=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625078
 ]

ASF GitHub Bot logged work on HIVE-25338:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 10:02
Start Date: 20/Jul/21 10:02
Worklog Time Spent: 10m 
  Work Description: maheshk114 commented on pull request #2485:
URL: https://github.com/apache/hive/pull/2485#issuecomment-882475619


   the changes looks good to me. +1. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625078)
Time Spent: 0.5h  (was: 20m)

> AIOBE in conv UDF if input is empty
> ---
>
> Key: HIVE-25338
> URL: https://issues.apache.org/jira/browse/HIVE-25338
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Repro
> {code:java}
> create table test (a string);
> insert into test values ("");
> select conv(a,16,10) from test;{code}
> Exception trace:
> {code:java}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
>  at org.apache.hadoop.hive.ql.udf.UDFConv.evaluate(UDFConv.java:160){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24037) Parallelize hash table constructions in map joins

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24037?focusedWorklogId=625064=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625064
 ]

ASF GitHub Bot logged work on HIVE-24037:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 10:01
Start Date: 20/Jul/21 10:01
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #2004:
URL: https://github.com/apache/hive/pull/2004


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625064)
Time Spent: 2h 50m  (was: 2h 40m)

> Parallelize hash table constructions in map joins
> -
>
> Key: HIVE-24037
> URL: https://issues.apache.org/jira/browse/HIVE-24037
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Parallelize hash table constructions in map joins



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25325) Add TRUNCATE TABLE support for Hive Iceberg tables

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25325?focusedWorklogId=625062=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625062
 ]

ASF GitHub Bot logged work on HIVE-25325:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 10:00
Start Date: 20/Jul/21 10:00
Worklog Time Spent: 10m 
  Work Description: kuczoram commented on a change in pull request #2471:
URL: https://github.com/apache/hive/pull/2471#discussion_r672339586



##
File path: 
iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerWithEngine.java
##
@@ -1313,6 +1313,186 @@ public void testScanTableCaseInsensitive() throws 
IOException {
 Assert.assertArrayEquals(new Object[] {1L, "Bob", "Green"}, rows.get(1));
   }
 
+  @Test
+  public void testTruncateTable() throws IOException, TException, 
InterruptedException {
+// Create an Iceberg table with some records in it then execute a truncate 
table command.
+// Then check if the data is deleted and the table statistics are reset to 
0.
+String databaseName = "default";
+String tableName = "customers";
+Table icebergTable = testTables.createTable(shell, tableName, 
HiveIcebergStorageHandlerTestUtils.CUSTOMER_SCHEMA,
+fileFormat, HiveIcebergStorageHandlerTestUtils.CUSTOMER_RECORDS);
+testTruncateTable(databaseName, tableName, icebergTable, 
HiveIcebergStorageHandlerTestUtils.CUSTOMER_RECORDS,
+HiveIcebergStorageHandlerTestUtils.CUSTOMER_SCHEMA, true, false);
+  }
+
+  @Test
+  public void testTruncateEmptyTable() throws IOException, TException, 
InterruptedException {
+// Create an empty Iceberg table and execute a truncate table command on 
it.
+String databaseName = "default";
+String tableName = "customers";
+String fullTableName = databaseName + "." + tableName;

Review comment:
   Thanks, I changed that.

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java
##
@@ -3360,42 +3360,50 @@ public CmRecycleResponse cm_recycle(final 
CmRecycleRequest request) throws MetaE
   public void truncate_table(final String dbName, final String tableName, 
List partNames)
   throws NoSuchObjectException, MetaException {
 // Deprecated path, won't work for txn tables.
-truncateTableInternal(dbName, tableName, partNames, null, -1);
+truncateTableInternal(dbName, tableName, partNames, null, -1, null);
   }
 
   @Override
   public TruncateTableResponse truncate_table_req(TruncateTableRequest req)
   throws MetaException, TException {
 truncateTableInternal(req.getDbName(), req.getTableName(), 
req.getPartNames(),
-req.getValidWriteIdList(), req.getWriteId());
+req.getValidWriteIdList(), req.getWriteId(), 
req.getEnvironmentContext());
 return new TruncateTableResponse();
   }
 
   private void truncateTableInternal(String dbName, String tableName, 
List partNames,
- String validWriteIds, long writeId) 
throws MetaException, NoSuchObjectException {
+ String validWriteIds, long writeId, 
EnvironmentContext context) throws MetaException, NoSuchObjectException {
 boolean isSkipTrash = false, needCmRecycle = false;
 try {
   String[] parsedDbName = parseDbName(dbName, conf);
   Table tbl = get_table_core(parsedDbName[CAT_NAME], 
parsedDbName[DB_NAME], tableName);
 
-  boolean truncateFiles = !TxnUtils.isTransactionalTable(tbl) ||
-  !MetastoreConf.getBoolVar(getConf(), 
MetastoreConf.ConfVars.TRUNCATE_ACID_USE_BASE);
-
-  if (truncateFiles) {
-isSkipTrash = MetaStoreUtils.isSkipTrash(tbl.getParameters());
-Database db = get_database_core(parsedDbName[CAT_NAME], 
parsedDbName[DB_NAME]);
-needCmRecycle = ReplChangeManager.shouldEnableCm(db, tbl);
+  boolean skipDataDeletion = false;
+  if (context != null && context.getProperties() != null
+  && context.getProperties().get("truncateSkipDataDeletion") != null) {

Review comment:
   We can do that. Fixed it.

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java
##
@@ -3360,42 +3360,50 @@ public CmRecycleResponse cm_recycle(final 
CmRecycleRequest request) throws MetaE
   public void truncate_table(final String dbName, final String tableName, 
List partNames)
   throws NoSuchObjectException, MetaException {
 // Deprecated path, won't work for txn tables.
-truncateTableInternal(dbName, tableName, partNames, null, -1);
+truncateTableInternal(dbName, tableName, partNames, null, -1, null);
   }
 
   @Override
   public TruncateTableResponse truncate_table_req(TruncateTableRequest req)
   throws MetaException, TException {
 truncateTableInternal(req.getDbName(), 

[jira] [Work logged] (HIVE-25325) Add TRUNCATE TABLE support for Hive Iceberg tables

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25325?focusedWorklogId=625012=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625012
 ]

ASF GitHub Bot logged work on HIVE-25325:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 09:53
Start Date: 20/Jul/21 09:53
Worklog Time Spent: 10m 
  Work Description: marton-bod commented on a change in pull request #2471:
URL: https://github.com/apache/hive/pull/2471#discussion_r672353809



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java
##
@@ -3360,42 +3360,50 @@ public CmRecycleResponse cm_recycle(final 
CmRecycleRequest request) throws MetaE
   public void truncate_table(final String dbName, final String tableName, 
List partNames)
   throws NoSuchObjectException, MetaException {
 // Deprecated path, won't work for txn tables.
-truncateTableInternal(dbName, tableName, partNames, null, -1);
+truncateTableInternal(dbName, tableName, partNames, null, -1, null);
   }
 
   @Override
   public TruncateTableResponse truncate_table_req(TruncateTableRequest req)
   throws MetaException, TException {
 truncateTableInternal(req.getDbName(), req.getTableName(), 
req.getPartNames(),
-req.getValidWriteIdList(), req.getWriteId());
+req.getValidWriteIdList(), req.getWriteId(), 
req.getEnvironmentContext());
 return new TruncateTableResponse();
   }
 
   private void truncateTableInternal(String dbName, String tableName, 
List partNames,
- String validWriteIds, long writeId) 
throws MetaException, NoSuchObjectException {
+ String validWriteIds, long writeId, 
EnvironmentContext context) throws MetaException, NoSuchObjectException {
 boolean isSkipTrash = false, needCmRecycle = false;
 try {
   String[] parsedDbName = parseDbName(dbName, conf);
   Table tbl = get_table_core(parsedDbName[CAT_NAME], 
parsedDbName[DB_NAME], tableName);
 
-  boolean truncateFiles = !TxnUtils.isTransactionalTable(tbl) ||
-  !MetastoreConf.getBoolVar(getConf(), 
MetastoreConf.ConfVars.TRUNCATE_ACID_USE_BASE);
-
-  if (truncateFiles) {
-isSkipTrash = MetaStoreUtils.isSkipTrash(tbl.getParameters());
-Database db = get_database_core(parsedDbName[CAT_NAME], 
parsedDbName[DB_NAME]);
-needCmRecycle = ReplChangeManager.shouldEnableCm(db, tbl);
+  boolean skipDataDeletion = false;
+  if (context != null && context.getProperties() != null

Review comment:
   One minor thing, this should be `Optional.ofNullable(context)` if you 
want to guard against the `context` being null as well

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java
##
@@ -3360,42 +3360,50 @@ public CmRecycleResponse cm_recycle(final 
CmRecycleRequest request) throws MetaE
   public void truncate_table(final String dbName, final String tableName, 
List partNames)
   throws NoSuchObjectException, MetaException {
 // Deprecated path, won't work for txn tables.
-truncateTableInternal(dbName, tableName, partNames, null, -1);
+truncateTableInternal(dbName, tableName, partNames, null, -1, null);
   }
 
   @Override
   public TruncateTableResponse truncate_table_req(TruncateTableRequest req)
   throws MetaException, TException {
 truncateTableInternal(req.getDbName(), req.getTableName(), 
req.getPartNames(),
-req.getValidWriteIdList(), req.getWriteId());
+req.getValidWriteIdList(), req.getWriteId(), 
req.getEnvironmentContext());
 return new TruncateTableResponse();
   }
 
   private void truncateTableInternal(String dbName, String tableName, 
List partNames,
- String validWriteIds, long writeId) 
throws MetaException, NoSuchObjectException {
+ String validWriteIds, long writeId, 
EnvironmentContext context) throws MetaException, NoSuchObjectException {
 boolean isSkipTrash = false, needCmRecycle = false;
 try {
   String[] parsedDbName = parseDbName(dbName, conf);
   Table tbl = get_table_core(parsedDbName[CAT_NAME], 
parsedDbName[DB_NAME], tableName);
 
-  boolean truncateFiles = !TxnUtils.isTransactionalTable(tbl) ||
-  !MetastoreConf.getBoolVar(getConf(), 
MetastoreConf.ConfVars.TRUNCATE_ACID_USE_BASE);
-
-  if (truncateFiles) {
-isSkipTrash = MetaStoreUtils.isSkipTrash(tbl.getParameters());
-Database db = get_database_core(parsedDbName[CAT_NAME], 
parsedDbName[DB_NAME]);
-needCmRecycle = ReplChangeManager.shouldEnableCm(db, tbl);
+  boolean skipDataDeletion = false;
+  if (context != null && context.getProperties() != null

Review comment:
   In that case, if you get a null at any point during the map 

[jira] [Work logged] (HIVE-25277) Slow Hive partition deletion for Cloud object stores with expensive ListFiles

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25277?focusedWorklogId=625001=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625001
 ]

ASF GitHub Bot logged work on HIVE-25277:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 09:52
Start Date: 20/Jul/21 09:52
Worklog Time Spent: 10m 
  Work Description: coufon commented on pull request #2421:
URL: https://github.com/apache/hive/pull/2421#issuecomment-882711397


   > Zoltan Haindrich Haymant Mangla Naveen Gangam may you take a look and 
merge this PR?
   
   Thank you @medb! @kgyrtkirk @hmangla98 @nrg4878 a friendly ping, could you 
please take a look? Thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 625001)
Time Spent: 1h 50m  (was: 1h 40m)

> Slow Hive partition deletion for Cloud object stores with expensive ListFiles
> -
>
> Key: HIVE-25277
> URL: https://issues.apache.org/jira/browse/HIVE-25277
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: All Versions
>Reporter: Zhou Fang
>Assignee: Zhou Fang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Deleting a Hive partition is slow when use a Cloud object store as the 
> warehouse for which ListFiles is expensive. A root cause is that the 
> recursive parent dir deletion is very inefficient: there are many duplicated 
> calls to isEmpty (ListFiles is called at the end). This fix sorts the parents 
> to delete according to the path size, and always processes the longest one 
> (e.g., a/b/c is always before a/b). As a result, each parent path is only 
> needed to be checked once.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25348) Skip metrics collection about writes to tables with tblproperty no_auto_compaction=true if CTAS

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25348?focusedWorklogId=624994=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-624994
 ]

ASF GitHub Bot logged work on HIVE-25348:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 09:51
Start Date: 20/Jul/21 09:51
Worklog Time Spent: 10m 
  Work Description: klcopp opened a new pull request #2497:
URL: https://github.com/apache/hive/pull/2497


   Tests: Unit tests


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 624994)
Time Spent: 1.5h  (was: 1h 20m)

> Skip metrics collection about writes to tables with tblproperty 
> no_auto_compaction=true if CTAS
> ---
>
> Key: HIVE-25348
> URL: https://issues.apache.org/jira/browse/HIVE-25348
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> We collect metrics about writes to tables with no_auto_compaction=true when 
> allocating writeids. In the case of CTAS, if ACID is enabled on the new 
> table, a writeid is allocated before the table object is created so we can't 
> get tblproperties from it when allocating the writeid.
> In this case we should skip collecting the metric.
> This commit fixes errors like this:
> {code:java}
> 2021-07-16 18:48:04,350 ERROR 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-9-thread-72]: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.HMSMetricsListener.onAllocWriteId(HMSMetricsListener.java:104)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.lambda$static$6(MetaStoreListenerNotifier.java:229)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:291)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.allocate_table_write_ids(HiveMetaStore.java:8592)
>   at sun.reflect.GeneratedMethodAccessor86.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:160)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:121)
>   at com.sun.proxy.$Proxy33.allocate_table_write_ids(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$allocate_table_write_ids.getResult(ThriftHiveMetastore.java:21584)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$allocate_table_write_ids.getResult(ThriftHiveMetastore.java:21568)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25256) Support ALTER TABLE CHANGE COLUMN for Iceberg

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25256?focusedWorklogId=624984=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-624984
 ]

ASF GitHub Bot logged work on HIVE-25256:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 09:50
Start Date: 20/Jul/21 09:50
Worklog Time Spent: 10m 
  Work Description: szlta merged pull request #2463:
URL: https://github.com/apache/hive/pull/2463


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 624984)
Time Spent: 2.5h  (was: 2h 20m)

> Support ALTER TABLE CHANGE COLUMN for Iceberg
> -
>
> Key: HIVE-25256
> URL: https://issues.apache.org/jira/browse/HIVE-25256
> Project: Hive
>  Issue Type: New Feature
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> In order to provide support for renaming/changing the data type of a single 
> column, we should add alter table change column support for Iceberg tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25346) cleanTxnToWriteIdTable breaks SNAPSHOT isolation

2021-07-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25346?focusedWorklogId=624950=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-624950
 ]

ASF GitHub Bot logged work on HIVE-25346:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 09:45
Start Date: 20/Jul/21 09:45
Worklog Time Spent: 10m 
  Work Description: zchovan opened a new pull request #2494:
URL: https://github.com/apache/hive/pull/2494


   Change-Id: I5f832626e7a38834441c38cdde20d57006d11a11
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 624950)
Time Spent: 40m  (was: 0.5h)

> cleanTxnToWriteIdTable breaks SNAPSHOT isolation
> 
>
> Key: HIVE-25346
> URL: https://issues.apache.org/jira/browse/HIVE-25346
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   >