[jira] [Commented] (HIVE-24167) TPC-DS query 14 fails while generating plan for the filter

2024-01-23 Thread okumin (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17810178#comment-17810178
 ] 

okumin commented on HIVE-24167:
---

Let me take this over as we plan to use CTE materialization with Hive 4. I know 
the root cause and I am almost close to the remediation.

> TPC-DS query 14 fails while generating plan for the filter
> --
>
> Key: HIVE-24167
> URL: https://issues.apache.org/jira/browse/HIVE-24167
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Stamatis Zampetakis
>Assignee: okumin
>Priority: Major
>  Labels: hive-4.1.0-must
>
> TPC-DS query 14 (cbo_query14.q and query4.q) fail with NPE on the metastore 
> with the partitioned TPC-DS 30TB dataset while generating the plan for the 
> filter.
> The problem can be reproduced using the PR in HIVE-23965.
> The current stacktrace shows that the NPE appears while trying to display the 
> debug message but even if this line didn't exist it would fail again later on.
> {noformat}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:10867)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11765)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11622)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11649)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11622)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11649)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11635)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlanForSubQueryPredicate(SemanticAnalyzer.java:3375)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:3473)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:10819)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11765)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11622)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11625)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11625)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11649)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11622)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11649)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11635)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:12417)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:718)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12519)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:443)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:301)
> at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:171)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:301)
> at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:220)
> at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:104)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:173)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:414)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:363)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:357)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:129)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:231)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:203)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:129)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:355)
> at 
> 

[jira] [Assigned] (HIVE-24167) TPC-DS query 14 fails while generating plan for the filter

2024-01-23 Thread okumin (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

okumin reassigned HIVE-24167:
-

Assignee: okumin  (was: Zoltan Haindrich)

> TPC-DS query 14 fails while generating plan for the filter
> --
>
> Key: HIVE-24167
> URL: https://issues.apache.org/jira/browse/HIVE-24167
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Stamatis Zampetakis
>Assignee: okumin
>Priority: Major
>  Labels: hive-4.1.0-must
>
> TPC-DS query 14 (cbo_query14.q and query4.q) fail with NPE on the metastore 
> with the partitioned TPC-DS 30TB dataset while generating the plan for the 
> filter.
> The problem can be reproduced using the PR in HIVE-23965.
> The current stacktrace shows that the NPE appears while trying to display the 
> debug message but even if this line didn't exist it would fail again later on.
> {noformat}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:10867)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11765)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11622)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11649)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11622)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11649)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11635)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlanForSubQueryPredicate(SemanticAnalyzer.java:3375)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:3473)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:10819)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11765)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11622)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11625)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11625)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11649)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11622)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11649)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11635)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:12417)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:718)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12519)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:443)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:301)
> at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:171)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:301)
> at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:220)
> at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:104)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:173)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:414)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:363)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:357)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:129)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:231)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:203)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:129)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:355)
> at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:740)
> at 
> 

[jira] [Assigned] (HIVE-28023) CVE fixes for Apache hive JDBC driver

2024-01-23 Thread Hongdan Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongdan Zhu reassigned HIVE-28023:
--

Assignee: Hongdan Zhu

> CVE fixes for Apache hive JDBC driver
> -
>
> Key: HIVE-28023
> URL: https://issues.apache.org/jira/browse/HIVE-28023
> Project: Hive
>  Issue Type: Improvement
>Reporter: Hongdan Zhu
>Assignee: Hongdan Zhu
>Priority: Major
>
> A fix for:
> Apache Hive Driver : hive-jdbc-3.1.0-SNAPSHOT-standalone.jar - 
> {color:#00}CVE-2022-25857{color}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-28023) CVE fixes for Apache hive JDBC driver

2024-01-23 Thread Hongdan Zhu (Jira)
Hongdan Zhu created HIVE-28023:
--

 Summary: CVE fixes for Apache hive JDBC driver
 Key: HIVE-28023
 URL: https://issues.apache.org/jira/browse/HIVE-28023
 Project: Hive
  Issue Type: Improvement
Reporter: Hongdan Zhu


A fix for:

Apache Hive Driver : hive-jdbc-3.1.0-SNAPSHOT-standalone.jar - 
{color:#00}CVE-2022-25857{color}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27980) Hive Iceberg Compaction: add support for OPTIMIZE TABLE syntax

2024-01-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27980:
--
Labels: pull-request-available  (was: )

> Hive Iceberg Compaction: add support for OPTIMIZE TABLE syntax
> --
>
> Key: HIVE-27980
> URL: https://issues.apache.org/jira/browse/HIVE-27980
> Project: Hive
>  Issue Type: New Feature
>Reporter: Dmitriy Fingerman
>Priority: Major
>  Labels: pull-request-available
>
> Presently Hive Iceberg supports Major compaction using HIVE ACID syntax below.
> {code:java}
> ALTER TABLE name COMPACT MAJOR [AND WAIT] {code}
> Add support for OPTIMIZE TABLE syntax. Example:
> {code:java}
> OPTIMIZE TABLE name
> REWRITE DATA [USING BIN_PACK]
> [ ( { FILE_SIZE_THRESHOLD | MIN_INPUT_FILES } =  [, ... ] ) ]
> WHERE category = 'c1' {code}
> This syntax will be inline with Impala.
> Also, OPTIMIZE command is not limited to compaction, but also supports other 
> table maintenance operations.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-28000) Hive QL : "not in" clause gives incorrect results when type coercion cannot take place.

2024-01-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-28000:
--
Labels: pull-request-available  (was: )

>  Hive QL : "not in" clause gives incorrect results when type coercion cannot 
> take place.
> 
>
> Key: HIVE-28000
> URL: https://issues.apache.org/jira/browse/HIVE-28000
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Anmol Sundaram
>Priority: Major
>  Labels: pull-request-available
> Attachments: not_in_examples.q
>
>
> There are certain scenarios where "not in" clause gives incorrect results 
> when type coercion cannot take place. 
> These occur when the in clause contains at least one operand which cannot be 
> type-coerced to the column on which the in clause is being applied to. 
>  
> Please refer to the attached query examples for more details. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work started] (HIVE-28021) Attempting to create a table with a percent symbol fails

2024-01-23 Thread Tim Thorpe (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-28021 started by Tim Thorpe.
-
> Attempting to create a table with a percent symbol fails
> 
>
> Key: HIVE-28021
> URL: https://issues.apache.org/jira/browse/HIVE-28021
> Project: Hive
>  Issue Type: Bug
>  Components: Iceberg integration
>Affects Versions: 4.0.0-beta-1
>Reporter: Tim Thorpe
>Assignee: Tim Thorpe
>Priority: Minor
>  Labels: pull-request-available
>
> This occurred while attempting to test creating a table 
> "[|]#&%_@"."[|]#&%_@"
> The stack trace is as follows:
>  
> {code:java}
> java.util.UnknownFormatConversionException: Conversion = '_'
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.util.UnknownFormatConversionException: Conversion = '_'
>         at 
> org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1383) 
> ~[hive-exec-4.0.0-beta-1.jar:4.0.0-beta-1]
>         at 
> org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1388) 
> ~[hive-exec-4.0.0-beta-1.jar:4.0.0-beta-1]
>         at 
> org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1278) 
> ~[hive-exec-4.0.0-beta-1.jar:4.0.0-beta-1]
>         …
> Caused by: java.util.UnknownFormatConversionException: Conversion = '_'
>         at java.util.Formatter.checkText(Formatter.java:2590) ~[?:1.8.0]
>         at java.util.Formatter.parse(Formatter.java:2566) ~[?:1.8.0]
>         at java.util.Formatter.format(Formatter.java:2512) ~[?:1.8.0]
>         at java.util.Formatter.format(Formatter.java:2466) ~[?:1.8.0]
>         at java.lang.String.format(String.java:4268) ~[?:2.9 (05-29-2023)]
>         at 
> org.apache.iceberg.relocated.com.google.common.util.concurrent.ThreadFactoryBuilder.format(ThreadFactoryBuilder.java:186)
>  ~[hive-iceberg-handler-4.0.0-beta-1.jar:4.0.0-beta-1]
>         at 
> org.apache.iceberg.relocated.com.google.common.util.concurrent.ThreadFactoryBuilder.setNameFormat(ThreadFactoryBuilder.java:73)
>  ~[hive-iceberg-handler-4.0.0-beta-1.jar:4.0.0-beta-1]
>         at 
> org.apache.iceberg.hive.MetastoreLock.(MetastoreLock.java:129) 
> ~[hive-iceberg-handler-4.0.0-beta-1.jar:4.0.0-beta-1] {code}
>  
>  
> This was fixed by making a change to 
> [https://github.com/apache/hive/blob/branch-4.0.0-beta-1/iceberg/iceberg-catalog/src/main/java/org/apache/iceberg/hive/MetastoreLock.java#L129]
>  
> {code:java}
> -.setNameFormat("iceberg-hive-lock-heartbeat-" + 
> fullName + "-%d")
> +.setNameFormat("iceberg-hive-lock-heartbeat-" + 
> fullName.replace("%", "%%") + "-%d"){code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-28022) Authorization fails for nested view + with clause + union all

2024-01-23 Thread Taraka Rama Rao Lethavadla (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taraka Rama Rao Lethavadla updated HIVE-28022:
--
Summary: Authorization fails for nested view + with clause +  union all   
(was: Authorization fails for nested view with a union all )

> Authorization fails for nested view + with clause +  union all 
> ---
>
> Key: HIVE-28022
> URL: https://issues.apache.org/jira/browse/HIVE-28022
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Taraka Rama Rao Lethavadla
>Priority: Major
>
> Test Case:
> set 
> hive.security.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.DefaultHiveAuthorizationProvider;
> {noformat}
> create database table_db;
> create database view_db_outer;
> create database view_db_inner;
> create database view_db_inner_inner;{noformat}
> {noformat}
> create table table_db.test_tbl(col1 string);
> create view view_db_outer.outer_view1 as select col1 from table_db.test_tbl;
> create view view_db_outer.outer_view2 as select col1 from table_db.test_tbl;
> create view view_db_inner.inner_view as with wct as (select ov1.col1 from 
> view_db_outer.outer_view1 ov1 union all select ov2.col1 from 
> view_db_outer.outer_view2 ov2) select * from wct;
> create view view_db_inner_inner.inner_inner_view as select * from 
> view_db_inner.inner_view;{noformat}
> Enable authorization
> {code:java}
> set hive.security.authorization.enabled=true; {code}
> Grant permissions to the final view
> {code:java}
> grant select on table view_db_inner_inner.inner_inner_view to user 
> hive_test_user;{code}
>  select * from view_db_inner_inner.inner_inner_view; --Fails with exception 
> unauthorized
> {noformat}
> ql.Driver: Authorization failed:No privilege 'Select' found for inputs { 
> database:view_db_outer, table:outer_view1, columnName:col1}. Use SHOW GRANT 
> to get more details.{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-28022) Authorization fails for nested view with a union all

2024-01-23 Thread Taraka Rama Rao Lethavadla (Jira)
Taraka Rama Rao Lethavadla created HIVE-28022:
-

 Summary: Authorization fails for nested view with a union all 
 Key: HIVE-28022
 URL: https://issues.apache.org/jira/browse/HIVE-28022
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Taraka Rama Rao Lethavadla


Test Case:
set 
hive.security.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.DefaultHiveAuthorizationProvider;
{noformat}
create database table_db;
create database view_db_outer;
create database view_db_inner;
create database view_db_inner_inner;{noformat}
{noformat}
create table table_db.test_tbl(col1 string);
create view view_db_outer.outer_view1 as select col1 from table_db.test_tbl;
create view view_db_outer.outer_view2 as select col1 from table_db.test_tbl;
create view view_db_inner.inner_view as with wct as (select ov1.col1 from 
view_db_outer.outer_view1 ov1 union all select ov2.col1 from 
view_db_outer.outer_view2 ov2) select * from wct;
create view view_db_inner_inner.inner_inner_view as select * from 
view_db_inner.inner_view;{noformat}
Enable authorization
{code:java}
set hive.security.authorization.enabled=true; {code}
Grant permissions to the final view

{code:java}
grant select on table view_db_inner_inner.inner_inner_view to user 
hive_test_user;{code}
 select * from view_db_inner_inner.inner_inner_view; --Fails with exception 
unauthorized
{noformat}
ql.Driver: Authorization failed:No privilege 'Select' found for inputs { 
database:view_db_outer, table:outer_view1, columnName:col1}. Use SHOW GRANT to 
get more details.{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-28021) Attempting to create a table with a percent symbol fails

2024-01-23 Thread Tim Thorpe (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Thorpe reassigned HIVE-28021:
-

Assignee: Tim Thorpe

> Attempting to create a table with a percent symbol fails
> 
>
> Key: HIVE-28021
> URL: https://issues.apache.org/jira/browse/HIVE-28021
> Project: Hive
>  Issue Type: Bug
>  Components: Iceberg integration
>Affects Versions: 4.0.0-beta-1
>Reporter: Tim Thorpe
>Assignee: Tim Thorpe
>Priority: Minor
>  Labels: pull-request-available
>
> This occurred while attempting to test creating a table 
> "[|]#&%_@"."[|]#&%_@"
> The stack trace is as follows:
>  
> {code:java}
> java.util.UnknownFormatConversionException: Conversion = '_'
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.util.UnknownFormatConversionException: Conversion = '_'
>         at 
> org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1383) 
> ~[hive-exec-4.0.0-beta-1.jar:4.0.0-beta-1]
>         at 
> org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1388) 
> ~[hive-exec-4.0.0-beta-1.jar:4.0.0-beta-1]
>         at 
> org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1278) 
> ~[hive-exec-4.0.0-beta-1.jar:4.0.0-beta-1]
>         …
> Caused by: java.util.UnknownFormatConversionException: Conversion = '_'
>         at java.util.Formatter.checkText(Formatter.java:2590) ~[?:1.8.0]
>         at java.util.Formatter.parse(Formatter.java:2566) ~[?:1.8.0]
>         at java.util.Formatter.format(Formatter.java:2512) ~[?:1.8.0]
>         at java.util.Formatter.format(Formatter.java:2466) ~[?:1.8.0]
>         at java.lang.String.format(String.java:4268) ~[?:2.9 (05-29-2023)]
>         at 
> org.apache.iceberg.relocated.com.google.common.util.concurrent.ThreadFactoryBuilder.format(ThreadFactoryBuilder.java:186)
>  ~[hive-iceberg-handler-4.0.0-beta-1.jar:4.0.0-beta-1]
>         at 
> org.apache.iceberg.relocated.com.google.common.util.concurrent.ThreadFactoryBuilder.setNameFormat(ThreadFactoryBuilder.java:73)
>  ~[hive-iceberg-handler-4.0.0-beta-1.jar:4.0.0-beta-1]
>         at 
> org.apache.iceberg.hive.MetastoreLock.(MetastoreLock.java:129) 
> ~[hive-iceberg-handler-4.0.0-beta-1.jar:4.0.0-beta-1] {code}
>  
>  
> This was fixed by making a change to 
> [https://github.com/apache/hive/blob/branch-4.0.0-beta-1/iceberg/iceberg-catalog/src/main/java/org/apache/iceberg/hive/MetastoreLock.java#L129]
>  
> {code:java}
> -.setNameFormat("iceberg-hive-lock-heartbeat-" + 
> fullName + "-%d")
> +.setNameFormat("iceberg-hive-lock-heartbeat-" + 
> fullName.replace("%", "%%") + "-%d"){code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-28021) Attempting to create a table with a percent symbol fails

2024-01-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-28021:
--
Labels: pull-request-available  (was: )

> Attempting to create a table with a percent symbol fails
> 
>
> Key: HIVE-28021
> URL: https://issues.apache.org/jira/browse/HIVE-28021
> Project: Hive
>  Issue Type: Bug
>  Components: Iceberg integration
>Affects Versions: 4.0.0-beta-1
>Reporter: Tim Thorpe
>Priority: Minor
>  Labels: pull-request-available
>
> This occurred while attempting to test creating a table 
> "[|]#&%_@"."[|]#&%_@"
> The stack trace is as follows:
>  
> {code:java}
> java.util.UnknownFormatConversionException: Conversion = '_'
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.util.UnknownFormatConversionException: Conversion = '_'
>         at 
> org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1383) 
> ~[hive-exec-4.0.0-beta-1.jar:4.0.0-beta-1]
>         at 
> org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1388) 
> ~[hive-exec-4.0.0-beta-1.jar:4.0.0-beta-1]
>         at 
> org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1278) 
> ~[hive-exec-4.0.0-beta-1.jar:4.0.0-beta-1]
>         …
> Caused by: java.util.UnknownFormatConversionException: Conversion = '_'
>         at java.util.Formatter.checkText(Formatter.java:2590) ~[?:1.8.0]
>         at java.util.Formatter.parse(Formatter.java:2566) ~[?:1.8.0]
>         at java.util.Formatter.format(Formatter.java:2512) ~[?:1.8.0]
>         at java.util.Formatter.format(Formatter.java:2466) ~[?:1.8.0]
>         at java.lang.String.format(String.java:4268) ~[?:2.9 (05-29-2023)]
>         at 
> org.apache.iceberg.relocated.com.google.common.util.concurrent.ThreadFactoryBuilder.format(ThreadFactoryBuilder.java:186)
>  ~[hive-iceberg-handler-4.0.0-beta-1.jar:4.0.0-beta-1]
>         at 
> org.apache.iceberg.relocated.com.google.common.util.concurrent.ThreadFactoryBuilder.setNameFormat(ThreadFactoryBuilder.java:73)
>  ~[hive-iceberg-handler-4.0.0-beta-1.jar:4.0.0-beta-1]
>         at 
> org.apache.iceberg.hive.MetastoreLock.(MetastoreLock.java:129) 
> ~[hive-iceberg-handler-4.0.0-beta-1.jar:4.0.0-beta-1] {code}
>  
>  
> This was fixed by making a change to 
> [https://github.com/apache/hive/blob/branch-4.0.0-beta-1/iceberg/iceberg-catalog/src/main/java/org/apache/iceberg/hive/MetastoreLock.java#L129]
>  
> {code:java}
> -.setNameFormat("iceberg-hive-lock-heartbeat-" + 
> fullName + "-%d")
> +.setNameFormat("iceberg-hive-lock-heartbeat-" + 
> fullName.replace("%", "%%") + "-%d"){code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-28015) Iceberg: Add identifier-field-ids support in Hive

2024-01-23 Thread Denys Kuzmenko (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-28015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17810013#comment-17810013
 ] 

Denys Kuzmenko commented on HIVE-28015:
---

(y)

> Iceberg: Add identifier-field-ids support in Hive
> -
>
> Key: HIVE-28015
> URL: https://issues.apache.org/jira/browse/HIVE-28015
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Affects Versions: 4.0.0
>Reporter: Denys Kuzmenko
>Assignee: Butao Zhang
>Priority: Major
>
> Some writer engines require primary keys on a table so that they can use them 
> for writing equality deletes (only the PK cols are written to the eq-delete 
> files).
> Hive currently doesn't reject setting PKs for Iceberg tables, however, it 
> just ignores them. This succeeds:
> {code:java}
> create table ice_pk (i int, j int, primary key(i)) stored by iceberg;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-28021) Attempting to create a table with a percent symbol fails

2024-01-23 Thread Tim Thorpe (Jira)
Tim Thorpe created HIVE-28021:
-

 Summary: Attempting to create a table with a percent symbol fails
 Key: HIVE-28021
 URL: https://issues.apache.org/jira/browse/HIVE-28021
 Project: Hive
  Issue Type: Bug
  Components: Iceberg integration
Affects Versions: 4.0.0-beta-1
Reporter: Tim Thorpe


This occurred while attempting to test creating a table 
"[|]#&%_@"."[|]#&%_@"
The stack trace is as follows:

 
{code:java}
java.util.UnknownFormatConversionException: Conversion = '_'
org.apache.hadoop.hive.ql.metadata.HiveException: 
java.util.UnknownFormatConversionException: Conversion = '_'
        at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1383) 
~[hive-exec-4.0.0-beta-1.jar:4.0.0-beta-1]
        at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1388) 
~[hive-exec-4.0.0-beta-1.jar:4.0.0-beta-1]
        at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1278) 
~[hive-exec-4.0.0-beta-1.jar:4.0.0-beta-1]
        …
Caused by: java.util.UnknownFormatConversionException: Conversion = '_'
        at java.util.Formatter.checkText(Formatter.java:2590) ~[?:1.8.0]
        at java.util.Formatter.parse(Formatter.java:2566) ~[?:1.8.0]
        at java.util.Formatter.format(Formatter.java:2512) ~[?:1.8.0]
        at java.util.Formatter.format(Formatter.java:2466) ~[?:1.8.0]
        at java.lang.String.format(String.java:4268) ~[?:2.9 (05-29-2023)]
        at 
org.apache.iceberg.relocated.com.google.common.util.concurrent.ThreadFactoryBuilder.format(ThreadFactoryBuilder.java:186)
 ~[hive-iceberg-handler-4.0.0-beta-1.jar:4.0.0-beta-1]
        at 
org.apache.iceberg.relocated.com.google.common.util.concurrent.ThreadFactoryBuilder.setNameFormat(ThreadFactoryBuilder.java:73)
 ~[hive-iceberg-handler-4.0.0-beta-1.jar:4.0.0-beta-1]
        at org.apache.iceberg.hive.MetastoreLock.(MetastoreLock.java:129) 
~[hive-iceberg-handler-4.0.0-beta-1.jar:4.0.0-beta-1] {code}
 

 

This was fixed by making a change to 
[https://github.com/apache/hive/blob/branch-4.0.0-beta-1/iceberg/iceberg-catalog/src/main/java/org/apache/iceberg/hive/MetastoreLock.java#L129]

 
{code:java}
-.setNameFormat("iceberg-hive-lock-heartbeat-" + 
fullName + "-%d")
+.setNameFormat("iceberg-hive-lock-heartbeat-" + 
fullName.replace("%", "%%") + "-%d"){code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-28015) Iceberg: Add identifier-field-ids support in Hive

2024-01-23 Thread Butao Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-28015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809983#comment-17809983
 ] 

Butao Zhang commented on HIVE-28015:


Got it! Thx. I will explore implementing this by Hive primary key syntax.

> Iceberg: Add identifier-field-ids support in Hive
> -
>
> Key: HIVE-28015
> URL: https://issues.apache.org/jira/browse/HIVE-28015
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Affects Versions: 4.0.0
>Reporter: Denys Kuzmenko
>Assignee: Butao Zhang
>Priority: Major
>
> Some writer engines require primary keys on a table so that they can use them 
> for writing equality deletes (only the PK cols are written to the eq-delete 
> files).
> Hive currently doesn't reject setting PKs for Iceberg tables, however, it 
> just ignores them. This succeeds:
> {code:java}
> create table ice_pk (i int, j int, primary key(i)) stored by iceberg;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-28015) Iceberg: Add identifier-field-ids support in Hive

2024-01-23 Thread Butao Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Butao Zhang reassigned HIVE-28015:
--

Assignee: Butao Zhang

> Iceberg: Add identifier-field-ids support in Hive
> -
>
> Key: HIVE-28015
> URL: https://issues.apache.org/jira/browse/HIVE-28015
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Affects Versions: 4.0.0
>Reporter: Denys Kuzmenko
>Assignee: Butao Zhang
>Priority: Major
>
> Some writer engines require primary keys on a table so that they can use them 
> for writing equality deletes (only the PK cols are written to the eq-delete 
> files).
> Hive currently doesn't reject setting PKs for Iceberg tables, however, it 
> just ignores them. This succeeds:
> {code:java}
> create table ice_pk (i int, j int, primary key(i)) stored by iceberg;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HIVE-28015) Iceberg: Add identifier-field-ids support in Hive

2024-01-23 Thread Butao Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-28015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809762#comment-17809762
 ] 

Butao Zhang edited comment on HIVE-28015 at 1/23/24 2:59 PM:
-

Spark-iceberg uses this *alter set*  syntax to add identifier-field-ids, should 
we also do like spark?

[https://iceberg.apache.org/docs/latest/spark-ddl/#alter-table--set-identifier-fields]

{{{*}ALTER TABLE prod.db.sample SET IDENTIFIER FIELDS id{*}-- single column}}

{{{*}ALTER TABLE prod.db.sample SET IDENTIFIER FIELDS id, data{*}-- multiple 
columns}}

 

Or using *primary key( i )* syntax like your example?

*create table ice_pk (i int, j int, primary key( i )) stored by iceberg;*


was (Author: zhangbutao):
Spark-iceberg uses this *alter set*  syntax to add identifier-field-ids, should 
we also do like spark?

[https://iceberg.apache.org/docs/latest/spark-ddl/#alter-table--set-identifier-fields]

{{{*}ALTER TABLE prod.db.sample SET IDENTIFIER FIELDS id{*}-- single column}}

{{{*}ALTER TABLE prod.db.sample SET IDENTIFIER FIELDS id, data{*}-- multiple 
columns}}

 

Or using *primary key(i)* syntax like your example?

*create table ice_pk (i int, j int, primary key(i)) stored as iceberg;*

> Iceberg: Add identifier-field-ids support in Hive
> -
>
> Key: HIVE-28015
> URL: https://issues.apache.org/jira/browse/HIVE-28015
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Affects Versions: 4.0.0
>Reporter: Denys Kuzmenko
>Priority: Major
>
> Some writer engines require primary keys on a table so that they can use them 
> for writing equality deletes (only the PK cols are written to the eq-delete 
> files).
> Hive currently doesn't reject setting PKs for Iceberg tables, however, it 
> just ignores them. This succeeds:
> {code:java}
> create table ice_pk (i int, j int, primary key(i)) stored by iceberg;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-28015) Iceberg: Add identifier-field-ids support in Hive

2024-01-23 Thread Butao Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Butao Zhang updated HIVE-28015:
---
Description: 
Some writer engines require primary keys on a table so that they can use them 
for writing equality deletes (only the PK cols are written to the eq-delete 
files).

Hive currently doesn't reject setting PKs for Iceberg tables, however, it just 
ignores them. This succeeds:
{code:java}
create table ice_pk (i int, j int, primary key(i)) stored by iceberg;
{code}

  was:
Some writer engines require primary keys on a table so that they can use them 
for writing equality deletes (only the PK cols are written to the eq-delete 
files).

Hive currently doesn't reject setting PKs for Iceberg tables, however, it just 
ignores them. This succeeds:

{code}
create table ice_pk (i int, j int, primary key(i)) stored as iceberg;
{code}


> Iceberg: Add identifier-field-ids support in Hive
> -
>
> Key: HIVE-28015
> URL: https://issues.apache.org/jira/browse/HIVE-28015
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Affects Versions: 4.0.0
>Reporter: Denys Kuzmenko
>Priority: Major
>
> Some writer engines require primary keys on a table so that they can use them 
> for writing equality deletes (only the PK cols are written to the eq-delete 
> files).
> Hive currently doesn't reject setting PKs for Iceberg tables, however, it 
> just ignores them. This succeeds:
> {code:java}
> create table ice_pk (i int, j int, primary key(i)) stored by iceberg;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HIVE-27942) Missing aux jar errors during LLAP launch

2024-01-23 Thread Butao Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809965#comment-17809965
 ] 

Butao Zhang edited comment on HIVE-27942 at 1/23/24 2:30 PM:
-

Merged into master branch.

Thanks [~shubh_init] for your contribution!

Thanks for your review!  [~ayushtkn] [~akshatm]


was (Author: zhangbutao):
Merged into master branch.

Thanks [~shubh_init] for your contribution!

Thanks for your review!  [~ayushsaxena] [~akshatm]

> Missing aux jar errors during LLAP launch 
> --
>
> Key: HIVE-27942
> URL: https://issues.apache.org/jira/browse/HIVE-27942
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, llap
>Affects Versions: 4.0.0-alpha-1, 4.0.0-alpha-2
>Reporter: Shubham Sharma
>Assignee: Shubham Sharma
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.1.0
>
>
> While launching LLAP following errors are being prompted where dbcp and pool 
> classes are not updated in default aux classes
> {code:java}
> 2023-04-18T10:31:14,423 ERROR [llap-pkg-2] service.AsyncTaskCopyAuxJars: 
> Cannot find a jar for [org.apache.commons.dbcp.BasicDataSourceFactory] due to 
> an exception (org.apache.commons.dbcp.BasicDataSourceFactory); not packaging 
> the jar
> 2023-04-18T10:31:14,424 ERROR [llap-pkg-2] service.AsyncTaskCopyAuxJars: 
> Cannot find a jar for [org.apache.commons.pool.impl.GenericObjectPool] due to 
> an exception (org.apache.commons.pool.impl.GenericObjectPool); not packaging 
> the jar{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27942) Missing aux jar errors during LLAP launch

2024-01-23 Thread Butao Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Butao Zhang resolved HIVE-27942.

Fix Version/s: 4.1.0
   Resolution: Fixed

Merged into master branch.

Thanks [~shubh_init] for your contribution!

Thanks for your review!  [~ayushsaxena] [~akshatm]

> Missing aux jar errors during LLAP launch 
> --
>
> Key: HIVE-27942
> URL: https://issues.apache.org/jira/browse/HIVE-27942
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, llap
>Affects Versions: 4.0.0-alpha-1, 4.0.0-alpha-2
>Reporter: Shubham Sharma
>Assignee: Shubham Sharma
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.1.0
>
>
> While launching LLAP following errors are being prompted where dbcp and pool 
> classes are not updated in default aux classes
> {code:java}
> 2023-04-18T10:31:14,423 ERROR [llap-pkg-2] service.AsyncTaskCopyAuxJars: 
> Cannot find a jar for [org.apache.commons.dbcp.BasicDataSourceFactory] due to 
> an exception (org.apache.commons.dbcp.BasicDataSourceFactory); not packaging 
> the jar
> 2023-04-18T10:31:14,424 ERROR [llap-pkg-2] service.AsyncTaskCopyAuxJars: 
> Cannot find a jar for [org.apache.commons.pool.impl.GenericObjectPool] due to 
> an exception (org.apache.commons.pool.impl.GenericObjectPool); not packaging 
> the jar{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-28016) Iceberg: NULL column values handling in COW mode

2024-01-23 Thread Denys Kuzmenko (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-28016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809946#comment-17809946
 ] 

Denys Kuzmenko commented on HIVE-28016:
---

Merged to master.
[~kkasa], thanks for the review!


> Iceberg: NULL column values handling in COW mode
> 
>
> Key: HIVE-28016
> URL: https://issues.apache.org/jira/browse/HIVE-28016
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-28016) Iceberg: NULL column values handling in COW mode

2024-01-23 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko resolved HIVE-28016.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

> Iceberg: NULL column values handling in COW mode
> 
>
> Key: HIVE-28016
> URL: https://issues.apache.org/jira/browse/HIVE-28016
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-28020) Iceberg: Upgrade iceberg version to 1.4.3

2024-01-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-28020:
--
Labels: pull-request-available  (was: )

> Iceberg: Upgrade iceberg version to 1.4.3
> -
>
> Key: HIVE-28020
> URL: https://issues.apache.org/jira/browse/HIVE-28020
> Project: Hive
>  Issue Type: Task
>Reporter: Simhadri Govindappa
>Assignee: Simhadri Govindappa
>Priority: Major
>  Labels: pull-request-available
>
> Iceberg version 1.4.3 has been released. 
> [https://github.com/apache/iceberg/releases/tag/apache-iceberg-1.4.3] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-28020) Iceberg: Upgrade iceberg version to 1.4.3

2024-01-23 Thread Simhadri Govindappa (Jira)
Simhadri Govindappa created HIVE-28020:
--

 Summary: Iceberg: Upgrade iceberg version to 1.4.3
 Key: HIVE-28020
 URL: https://issues.apache.org/jira/browse/HIVE-28020
 Project: Hive
  Issue Type: Task
Reporter: Simhadri Govindappa
Assignee: Simhadri Govindappa


Iceberg version 1.4.3 has been released. 
[https://github.com/apache/iceberg/releases/tag/apache-iceberg-1.4.3] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-28015) Iceberg: Add identifier-field-ids support in Hive

2024-01-23 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko updated HIVE-28015:
--
Affects Version/s: 4.0.0

> Iceberg: Add identifier-field-ids support in Hive
> -
>
> Key: HIVE-28015
> URL: https://issues.apache.org/jira/browse/HIVE-28015
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Affects Versions: 4.0.0
>Reporter: Denys Kuzmenko
>Priority: Major
>
> Some writer engines require primary keys on a table so that they can use them 
> for writing equality deletes (only the PK cols are written to the eq-delete 
> files).
> Hive currently doesn't reject setting PKs for Iceberg tables, however, it 
> just ignores them. This succeeds:
> {code}
> create table ice_pk (i int, j int, primary key(i)) stored as iceberg;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-28015) Iceberg: Add identifier-field-ids support in Hive

2024-01-23 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko updated HIVE-28015:
--
Component/s: Iceberg integration

> Iceberg: Add identifier-field-ids support in Hive
> -
>
> Key: HIVE-28015
> URL: https://issues.apache.org/jira/browse/HIVE-28015
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Reporter: Denys Kuzmenko
>Priority: Major
>
> Some writer engines require primary keys on a table so that they can use them 
> for writing equality deletes (only the PK cols are written to the eq-delete 
> files).
> Hive currently doesn't reject setting PKs for Iceberg tables, however, it 
> just ignores them. This succeeds:
> {code}
> create table ice_pk (i int, j int, primary key(i)) stored as iceberg;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-28015) Iceberg: Add identifier-field-ids support in Hive

2024-01-23 Thread Denys Kuzmenko (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-28015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809913#comment-17809913
 ] 

Denys Kuzmenko commented on HIVE-28015:
---

[~zhangbutao] i think we should try to follow the Hive syntax:
https://issues.apache.org/jira/secure/attachment/12803522/AddingPKFKconstraints.pdf

> Iceberg: Add identifier-field-ids support in Hive
> -
>
> Key: HIVE-28015
> URL: https://issues.apache.org/jira/browse/HIVE-28015
> Project: Hive
>  Issue Type: Improvement
>Reporter: Denys Kuzmenko
>Priority: Major
>
> Some writer engines require primary keys on a table so that they can use them 
> for writing equality deletes (only the PK cols are written to the eq-delete 
> files).
> Hive currently doesn't reject setting PKs for Iceberg tables, however, it 
> just ignores them. This succeeds:
> {code}
> create table ice_pk (i int, j int, primary key(i)) stored as iceberg;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27775) DirectSQL and JDO results are different when fetching partitions by timestamp in DST shift

2024-01-23 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809883#comment-17809883
 ] 

Stamatis Zampetakis commented on HIVE-27775:


To help advance the resolution of this ticket let's focus on the issue reported 
in the description. For sure there are more bugs around this but let's handle 
one at the time.

The bug shows up in the tests cause they use the {{VerifyingObjectStore}}. The 
{{VerifyingObjectStore}} is not used in production and that's why I am asking 
if there is a way for this bug to manifestate without using this 
implementation. Typically, if direct SQL returns the correct result then we are 
not going to go through JDO. IS there a way (via config or otherwise) to bypass 
the direct SQL mechanism and use JDO exclusively?

 

> DirectSQL and JDO results are different when fetching partitions by timestamp 
> in DST shift
> --
>
> Key: HIVE-27775
> URL: https://issues.apache.org/jira/browse/HIVE-27775
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0-beta-1
>Reporter: Stamatis Zampetakis
>Assignee: Zhihua Deng
>Priority: Critical
>  Labels: pull-request-available
>
> DirectSQL and JDO results are different when fetching partitions by timestamp 
> in DST shift.
> {code:sql}
> --! qt:timezone:Europe/Paris
> CREATE EXTERNAL TABLE payments (card string) PARTITIONED BY(txn_datetime 
> TIMESTAMP) STORED AS ORC;
> INSERT into payments VALUES('---', '2023-03-26 02:30:00');
> SELECT * FROM payments WHERE txn_datetime = '2023-03-26 02:30:00';
> {code}
> The '2023-03-26 02:30:00' is a timestamp that in Europe/Paris timezone falls 
> exactly in the middle of the DST shift. In this particular timezone this date 
> time never really exists since we are jumping directly from 02:00:00 to 
> 03:00:00. However, the TIMESTAMP data type in Hive is timezone agnostic 
> (https://cwiki.apache.org/confluence/display/Hive/Different+TIMESTAMP+types) 
> so it is a perfectly valid timestamp that can be inserted in a table and we 
> must be able to recover it back.
> For the SELECT query above, partition pruning kicks in and calls the 
> ObjectStore#getPartitionsByExpr method in order to fetch the respective 
> partitions matching the timestamp from HMS.
> The tests however reveal that DirectSQL and JDO paths are not returning the 
> same results leading to an exception when VerifyingObjectStore is used. 
> According to the error below DirectSQL is able to recover one partition from 
> HMS (expected) while JDO/ORM returns empty (not expected).
> {noformat}
> 2023-10-06T03:51:19,406 ERROR [80252df4-3fdc-4971-badf-ad67ce8567c7 main] 
> metastore.VerifyingObjectStore: Lists are not the same size: SQL 1, ORM 0
> 2023-10-06T03:51:19,409 ERROR [80252df4-3fdc-4971-badf-ad67ce8567c7 main] 
> metastore.RetryingHMSHandler: MetaException(message:Lists are not the same 
> size: SQL 1, ORM 0)
>   at 
> org.apache.hadoop.hive.metastore.VerifyingObjectStore.verifyLists(VerifyingObjectStore.java:148)
>   at 
> org.apache.hadoop.hive.metastore.VerifyingObjectStore.getPartitionsByExpr(VerifyingObjectStore.java:88)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97)
>   at com.sun.proxy.$Proxy57.getPartitionsByExpr(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.HMSHandler.get_partitions_spec_by_expr(HMSHandler.java:7330)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:98)
>   at 
> org.apache.hadoop.hive.metastore.AbstractHMSHandlerProxy.invoke(AbstractHMSHandlerProxy.java:82)
>   at com.sun.proxy.$Proxy59.get_partitions_spec_by_expr(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getPartitionsSpecByExprInternal(HiveMetaStoreClient.java:2472)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreClientWithLocalCache.getPartitionsSpecByExprInternal(HiveMetaStoreClientWithLocalCache.java:396)
>   at 
> 

[jira] [Commented] (HIVE-28003) Fix Batch execution issues

2024-01-23 Thread Jira


[ 
https://issues.apache.org/jira/browse/HIVE-28003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809881#comment-17809881
 ] 

László Végh commented on HIVE-28003:


[~aturoczy] Unfortunately the CI does not run the tests against "real" HMS 
backend databases like PostgreSQL or MySQL, and some issues are affecting only 
those DBs.

> Fix Batch execution issues
> --
>
> Key: HIVE-28003
> URL: https://issues.apache.org/jira/browse/HIVE-28003
> Project: Hive
>  Issue Type: Task
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>
> Postgresql and Mysql JDBC drivers are returning -2 instead of the number of 
> affected rows in batch execution mode. Therefore the result policy check 
> needs to be removed for batch executions.
> Fix Batch execution issues.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-28019) Fix query type information in proto files for load, export, import and explain queries

2024-01-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-28019:
--
Labels: pull-request-available  (was: )

> Fix query type information in proto files for load, export, import and 
> explain queries
> --
>
> Key: HIVE-28019
> URL: https://issues.apache.org/jira/browse/HIVE-28019
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
>
> Certain query types like LOAD, export, import and explain queries did not 
> produce the right Hive operation type



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-28019) Fix query type information in proto files for load, export, import and explain queries

2024-01-23 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-28019:

Summary: Fix query type information in proto files for load, export, import 
and explain queries  (was: Wrong query type information in proto files for 
load, export, import and explain queries)

> Fix query type information in proto files for load, export, import and 
> explain queries
> --
>
> Key: HIVE-28019
> URL: https://issues.apache.org/jira/browse/HIVE-28019
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> Certain query types like LOAD, export, import and explain queries did not 
> produce the right Hive operation type



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-28019) Wrong query type information in proto files for load, export, import and explain queries

2024-01-23 Thread Ramesh Kumar Thangarajan (Jira)
Ramesh Kumar Thangarajan created HIVE-28019:
---

 Summary: Wrong query type information in proto files for load, 
export, import and explain queries
 Key: HIVE-28019
 URL: https://issues.apache.org/jira/browse/HIVE-28019
 Project: Hive
  Issue Type: Task
  Components: HiveServer2
Reporter: Ramesh Kumar Thangarajan
Assignee: Ramesh Kumar Thangarajan


Certain query types like LOAD, export, import and explain queries did not 
produce the right Hive operation type



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-12778) Having with count distinct doesn't work for special combination

2024-01-23 Thread archon gum (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-12778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809805#comment-17809805
 ] 

archon gum commented on HIVE-12778:
---

It seems mr has this issue, use spark and set cbo to true works for me.
{code:sql}
set hive.execution.engine=spark;
set hive.cbo.enable=true; {code}

> Having with count distinct doesn't work for special combination
> ---
>
> Key: HIVE-12778
> URL: https://issues.apache.org/jira/browse/HIVE-12778
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0, 1.2.1
>Reporter: Peter Brejcak
>Priority: Major
>
> There is problem for combination of count(distinct ) in having clause without 
> count(distinct ) in select clause. 
> First case returns error *FAILED: SemanticException [Error 10002]: Line 
> Invalid column reference* (unexpected)
> If I add count(distinct ) to select clause result is ok (expected).
> Please run code to see it.
> Steps to reproduce:
> {code}
> create table table_subquery_having_problem (id int, value int);
> insert into table table_subquery_having_problem values (1,1);
> insert into table table_subquery_having_problem values (1,2);
> insert into table table_subquery_having_problem values (1,3);
> insert into table table_subquery_having_problem values (1,4);
> insert into table table_subquery_having_problem values (1,5);
> insert into table table_subquery_having_problem values (1,6);
> insert into table table_subquery_having_problem values (1,7);
> insert into table table_subquery_having_problem values (1,8);
> insert into table table_subquery_having_problem values (1,9);
> select x.id from table_subquery_having_problem x
> group by x.id
> having count(distinct x.value)>1;  -- result is ERROR
> select x.id, count(distinct x.value) from table_subquery_having_problem x
> group by x.id
> having count(distinct x.value)>1; --result is OK
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)