[jira] [Commented] (HIVE-26376) Hive Metastore connection leak (OOM Error)

2022-07-08 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564447#comment-17564447
 ] 

Ayush Saxena commented on HIVE-26376:
-

hmm, need to explore, was trying to find if there is any related Jira in 
Hadoop, found couple of them, but HDFS-3545 looked going near hive as well, so 
may be related. I am also not very sure about the auth setup and all.

One more question, the hive user with which we are getting via 
UserGroupInformation.getCurrentUser() is this the user with which HMS service 
started or the one from the client, provided we are not using impersonation 
also? if it is not of the end client, then Subject shouldn't change right?

And one more doubt as well: This is what we saw in Hive-Replication. The 
FileSystem was cached and we were closing the FileSystem after shooting a 
DistCp job for data copy, So, since both the threads used the same cached 
FileSystem, so when one thread closed the FileSystem, the other Thread started 
giving FileSystem closed exceptions during clean up task after MR jobs under 
race conditions. So, this is also something we should take care, we don't land 
up in such a situation. The same cached filesystem shouldn't be getting used at 
more than one place, else closing at one place will screw up the other one as 
well.

> Hive Metastore connection leak (OOM Error)
> --
>
> Key: HIVE-26376
> URL: https://issues.apache.org/jira/browse/HIVE-26376
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.1.2
> Environment: !Screenshot 2022-07-07 at 11.52.33 AM.png!
>Reporter: Ranith Sardar
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2022-07-07 at 11.52.33 AM.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive version:3.1.2
> Hive metastore heap size is 14GB, Memory Leak is happening after 4-5 days, 
> hive meta-store throwing error with OOM.
> If we disable the configuration, the memory leak disappears.
> In the case of, Heap dump size 3.5GB, a large number of filesystem objects(> 
> 9k instances) are being retained. It's occupying most of the heap space. 
> Added snapshot from the eclipse MAT.
> Bellow are part of the stack trace for OOM error:
> {code:java}
> at 
> org.apache.hadoop.hive.common.FileUtils.getFileStatusOrNull(Lorg/apache/hadoop/fs/FileSystem;Lorg/apache/hadoop/fs/Path;)Lorg/apache/hadoop/fs/FileStatus;
>  (FileUtils.java:801)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.checkPermissions(Lorg/apache/hadoop/conf/Configuration;Lorg/apache/hadoop/fs/Path;Ljava/util/EnumSet;)V
>  (StorageBasedAuthorizationProvider.java:371)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.authorize(Lorg/apache/hadoop/fs/Path;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;)V
>  (StorageBasedAuthorizationProvider.java:346)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.authorize(Lorg/apache/hadoop/hive/metastore/api/Database;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;)V
>  (StorageBasedAuthorizationProvider.java:154)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener.authorizeReadDatabase(Lorg/apache/hadoop/hive/metastore/events/PreReadDatabaseEvent;)V
>  (AuthorizationPreEventListener.java:208)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener.onEvent(Lorg/apache/hadoop/hive/metastore/events/PreEventContext;)V
>  (AuthorizationPreEventListener.java:153)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.firePreEvent(Lorg/apache/hadoop/hive/metastore/events/PreEventContext;)V
>  (HiveMetaStore.java:3221)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_database(Ljava/lang/String;)Lorg/apache/hadoop/hive/metastore/api/Database;
>  (HiveMetaStore.java:1352){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26376) Hive Metastore connection leak (OOM Error)

2022-07-08 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564366#comment-17564366
 ] 

Stamatis Zampetakis commented on HIVE-26376:


[~ayushtkn]  I am not that familiar with the user authentication logic but even 
if we are not doing impersonation and we are connecting with the current user 
(UserGroupInformation.getCurrentUser()) I think we may still have multiple Java 
objects representing the same user. If for instance we have Kerberos 
authentication then each time we get a new \{{Subject}} for the "hive" user we 
are gonna have a new \{{UserGroupInformation}} instance. I am not 100% sure 
about what I am saying so please correct me if I am wrong.

> Hive Metastore connection leak (OOM Error)
> --
>
> Key: HIVE-26376
> URL: https://issues.apache.org/jira/browse/HIVE-26376
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.1.2
> Environment: !Screenshot 2022-07-07 at 11.52.33 AM.png!
>Reporter: Ranith Sardar
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2022-07-07 at 11.52.33 AM.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive version:3.1.2
> Hive metastore heap size is 14GB, Memory Leak is happening after 4-5 days, 
> hive meta-store throwing error with OOM.
> If we disable the configuration, the memory leak disappears.
> In the case of, Heap dump size 3.5GB, a large number of filesystem objects(> 
> 9k instances) are being retained. It's occupying most of the heap space. 
> Added snapshot from the eclipse MAT.
> Bellow are part of the stack trace for OOM error:
> {code:java}
> at 
> org.apache.hadoop.hive.common.FileUtils.getFileStatusOrNull(Lorg/apache/hadoop/fs/FileSystem;Lorg/apache/hadoop/fs/Path;)Lorg/apache/hadoop/fs/FileStatus;
>  (FileUtils.java:801)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.checkPermissions(Lorg/apache/hadoop/conf/Configuration;Lorg/apache/hadoop/fs/Path;Ljava/util/EnumSet;)V
>  (StorageBasedAuthorizationProvider.java:371)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.authorize(Lorg/apache/hadoop/fs/Path;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;)V
>  (StorageBasedAuthorizationProvider.java:346)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.authorize(Lorg/apache/hadoop/hive/metastore/api/Database;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;)V
>  (StorageBasedAuthorizationProvider.java:154)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener.authorizeReadDatabase(Lorg/apache/hadoop/hive/metastore/events/PreReadDatabaseEvent;)V
>  (AuthorizationPreEventListener.java:208)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener.onEvent(Lorg/apache/hadoop/hive/metastore/events/PreEventContext;)V
>  (AuthorizationPreEventListener.java:153)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.firePreEvent(Lorg/apache/hadoop/hive/metastore/events/PreEventContext;)V
>  (HiveMetaStore.java:3221)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_database(Ljava/lang/String;)Lorg/apache/hadoop/hive/metastore/api/Database;
>  (HiveMetaStore.java:1352){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-26371) Constant propagation does not evaluate constraint expressions at merge when CBO is enabled

2022-07-08 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa resolved HIVE-26371.
---
Resolution: Fixed

Pushed to master. Thanks [~zabetak] for review.

> Constant propagation does not evaluate constraint expressions at merge when 
> CBO is enabled
> --
>
> Key: HIVE-26371
> URL: https://issues.apache.org/jira/browse/HIVE-26371
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Logical Optimizer
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Prior HIVE-23089 check and not null constraint violations may detected faster 
> when merging.
> {code:java}
> CREATE TABLE t_target(
> name string CHECK (length(name)<=20),
> age int,
> gpa double CHECK (gpa BETWEEN 0.0 AND 4.0))
> stored as orc TBLPROPERTIES ('transactional'='true');
> CREATE TABLE t_source(
> name string,
> age int,
> gpa double);
> insert into t_source(name, age, gpa) values ('student1', 16, null);
> insert into t_target(name, age, gpa) values ('student1', 16, 2.0);
> merge into t_target using t_source source on source.age=t_target.age when 
> matched then update set gpa=6;
> {code}
> Currently CBO can not handle constraint checks when merging so the filter 
> operator with the {{enforce_constraint}} call is added to the Hive operator 
> plan after CBO is succeeded and {{ConstantPropagate}} optimization is called 
> only from TezCompiler with {{{}ConstantPropagateOption.SHORTCUT{}}}. 
> With this option {{ConstantPropagate}} does not evaluate deterministic 
> functions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26371) Constant propagation does not evaluate constraint expressions at merge when CBO is enabled

2022-07-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26371?focusedWorklogId=788938=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-788938
 ]

ASF GitHub Bot logged work on HIVE-26371:
-

Author: ASF GitHub Bot
Created on: 08/Jul/22 11:08
Start Date: 08/Jul/22 11:08
Worklog Time Spent: 10m 
  Work Description: kasakrisz merged PR #3415:
URL: https://github.com/apache/hive/pull/3415




Issue Time Tracking
---

Worklog Id: (was: 788938)
Time Spent: 20m  (was: 10m)

> Constant propagation does not evaluate constraint expressions at merge when 
> CBO is enabled
> --
>
> Key: HIVE-26371
> URL: https://issues.apache.org/jira/browse/HIVE-26371
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Logical Optimizer
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Prior HIVE-23089 check and not null constraint violations may detected faster 
> when merging.
> {code:java}
> CREATE TABLE t_target(
> name string CHECK (length(name)<=20),
> age int,
> gpa double CHECK (gpa BETWEEN 0.0 AND 4.0))
> stored as orc TBLPROPERTIES ('transactional'='true');
> CREATE TABLE t_source(
> name string,
> age int,
> gpa double);
> insert into t_source(name, age, gpa) values ('student1', 16, null);
> insert into t_target(name, age, gpa) values ('student1', 16, 2.0);
> merge into t_target using t_source source on source.age=t_target.age when 
> matched then update set gpa=6;
> {code}
> Currently CBO can not handle constraint checks when merging so the filter 
> operator with the {{enforce_constraint}} call is added to the Hive operator 
> plan after CBO is succeeded and {{ConstantPropagate}} optimization is called 
> only from TezCompiler with {{{}ConstantPropagateOption.SHORTCUT{}}}. 
> With this option {{ConstantPropagate}} does not evaluate deterministic 
> functions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-26373) ClassCastException when reading timestamps from HBase table with Avro data

2022-07-08 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis resolved HIVE-26373.

Fix Version/s: 4.0.0-alpha-2
   Resolution: Fixed

Fixed in 
[https://github.com/apache/hive/commit/97d7630bca10e96229519ab397f5cf122e5622e3.]
 Thanks for the PR [~soumyakanti.das] !

> ClassCastException when reading timestamps from HBase table with Avro data
> --
>
> Key: HIVE-26373
> URL: https://issues.apache.org/jira/browse/HIVE-26373
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Soumyakanti Das
>Assignee: Soumyakanti Das
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-2
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Consider an HBase table (e.g., HiveAvroTable) that has column with Avro data 
> and there are timestamps nested under complex/struct types.
> {code:sql}
> CREATE EXTERNAL TABLE hbase_avro_table(
> `key` string COMMENT '',
> `data_frv4` struct<`id`:string, `dischargedate`:struct<`value`:timestamp>>)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.hbase.HBaseSerDe'
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES (
> 'serialization.format'='1',
> 'hbase.columns.mapping' = ':key,data:frV4',
> 'data.frV4.serialization.type'='avro',
> 'data.frV4.avro.schema.url'='path/to/avro/schema/for/column/filename.avsc'
> )
> TBLPROPERTIES (
> 'hbase.table.name' = 'HiveAvroTable',
> 'hbase.struct.autogenerate'='true');
> {code}
> Any attempt to read the timestamp value from the nested struct leads to a 
> {{{}ClassCastException{}}}.
> {code:sql}
> select data_frV4.dischargedate.value from hbase_avro_table;
> {code}
> Below you can find the stack trace for the previous query:
> {noformat}
> 2022-07-05T08:40:51,572 ERROR [LocalJobRunner Map Task Executor #0] 
> mr.ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:573)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:148)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.common.type.Timestamp cannot be cast to 
> org.apache.hadoop.hive.serde2.lazy.LazyPrimitive
> at 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.AbstractPrimitiveLazyObjectInspector.getPrimitiveWritableObject(AbstractPrimitiveLazyObjectInspector.java:40)
> at 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyTimestampObjectInspector.getPrimitiveWritableObject(LazyTimestampObjectInspector.java:29)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:308)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:292)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:247)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:231)
> at 
> org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:55)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:1059)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:937)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:937)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:152)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:552)
> ... 11 more 
> {noformat}
> The problem starts in {{toLazyObject}} method of 
> 

[jira] [Commented] (HIVE-26373) ClassCastException when reading timestamps from HBase table with Avro data

2022-07-08 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564233#comment-17564233
 ] 

Stamatis Zampetakis commented on HIVE-26373:


I updated the summary and description to better reflect the problem.

> ClassCastException when reading timestamps from HBase table with Avro data
> --
>
> Key: HIVE-26373
> URL: https://issues.apache.org/jira/browse/HIVE-26373
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Soumyakanti Das
>Assignee: Soumyakanti Das
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Consider an HBase table (e.g., HiveAvroTable) that has column with Avro data 
> and there are timestamps nested under complex/struct types.
> {code:sql}
> CREATE EXTERNAL TABLE hbase_avro_table(
> `key` string COMMENT '',
> `data_frv4` struct<`id`:string, `dischargedate`:struct<`value`:timestamp>>)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.hbase.HBaseSerDe'
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES (
> 'serialization.format'='1',
> 'hbase.columns.mapping' = ':key,data:frV4',
> 'data.frV4.serialization.type'='avro',
> 'data.frV4.avro.schema.url'='path/to/avro/schema/for/column/filename.avsc'
> )
> TBLPROPERTIES (
> 'hbase.table.name' = 'HiveAvroTable',
> 'hbase.struct.autogenerate'='true');
> {code}
> Any attempt to read the timestamp value from the nested struct leads to a 
> {{{}ClassCastException{}}}.
> {code:sql}
> select data_frV4.dischargedate.value from hbase_avro_table;
> {code}
> Below you can find the stack trace for the previous query:
> {noformat}
> 2022-07-05T08:40:51,572 ERROR [LocalJobRunner Map Task Executor #0] 
> mr.ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:573)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:148)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.common.type.Timestamp cannot be cast to 
> org.apache.hadoop.hive.serde2.lazy.LazyPrimitive
> at 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.AbstractPrimitiveLazyObjectInspector.getPrimitiveWritableObject(AbstractPrimitiveLazyObjectInspector.java:40)
> at 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyTimestampObjectInspector.getPrimitiveWritableObject(LazyTimestampObjectInspector.java:29)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:308)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:292)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:247)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:231)
> at 
> org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:55)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:1059)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:937)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:937)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:152)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:552)
> ... 11 more 
> {noformat}
> The problem starts in {{toLazyObject}} method of 
> {*}AvroLazyObjectInspector.java{*}, when 
> 

[jira] [Updated] (HIVE-26373) ClassCastException when reading timestamps from HBase table with Avro data

2022-07-08 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-26373:
---
Description: 
Consider an HBase table (e.g., HiveAvroTable) that has column with Avro data 
and there are timestamps nested under complex/struct types.
{code:sql}
CREATE EXTERNAL TABLE hbase_avro_table(
`key` string COMMENT '',
`data_frv4` struct<`id`:string, `dischargedate`:struct<`value`:timestamp>>)
ROW FORMAT SERDE
  'org.apache.hadoop.hive.hbase.HBaseSerDe'
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (
'serialization.format'='1',
'hbase.columns.mapping' = ':key,data:frV4',
'data.frV4.serialization.type'='avro',
'data.frV4.avro.schema.url'='path/to/avro/schema/for/column/filename.avsc'
)
TBLPROPERTIES (
'hbase.table.name' = 'HiveAvroTable',
'hbase.struct.autogenerate'='true');
{code}
Any attempt to read the timestamp value from the nested struct leads to a 
{{{}ClassCastException{}}}.
{code:sql}
select data_frV4.dischargedate.value from hbase_avro_table;
{code}
Below you can find the stack trace for the previous query:
{noformat}
2022-07-05T08:40:51,572 ERROR [LocalJobRunner Map Task Executor #0] 
mr.ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
Error while processing row
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:573)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:148)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassCastException: 
org.apache.hadoop.hive.common.type.Timestamp cannot be cast to 
org.apache.hadoop.hive.serde2.lazy.LazyPrimitive
at 
org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.AbstractPrimitiveLazyObjectInspector.getPrimitiveWritableObject(AbstractPrimitiveLazyObjectInspector.java:40)
at 
org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyTimestampObjectInspector.getPrimitiveWritableObject(LazyTimestampObjectInspector.java:29)
at 
org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:308)
at 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:292)
at 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:247)
at 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:231)
at 
org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:55)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:1059)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:937)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:937)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:152)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:552)
... 11 more 
{noformat}
The problem starts in {{toLazyObject}} method of 
{*}AvroLazyObjectInspector.java{*}, when 
[this|https://github.com/apache/hive/blob/53009126f6fe7ccf24cf052fd6c156542f38b19d/serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroLazyObjectInspector.java#L347]
 condition returns false for {*}Timestamp{*}, preventing the conversion of 
*Timestamp* to *LazyTimestamp* 
[here|https://github.com/apache/hive/blob/53009126f6fe7ccf24cf052fd6c156542f38b19d/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java#L132].

The solution is to return {{true}} for Timestamps in the {{isPrimitive}} method.

  was:
For Avro data where the schema has nested struct with a Timestamp datatype, we 
get the following ClassCastException:
{code:java}
2022-07-05T08:40:51,572 ERROR [LocalJobRunner Map Task Executor #0] 
mr.ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
Error while processing row
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:573)
 

[jira] [Commented] (HIVE-26376) Hive Metastore connection leak (OOM Error)

2022-07-08 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564232#comment-17564232
 ] 

Ayush Saxena commented on HIVE-26376:
-

[~zabetak] 

This FileSystem isn't per user? then this will always be 1, irrespective of the 
number of calls, right?


{code:java}
final FileSystem fs = path.getFileSystem(conf);
{code}

When it is called per user, it is closed in this method in the finally block:

{code:java}
public static void checkFileAccessWithImpersonation(final FileSystem fs, final 
FileStatus stat, final FsAction action,
  final String user) throws Exception 
{code}



 

> Hive Metastore connection leak (OOM Error)
> --
>
> Key: HIVE-26376
> URL: https://issues.apache.org/jira/browse/HIVE-26376
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.1.2
> Environment: !Screenshot 2022-07-07 at 11.52.33 AM.png!
>Reporter: Ranith Sardar
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2022-07-07 at 11.52.33 AM.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive version:3.1.2
> Hive metastore heap size is 14GB, Memory Leak is happening after 4-5 days, 
> hive meta-store throwing error with OOM.
> If we disable the configuration, the memory leak disappears.
> In the case of, Heap dump size 3.5GB, a large number of filesystem objects(> 
> 9k instances) are being retained. It's occupying most of the heap space. 
> Added snapshot from the eclipse MAT.
> Bellow are part of the stack trace for OOM error:
> {code:java}
> at 
> org.apache.hadoop.hive.common.FileUtils.getFileStatusOrNull(Lorg/apache/hadoop/fs/FileSystem;Lorg/apache/hadoop/fs/Path;)Lorg/apache/hadoop/fs/FileStatus;
>  (FileUtils.java:801)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.checkPermissions(Lorg/apache/hadoop/conf/Configuration;Lorg/apache/hadoop/fs/Path;Ljava/util/EnumSet;)V
>  (StorageBasedAuthorizationProvider.java:371)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.authorize(Lorg/apache/hadoop/fs/Path;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;)V
>  (StorageBasedAuthorizationProvider.java:346)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.authorize(Lorg/apache/hadoop/hive/metastore/api/Database;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;)V
>  (StorageBasedAuthorizationProvider.java:154)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener.authorizeReadDatabase(Lorg/apache/hadoop/hive/metastore/events/PreReadDatabaseEvent;)V
>  (AuthorizationPreEventListener.java:208)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener.onEvent(Lorg/apache/hadoop/hive/metastore/events/PreEventContext;)V
>  (AuthorizationPreEventListener.java:153)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.firePreEvent(Lorg/apache/hadoop/hive/metastore/events/PreEventContext;)V
>  (HiveMetaStore.java:3221)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_database(Ljava/lang/String;)Lorg/apache/hadoop/hive/metastore/api/Database;
>  (HiveMetaStore.java:1352){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-26373) ClassCastException when reading timestamps from HBase table with Avro data

2022-07-08 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-26373:
---
Summary: ClassCastException when reading timestamps from HBase table with 
Avro data  (was: ClassCastException while inserting Avro data into Hbase table 
for nested struct with Timestamp)

> ClassCastException when reading timestamps from HBase table with Avro data
> --
>
> Key: HIVE-26373
> URL: https://issues.apache.org/jira/browse/HIVE-26373
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Soumyakanti Das
>Assignee: Soumyakanti Das
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> For Avro data where the schema has nested struct with a Timestamp datatype, 
> we get the following ClassCastException:
> {code:java}
> 2022-07-05T08:40:51,572 ERROR [LocalJobRunner Map Task Executor #0] 
> mr.ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:573)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:148)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.common.type.Timestamp cannot be cast to 
> org.apache.hadoop.hive.serde2.lazy.LazyPrimitive
> at 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.AbstractPrimitiveLazyObjectInspector.getPrimitiveWritableObject(AbstractPrimitiveLazyObjectInspector.java:40)
> at 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyTimestampObjectInspector.getPrimitiveWritableObject(LazyTimestampObjectInspector.java:29)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:308)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:292)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:247)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:231)
> at 
> org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:55)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:1059)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:937)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:937)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:152)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:552)
> ... 11 more {code}
> The problem starts in {{toLazyObject}} method of 
> {*}AvroLazyObjectInspector.java{*}, when 
> [this|https://github.com/apache/hive/blob/53009126f6fe7ccf24cf052fd6c156542f38b19d/serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroLazyObjectInspector.java#L347]
>  condition returns false for {*}Timestamp{*}, preventing the conversion of 
> *Timestamp* to *LazyTimestamp* 
> [here|https://github.com/apache/hive/blob/53009126f6fe7ccf24cf052fd6c156542f38b19d/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java#L132].
> The solution is to return {{true}} for Timestamps in the {{isPrimitive}} 
> method.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26373) ClassCastException while inserting Avro data into Hbase table for nested struct with Timestamp

2022-07-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26373?focusedWorklogId=788932=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-788932
 ]

ASF GitHub Bot logged work on HIVE-26373:
-

Author: ASF GitHub Bot
Created on: 08/Jul/22 10:40
Start Date: 08/Jul/22 10:40
Worklog Time Spent: 10m 
  Work Description: zabetak closed pull request #3418: HIVE-26373: 
ClassCastException while inserting Avro data into Hbase table for nested struct 
with Timestamp
URL: https://github.com/apache/hive/pull/3418




Issue Time Tracking
---

Worklog Id: (was: 788932)
Time Spent: 1h 10m  (was: 1h)

> ClassCastException while inserting Avro data into Hbase table for nested 
> struct with Timestamp
> --
>
> Key: HIVE-26373
> URL: https://issues.apache.org/jira/browse/HIVE-26373
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Soumyakanti Das
>Assignee: Soumyakanti Das
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> For Avro data where the schema has nested struct with a Timestamp datatype, 
> we get the following ClassCastException:
> {code:java}
> 2022-07-05T08:40:51,572 ERROR [LocalJobRunner Map Task Executor #0] 
> mr.ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:573)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:148)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.common.type.Timestamp cannot be cast to 
> org.apache.hadoop.hive.serde2.lazy.LazyPrimitive
> at 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.AbstractPrimitiveLazyObjectInspector.getPrimitiveWritableObject(AbstractPrimitiveLazyObjectInspector.java:40)
> at 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyTimestampObjectInspector.getPrimitiveWritableObject(LazyTimestampObjectInspector.java:29)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:308)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:292)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:247)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:231)
> at 
> org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:55)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:1059)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:937)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:937)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:152)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:552)
> ... 11 more {code}
> The problem starts in {{toLazyObject}} method of 
> {*}AvroLazyObjectInspector.java{*}, when 
> [this|https://github.com/apache/hive/blob/53009126f6fe7ccf24cf052fd6c156542f38b19d/serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroLazyObjectInspector.java#L347]
>  condition returns false for {*}Timestamp{*}, preventing the conversion of 
> *Timestamp* to *LazyTimestamp* 
> [here|https://github.com/apache/hive/blob/53009126f6fe7ccf24cf052fd6c156542f38b19d/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java#L132].
> The solution is to return {{true}} for Timestamps in the {{isPrimitive}} 
> method.



--
This message 

[jira] [Comment Edited] (HIVE-24083) hcatalog error in Hadoop 3.3.0: authentication type needed

2022-07-08 Thread Anmol Sundaram (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564223#comment-17564223
 ] 

Anmol Sundaram edited comment on HIVE-24083 at 7/8/22 10:35 AM:


I was able to get it run by the following (temporary?) fix : 

 
{code:java}
diff --git 
a/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/Main.java
 
b/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/Main.java
index d183b2e61b..5e5c4132f4 100644
— 
a/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/Main.java
+++ 
b/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/Main.java
@@ -38,6 +38,8 @@
import org.apache.hadoop.hive.shims.Utils;
import org.apache.hadoop.security.UserGroupInformation;
import org.apache.hadoop.security.authentication.client.PseudoAuthenticator;
+import org.apache.hadoop.security.authentication.server.AuthenticationFilter;
+import 
org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler;
import 
org.apache.hadoop.security.authentication.server.PseudoAuthenticationHandler;
import org.apache.hadoop.security.SecurityUtil;
import org.apache.hadoop.util.GenericOptionsParser;
@@ -269,6 +271,11 @@ public FilterHolder makeAuthFilter() throws IOException {
authFilter.setInitParameter("dfs.web.authentication.kerberos.keytab",
conf.kerberosKeytab());
}
+
+ authFilter.setInitParameter(AuthenticationFilter.AUTH_TYPE, 
UserGroupInformation.isSecurityEnabled() ?
+ KerberosAuthenticationHandler.TYPE :
+ PseudoAuthenticationHandler.TYPE);
+
return authFilter;
}{code}


was (Author: JIRAUSER288438):
I was able to get it run by the following (temporary?) fix : 
diff --git 
a/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/Main.java
 
b/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/Main.java
index d183b2e61b..5e5c4132f4 100644
--- 
a/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/Main.java
+++ 
b/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/Main.java
@@ -38,6 +38,8 @@
 import org.apache.hadoop.hive.shims.Utils;
 import org.apache.hadoop.security.UserGroupInformation;
 import org.apache.hadoop.security.authentication.client.PseudoAuthenticator;
+import org.apache.hadoop.security.authentication.server.AuthenticationFilter;
+import 
org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler;
 import 
org.apache.hadoop.security.authentication.server.PseudoAuthenticationHandler;
 import org.apache.hadoop.security.SecurityUtil;
 import org.apache.hadoop.util.GenericOptionsParser;
@@ -269,6 +271,11 @@ public FilterHolder makeAuthFilter() throws IOException \{
   authFilter.setInitParameter("dfs.web.authentication.kerberos.keytab",
 conf.kerberosKeytab());
 }
+
+authFilter.setInitParameter(AuthenticationFilter.AUTH_TYPE, 
UserGroupInformation.isSecurityEnabled() ?
+KerberosAuthenticationHandler.TYPE :
+PseudoAuthenticationHandler.TYPE);
+
 return authFilter;
   }

> hcatalog error in Hadoop 3.3.0: authentication type needed
> --
>
> Key: HIVE-24083
> URL: https://issues.apache.org/jira/browse/HIVE-24083
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 3.1.2
>Reporter: Javier J. Salmeron Garcia
>Priority: Minor
>
> Using Hive 3.1.2, webhcat fails to start in Hadoop 3.3.0 with the following 
> error:
> ```
> javax.servlet.ServletException: Authentication type must be specified: 
> simple|kerberos
> ```
> I tried in Hadoop 3.2.1 with the exact settings and it starts without issues:
>  
> ```
> webhcat: /tmp/hadoop-3.2.1//bin/hadoop jar 
> /opt/bitnami/hadoop/hive/hcatalog/sbin/../share/webhcat/svr/lib/hive-webhcat-3.1.2.jar
>  org.apache.hive.hcatalog.templeton.Main
> webhcat: starting ... started.
> webhcat: done
> ```
>  
> I can provide more logs if needed. Detected authentication settings:
>  
> ```
> hadoop.http.authentication.simple.anonymous.allowed=true
> hadoop.http.authentication.type=simple
> hadoop.security.authentication=simple
> ipc.client.fallback-to-simple-auth-allowed=false
> yarn.timeline-service.http-authentication.simple.anonymous.allowed=true
> yarn.timeline-service.http-authentication.type=simple
> ```
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-24083) hcatalog error in Hadoop 3.3.0: authentication type needed

2022-07-08 Thread Anmol Sundaram (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564223#comment-17564223
 ] 

Anmol Sundaram commented on HIVE-24083:
---

I was able to get it run by the following (temporary?) fix : 
diff --git 
a/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/Main.java
 
b/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/Main.java
index d183b2e61b..5e5c4132f4 100644
--- 
a/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/Main.java
+++ 
b/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/Main.java
@@ -38,6 +38,8 @@
 import org.apache.hadoop.hive.shims.Utils;
 import org.apache.hadoop.security.UserGroupInformation;
 import org.apache.hadoop.security.authentication.client.PseudoAuthenticator;
+import org.apache.hadoop.security.authentication.server.AuthenticationFilter;
+import 
org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler;
 import 
org.apache.hadoop.security.authentication.server.PseudoAuthenticationHandler;
 import org.apache.hadoop.security.SecurityUtil;
 import org.apache.hadoop.util.GenericOptionsParser;
@@ -269,6 +271,11 @@ public FilterHolder makeAuthFilter() throws IOException \{
   authFilter.setInitParameter("dfs.web.authentication.kerberos.keytab",
 conf.kerberosKeytab());
 }
+
+authFilter.setInitParameter(AuthenticationFilter.AUTH_TYPE, 
UserGroupInformation.isSecurityEnabled() ?
+KerberosAuthenticationHandler.TYPE :
+PseudoAuthenticationHandler.TYPE);
+
 return authFilter;
   }

> hcatalog error in Hadoop 3.3.0: authentication type needed
> --
>
> Key: HIVE-24083
> URL: https://issues.apache.org/jira/browse/HIVE-24083
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 3.1.2
>Reporter: Javier J. Salmeron Garcia
>Priority: Minor
>
> Using Hive 3.1.2, webhcat fails to start in Hadoop 3.3.0 with the following 
> error:
> ```
> javax.servlet.ServletException: Authentication type must be specified: 
> simple|kerberos
> ```
> I tried in Hadoop 3.2.1 with the exact settings and it starts without issues:
>  
> ```
> webhcat: /tmp/hadoop-3.2.1//bin/hadoop jar 
> /opt/bitnami/hadoop/hive/hcatalog/sbin/../share/webhcat/svr/lib/hive-webhcat-3.1.2.jar
>  org.apache.hive.hcatalog.templeton.Main
> webhcat: starting ... started.
> webhcat: done
> ```
>  
> I can provide more logs if needed. Detected authentication settings:
>  
> ```
> hadoop.http.authentication.simple.anonymous.allowed=true
> hadoop.http.authentication.type=simple
> hadoop.security.authentication=simple
> ipc.client.fallback-to-simple-auth-allowed=false
> yarn.timeline-service.http-authentication.simple.anonymous.allowed=true
> yarn.timeline-service.http-authentication.type=simple
> ```
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26376) Hive Metastore connection leak (OOM Error)

2022-07-08 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564220#comment-17564220
 ] 

Stamatis Zampetakis commented on HIVE-26376:


[~RANith] Thanks for reporting this. I believe the problem starts from the fact 
that there are various places where we are not closing the Filesystem objects 
thus they keep accumulate in the CACHE. In this case the class to blame seems 
to be the {{StorageBasedAuthorizationProvider}}. I created a PR with a proposed 
fix, can you please test the patch and let us know if it solves your problem.

> Hive Metastore connection leak (OOM Error)
> --
>
> Key: HIVE-26376
> URL: https://issues.apache.org/jira/browse/HIVE-26376
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.1.2
> Environment: !Screenshot 2022-07-07 at 11.52.33 AM.png!
>Reporter: Ranith Sardar
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2022-07-07 at 11.52.33 AM.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive version:3.1.2
> Hive metastore heap size is 14GB, Memory Leak is happening after 4-5 days, 
> hive meta-store throwing error with OOM.
> If we disable the configuration, the memory leak disappears.
> In the case of, Heap dump size 3.5GB, a large number of filesystem objects(> 
> 9k instances) are being retained. It's occupying most of the heap space. 
> Added snapshot from the eclipse MAT.
> Bellow are part of the stack trace for OOM error:
> {code:java}
> at 
> org.apache.hadoop.hive.common.FileUtils.getFileStatusOrNull(Lorg/apache/hadoop/fs/FileSystem;Lorg/apache/hadoop/fs/Path;)Lorg/apache/hadoop/fs/FileStatus;
>  (FileUtils.java:801)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.checkPermissions(Lorg/apache/hadoop/conf/Configuration;Lorg/apache/hadoop/fs/Path;Ljava/util/EnumSet;)V
>  (StorageBasedAuthorizationProvider.java:371)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.authorize(Lorg/apache/hadoop/fs/Path;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;)V
>  (StorageBasedAuthorizationProvider.java:346)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.authorize(Lorg/apache/hadoop/hive/metastore/api/Database;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;)V
>  (StorageBasedAuthorizationProvider.java:154)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener.authorizeReadDatabase(Lorg/apache/hadoop/hive/metastore/events/PreReadDatabaseEvent;)V
>  (AuthorizationPreEventListener.java:208)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener.onEvent(Lorg/apache/hadoop/hive/metastore/events/PreEventContext;)V
>  (AuthorizationPreEventListener.java:153)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.firePreEvent(Lorg/apache/hadoop/hive/metastore/events/PreEventContext;)V
>  (HiveMetaStore.java:3221)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_database(Ljava/lang/String;)Lorg/apache/hadoop/hive/metastore/api/Database;
>  (HiveMetaStore.java:1352){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26376) Hive Metastore connection leak (OOM Error)

2022-07-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26376?focusedWorklogId=788922=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-788922
 ]

ASF GitHub Bot logged work on HIVE-26376:
-

Author: ASF GitHub Bot
Created on: 08/Jul/22 10:26
Start Date: 08/Jul/22 10:26
Worklog Time Spent: 10m 
  Work Description: zabetak opened a new pull request, #3424:
URL: https://github.com/apache/hive/pull/3424

   ### What changes were proposed in this pull request?
   Close filesystem references in StorageBasedAuthorizationProvider after use.
   
   ### Why are the changes needed?
   To prevent leaving filesystem objects and hitting OOM error.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Manual tests.
   




Issue Time Tracking
---

Worklog Id: (was: 788922)
Remaining Estimate: 0h
Time Spent: 10m

> Hive Metastore connection leak (OOM Error)
> --
>
> Key: HIVE-26376
> URL: https://issues.apache.org/jira/browse/HIVE-26376
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.1.2
> Environment: !Screenshot 2022-07-07 at 11.52.33 AM.png!
>Reporter: Ranith Sardar
>Assignee: Stamatis Zampetakis
>Priority: Major
> Attachments: Screenshot 2022-07-07 at 11.52.33 AM.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive version:3.1.2
> Hive metastore heap size is 14GB, Memory Leak is happening after 4-5 days, 
> hive meta-store throwing error with OOM.
> If we disable the configuration, the memory leak disappears.
> In the case of, Heap dump size 3.5GB, a large number of filesystem objects(> 
> 9k instances) are being retained. It's occupying most of the heap space. 
> Added snapshot from the eclipse MAT.
> Bellow are part of the stack trace for OOM error:
> {code:java}
> at 
> org.apache.hadoop.hive.common.FileUtils.getFileStatusOrNull(Lorg/apache/hadoop/fs/FileSystem;Lorg/apache/hadoop/fs/Path;)Lorg/apache/hadoop/fs/FileStatus;
>  (FileUtils.java:801)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.checkPermissions(Lorg/apache/hadoop/conf/Configuration;Lorg/apache/hadoop/fs/Path;Ljava/util/EnumSet;)V
>  (StorageBasedAuthorizationProvider.java:371)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.authorize(Lorg/apache/hadoop/fs/Path;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;)V
>  (StorageBasedAuthorizationProvider.java:346)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.authorize(Lorg/apache/hadoop/hive/metastore/api/Database;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;)V
>  (StorageBasedAuthorizationProvider.java:154)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener.authorizeReadDatabase(Lorg/apache/hadoop/hive/metastore/events/PreReadDatabaseEvent;)V
>  (AuthorizationPreEventListener.java:208)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener.onEvent(Lorg/apache/hadoop/hive/metastore/events/PreEventContext;)V
>  (AuthorizationPreEventListener.java:153)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.firePreEvent(Lorg/apache/hadoop/hive/metastore/events/PreEventContext;)V
>  (HiveMetaStore.java:3221)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_database(Ljava/lang/String;)Lorg/apache/hadoop/hive/metastore/api/Database;
>  (HiveMetaStore.java:1352){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-26376) Hive Metastore connection leak (OOM Error)

2022-07-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26376:
--
Labels: pull-request-available  (was: )

> Hive Metastore connection leak (OOM Error)
> --
>
> Key: HIVE-26376
> URL: https://issues.apache.org/jira/browse/HIVE-26376
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.1.2
> Environment: !Screenshot 2022-07-07 at 11.52.33 AM.png!
>Reporter: Ranith Sardar
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2022-07-07 at 11.52.33 AM.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive version:3.1.2
> Hive metastore heap size is 14GB, Memory Leak is happening after 4-5 days, 
> hive meta-store throwing error with OOM.
> If we disable the configuration, the memory leak disappears.
> In the case of, Heap dump size 3.5GB, a large number of filesystem objects(> 
> 9k instances) are being retained. It's occupying most of the heap space. 
> Added snapshot from the eclipse MAT.
> Bellow are part of the stack trace for OOM error:
> {code:java}
> at 
> org.apache.hadoop.hive.common.FileUtils.getFileStatusOrNull(Lorg/apache/hadoop/fs/FileSystem;Lorg/apache/hadoop/fs/Path;)Lorg/apache/hadoop/fs/FileStatus;
>  (FileUtils.java:801)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.checkPermissions(Lorg/apache/hadoop/conf/Configuration;Lorg/apache/hadoop/fs/Path;Ljava/util/EnumSet;)V
>  (StorageBasedAuthorizationProvider.java:371)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.authorize(Lorg/apache/hadoop/fs/Path;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;)V
>  (StorageBasedAuthorizationProvider.java:346)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.authorize(Lorg/apache/hadoop/hive/metastore/api/Database;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;)V
>  (StorageBasedAuthorizationProvider.java:154)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener.authorizeReadDatabase(Lorg/apache/hadoop/hive/metastore/events/PreReadDatabaseEvent;)V
>  (AuthorizationPreEventListener.java:208)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener.onEvent(Lorg/apache/hadoop/hive/metastore/events/PreEventContext;)V
>  (AuthorizationPreEventListener.java:153)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.firePreEvent(Lorg/apache/hadoop/hive/metastore/events/PreEventContext;)V
>  (HiveMetaStore.java:3221)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_database(Ljava/lang/String;)Lorg/apache/hadoop/hive/metastore/api/Database;
>  (HiveMetaStore.java:1352){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-26376) Hive Metastore connection leak (OOM Error)

2022-07-08 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis reassigned HIVE-26376:
--

Assignee: Stamatis Zampetakis

> Hive Metastore connection leak (OOM Error)
> --
>
> Key: HIVE-26376
> URL: https://issues.apache.org/jira/browse/HIVE-26376
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.1.2
> Environment: !Screenshot 2022-07-07 at 11.52.33 AM.png!
>Reporter: Ranith Sardar
>Assignee: Stamatis Zampetakis
>Priority: Major
> Attachments: Screenshot 2022-07-07 at 11.52.33 AM.png
>
>
> Hive version:3.1.2
> Hive metastore heap size is 14GB, Memory Leak is happening after 4-5 days, 
> hive meta-store throwing error with OOM.
> If we disable the configuration, the memory leak disappears.
> In the case of, Heap dump size 3.5GB, a large number of filesystem objects(> 
> 9k instances) are being retained. It's occupying most of the heap space. 
> Added snapshot from the eclipse MAT.
> Bellow are part of the stack trace for OOM error:
> {code:java}
> at 
> org.apache.hadoop.hive.common.FileUtils.getFileStatusOrNull(Lorg/apache/hadoop/fs/FileSystem;Lorg/apache/hadoop/fs/Path;)Lorg/apache/hadoop/fs/FileStatus;
>  (FileUtils.java:801)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.checkPermissions(Lorg/apache/hadoop/conf/Configuration;Lorg/apache/hadoop/fs/Path;Ljava/util/EnumSet;)V
>  (StorageBasedAuthorizationProvider.java:371)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.authorize(Lorg/apache/hadoop/fs/Path;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;)V
>  (StorageBasedAuthorizationProvider.java:346)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.authorize(Lorg/apache/hadoop/hive/metastore/api/Database;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;)V
>  (StorageBasedAuthorizationProvider.java:154)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener.authorizeReadDatabase(Lorg/apache/hadoop/hive/metastore/events/PreReadDatabaseEvent;)V
>  (AuthorizationPreEventListener.java:208)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener.onEvent(Lorg/apache/hadoop/hive/metastore/events/PreEventContext;)V
>  (AuthorizationPreEventListener.java:153)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.firePreEvent(Lorg/apache/hadoop/hive/metastore/events/PreEventContext;)V
>  (HiveMetaStore.java:3221)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_database(Ljava/lang/String;)Lorg/apache/hadoop/hive/metastore/api/Database;
>  (HiveMetaStore.java:1352){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26376) Hive Metastore connection leak (OOM Error)

2022-07-08 Thread Ranith Sardar (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564210#comment-17564210
 ] 

Ranith Sardar commented on HIVE-26376:
--

yes, [~asolimando] [~ayushtkn] 
Disabled fs.hdfs.impl.disable.cache property in HDFS level.

> Hive Metastore connection leak (OOM Error)
> --
>
> Key: HIVE-26376
> URL: https://issues.apache.org/jira/browse/HIVE-26376
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.1.2
> Environment: !Screenshot 2022-07-07 at 11.52.33 AM.png!
>Reporter: Ranith Sardar
>Priority: Major
> Attachments: Screenshot 2022-07-07 at 11.52.33 AM.png
>
>
> Hive version:3.1.2
> Hive metastore heap size is 14GB, Memory Leak is happening after 4-5 days, 
> hive meta-store throwing error with OOM.
> If we disable the configuration, the memory leak disappears.
> In the case of, Heap dump size 3.5GB, a large number of filesystem objects(> 
> 9k instances) are being retained. It's occupying most of the heap space. 
> Added snapshot from the eclipse MAT.
> Bellow are part of the stack trace for OOM error:
> {code:java}
> at 
> org.apache.hadoop.hive.common.FileUtils.getFileStatusOrNull(Lorg/apache/hadoop/fs/FileSystem;Lorg/apache/hadoop/fs/Path;)Lorg/apache/hadoop/fs/FileStatus;
>  (FileUtils.java:801)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.checkPermissions(Lorg/apache/hadoop/conf/Configuration;Lorg/apache/hadoop/fs/Path;Ljava/util/EnumSet;)V
>  (StorageBasedAuthorizationProvider.java:371)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.authorize(Lorg/apache/hadoop/fs/Path;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;)V
>  (StorageBasedAuthorizationProvider.java:346)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.authorize(Lorg/apache/hadoop/hive/metastore/api/Database;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;[Lorg/apache/hadoop/hive/ql/security/authorization/Privilege;)V
>  (StorageBasedAuthorizationProvider.java:154)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener.authorizeReadDatabase(Lorg/apache/hadoop/hive/metastore/events/PreReadDatabaseEvent;)V
>  (AuthorizationPreEventListener.java:208)
>   at 
> org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener.onEvent(Lorg/apache/hadoop/hive/metastore/events/PreEventContext;)V
>  (AuthorizationPreEventListener.java:153)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.firePreEvent(Lorg/apache/hadoop/hive/metastore/events/PreEventContext;)V
>  (HiveMetaStore.java:3221)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_database(Ljava/lang/String;)Lorg/apache/hadoop/hive/metastore/api/Database;
>  (HiveMetaStore.java:1352){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26337) Duplicate and single event log under NOTIFICATION_LOG for drop-partition unlike add-partition event which overloads the metadata table

2022-07-08 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564178#comment-17564178
 ] 

Ayush Saxena commented on HIVE-26337:
-

hmm, that isn't something good, if it is true, this would bother 
hive-replication performance as well. 

Do you plan to raise a PR with the fix, let me know if you face any issues 
there, I can try help...

> Duplicate and single event log under NOTIFICATION_LOG for drop-partition 
> unlike add-partition event which overloads the metadata table
> --
>
> Key: HIVE-26337
> URL: https://issues.apache.org/jira/browse/HIVE-26337
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
> Environment: HDInsight 4.1.8
> Hive 3.1.2
>Reporter: Sindhu Subhas
>Priority: Major
>
> Multiple events are generated for each drop partition under NOTIFICATION_LOG 
> whereas a single event is generated for add_partition during msck sync 
> partitions as part of partition management discovery.
> Say, Hive is to add 5 new partitions and remove 100 partitions into SQL 
> server, one event is generated for 1 add_partition event entry whereas 100 
> drop_partitions events are generated under NOTIFICATION_LOG. This results in 
> overloading of the table and also indexes associated with the table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-24484) Upgrade Hadoop to 3.3.1 And Tez to 0.10.2

2022-07-08 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564175#comment-17564175
 ] 

Ayush Saxena commented on HIVE-24484:
-

Moving to 3.3.1 & clubbed with Tez upgrade to 0.10.2, since both needs to go 
together for a green run. Changed the title to reflect that.

We have the tests sorted now in the PR, once we get an official Tez release, 
most probably within 2 weeks, will go ahead and commit the PR.

 

> Upgrade Hadoop to 3.3.1 And Tez to 0.10.2 
> --
>
> Key: HIVE-24484
> URL: https://issues.apache.org/jira/browse/HIVE-24484
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 13.05h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-24484) Upgrade Hadoop to 3.3.1 And Tez to 0.10.2

2022-07-08 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HIVE-24484:

Summary: Upgrade Hadoop to 3.3.1 And Tez to 0.10.2   (was: Upgrade Hadoop 
to 3.3.1)

> Upgrade Hadoop to 3.3.1 And Tez to 0.10.2 
> --
>
> Key: HIVE-24484
> URL: https://issues.apache.org/jira/browse/HIVE-24484
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 13.05h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-26378) Improve error message for masking over complex data types

2022-07-08 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved HIVE-26378.
-
Fix Version/s: 4.0.0-alpha-2
   Resolution: Fixed

> Improve error message for masking over complex data types
> -
>
> Key: HIVE-26378
> URL: https://issues.apache.org/jira/browse/HIVE-26378
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Security
>Affects Versions: 4.0.0-alpha-2
>Reporter: Alessandro Solimando
>Assignee: Alessandro Solimando
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-2
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The current error when applying column masking over (unsupported) complex 
> data types could be improved and be more explicit.
> Currently, the thrown error is as follows:
> {noformat}
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.parse.SemanticException:org.apache.hadoop.hive.ql.parse.ParseException:
>  line 1:57 cannot recognize input near 'map' '<' 'string' in primitive type 
> specification
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.rewriteASTWithMaskAndFilter(SemanticAnalyzer.java:10370)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10486)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:219)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:238)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:465)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:321)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1224)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1218)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:146)
> ... 15 more
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.parse.ParseException:line 1:57 cannot recognize 
> input near 'map' '<' 'string' in primitive type specification
> at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:214)
> at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:171)
> at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.rewriteASTWithMaskAndFilter(SemanticAnalyzer.java:10368)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26378) Improve error message for masking over complex data types

2022-07-08 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564166#comment-17564166
 ] 

Ayush Saxena commented on HIVE-26378:
-

Committed to master.

Thanx [~asolimando] for the contribution!!!

> Improve error message for masking over complex data types
> -
>
> Key: HIVE-26378
> URL: https://issues.apache.org/jira/browse/HIVE-26378
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Security
>Affects Versions: 4.0.0-alpha-2
>Reporter: Alessandro Solimando
>Assignee: Alessandro Solimando
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The current error when applying column masking over (unsupported) complex 
> data types could be improved and be more explicit.
> Currently, the thrown error is as follows:
> {noformat}
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.parse.SemanticException:org.apache.hadoop.hive.ql.parse.ParseException:
>  line 1:57 cannot recognize input near 'map' '<' 'string' in primitive type 
> specification
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.rewriteASTWithMaskAndFilter(SemanticAnalyzer.java:10370)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10486)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:219)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:238)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:465)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:321)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1224)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1218)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:146)
> ... 15 more
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.parse.ParseException:line 1:57 cannot recognize 
> input near 'map' '<' 'string' in primitive type specification
> at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:214)
> at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:171)
> at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.rewriteASTWithMaskAndFilter(SemanticAnalyzer.java:10368)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26378) Improve error message for masking over complex data types

2022-07-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26378?focusedWorklogId=788890=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-788890
 ]

ASF GitHub Bot logged work on HIVE-26378:
-

Author: ASF GitHub Bot
Created on: 08/Jul/22 08:52
Start Date: 08/Jul/22 08:52
Worklog Time Spent: 10m 
  Work Description: ayushtkn merged PR #3421:
URL: https://github.com/apache/hive/pull/3421




Issue Time Tracking
---

Worklog Id: (was: 788890)
Time Spent: 20m  (was: 10m)

> Improve error message for masking over complex data types
> -
>
> Key: HIVE-26378
> URL: https://issues.apache.org/jira/browse/HIVE-26378
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Security
>Affects Versions: 4.0.0-alpha-2
>Reporter: Alessandro Solimando
>Assignee: Alessandro Solimando
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The current error when applying column masking over (unsupported) complex 
> data types could be improved and be more explicit.
> Currently, the thrown error is as follows:
> {noformat}
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.parse.SemanticException:org.apache.hadoop.hive.ql.parse.ParseException:
>  line 1:57 cannot recognize input near 'map' '<' 'string' in primitive type 
> specification
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.rewriteASTWithMaskAndFilter(SemanticAnalyzer.java:10370)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10486)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:219)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:238)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:465)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:321)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1224)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1218)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:146)
> ... 15 more
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.parse.ParseException:line 1:57 cannot recognize 
> input near 'map' '<' 'string' in primitive type specification
> at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:214)
> at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:171)
> at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.rewriteASTWithMaskAndFilter(SemanticAnalyzer.java:10368)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-26380) Fix NPE when reading a struct field with null value from iceberg table

2022-07-08 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-26380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Pintér reassigned HIVE-26380:



> Fix NPE when reading a struct field with null value from iceberg table
> --
>
> Key: HIVE-26380
> URL: https://issues.apache.org/jira/browse/HIVE-26380
> Project: Hive
>  Issue Type: Bug
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>
> When reading a map that contains a struct of null an NPE is raised
> {code:java}
> Caused by: java.lang.NullPointerException at 
> org.apache.iceberg.mr.hive.serde.objectinspector.IcebergRecordObjectInspector.getStructFieldData(IcebergRecordObjectInspector.java:75)at
>  
> org.apache.hadoop.hive.ql.exec.ExprNodeFieldEvaluator._evaluate(ExprNodeFieldEvaluator.java:94)
>  at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
>  at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:88)
>  at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFStruct.evaluate(GenericUDFStruct.java:70)
>  at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:197)
>  at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
>  at 
> org.apache.hadoop.hive.ql.exec.ExprNodeFieldEvaluator._evaluate(ExprNodeFieldEvaluator.java:79)
>  at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
>  at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:68)
>  at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88) 
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HIVE-24066) Hive query on parquet data should identify if column is not present in file schema and show NULL value instead of Exception

2022-07-08 Thread Seonguk Kim (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17563720#comment-17563720
 ] 

Seonguk Kim edited comment on HIVE-24066 at 7/8/22 6:25 AM:


null check support for `context.os` would be useful.

(null check for struct column that not exists in file)


was (Author: JIRAUSER292443):
It would be useful if null check for context.os works.

(null check for struct column that not exists in file)

> Hive query on parquet data should identify if column is not present in file 
> schema and show NULL value instead of Exception
> ---
>
> Key: HIVE-24066
> URL: https://issues.apache.org/jira/browse/HIVE-24066
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.3.5, 3.1.2
>Reporter: Jainik Vora
>Priority: Major
> Attachments: day_01.snappy.parquet
>
>
> I created a hive table containing columns with struct data type 
>   
> {code:java}
> CREATE EXTERNAL TABLE test_dwh.sample_parquet_table (
>   `context` struct<
> `app`: struct<
> `build`: string,
> `name`: string,
> `namespace`: string,
> `version`: string
> >,
> `device`: struct<
> `adtrackingenabled`: boolean,
> `advertisingid`: string,
> `id`: string,
> `manufacturer`: string,
> `model`: string,
> `type`: string
> >,
> `locale`: string,
> `library`: struct<
> `name`: string,
> `version`: string
> >,
> `os`: struct<
> `name`: string,
> `version`: string
> >,
> `screen`: struct<
> `height`: bigint,
> `width`: bigint
> >,
> `network`: struct<
> `carrier`: string,
> `cellular`: boolean,
> `wifi`: boolean
>  >,
> `timezone`: string,
> `userAgent`: string
> >
> ) PARTITIONED BY (day string)
> STORED as PARQUET
> LOCATION 's3://xyz/events'{code}
>  
>  All columns are nullable hence the parquet files read by the table don't 
> always contain all columns. If any file in a partition doesn't have 
> "context.os" struct and if "context.os.name" is queried, Hive throws an 
> exception as below. Same for "context.screen" as well.
>   
> {code:java}
> 2020-10-23T00:44:10,496 ERROR [db58bfe6-d0ca-4233-845a-8a10916c3ff1 
> main([])]: CliDriver (SessionState.java:printError(1126)) - Failed with 
> exception java.io.IOException:java.lang.RuntimeException: Primitive type 
> osshould not doesn't match typeos[name]
> 2020-10-23T00:44:10,496 ERROR [db58bfe6-d0ca-4233-845a-8a10916c3ff1 
> main([])]: CliDriver (SessionState.java:printError(1126)) - Failed with 
> exception java.io.IOException:java.lang.RuntimeException: Primitive type 
> osshould not doesn't match typeos[name]java.io.IOException: 
> java.lang.RuntimeException: Primitive type osshould not doesn't match 
> typeos[name] 
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:521)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:428)
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:147)
>   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2208)
>   at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:253)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336)
>   at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:787)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> Caused by: java.lang.RuntimeException: Primitive type osshould not doesn't 
> match typeos[name] 
>   at 
> org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.projectLeafTypes(DataWritableReadSupport.java:330)
>  
>   at 
> org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.projectLeafTypes(DataWritableReadSupport.java:322)
>  
>   at 
>