date:20200831

[jira] [Commented] (HIVE-22758) Create database with permission error when doas set to true

2020-08-31 Thread Chiran Ravani (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188126#comment-17188126
 ] 

Chiran Ravani commented on HIVE-22758:
--

[~ngangam] Updloaded combined  fix forHIVE-20001 and HIVE-22758 and created a 
PR: https://github.com/apache/hive/pull/1451

> Create database with permission error when doas set to true
> ---
>
> Key: HIVE-22758
> URL: https://issues.apache.org/jira/browse/HIVE-22758
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Chiran Ravani
>Assignee: Chiran Ravani
>Priority: Critical
> Attachments: HIVE-22758.1.patch, HIVE-22758.2.patch
>
>
> With doAs set to true, running create database on external location fails 
> with permission denied for write access on the directory for hive user (User 
> HMS is running as).
> Steps to reproduce the issue:
> 1. Turn on, Hive run as end-user to true.
> 2. Connect to hive as some user other than admin, eg:- chiran
> 3. Create a database with external location
> {code}
> create database externaldbexample location '/user/chiran/externaldbexample'
> {code}
> The above statement fails as write access is not available to hive service 
> user on HDFS as below.
> {code}
> > create database externaldbexample location '/user/chiran/externaldbexample';
> INFO  : Compiling 
> command(queryId=hive_20200122043626_5c95e1fd-ce00-45fd-b58d-54f5e579f87d): 
> create database externaldbexample location '/user/chiran/externaldbexample'
> INFO  : Semantic Analysis Completed (retrial = false)
> INFO  : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
> INFO  : Completed compiling 
> command(queryId=hive_20200122043626_5c95e1fd-ce00-45fd-b58d-54f5e579f87d); 
> Time taken: 1.377 seconds
> INFO  : Executing 
> command(queryId=hive_20200122043626_5c95e1fd-ce00-45fd-b58d-54f5e579f87d): 
> create database externaldbexample location '/user/chiran/externaldbexample'
> INFO  : Starting task [Stage-0:DDL] in serial mode
> ERROR : FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. 
> MetaException(message:java.lang.reflect.UndeclaredThrowableException)
> INFO  : Completed executing 
> command(queryId=hive_20200122043626_5c95e1fd-ce00-45fd-b58d-54f5e579f87d); 
> Time taken: 0.238 seconds
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask. 
> MetaException(message:java.lang.reflect.UndeclaredThrowableException) 
> (state=08S01,code=1)
> {code}
> From Hive Metastore service log, below is seen.
> {code}
> 2020-01-22T04:36:27,870 WARN  [pool-6-thread-6]: metastore.ObjectStore 
> (ObjectStore.java:getDatabase(1010)) - Failed to get database 
> hive.externaldbexample, returning NoSuchObjectExcept
> ion
> 2020-01-22T04:36:27,898 INFO  [pool-6-thread-6]: metastore.HiveMetaStore 
> (HiveMetaStore.java:run(1339)) - Creating database path in managed directory 
> hdfs://c470-node2.squadron.support.
> hortonworks.com:8020/user/chiran/externaldbexample
> 2020-01-22T04:36:27,903 INFO  [pool-6-thread-6]: utils.FileUtils 
> (FileUtils.java:mkdir(170)) - Creating directory if it doesn't exist: 
> hdfs://namenodeaddress:8020/user/chiran/externaldbexample
> 2020-01-22T04:36:27,932 ERROR [pool-6-thread-6]: utils.MetaStoreUtils 
> (MetaStoreUtils.java:logAndThrowMetaException(169)) - Got exception: 
> org.apache.hadoop.security.AccessControlException Permission denied: 
> user=hive, access=WRITE, inode="/user/chiran":chiran:chiran:drwxr-xr-x
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:399)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:255)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:193)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1859)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1843)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1802)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:59)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3150)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1126)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:707)
> at 
>

[jira] [Work logged] (HIVE-20001) With doas set to true, running select query as hrt_qa user on external table fails due to permission denied to read /warehouse/tablespace/managed directory.

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-20001?focusedWorklogId=476986=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476986
 ]

ASF GitHub Bot logged work on HIVE-20001:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 03:53
Start Date: 01/Sep/20 03:53
Worklog Time Spent: 10m 
  Work Description: cravani opened a new pull request #1451:
URL: https://github.com/apache/hive/pull/1451


   Fix for With doas set to true, running select query as hrt_qa user on 
external table fails due to permission denied to read 
/warehouse/tablespace/managed directory. and Create database with permission 
error when doas set to true



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476986)
Time Spent: 1h  (was: 50m)

> With doas set to true, running select query as hrt_qa user on external table 
> fails due to permission denied to read /warehouse/tablespace/managed 
> directory.
> 
>
> Key: HIVE-20001
> URL: https://issues.apache.org/jira/browse/HIVE-20001
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Critical
>  Labels: pull-request-available
> Attachments: HIVE-20001.1.patch, HIVE-20001.1.patch, 
> HIVE-20001.2.patch, HIVE-20001.3.patch, HIVE-20001.4.patch, HIVE-20001.5.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Hive: With doas set to true, running select query as hrt_qa user on external 
> table fails due to permission denied to read /warehouse/tablespace/managed 
> directory.
> Steps: 
> 1. Create a external table.
> 2. Set doas to true.
> 3. run select count(*) using user hrt_qa.
> Table creation query.
> {code}
> beeline -n hrt_qa -p pwd -u 
> "jdbc:hive2://ctr-e138-1518143905142-375925-01-06.hwx.site:2181,ctr-e138-1518143905142-375925-01-05.hwx.site:2181,ctr-e138-1518143905142-375925-01-07.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit"
>  --outputformat=tsv -e "drop table if exists test_table purge;
> create external table test_table(id int, age int) row format delimited fields 
> terminated by '|' stored as textfile;
> load data inpath '/tmp/table1.dat' overwrite into table test_table;
> {code}
> select count(*) query execution fails
> {code}
> beeline -n hrt_qa -p pwd -u 
> "jdbc:hive2://ctr-e138-1518143905142-375925-01-06.hwx.site:2181,ctr-e138-1518143905142-375925-01-05.hwx.site:2181,ctr-e138-1518143905142-375925-01-07.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit"
>  --outputformat=tsv -e "select count(*) from test_table where age>30 and 
> id<10100;"
> 2018-06-22 10:22:29,328|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|SLF4J: Class path contains 
> multiple SLF4J bindings.
> 2018-06-22 10:22:29,330|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|SLF4J: See 
> http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
> 2018-06-22 10:22:29,335|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|SLF4J: Actual binding is of 
> type [org.apache.logging.slf4j.Log4jLoggerFactory]
> 2018-06-22 10:22:31,408|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|Format tsv is deprecated, 
> please use tsv2
> 2018-06-22 10:22:31,529|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|Connecting to 
> jdbc:hive2://ctr-e138-1518143905142-375925-01-06.hwx.site:2181,ctr-e138-1518143905142-375925-01-05.hwx.site:2181,ctr-e138-1518143905142-375925-01-07.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit
> 2018-06-22 10:22:32,031|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|18/06/22 10:22:32

[jira] [Updated] (HIVE-24094) cast is not null, the results are different in cbo is true and false

2020-08-31 Thread zhaolong (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaolong updated HIVE-24094:

Component/s: CBO

> cast is not null, the results are different in cbo is true and false 
> -
>
> Key: HIVE-24094
> URL: https://issues.apache.org/jira/browse/HIVE-24094
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 3.1.0
>Reporter: zhaolong
>Priority: Major
> Attachments: image-2020-08-31-10-01-26-250.png, 
> image-2020-08-31-10-02-39-154.png
>
>
> 1.CREATE TABLE IF NOT EXISTS testa
> ( 
>  SEARCHWORD STRING, 
>  COUNT_NUM BIGINT, 
>  WORDS STRING 
> ) 
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\27' 
> STORED AS TEXTFILE; 
> 2.insert into testa values('searchword', 1, 'a');
> 3.set hive.cbo.enable=false;
> 4.SELECT 
> CASE 
>  WHEN CAST(searchword as bigint) IS NOT NULL THEN CAST(CAST(searchword as 
> bigint) as String) 
>  ELSE searchword 
> END AS WORDS, 
> searchword FROM testa;
> !image-2020-08-31-10-01-26-250.png!
> 5.set hive.cbo.enable=true;
> 6.SELECT 
> CASE 
>  WHEN CAST(searchword as bigint) IS NOT NULL THEN CAST(CAST(searchword as 
> bigint) as String) 
>  ELSE searchword 
> END AS WORDS, 
> searchword FROM testa;
> !image-2020-08-31-10-02-39-154.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24099) unix_timestamp,intersect,except throws NPE

2020-08-31 Thread zhaolong (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaolong updated HIVE-24099:

Component/s: CBO

> unix_timestamp,intersect,except throws NPE
> --
>
> Key: HIVE-24099
> URL: https://issues.apache.org/jira/browse/HIVE-24099
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 3.1.0
>Reporter: zhaolong
>Priority: Major
> Attachments: image-2020-09-01-10-22-07-549.png, 
> image-2020-09-01-10-26-14-062.png, image-2020-09-01-10-27-23-916.png
>
>
> unix_timestamp,intersect,except throws NPE when cbo is false and 
> optimize.constant.propagation is false
> reproduced problems:
>  1. unix_timestap:
>       set hive.cbo.enable=true;
>       set hive.optimize.constant.propagation=false;
>  {color:#00}     create table test_pt(idx string, namex string) 
> partitioned by(pt_dt string) stored as orc;{color}
> {color:#00} explain extended select count(1) from test_pt where pt_dt 
> = unix_timestamp();{color}
> {color:#00}!image-2020-09-01-10-22-07-549.png!{color}
> {color:#00}2.intersect{color}
> {color:#00} create table t1(id int, name string, score int);{color}
> create table t2(id int, name string, score int);
> insert into t1 values(1,'xiaoming', 98);
> insert into t2 values(2,'xiaohong', 95);
> select id from t1 intersect select id from t2;
> !image-2020-09-01-10-26-14-062.png!
> 3.except 
> select id from t1 except select id from t2;
>   !image-2020-09-01-10-27-23-916.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24101) Invalid table alias order by columns(>=2) if cbo is false

2020-08-31 Thread zhaolong (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaolong updated HIVE-24101:

Description: 
create table a
(
item_id string, 
stru_area_id string
)partitioned by ( PT_DT string) stored as orc;

create table b
(
CREATE_ORG_ID string,
PROMOTION_ID string,
PROMOTION_STATUS string
) partitioned by (pt_dt string) stored as orc;

create table c
(
STRU_ID string,
SUP_STRU string
) partitioned by(pt_dt string) stored as orc;

set hive.cbo.enable=false;

explain
insert into table a partition( PT_DT = '2020-08-22' )
(item_id , stru_area_id)
select 
 '' ITEM_ID , T.STRU_ID STRU_AREA_ID 
from ( 
 select 
 STRU_ID STRU_ID ,T0.STRU_ID STRU_ID_BRANCH 
 from c T0 
) T
inner join ( 
 select 
 CREATE_ORG_ID
 from b TT 
) TIV
on ( STRU_ID_BRANCH = CREATE_ORG_ID ) 
group by T.STRU_ID
order by 1,2;

!image-2020-09-01-11-29-50-729.png!

if delete order by 1,2 it`s ok.

 

> Invalid table alias order by columns(>=2) if cbo is false
> -
>
> Key: HIVE-24101
> URL: https://issues.apache.org/jira/browse/HIVE-24101
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 3.1.0
>Reporter: zhaolong
>Priority: Major
> Attachments: image-2020-09-01-11-29-50-729.png
>
>
> create table a
> (
> item_id string, 
> stru_area_id string
> )partitioned by ( PT_DT string) stored as orc;
> create table b
> (
> CREATE_ORG_ID string,
> PROMOTION_ID string,
> PROMOTION_STATUS string
> ) partitioned by (pt_dt string) stored as orc;
> create table c
> (
> STRU_ID string,
> SUP_STRU string
> ) partitioned by(pt_dt string) stored as orc;
> set hive.cbo.enable=false;
> explain
> insert into table a partition( PT_DT = '2020-08-22' )
> (item_id , stru_area_id)
> select 
>  '' ITEM_ID , T.STRU_ID STRU_AREA_ID 
> from ( 
>  select 
>  STRU_ID STRU_ID ,T0.STRU_ID STRU_ID_BRANCH 
>  from c T0 
> ) T
> inner join ( 
>  select 
>  CREATE_ORG_ID
>  from b TT 
> ) TIV
> on ( STRU_ID_BRANCH = CREATE_ORG_ID ) 
> group by T.STRU_ID
> order by 1,2;
> !image-2020-09-01-11-29-50-729.png!
> if delete order by 1,2 it`s ok.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24101) Invalid table alias order by columns(>=2) if cbo is false

2020-08-31 Thread zhaolong (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaolong updated HIVE-24101:

Attachment: image-2020-09-01-11-29-50-729.png

> Invalid table alias order by columns(>=2) if cbo is false
> -
>
> Key: HIVE-24101
> URL: https://issues.apache.org/jira/browse/HIVE-24101
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 3.1.0
>Reporter: zhaolong
>Priority: Major
> Attachments: image-2020-09-01-11-29-50-729.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-20001) With doas set to true, running select query as hrt_qa user on external table fails due to permission denied to read /warehouse/tablespace/managed directory.

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-20001?focusedWorklogId=476980=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476980
 ]

ASF GitHub Bot logged work on HIVE-20001:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 03:27
Start Date: 01/Sep/20 03:27
Worklog Time Spent: 10m 
  Work Description: cravani closed pull request #1450:
URL: https://github.com/apache/hive/pull/1450


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476980)
Time Spent: 50m  (was: 40m)

> With doas set to true, running select query as hrt_qa user on external table 
> fails due to permission denied to read /warehouse/tablespace/managed 
> directory.
> 
>
> Key: HIVE-20001
> URL: https://issues.apache.org/jira/browse/HIVE-20001
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Critical
>  Labels: pull-request-available
> Attachments: HIVE-20001.1.patch, HIVE-20001.1.patch, 
> HIVE-20001.2.patch, HIVE-20001.3.patch, HIVE-20001.4.patch, HIVE-20001.5.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Hive: With doas set to true, running select query as hrt_qa user on external 
> table fails due to permission denied to read /warehouse/tablespace/managed 
> directory.
> Steps: 
> 1. Create a external table.
> 2. Set doas to true.
> 3. run select count(*) using user hrt_qa.
> Table creation query.
> {code}
> beeline -n hrt_qa -p pwd -u 
> "jdbc:hive2://ctr-e138-1518143905142-375925-01-06.hwx.site:2181,ctr-e138-1518143905142-375925-01-05.hwx.site:2181,ctr-e138-1518143905142-375925-01-07.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit"
>  --outputformat=tsv -e "drop table if exists test_table purge;
> create external table test_table(id int, age int) row format delimited fields 
> terminated by '|' stored as textfile;
> load data inpath '/tmp/table1.dat' overwrite into table test_table;
> {code}
> select count(*) query execution fails
> {code}
> beeline -n hrt_qa -p pwd -u 
> "jdbc:hive2://ctr-e138-1518143905142-375925-01-06.hwx.site:2181,ctr-e138-1518143905142-375925-01-05.hwx.site:2181,ctr-e138-1518143905142-375925-01-07.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit"
>  --outputformat=tsv -e "select count(*) from test_table where age>30 and 
> id<10100;"
> 2018-06-22 10:22:29,328|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|SLF4J: Class path contains 
> multiple SLF4J bindings.
> 2018-06-22 10:22:29,330|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|SLF4J: See 
> http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
> 2018-06-22 10:22:29,335|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|SLF4J: Actual binding is of 
> type [org.apache.logging.slf4j.Log4jLoggerFactory]
> 2018-06-22 10:22:31,408|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|Format tsv is deprecated, 
> please use tsv2
> 2018-06-22 10:22:31,529|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|Connecting to 
> jdbc:hive2://ctr-e138-1518143905142-375925-01-06.hwx.site:2181,ctr-e138-1518143905142-375925-01-05.hwx.site:2181,ctr-e138-1518143905142-375925-01-07.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit
> 2018-06-22 10:22:32,031|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|18/06/22 10:22:32 [main]: 
> INFO jdbc.HiveConnection: Connected to 
> ctr-e138-1518143905142-375925-01-04.hwx.site:10001
> 2018-06-22 10:22:34,130|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|18/06/22

[jira] [Work logged] (HIVE-20001) With doas set to true, running select query as hrt_qa user on external table fails due to permission denied to read /warehouse/tablespace/managed directory.

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-20001?focusedWorklogId=476979=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476979
 ]

ASF GitHub Bot logged work on HIVE-20001:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 03:26
Start Date: 01/Sep/20 03:26
Worklog Time Spent: 10m 
  Work Description: cravani commented on pull request #1450:
URL: https://github.com/apache/hive/pull/1450#issuecomment-684173629


   Raised by mistake, please ignore.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476979)
Time Spent: 40m  (was: 0.5h)

> With doas set to true, running select query as hrt_qa user on external table 
> fails due to permission denied to read /warehouse/tablespace/managed 
> directory.
> 
>
> Key: HIVE-20001
> URL: https://issues.apache.org/jira/browse/HIVE-20001
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Critical
>  Labels: pull-request-available
> Attachments: HIVE-20001.1.patch, HIVE-20001.1.patch, 
> HIVE-20001.2.patch, HIVE-20001.3.patch, HIVE-20001.4.patch, HIVE-20001.5.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Hive: With doas set to true, running select query as hrt_qa user on external 
> table fails due to permission denied to read /warehouse/tablespace/managed 
> directory.
> Steps: 
> 1. Create a external table.
> 2. Set doas to true.
> 3. run select count(*) using user hrt_qa.
> Table creation query.
> {code}
> beeline -n hrt_qa -p pwd -u 
> "jdbc:hive2://ctr-e138-1518143905142-375925-01-06.hwx.site:2181,ctr-e138-1518143905142-375925-01-05.hwx.site:2181,ctr-e138-1518143905142-375925-01-07.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit"
>  --outputformat=tsv -e "drop table if exists test_table purge;
> create external table test_table(id int, age int) row format delimited fields 
> terminated by '|' stored as textfile;
> load data inpath '/tmp/table1.dat' overwrite into table test_table;
> {code}
> select count(*) query execution fails
> {code}
> beeline -n hrt_qa -p pwd -u 
> "jdbc:hive2://ctr-e138-1518143905142-375925-01-06.hwx.site:2181,ctr-e138-1518143905142-375925-01-05.hwx.site:2181,ctr-e138-1518143905142-375925-01-07.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit"
>  --outputformat=tsv -e "select count(*) from test_table where age>30 and 
> id<10100;"
> 2018-06-22 10:22:29,328|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|SLF4J: Class path contains 
> multiple SLF4J bindings.
> 2018-06-22 10:22:29,330|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|SLF4J: See 
> http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
> 2018-06-22 10:22:29,335|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|SLF4J: Actual binding is of 
> type [org.apache.logging.slf4j.Log4jLoggerFactory]
> 2018-06-22 10:22:31,408|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|Format tsv is deprecated, 
> please use tsv2
> 2018-06-22 10:22:31,529|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|Connecting to 
> jdbc:hive2://ctr-e138-1518143905142-375925-01-06.hwx.site:2181,ctr-e138-1518143905142-375925-01-05.hwx.site:2181,ctr-e138-1518143905142-375925-01-07.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit
> 2018-06-22 10:22:32,031|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|18/06/22 10:22:32 [main]: 
> INFO jdbc.HiveConnection: Connected to 
> ctr-e138-1518143905142-375925-01-04.hwx.site:10001
> 2018-06-22 10:22:34,130|INFO|Thread-126|machine.py:111 - 
>

[jira] [Work logged] (HIVE-20001) With doas set to true, running select query as hrt_qa user on external table fails due to permission denied to read /warehouse/tablespace/managed directory.

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-20001?focusedWorklogId=476978=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476978
 ]

ASF GitHub Bot logged work on HIVE-20001:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 03:21
Start Date: 01/Sep/20 03:21
Worklog Time Spent: 10m 
  Work Description: cravani opened a new pull request #1450:
URL: https://github.com/apache/hive/pull/1450


   Fix for With doas set to true, running select query as hrt_qa user on 
external table fails due to permission denied to read 
/warehouse/tablespace/managed directory. and Create database with permission 
error when doas set to true



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476978)
Time Spent: 0.5h  (was: 20m)

> With doas set to true, running select query as hrt_qa user on external table 
> fails due to permission denied to read /warehouse/tablespace/managed 
> directory.
> 
>
> Key: HIVE-20001
> URL: https://issues.apache.org/jira/browse/HIVE-20001
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Critical
>  Labels: pull-request-available
> Attachments: HIVE-20001.1.patch, HIVE-20001.1.patch, 
> HIVE-20001.2.patch, HIVE-20001.3.patch, HIVE-20001.4.patch, HIVE-20001.5.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Hive: With doas set to true, running select query as hrt_qa user on external 
> table fails due to permission denied to read /warehouse/tablespace/managed 
> directory.
> Steps: 
> 1. Create a external table.
> 2. Set doas to true.
> 3. run select count(*) using user hrt_qa.
> Table creation query.
> {code}
> beeline -n hrt_qa -p pwd -u 
> "jdbc:hive2://ctr-e138-1518143905142-375925-01-06.hwx.site:2181,ctr-e138-1518143905142-375925-01-05.hwx.site:2181,ctr-e138-1518143905142-375925-01-07.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit"
>  --outputformat=tsv -e "drop table if exists test_table purge;
> create external table test_table(id int, age int) row format delimited fields 
> terminated by '|' stored as textfile;
> load data inpath '/tmp/table1.dat' overwrite into table test_table;
> {code}
> select count(*) query execution fails
> {code}
> beeline -n hrt_qa -p pwd -u 
> "jdbc:hive2://ctr-e138-1518143905142-375925-01-06.hwx.site:2181,ctr-e138-1518143905142-375925-01-05.hwx.site:2181,ctr-e138-1518143905142-375925-01-07.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit"
>  --outputformat=tsv -e "select count(*) from test_table where age>30 and 
> id<10100;"
> 2018-06-22 10:22:29,328|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|SLF4J: Class path contains 
> multiple SLF4J bindings.
> 2018-06-22 10:22:29,330|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|SLF4J: See 
> http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
> 2018-06-22 10:22:29,335|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|SLF4J: Actual binding is of 
> type [org.apache.logging.slf4j.Log4jLoggerFactory]
> 2018-06-22 10:22:31,408|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|Format tsv is deprecated, 
> please use tsv2
> 2018-06-22 10:22:31,529|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|Connecting to 
> jdbc:hive2://ctr-e138-1518143905142-375925-01-06.hwx.site:2181,ctr-e138-1518143905142-375925-01-05.hwx.site:2181,ctr-e138-1518143905142-375925-01-07.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com;transportMode=http;httpPath=cliservice;ssl=true;sslTrustStore=/etc/security/serverKeys/hivetruststore.jks;trustStorePassword=changeit
> 2018-06-22 10:22:32,031|INFO|Thread-126|machine.py:111 - 
> tee_pipe()||b3a493ec-99be-483e-91fe-4b701ec27ebc|18/06/22

[jira] [Updated] (HIVE-24100) Syntax compile failure occurs when INSERT table column Order by is greater than 2 columns when CBO is false

2020-08-31 Thread LuGuangMing (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LuGuangMing updated HIVE-24100:
---
Description: 
Executing the following SQL will fail to compile
{code:java}
set hive.cbo.enable=false;

-- create tabls --
create table tab_1
(
item_id string, 
stru_area_id string
)partitioned by ( PT_DT string) stored as orc;create table tab_2
(
CREATE_ORG_ID string,
PROMOTION_ID  string,
PROMOTION_STATUS string
) partitioned by (pt_dt string) stored as orc;create table tab_3
(
STRU_ID string,
SUP_STRU string
) partitioned by(pt_dt string) stored as orc;

-- execution --
explain
insert into table tab_1 partition(PT_DT = '2020-08-22')
(item_id , stru_area_id)
select '123' ITEM_ID , T.STRU_ID STRU_AREA_ID 
from ( 
  select 
  T0.STRU_ID STRU_ID ,T0.STRU_ID STRU_ID_BRANCH 
  from  tab_3 T0 
) T
inner join ( 
  select 
  TT.CREATE_ORG_ID
  from  tab_2 TT 
) TIV
on (T.STRU_ID_BRANCH = TIV.CREATE_ORG_ID) 
group by T.STRU_ID
order by 1,2;
{code}
{code:java}
org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: 
FAILED: SemanticException [Error 10004]: Line 5:28 Invalid table alias or 
column reference 'T': (possible column names are: _col0, _col1)
 at 
org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:341)
 ~[hive-service-3.1.0.jar:3.1.0]
 at 
org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:215)
 ~[hive-service-3.1.0.jar:3.1.0]
 at 
org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:316)
 ~[hive-service-3.1.0.jar:3.1.0]
 at org.apache.hive.service.cli.operation.Operation.run(Operation.java:253) 
~[hive-service-3.1.0.jar:3.1.0]
 at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:684)
 ~[hive-service-3.1.0.jar:3.1.0]
 at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:670)
 ~[hive-service-3.1.0.jar:3.1.0]
 at 
org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:342)
 ~[hive-service-3.1.0.jar:3.1.0]
 at 
org.apache.hive.service.cli.thrift.ThriftCLIService.executeNewStatement(ThriftCLIService.java:1144)
 ~[hive-service-3.1.0.jar:3.1.0]
 at 
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:1280)
 ~[hive-service-3.1.0.jar:3.1.0]
 at 
org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557)
 ~[hive-exec-3.1.0.jar:3.1.0]
 at 
org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542)
 ~[hive-exec-3.1.0.jar:3.1.0]
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) 
~[hive-exec-3.1.0.jar:3.1.0]
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) 
~[hive-exec-3.1.0.jar:3.1.0]
 at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:648)
 ~[hive-exec-3.1.0.jar:3.1.0]
 at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
 ~[hive-exec-3.1.0.jar:3.1.0]
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_201]
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_201]
 at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: Line 5:28 Invalid 
table alias or column reference 'T': (possible column names are: _col0, _col1)
 at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:12689)
 ~[hive-exec-3.1.0.jar:3.1.0]
 at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:12629)
 ~[hive-exec-3.1.0.jar:3.1.0]
 at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:12597)
 ~[hive-exec-3.1.0.jar:3.1.0]
 at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:12575)
 ~[hive-exec-3.1.0.jar:3.1.0]
 at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genReduceSinkPlan(SemanticAnalyzer.java:8482)
 ~[hive-exec-3.1.0.jar:3.1.0]
 at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:10616)
 ~[hive-exec-3.1.0.jar:3.1.0]
 at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:10515)
 ~[hive-exec-3.1.0.jar:3.1.0]
 at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11434)
 ~[hive-exec-3.1.0.jar:3.1.0]
 at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11304)
 ~[hive-exec-3.1.0.jar:3.1.0]
 at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:12090)
 ~[hive-exec-3.1.0.jar:3.1.0]
 at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12180)

[jira] [Updated] (HIVE-22758) Create database with permission error when doas set to true

2020-08-31 Thread Chiran Ravani (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chiran Ravani updated HIVE-22758:
-
Attachment: HIVE-22758.2.patch

> Create database with permission error when doas set to true
> ---
>
> Key: HIVE-22758
> URL: https://issues.apache.org/jira/browse/HIVE-22758
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Chiran Ravani
>Assignee: Chiran Ravani
>Priority: Critical
> Attachments: HIVE-22758.1.patch, HIVE-22758.2.patch
>
>
> With doAs set to true, running create database on external location fails 
> with permission denied for write access on the directory for hive user (User 
> HMS is running as).
> Steps to reproduce the issue:
> 1. Turn on, Hive run as end-user to true.
> 2. Connect to hive as some user other than admin, eg:- chiran
> 3. Create a database with external location
> {code}
> create database externaldbexample location '/user/chiran/externaldbexample'
> {code}
> The above statement fails as write access is not available to hive service 
> user on HDFS as below.
> {code}
> > create database externaldbexample location '/user/chiran/externaldbexample';
> INFO  : Compiling 
> command(queryId=hive_20200122043626_5c95e1fd-ce00-45fd-b58d-54f5e579f87d): 
> create database externaldbexample location '/user/chiran/externaldbexample'
> INFO  : Semantic Analysis Completed (retrial = false)
> INFO  : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
> INFO  : Completed compiling 
> command(queryId=hive_20200122043626_5c95e1fd-ce00-45fd-b58d-54f5e579f87d); 
> Time taken: 1.377 seconds
> INFO  : Executing 
> command(queryId=hive_20200122043626_5c95e1fd-ce00-45fd-b58d-54f5e579f87d): 
> create database externaldbexample location '/user/chiran/externaldbexample'
> INFO  : Starting task [Stage-0:DDL] in serial mode
> ERROR : FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. 
> MetaException(message:java.lang.reflect.UndeclaredThrowableException)
> INFO  : Completed executing 
> command(queryId=hive_20200122043626_5c95e1fd-ce00-45fd-b58d-54f5e579f87d); 
> Time taken: 0.238 seconds
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask. 
> MetaException(message:java.lang.reflect.UndeclaredThrowableException) 
> (state=08S01,code=1)
> {code}
> From Hive Metastore service log, below is seen.
> {code}
> 2020-01-22T04:36:27,870 WARN  [pool-6-thread-6]: metastore.ObjectStore 
> (ObjectStore.java:getDatabase(1010)) - Failed to get database 
> hive.externaldbexample, returning NoSuchObjectExcept
> ion
> 2020-01-22T04:36:27,898 INFO  [pool-6-thread-6]: metastore.HiveMetaStore 
> (HiveMetaStore.java:run(1339)) - Creating database path in managed directory 
> hdfs://c470-node2.squadron.support.
> hortonworks.com:8020/user/chiran/externaldbexample
> 2020-01-22T04:36:27,903 INFO  [pool-6-thread-6]: utils.FileUtils 
> (FileUtils.java:mkdir(170)) - Creating directory if it doesn't exist: 
> hdfs://namenodeaddress:8020/user/chiran/externaldbexample
> 2020-01-22T04:36:27,932 ERROR [pool-6-thread-6]: utils.MetaStoreUtils 
> (MetaStoreUtils.java:logAndThrowMetaException(169)) - Got exception: 
> org.apache.hadoop.security.AccessControlException Permission denied: 
> user=hive, access=WRITE, inode="/user/chiran":chiran:chiran:drwxr-xr-x
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:399)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:255)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:193)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1859)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1843)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1802)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:59)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3150)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1126)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:707)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
>

[jira] [Updated] (HIVE-24099) unix_timestamp,intersect,except throws NPE

2020-08-31 Thread zhaolong (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaolong updated HIVE-24099:

Attachment: image-2020-09-01-10-27-23-916.png

> unix_timestamp,intersect,except throws NPE
> --
>
> Key: HIVE-24099
> URL: https://issues.apache.org/jira/browse/HIVE-24099
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: zhaolong
>Priority: Major
> Attachments: image-2020-09-01-10-22-07-549.png, 
> image-2020-09-01-10-26-14-062.png, image-2020-09-01-10-27-23-916.png
>
>
> unix_timestamp,intersect,except throws NPE when cbo is false and 
> optimize.constant.propagation is false
> reproduced problems:
>  1. unix_timestap:
>       set hive.cbo.enable=true;
>       set hive.optimize.constant.propagation=false;
>  {color:#00}     create table test_pt(idx string, namex string) 
> partitioned by(pt_dt string) stored as orc;{color}
> {color:#00} explain extended select count(1) from test_pt where pt_dt 
> = unix_timestamp();{color}
> {color:#00}!image-2020-09-01-10-22-07-549.png!{color}
> {color:#00}2.intersect{color}
> {color:#00} create table t1(id int, name string, score int);{color}
> create table t2(id int, name string, score int);
> insert into t1 values(1,'xiaoming', 98);
> insert into t2 values(2,'xiaohong', 95);
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24099) unix_timestamp,intersect,except throws NPE

2020-08-31 Thread zhaolong (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaolong updated HIVE-24099:

Description: 
unix_timestamp,intersect,except throws NPE when cbo is false and 
optimize.constant.propagation is false

reproduced problems:
 1. unix_timestap:
      set hive.cbo.enable=true;
      set hive.optimize.constant.propagation=false;
 {color:#00}     create table test_pt(idx string, namex string) partitioned 
by(pt_dt string) stored as orc;{color}

{color:#00} explain extended select count(1) from test_pt where pt_dt = 
unix_timestamp();{color}

{color:#00}!image-2020-09-01-10-22-07-549.png!{color}

{color:#00}2.intersect{color}

{color:#00} create table t1(id int, name string, score int);{color}

create table t2(id int, name string, score int);

insert into t1 values(1,'xiaoming', 98);

insert into t2 values(2,'xiaohong', 95);

select id from t1 intersect select id from t2;

!image-2020-09-01-10-26-14-062.png!

3.except 

select id from t1 except select id from t2;

  !image-2020-09-01-10-27-23-916.png!

  was:
unix_timestamp,intersect,except throws NPE when cbo is false and 
optimize.constant.propagation is false

reproduced problems:
 1. unix_timestap:
      set hive.cbo.enable=true;
      set hive.optimize.constant.propagation=false;
 {color:#00}     create table test_pt(idx string, namex string) partitioned 
by(pt_dt string) stored as orc;{color}

{color:#00} explain extended select count(1) from test_pt where pt_dt = 
unix_timestamp();{color}

{color:#00}!image-2020-09-01-10-22-07-549.png!{color}

{color:#00}2.intersect{color}

{color:#00} create table t1(id int, name string, score int);{color}

create table t2(id int, name string, score int);

insert into t1 values(1,'xiaoming', 98);

insert into t2 values(2,'xiaohong', 95);

 


> unix_timestamp,intersect,except throws NPE
> --
>
> Key: HIVE-24099
> URL: https://issues.apache.org/jira/browse/HIVE-24099
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: zhaolong
>Priority: Major
> Attachments: image-2020-09-01-10-22-07-549.png, 
> image-2020-09-01-10-26-14-062.png, image-2020-09-01-10-27-23-916.png
>
>
> unix_timestamp,intersect,except throws NPE when cbo is false and 
> optimize.constant.propagation is false
> reproduced problems:
>  1. unix_timestap:
>       set hive.cbo.enable=true;
>       set hive.optimize.constant.propagation=false;
>  {color:#00}     create table test_pt(idx string, namex string) 
> partitioned by(pt_dt string) stored as orc;{color}
> {color:#00} explain extended select count(1) from test_pt where pt_dt 
> = unix_timestamp();{color}
> {color:#00}!image-2020-09-01-10-22-07-549.png!{color}
> {color:#00}2.intersect{color}
> {color:#00} create table t1(id int, name string, score int);{color}
> create table t2(id int, name string, score int);
> insert into t1 values(1,'xiaoming', 98);
> insert into t2 values(2,'xiaohong', 95);
> select id from t1 intersect select id from t2;
> !image-2020-09-01-10-26-14-062.png!
> 3.except 
> select id from t1 except select id from t2;
>   !image-2020-09-01-10-27-23-916.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24099) unix_timestamp,intersect,except throws NPE

2020-08-31 Thread zhaolong (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaolong updated HIVE-24099:

Attachment: image-2020-09-01-10-26-14-062.png

> unix_timestamp,intersect,except throws NPE
> --
>
> Key: HIVE-24099
> URL: https://issues.apache.org/jira/browse/HIVE-24099
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: zhaolong
>Priority: Major
> Attachments: image-2020-09-01-10-22-07-549.png, 
> image-2020-09-01-10-26-14-062.png
>
>
> unix_timestamp,intersect,except throws NPE when cbo is false and 
> optimize.constant.propagation is false
> reproduced problems:
>  1. unix_timestap:
>       set hive.cbo.enable=true;
>       set hive.optimize.constant.propagation=false;
>  {color:#00}     create table test_pt(idx string, namex string) 
> partitioned by(pt_dt string) stored as orc;{color}
> {color:#00} explain extended select count(1) from test_pt where pt_dt 
> = unix_timestamp();{color}
> {color:#00}!image-2020-09-01-10-22-07-549.png!{color}
> {color:#00}2.intersect{color}
> {color:#00} create table t1(id int, name string, score int);{color}
> create table t2(id int, name string, score int);
> insert into t1 values(1,'xiaoming', 98);
> insert into t2 values(2,'xiaohong', 95);
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24099) unix_timestamp,intersect,except throws NPE

2020-08-31 Thread zhaolong (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaolong updated HIVE-24099:

Description: 
unix_timestamp,intersect,except throws NPE when cbo is false and 
optimize.constant.propagation is false

reproduced problems:
 1. unix_timestap:
      set hive.cbo.enable=true;
      set hive.optimize.constant.propagation=false;
 {color:#00}     create table test_pt(idx string, namex string) partitioned 
by(pt_dt string) stored as orc;{color}

{color:#00} explain extended select count(1) from test_pt where pt_dt = 
unix_timestamp();{color}

{color:#00}!image-2020-09-01-10-22-07-549.png!{color}

{color:#00}2.intersect{color}

{color:#00} create table t1(id int, name string, score int);{color}

create table t2(id int, name string, score int);

insert into t1 values(1,'xiaoming', 98);

insert into t2 values(2,'xiaohong', 95);

 

  was:
unix_timestamp,intersect,except throws NPE when cbo is false and 
optimize.constant.propagation is false


reproduced problems:
1. unix_timestap:
     set hive.cbo.enable=true;
     set hive.optimize.constant.propagation=false;
{color:#00}     create table test_pt(idx string, namex string) partitioned 
by(pt_dt string) stored as orc;{color}

{color:#00} explain extended select count(1) from test_pt where pt_dt = 
unix_timestamp();{color}

{color:#00}!image-2020-09-01-10-22-07-549.png!{color}

{color:#00}2.{color}

{color:#00} create table t1(id int, name string, score int);{color}

 
 


> unix_timestamp,intersect,except throws NPE
> --
>
> Key: HIVE-24099
> URL: https://issues.apache.org/jira/browse/HIVE-24099
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: zhaolong
>Priority: Major
> Attachments: image-2020-09-01-10-22-07-549.png
>
>
> unix_timestamp,intersect,except throws NPE when cbo is false and 
> optimize.constant.propagation is false
> reproduced problems:
>  1. unix_timestap:
>       set hive.cbo.enable=true;
>       set hive.optimize.constant.propagation=false;
>  {color:#00}     create table test_pt(idx string, namex string) 
> partitioned by(pt_dt string) stored as orc;{color}
> {color:#00} explain extended select count(1) from test_pt where pt_dt 
> = unix_timestamp();{color}
> {color:#00}!image-2020-09-01-10-22-07-549.png!{color}
> {color:#00}2.intersect{color}
> {color:#00} create table t1(id int, name string, score int);{color}
> create table t2(id int, name string, score int);
> insert into t1 values(1,'xiaoming', 98);
> insert into t2 values(2,'xiaohong', 95);
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24099) unix_timestamp,intersect,except throws NPE

2020-08-31 Thread zhaolong (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaolong updated HIVE-24099:

Description: 
unix_timestamp,intersect,except throws NPE when cbo is false and 
optimize.constant.propagation is false


reproduced problems:
1. unix_timestap:
     set hive.cbo.enable=true;
     set hive.optimize.constant.propagation=false;
{color:#00}     create table test_pt(idx string, namex string) partitioned 
by(pt_dt string) stored as orc;{color}

{color:#00} explain extended select count(1) from test_pt where pt_dt = 
unix_timestamp();{color}

{color:#00}!image-2020-09-01-10-22-07-549.png!{color}

{color:#00}2.{color}

{color:#00} create table t1(id int, name string, score int);{color}

 
 

> unix_timestamp,intersect,except throws NPE
> --
>
> Key: HIVE-24099
> URL: https://issues.apache.org/jira/browse/HIVE-24099
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: zhaolong
>Priority: Major
> Attachments: image-2020-09-01-10-22-07-549.png
>
>
> unix_timestamp,intersect,except throws NPE when cbo is false and 
> optimize.constant.propagation is false
> reproduced problems:
> 1. unix_timestap:
>      set hive.cbo.enable=true;
>      set hive.optimize.constant.propagation=false;
> {color:#00}     create table test_pt(idx string, namex string) 
> partitioned by(pt_dt string) stored as orc;{color}
> {color:#00} explain extended select count(1) from test_pt where pt_dt 
> = unix_timestamp();{color}
> {color:#00}!image-2020-09-01-10-22-07-549.png!{color}
> {color:#00}2.{color}
> {color:#00} create table t1(id int, name string, score int);{color}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24099) unix_timestamp,intersect,except throws NPE

2020-08-31 Thread zhaolong (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaolong updated HIVE-24099:

Attachment: image-2020-09-01-10-22-07-549.png

> unix_timestamp,intersect,except throws NPE
> --
>
> Key: HIVE-24099
> URL: https://issues.apache.org/jira/browse/HIVE-24099
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: zhaolong
>Priority: Major
> Attachments: image-2020-09-01-10-22-07-549.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23309) Lazy Initialization of Hadoop Shims

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-23309:
--
Labels: pull-request-available  (was: )

> Lazy Initialization of Hadoop Shims
> ---
>
> Key: HIVE-23309
> URL: https://issues.apache.org/jira/browse/HIVE-23309
> Project: Hive
>  Issue Type: Bug
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23309.01.patch, HIVE-23309.02.patch, 
> HIVE-23309.03.patch, HIVE-23309.04.patch, HIVE-23309.05.patch, 
> HIVE-23309.06.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Initialize hadoop-shims only if CM is enabled



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23561) FIX Arrow Decimal serialization for native VectorRowBatches

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-23561:
--
Labels: pull-request-available  (was: )

> FIX Arrow Decimal serialization for native VectorRowBatches
> ---
>
> Key: HIVE-23561
> URL: https://issues.apache.org/jira/browse/HIVE-23561
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-23561.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Arrow Serializer does not properly handle Decimal primitive values when 
> selected array is used.
> In more detail, decimalValueSetter should be setting the value at 
> *arrowIndex[i]* as the value at *hiveIndex[j]*, however currently its using 
> the _same_ index!
> https://github.com/apache/hive/blob/eac25e711ea750bc52f41da7ed3c32bfe36d4f67/ql/src/java/org/apache/hadoop/hive/ql/io/arrow/Serializer.java#L926
> This works fine for cases where i == j (selected is not used) but returns 
> wrong decimal row values when i != j.
> This ticket fixes this inconsistency and adds tests with selected indexes for 
> all supported types



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23742) Remove unintentional execution of TPC-DS query39 in qtests

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23742?focusedWorklogId=476939=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476939
 ]

ASF GitHub Bot logged work on HIVE-23742:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 00:47
Start Date: 01/Sep/20 00:47
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1160:
URL: https://github.com/apache/hive/pull/1160#issuecomment-684124139


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476939)
Time Spent: 20m  (was: 10m)

> Remove unintentional execution of TPC-DS query39 in qtests
> --
>
> Key: HIVE-23742
> URL: https://issues.apache.org/jira/browse/HIVE-23742
> Project: Hive
>  Issue Type: Task
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Trivial
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> TPC-DS queries under clientpositive/perf are meant only to check plan 
> regressions so they should never be really executed thus the execution part 
> should be removed from query39.q and cbo_query39.q



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23582) LLAP: Make SplitLocationProvider impl pluggable

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23582?focusedWorklogId=476944=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476944
 ]

ASF GitHub Bot logged work on HIVE-23582:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 00:47
Start Date: 01/Sep/20 00:47
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1041:
URL: https://github.com/apache/hive/pull/1041#issuecomment-684124226


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476944)
Time Spent: 20m  (was: 10m)

> LLAP: Make SplitLocationProvider impl pluggable
> ---
>
> Key: HIVE-23582
> URL: https://issues.apache.org/jira/browse/HIVE-23582
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-23582.1.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> LLAP uses HostAffinitySplitLocationProvider implementation by default. For 
> non zookeeper based environments, a different split location provider may be 
> used. To facilitate that make the SplitLocationProvider implementation class 
> a pluggable. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23606) LLAP: Delay In DirectByteBuffer Clean Up For EncodedReaderImpl

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23606?focusedWorklogId=476941=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476941
 ]

ASF GitHub Bot logged work on HIVE-23606:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 00:47
Start Date: 01/Sep/20 00:47
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1057:
URL: https://github.com/apache/hive/pull/1057#issuecomment-684124206


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476941)
Time Spent: 20m  (was: 10m)

> LLAP: Delay In DirectByteBuffer Clean Up For EncodedReaderImpl
> --
>
> Key: HIVE-23606
> URL: https://issues.apache.org/jira/browse/HIVE-23606
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 3.0.0
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-23606.01.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> DirectByteBuffler are only cleaned up when there is Full GC or manually 
> invoked cleaner method of DirectByteBuffer, Since full GC may take some time 
> to kick in, In the meanwhile the native memory usage of LLAP daemon process 
> might shoot up and this will force the YARN pmem monitor to kill the 
> container running the daemon.
> HIVE-16180 tried to solve this problem, but the code structure got messed up 
> after HIVE-15665
> The IdentityHashMap (toRelease) is initialized in 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java#L409
>  , but it is getting re-initialized inside the method 
> getDataFromCacheAndDisk() 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java#L633
>   which makes it local to that method hence the original toRelease 
> IdentityHashMap remains empty.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23475) Track MJ HashTable mem usage

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23475?focusedWorklogId=476947=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476947
 ]

ASF GitHub Bot logged work on HIVE-23475:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 00:47
Start Date: 01/Sep/20 00:47
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1023:
URL: https://github.com/apache/hive/pull/1023#issuecomment-684124253


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476947)
Time Spent: 20m  (was: 10m)

> Track MJ HashTable mem usage
> 
>
> Key: HIVE-23475
> URL: https://issues.apache.org/jira/browse/HIVE-23475
> Project: Hive
>  Issue Type: Improvement
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23475.01.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-21141) Fix some spell errors in Hive

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21141?focusedWorklogId=476950=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476950
 ]

ASF GitHub Bot logged work on HIVE-21141:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 00:47
Start Date: 01/Sep/20 00:47
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #519:
URL: https://github.com/apache/hive/pull/519#issuecomment-684124327


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476950)
Time Spent: 50m  (was: 40m)

> Fix some spell errors in Hive
> -
>
> Key: HIVE-21141
> URL: https://issues.apache.org/jira/browse/HIVE-21141
> Project: Hive
>  Issue Type: Bug
>Reporter: Bo Xu
>Assignee: Bo Xu
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21141.1.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Fix som spell errors in Hive



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23561) FIX Arrow Decimal serialization for native VectorRowBatches

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23561?focusedWorklogId=476943=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476943
 ]

ASF GitHub Bot logged work on HIVE-23561:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 00:47
Start Date: 01/Sep/20 00:47
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1038:
URL: https://github.com/apache/hive/pull/1038#issuecomment-684124233


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476943)
Remaining Estimate: 0h
Time Spent: 10m

> FIX Arrow Decimal serialization for native VectorRowBatches
> ---
>
> Key: HIVE-23561
> URL: https://issues.apache.org/jira/browse/HIVE-23561
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-23561.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Arrow Serializer does not properly handle Decimal primitive values when 
> selected array is used.
> In more detail, decimalValueSetter should be setting the value at 
> *arrowIndex[i]* as the value at *hiveIndex[j]*, however currently its using 
> the _same_ index!
> https://github.com/apache/hive/blob/eac25e711ea750bc52f41da7ed3c32bfe36d4f67/ql/src/java/org/apache/hadoop/hive/ql/io/arrow/Serializer.java#L926
> This works fine for cases where i == j (selected is not used) but returns 
> wrong decimal row values when i != j.
> This ticket fixes this inconsistency and adds tests with selected indexes for 
> all supported types



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23611) Mandate fully qualified absolute path for external table base dir during REPL operation

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23611?focusedWorklogId=476938=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476938
 ]

ASF GitHub Bot logged work on HIVE-23611:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 00:47
Start Date: 01/Sep/20 00:47
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1120:
URL: https://github.com/apache/hive/pull/1120#issuecomment-684124169


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476938)
Time Spent: 1h  (was: 50m)

> Mandate fully qualified absolute path for external table base dir during REPL 
> operation
> ---
>
> Key: HIVE-23611
> URL: https://issues.apache.org/jira/browse/HIVE-23611
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23611.01.patch, HIVE-23611.02.patch, 
> HIVE-23611.03.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23554) [LLAP] support ColumnVectorBatch with FilterContext as part of ReadPipeline

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23554?focusedWorklogId=476940=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476940
 ]

ASF GitHub Bot logged work on HIVE-23554:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 00:47
Start Date: 01/Sep/20 00:47
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1036:
URL: https://github.com/apache/hive/pull/1036#issuecomment-684124241


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476940)
Time Spent: 20m  (was: 10m)

> [LLAP] support ColumnVectorBatch with FilterContext as part of ReadPipeline
> ---
>
> Key: HIVE-23554
> URL: https://issues.apache.org/jira/browse/HIVE-23554
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-23554.01.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently the readPipeline in LLAP supports consuming ColumnVectorBatches.
> As each batch can be now tied with a Filter (HIVE-22959  HIVE-23215) we 
> should update the pipeline to consume BatchWrappers of ColumnVectorBatch and 
> a Filter instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23309) Lazy Initialization of Hadoop Shims

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23309?focusedWorklogId=476946=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476946
 ]

ASF GitHub Bot logged work on HIVE-23309:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 00:47
Start Date: 01/Sep/20 00:47
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #999:
URL: https://github.com/apache/hive/pull/999#issuecomment-684124269


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476946)
Time Spent: 20m  (was: 10m)

> Lazy Initialization of Hadoop Shims
> ---
>
> Key: HIVE-23309
> URL: https://issues.apache.org/jira/browse/HIVE-23309
> Project: Hive
>  Issue Type: Bug
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-23309.01.patch, HIVE-23309.02.patch, 
> HIVE-23309.03.patch, HIVE-23309.04.patch, HIVE-23309.05.patch, 
> HIVE-23309.06.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Initialize hadoop-shims only if CM is enabled



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22360) MultiDelimitSerDe returns wrong results in last column when the loaded file has more columns than those in table schema

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22360?focusedWorklogId=476942=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476942
 ]

ASF GitHub Bot logged work on HIVE-22360:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 00:47
Start Date: 01/Sep/20 00:47
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #823:
URL: https://github.com/apache/hive/pull/823#issuecomment-684124292


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476942)
Time Spent: 0.5h  (was: 20m)

> MultiDelimitSerDe returns wrong results in last column when the loaded file 
> has more columns than those in table schema
> ---
>
> Key: HIVE-22360
> URL: https://issues.apache.org/jira/browse/HIVE-22360
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22360.1.patch, HIVE-22360.2.patch, 
> HIVE-22360.3.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Repro steps:
> Input file:
> {code}
> 1^,1^,^,0^,0^,0 
> 2^,1^,^,0^,1^,0 
> 3^,1^,^,0^,0^,0 
> 4^,1^,^,0^,1^,0
> {code}
> Queries:
> {code}
> CREATE TABLE  n2(colA int, colB tinyint, colC timestamp, colD smallint, colE 
> smallint) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.MultiDelimitSerDe' 
> WITH SERDEPROPERTIES ("field.delim"="^,")STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH '/Users/schaurasia/Documents/input_6_cols.csv' 
> OVERWRITE INTO TABLE n2;
>  select * from n2;
> // wrong last column results here.
> +--+--+--+--+--+
> | n2.cola  | n2.colb  | n2.colc  | n2.cold  | n2.cole  |
> +--+--+--+--+--+
> | 1| 1| NULL | 0| NULL |
> | 2| 1| NULL | 0| NULL |
> | 3| 1| NULL | 0| NULL |
> | 4| 1| NULL | 0| NULL |
> +--+--+--+--+--+
> {code}
> Cause:
> In multi-serde parsing, the total length calculation here: 
> https://github.com/apache/hive/blob/rel/release-3.1.2/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyStruct.java#L308
>  does not take extra fields into account.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23443) LLAP speculative task pre-emption seems to be not working

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23443?focusedWorklogId=476945=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476945
 ]

ASF GitHub Bot logged work on HIVE-23443:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 00:47
Start Date: 01/Sep/20 00:47
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1012:
URL: https://github.com/apache/hive/pull/1012#issuecomment-684124264


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476945)
Time Spent: 1h 20m  (was: 1h 10m)

> LLAP speculative task pre-emption seems to be not working
> -
>
> Key: HIVE-23443
> URL: https://issues.apache.org/jira/browse/HIVE-23443
> Project: Hive
>  Issue Type: Bug
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-23443.1.patch, HIVE-23443.2.patch, 
> HIVE-23443.3.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> I think after HIVE-23210 we are getting a stable sort order and it is causing 
> pre-emption to not work in certain cases.
> {code:java}
> "attempt_1589167813851__119_01_08_0 
> (hive_20200511055921_89598f09-19f1-4969-ab7a-82e2dd796273-119/Map 1, started 
> at 2020-05-11 05:59:22, in preemption queue, can finish)", 
> "attempt_1589167813851_0008_84_01_08_1 
> (hive_20200511055928_7ae29ca3-e67d-4d1f-b193-05651023b503-84/Map 1, started 
> at 2020-05-11 06:00:23, in preemption queue, can finish)" {code}
> Scheduler only peek's at the pre-emption queue and looks at whether it is 
> non-finishable. 
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskExecutorService.java#L420]
> In the above case, all tasks are speculative but state change is not 
> triggering pre-emption queue re-ordering so peek() always returns canFinish 
> task even though non-finishable tasks are in the queue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23597) VectorizedOrcAcidRowBatchReader::ColumnizedDeleteEventRegistry reads delete delta directories multiple times

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23597?focusedWorklogId=476948=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476948
 ]

ASF GitHub Bot logged work on HIVE-23597:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 00:47
Start Date: 01/Sep/20 00:47
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1081:
URL: https://github.com/apache/hive/pull/1081#issuecomment-684124193


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476948)
Time Spent: 1.5h  (was: 1h 20m)

> VectorizedOrcAcidRowBatchReader::ColumnizedDeleteEventRegistry reads delete 
> delta directories multiple times
> 
>
> Key: HIVE-23597
> URL: https://issues.apache.org/jira/browse/HIVE-23597
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java#L1562]
> {code:java}
> try {
> final Path[] deleteDeltaDirs = getDeleteDeltaDirsFromSplit(orcSplit);
> if (deleteDeltaDirs.length > 0) {
>   int totalDeleteEventCount = 0;
>   for (Path deleteDeltaDir : deleteDeltaDirs) {
> {code}
>  
> Consider a directory layout like the following. This was created by having 
> simple set of "insert --> update --> select" queries.
>  
> {noformat}
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/base_001
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/base_002
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delete_delta_003_003_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delete_delta_004_004_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delete_delta_005_005_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delete_delta_006_006_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delete_delta_007_007_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delete_delta_008_008_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delete_delta_009_009_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delete_delta_010_010_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delete_delta_011_011_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delete_delta_012_012_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delete_delta_013_013_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delta_003_003_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delta_004_004_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delta_005_005_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delta_006_006_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delta_007_007_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delta_008_008_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delta_009_009_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delta_010_010_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delta_011_011_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delta_012_012_
>

[jira] [Work logged] (HIVE-22979) Support total file size in statistics annotation

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22979?focusedWorklogId=476937=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476937
 ]

ASF GitHub Bot logged work on HIVE-22979:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 00:46
Start Date: 01/Sep/20 00:46
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #941:
URL: https://github.com/apache/hive/pull/941#issuecomment-684124279


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476937)
Time Spent: 1h 10m  (was: 1h)

> Support total file size in statistics annotation
> 
>
> Key: HIVE-22979
> URL: https://issues.apache.org/jira/browse/HIVE-22979
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22979.1.patch, HIVE-22979.2.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Hive statistics annotation provide estimated Statistics for each operator. 
> The data size provided in TableScanOperator is raw data size (after 
> decompression and decoding), but there are some optimizations that can be 
> performed based on total file size on disk (scan cost estimation).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-21624) LLAP: Cpu metrics at thread level is broken

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21624?focusedWorklogId=476936=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476936
 ]

ASF GitHub Bot logged work on HIVE-21624:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 00:46
Start Date: 01/Sep/20 00:46
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1030:
URL: https://github.com/apache/hive/pull/1030#issuecomment-684124249


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476936)
Time Spent: 20m  (was: 10m)

> LLAP: Cpu metrics at thread level is broken
> ---
>
> Key: HIVE-21624
> URL: https://issues.apache.org/jira/browse/HIVE-21624
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Nita Dembla
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21624.1.patch, HIVE-21624.2.patch, 
> HIVE-21624.3.patch, HIVE-21624.4.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> ExecutorThreadCPUTime and ExecutorThreadUserTime relies on thread mx bean cpu 
> metrics when available. At some point, the thread name which the metrics 
> publisher looks for has changed causing no metrics to be published for these 
> counters.  
> The above counters looks for thread with name starting with 
> "ContainerExecutor" but the llap task executor thread got changed to 
> "Task-Executor"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23770) Druid filter translation unable to handle inverted between

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23770?focusedWorklogId=476925=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476925
 ]

ASF GitHub Bot logged work on HIVE-23770:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 00:46
Start Date: 01/Sep/20 00:46
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1190:
URL: https://github.com/apache/hive/pull/1190#issuecomment-684124111


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476925)
Time Spent: 0.5h  (was: 20m)

> Druid filter translation unable to handle inverted between
> --
>
> Key: HIVE-23770
> URL: https://issues.apache.org/jira/browse/HIVE-23770
> Project: Hive
>  Issue Type: Bug
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23770.1.patch, HIVE-23770.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Druid filter translation happens in Calcite and does not uses HiveBetween 
> inverted flag for translation this misses a negation in the planned query



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23666) checkHashModeEfficiency is skipped when a groupby operator doesn't have a grouping set

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23666?focusedWorklogId=476935=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476935
 ]

ASF GitHub Bot logged work on HIVE-23666:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 00:46
Start Date: 01/Sep/20 00:46
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1103:
URL: https://github.com/apache/hive/pull/1103#issuecomment-684124182


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476935)
Time Spent: 20m  (was: 10m)

> checkHashModeEfficiency is skipped when a groupby operator doesn't have a 
> grouping set
> --
>
> Key: HIVE-23666
> URL: https://issues.apache.org/jira/browse/HIVE-23666
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-23666.1.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> checkHashModeEfficiency is skipped when a groupby operator doesn't have a 
> grouping set



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23585) Retrieve replication instance metrics details

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23585?focusedWorklogId=476933=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476933
 ]

ASF GitHub Bot logged work on HIVE-23585:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 00:46
Start Date: 01/Sep/20 00:46
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1100:
URL: https://github.com/apache/hive/pull/1100#issuecomment-684124189


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476933)
Time Spent: 0.5h  (was: 20m)

> Retrieve replication instance metrics details
> -
>
> Key: HIVE-23585
> URL: https://issues.apache.org/jira/browse/HIVE-23585
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23585.01.patch, HIVE-23585.02.patch, 
> HIVE-23585.03.patch, Replication Metrics.pdf
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23586) load data overwrite into bucket table failed

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23586?focusedWorklogId=476934=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476934
 ]

ASF GitHub Bot logged work on HIVE-23586:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 00:46
Start Date: 01/Sep/20 00:46
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1048:
URL: https://github.com/apache/hive/pull/1048#issuecomment-684124214


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476934)
Time Spent: 0.5h  (was: 20m)

> load data overwrite into bucket table failed
> 
>
> Key: HIVE-23586
> URL: https://issues.apache.org/jira/browse/HIVE-23586
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0, 4.0.0, 3.1.2
>Reporter: zhaolong
>Assignee: zhaolong
>Priority: Critical
>  Labels: pull-request-available
> Attachments: HIVE-23586.01.patch, image-2020-06-01-21-40-21-726.png, 
> image-2020-06-01-21-41-28-732.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> load data overwrite into bucket table is failed if filename is not like 
> 00_0, but insert new data in the table.
>  
> for example:
> CREATE EXTERNAL TABLE IF NOT EXISTS test_hive2 (name string,account string) 
> PARTITIONED BY (logdate string) CLUSTERED BY (account) INTO 4 BUCKETS row 
> format delimited fields terminated by '|' STORED AS textfile;
>  load data inpath 'hdfs://hacluster/tmp/zltest' overwrite into table 
> default.test_hive2 partition (logdate='20200508');
>  !image-2020-06-01-21-40-21-726.png!
>  load data inpath 'hdfs://hacluster/tmp/zltest' overwrite into table 
> default.test_hive2 partition (logdate='20200508');// should overwrite but 
> insert new data
>  !image-2020-06-01-21-41-28-732.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23551) Acid: Update queries should treat dirCache as read-only in AcidUtils

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23551?focusedWorklogId=476932=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476932
 ]

ASF GitHub Bot logged work on HIVE-23551:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 00:46
Start Date: 01/Sep/20 00:46
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1047:
URL: https://github.com/apache/hive/pull/1047#issuecomment-684124220


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476932)
Time Spent: 20m  (was: 10m)

> Acid: Update queries should treat dirCache as read-only in AcidUtils
> 
>
> Key: HIVE-23551
> URL: https://issues.apache.org/jira/browse/HIVE-23551
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-23551.1.patch, HIVE-23551.2.patch, 
> HIVE-23551.3.patch, HIVE-23551.4.patch, HIVE-23551.5.patch, HIVE-23551.6.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Update statements create delta folders at the end of the execution. When 
> {{insert overwrite}} followed by {{update}} is executed, it does not get any 
> open txns and ends up caching the {{base}} folder. However, the delta folder 
> which gets created at the end of the statement never makes it to the cache. 
> This creates wrong results.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23688) Vectorization: IndexArrayOutOfBoundsException For map type column which includes null value

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23688?focusedWorklogId=476931=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476931
 ]

ASF GitHub Bot logged work on HIVE-23688:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 00:46
Start Date: 01/Sep/20 00:46
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1122:
URL: https://github.com/apache/hive/pull/1122#issuecomment-684124163


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476931)
Time Spent: 20m  (was: 10m)

> Vectorization: IndexArrayOutOfBoundsException For map type column which 
> includes null value
> ---
>
> Key: HIVE-23688
> URL: https://issues.apache.org/jira/browse/HIVE-23688
> Project: Hive
>  Issue Type: Bug
>  Components: Parquet, storage-api, Vectorization
>Affects Versions: All Versions
>Reporter: 范宜臻
>Assignee: 范宜臻
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.0.0, 4.0.0
>
> Attachments: HIVE-23688.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {color:#de350b}start{color} and {color:#de350b}length{color} are empty arrays 
> in MapColumnVector.values(BytesColumnVector) when values in map contain 
> {color:#de350b}null{color}
> reproduce in master branch:
> {code:java}
> set hive.vectorized.execution.enabled=true; 
> CREATE TABLE parquet_map_type (id int,stringMap map) 
> stored as parquet; 
> insert overwrite table parquet_map_typeSELECT 1, MAP('k1', null, 'k2', 
> 'bar'); 
> select id, stringMap['k1'] from parquet_map_type group by 1,2;
> {code}
> query explain:
> {code:java}
> Stage-0
>   Fetch Operator
> limit:-1
> Stage-1
>   Reducer 2 vectorized
>   File Output Operator [FS_12]
> Group By Operator [GBY_11] (rows=5 width=2)
>   Output:["_col0","_col1"],keys:KEY._col0, KEY._col1
> <-Map 1 [SIMPLE_EDGE] vectorized
>   SHUFFLE [RS_10]
> PartitionCols:_col0, _col1
> Group By Operator [GBY_9] (rows=10 width=2)
>   Output:["_col0","_col1"],keys:_col0, _col1
>   Select Operator [SEL_8] (rows=10 width=2)
> Output:["_col0","_col1"]
> TableScan [TS_0] (rows=10 width=2)
>   
> temp@parquet_map_type_fyz,parquet_map_type_fyz,Tbl:COMPLETE,Col:NONE,Output:["id","stringmap"]
> {code}
> runtime error:
> {code:java}
> Vertex failed, vertexName=Map 1, vertexId=vertex_1592040015150_0001_3_00, 
> diagnostics=[Task failed, taskId=task_1592040015150_0001_3_00_00, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( 
> failure ) : 
> attempt_1592040015150_0001_3_00_00_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row 
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at 
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
>   at 
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
>   at

[jira] [Work logged] (HIVE-23735) Reducer misestimate for export command

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23735?focusedWorklogId=476926=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476926
 ]

ASF GitHub Bot logged work on HIVE-23735:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 00:46
Start Date: 01/Sep/20 00:46
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1165:
URL: https://github.com/apache/hive/pull/1165#issuecomment-684124135


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476926)
Time Spent: 20m  (was: 10m)

> Reducer misestimate for export command
> --
>
> Key: HIVE-23735
> URL: https://issues.apache.org/jira/browse/HIVE-23735
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23735.1.wip.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L6869
> {code}
> if (dest_tab.getNumBuckets() > 0) {
> ...
> }
> {code}
> For "export" command, HS2 creates a dummy table and for this table and gets 
> "1" as the number of buckets.
> {noformat}
> set hive.stats.autogather=false;
> export table sample_table to '/tmp/export/sampe_db/t1';
> {noformat}
> This causes issues in reducer estimates and always lands up with '1' as the 
> number of reducer task. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23710) Add table meta cache limit when starting Hive server2

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23710?focusedWorklogId=476929=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476929
 ]

ASF GitHub Bot logged work on HIVE-23710:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 00:46
Start Date: 01/Sep/20 00:46
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1135:
URL: https://github.com/apache/hive/pull/1135#issuecomment-684124148


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476929)
Time Spent: 20m  (was: 10m)

> Add table meta cache limit when starting Hive server2
> -
>
> Key: HIVE-23710
> URL: https://issues.apache.org/jira/browse/HIVE-23710
> Project: Hive
>  Issue Type: Improvement
> Environment: Hive 2.3.6
>Reporter: Deegue
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23710.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When we start up Hive server2, it will connect to metastore to get table meta 
> info by database and cache them. If there are many tables in a database, 
> however, will exceed `hive.metastore.client.socket.timeout`.
> Then exception thrown like:
> {noformat}
> 2020-06-17T11:38:27,595  WARN [main] metastore.RetryingMetaStoreClient: 
> MetaStoreClient lost connection. Attempting to reconnect (1 of 1) after 1s. 
> getTableObjectsByName
> org.apache.thrift.transport.TTransportException: 
> java.net.SocketTimeoutException: Read timed out
>   at 
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
>  ~[hive-exec-2.3.6.jar:2.3.6]
>   at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) 
> ~[hive-exec-2.3.6.jar:2.3.6]
>   at 
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) 
> ~[hive-exec-2.3.6.jar:2.3.6]
>   at 
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) 
> ~[hive-exec-2.3.6.jar:2.3.6]
>   at 
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
>  ~[hive-exec-2.3.6.jar:2.3.6]
>   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77) 
> ~[hive-exec-2.3.6.jar:2.3.6]
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table_objects_by_name_req(ThriftHiveMetastore.java:1596)
>  ~[hive-exec-2.3.6.jar:2.3.6]
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_table_objects_by_name_req(ThriftHiveMetastore.java:1583)
>  ~[hive-exec-2.3.6.jar:2.3.6]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTableObjectsByName(HiveMetaStoreClient.java:1370)
>  ~[hive-exec-2.3.6.jar:2.3.6]
>   at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getTableObjectsByName(SessionHiveMetaStoreClient.java:238)
>  ~[hive-exec-2.3.6.jar:2.3.6]
>   at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) ~[?:?]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_121]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_121]
>   at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:206)
>  ~[hive-exec-2.3.6.jar:2.3.6]
>   at com.sun.proxy.$Proxy38.getTableObjectsByName(Unknown Source) ~[?:?]
>   at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) ~[?:?]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_121]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_121]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2336)
>  ~[hive-exec-2.3.6.jar:2.3.6]
>   at com.sun.proxy.$Proxy38.getTableObjectsByName(Unknown Source) ~[?:?]
>   at 
> org.apache.hadoop.hive.ql.metadata.Hive.getAllTableObjects(Hive.java:1343) 
> ~[hive-exec-2.3.6.jar:2.3.6]
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMaterializedViewsRegistry.init(HiveMaterializedViewsRegistry.java:127)
>  ~[hive-exec-2.3.6.jar:2.3.6]
>   at 
>

[jira] [Work logged] (HIVE-20817) Reading Timestamp datatype via HiveServer2 gives errors

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-20817?focusedWorklogId=476930=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476930
 ]

ASF GitHub Bot logged work on HIVE-20817:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 00:46
Start Date: 01/Sep/20 00:46
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1179:
URL: https://github.com/apache/hive/pull/1179


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476930)
Time Spent: 50m  (was: 40m)

> Reading Timestamp datatype via HiveServer2 gives errors
> ---
>
> Key: HIVE-20817
> URL: https://issues.apache.org/jira/browse/HIVE-20817
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20817.01.patch, HIVE-20817.02.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> CREATE TABLE JdbcBasicRead ( empno int, desg string,empname string,doj 
> timestamp,Salary float,mgrid smallint, deptno tinyint ) ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY ',';
> LOAD DATA LOCAL INPATH '/tmp/art_jdbc/hive/input/input_7columns.txt' 
> OVERWRITE INTO TABLE JdbcBasicRead;
> Sample Data.
> —
> 7369,M,SMITH,1980-12-17 17:07:29.234234,5000.00,7902,20
> 7499,X,ALLEN,1981-02-20 17:07:29.234234,1250.00,7698,30
> 7521,X,WARD,1981-02-22 17:07:29.234234,01600.57,7698,40
> 7566,M,JONES,1981-04-02 17:07:29.234234,02975.65,7839,10
> 7654,X,MARTIN,1981-09-28 17:07:29.234234,01250.00,7698,20
> 7698,M,BLAKE,1981-05-01 17:07:29.234234,2850.98,7839,30
> 7782,M,CLARK,1981-06-09 17:07:29.234234,02450.00,7839,20
> —
> Select statement: SELECT empno, desg, empname, doj, salary, mgrid, deptno 
> FROM JdbcBasicWrite
> {code}
> 2018-09-25T07:11:03,222 WARN [HiveServer2-Handler-Pool: Thread-83]: 
> thrift.ThriftCLIService (:()) - Error fetching results:
> org.apache.hive.service.cli.HiveSQLException: java.lang.ClassCastException: 
> org.apache.hadoop.hive.common.type.Timestamp cannot be cast to 
> java.sql.Timestamp
> at 
> org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:469)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:328)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:910)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at sun.reflect.GeneratedMethodAccessor50.invoke(Unknown Source) ~[?:?]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_112]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
> at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112]
> at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112]
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>  ~[hadoop-common-3.1.1.3.0.1.0-187.jar:?]
> at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at com.sun.proxy.$Proxy46.fetchResults(Unknown Source) ~[?:?]
> at org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:564) 
> ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:786)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1837)
>  ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
>

[jira] [Work logged] (HIVE-23755) Fix Ranger Url extra slash

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23755?focusedWorklogId=476928=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476928
 ]

ASF GitHub Bot logged work on HIVE-23755:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 00:46
Start Date: 01/Sep/20 00:46
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1173:
URL: https://github.com/apache/hive/pull/1173#issuecomment-684124125


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476928)
Time Spent: 50m  (was: 40m)

> Fix Ranger Url extra slash
> --
>
> Key: HIVE-23755
> URL: https://issues.apache.org/jira/browse/HIVE-23755
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23755.01.patch, HIVE-23755.02.patch, 
> HIVE-23755.03.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23784) Fix Replication Metrics Sink to DB

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23784?focusedWorklogId=476927=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476927
 ]

ASF GitHub Bot logged work on HIVE-23784:
-

Author: ASF GitHub Bot
Created on: 01/Sep/20 00:46
Start Date: 01/Sep/20 00:46
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1193:
URL: https://github.com/apache/hive/pull/1193#issuecomment-684124104


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476927)
Time Spent: 20m  (was: 10m)

> Fix Replication Metrics Sink to DB
> --
>
> Key: HIVE-23784
> URL: https://issues.apache.org/jira/browse/HIVE-23784
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23784.01.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24098) Bump Jetty to 9.4.31.v20200723 to get rid of Tomcat CVE warnings

2020-08-31 Thread Sai Hemanth Gantasala (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sai Hemanth Gantasala reassigned HIVE-24098:



> Bump Jetty to 9.4.31.v20200723 to get rid of Tomcat CVE warnings
> 
>
> Key: HIVE-24098
> URL: https://issues.apache.org/jira/browse/HIVE-24098
> Project: Hive
>  Issue Type: Bug
>  Components: Security
>Reporter: Sai Hemanth Gantasala
>Assignee: Sai Hemanth Gantasala
>Priority: Major
>
> Jetty jar has some fixes for transitive CVEs (apache-jsp see details below).
> When using the Apache JServ Protocol (AJP), care must be taken when trusting 
> incoming connections to Apache Tomcat. Tomcat treats AJP connections as 
> having higher trust than, for example, a similar HTTP connection. If such 
> connections are available to an attacker, they can be exploited in ways that 
> may be surprising. In Apache Tomcat 9.0.0.M1 to 9.0.0.30, 8.5.0 to 8.5.50, 
> and 7.0.0 to 7.0.99, Tomcat shipped with an AJP Connector enabled by default 
> that listened on all configured IP addresses. It was expected (and 
> recommended in the security guide) that this Connector would be disabled if 
> not required. This vulnerability report identified a mechanism that allowed: 
> - returning arbitrary files from anywhere in the web application - processing 
> any file in the web application as a JSP Further, if the web application 
> allowed file upload and stored those files within the web application (or the 
> attacker was able to control the content of the web application by some other 
> means) then this, along with the ability to process a file as a JSP, made 
> remote code execution possible. It is important to note that mitigation is 
> only required if an AJP port is accessible to untrusted users.
> So we need to upgrade jetty 9.4.30+ to get rid of Tomcat CVE warnings
>  * 
> [https://github.com/eclipse/jetty.project/commit/fedc7c65997d433bbdfc26fb3d861f8488f9c804]
>  * 
> [https://github.com/eclipse/jetty.project/commit/74a2ce7a4299014d0b8e4549961e7034ae24c3d1]
> There are also a bunch of other misc fixes:
> [https://github.com/eclipse/jetty.project/releases]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22622) Hive allows to create a struct with duplicate attribute names

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22622?focusedWorklogId=476848=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476848
 ]

ASF GitHub Bot logged work on HIVE-22622:
-

Author: ASF GitHub Bot
Created on: 31/Aug/20 21:28
Start Date: 31/Aug/20 21:28
Worklog Time Spent: 10m 
  Work Description: zabetak commented on a change in pull request #1446:
URL: https://github.com/apache/hive/pull/1446#discussion_r480409116



##
File path: common/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java
##
@@ -471,6 +471,7 @@
   "Not an ordered-set aggregate function: {0}. WITHIN GROUP clause is 
not allowed.", true),
   WITHIN_GROUP_PARAMETER_MISMATCH(10422,
   "The number of hypothetical direct arguments ({0}) must match the 
number of ordering columns ({1})", true),
+  AMBIGUOUS_STRUCT_FIELD(10423, "Struct field is not unique: {0}", true),

Review comment:
   nit: Usually we use "field" for row types and "attribute" for struct 
types. Plus it reads more natural if we inline the attribute name in the 
sentence: 
   
   `Attribute \"{0}\" specified more than once in structured type.` 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476848)
Time Spent: 20m  (was: 10m)

> Hive allows to create a struct with duplicate attribute names
> -
>
> Key: HIVE-22622
> URL: https://issues.apache.org/jira/browse/HIVE-22622
> Project: Hive
>  Issue Type: Bug
>Reporter: Denys Kuzmenko
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When you create at table with a struct with twice the same attribute name, 
> hive allow you to create it.
> create table test_struct( duplicateColumn struct);
> You can insert data into it :
> insert into test_struct select named_struct("id",1,"id",1);
> But you can not read it :
> select * from test_struct;
> Return : java.io.IOException: java.io.IOException: Error reading file: 
> hdfs://.../test_struct/delta_001_001_/bucket_0 ,
> We can create and insert. but fail on read the Struct part of the tables. We 
> can still read all other columns (if we have more than one) but not the 
> struct anymore.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24097) correct NPE exception in HiveMetastoreAuthorizer

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24097?focusedWorklogId=476835=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476835
 ]

ASF GitHub Bot logged work on HIVE-24097:
-

Author: ASF GitHub Bot
Created on: 31/Aug/20 21:08
Start Date: 31/Aug/20 21:08
Worklog Time Spent: 10m 
  Work Description: sam-an-cloudera opened a new pull request #1448:
URL: https://github.com/apache/hive/pull/1448


   ### What changes were proposed in this pull request?
   correcting NPE error. 
   
   
   ### Why are the changes needed?
   it corrects NPE error. 
   
   
   ### Does this PR introduce _any_ user-facing change?
   no
   
   
   ### How was this patch tested?
   manual
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476835)
Remaining Estimate: 0h
Time Spent: 10m

> correct NPE exception in HiveMetastoreAuthorizer
> 
>
> Key: HIVE-24097
> URL: https://issues.apache.org/jira/browse/HIVE-24097
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: Sam An
>Assignee: Sam An
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In some testing, we found it's possible to have NPE if the preEventType does 
> not fall within the several the HMS currently checks. This makes the 
> AuthzContext a null pointer. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24097) correct NPE exception in HiveMetastoreAuthorizer

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24097:
--
Labels: pull-request-available  (was: )

> correct NPE exception in HiveMetastoreAuthorizer
> 
>
> Key: HIVE-24097
> URL: https://issues.apache.org/jira/browse/HIVE-24097
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: Sam An
>Assignee: Sam An
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In some testing, we found it's possible to have NPE if the preEventType does 
> not fall within the several the HMS currently checks. This makes the 
> AuthzContext a null pointer. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24097) correct NPE exception in HiveMetastoreAuthorizer

2020-08-31 Thread Sam An (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam An reassigned HIVE-24097:
-


> correct NPE exception in HiveMetastoreAuthorizer
> 
>
> Key: HIVE-24097
> URL: https://issues.apache.org/jira/browse/HIVE-24097
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: Sam An
>Assignee: Sam An
>Priority: Minor
>
> In some testing, we found it's possible to have NPE if the preEventType does 
> not fall within the several the HMS currently checks. This makes the 
> AuthzContext a null pointer. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24059) Llap external client - Initial changes for running in cloud environment

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24059?focusedWorklogId=476799=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476799
 ]

ASF GitHub Bot logged work on HIVE-24059:
-

Author: ASF GitHub Bot
Created on: 31/Aug/20 20:14
Start Date: 31/Aug/20 20:14
Worklog Time Spent: 10m 
  Work Description: prasanthj commented on a change in pull request #1418:
URL: https://github.com/apache/hive/pull/1418#discussion_r480371903



##
File path: 
llap-common/src/java/org/apache/hadoop/hive/llap/security/DefaultJwtSharedSecretProvider.java
##
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.llap.security;
+
+import com.google.common.base.Preconditions;
+import io.jsonwebtoken.security.Keys;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hive.conf.HiveConf;
+
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.nio.CharBuffer;
+import java.nio.charset.StandardCharsets;
+import java.security.Key;
+
+import static 
org.apache.hadoop.hive.conf.HiveConf.ConfVars.LLAP_EXTERNAL_CLIENT_CLOUD_DEPLOYMENT_SETUP_ENABLED;
+import static 
org.apache.hadoop.hive.conf.HiveConf.ConfVars.LLAP_EXTERNAL_CLIENT_CLOUD_JWT_SHARED_SECRET;
+
+/**
+ * Default implementation of {@link JwtSecretProvider}.
+ *
+ * 1. It first tries to get shared secret from conf {@link 
HiveConf.ConfVars#LLAP_EXTERNAL_CLIENT_CLOUD_JWT_SHARED_SECRET}
+ * using {@link Configuration#getPassword(String)}.
+ *
+ * 2. If not found, it tries to read from env var {@link 
#LLAP_EXTERNAL_CLIENT_CLOUD_JWT_SHARED_SECRET_ENV_VAR}.
+ *
+ * If secret is not found even after 1) and 2), {@link #init(Configuration)} 
methods throws {@link NullPointerException}.
+ *
+ * It uses the same encryption and decryption secret which can be used to sign 
and verify JWT.
+ */
+public class DefaultJwtSharedSecretProvider implements JwtSecretProvider {
+
+  public static final String 
LLAP_EXTERNAL_CLIENT_CLOUD_JWT_SHARED_SECRET_ENV_VAR =
+  "LLAP_EXTERNAL_CLIENT_CLOUD_JWT_SHARED_SECRET_ENV_VAR";
+
+  private Key jwtEncryptionKey;
+
+  @Override public Key getEncryptionSecret() {
+return jwtEncryptionKey;
+  }
+
+  @Override public Key getDecryptionSecret() {
+return jwtEncryptionKey;
+  }
+
+  @Override public void init(final Configuration conf) {
+char[] sharedSecret;
+byte[] sharedSecretBytes = null;
+
+// try getting secret from conf first
+// if not found, get from env var - 
LLAP_EXTERNAL_CLIENT_CLOUD_JWT_SHARED_SECRET_ENV_VAR
+try {
+  sharedSecret = 
conf.getPassword(LLAP_EXTERNAL_CLIENT_CLOUD_JWT_SHARED_SECRET.varname);
+} catch (IOException e) {
+  throw new RuntimeException("Unable to get password 
[hive.llap.external.client.cloud.jwt.shared.secret] - "
+  + e.getMessage(), e);
+}
+if (sharedSecret != null) {
+  ByteBuffer bb = 
StandardCharsets.UTF_8.encode(CharBuffer.wrap(sharedSecret));
+  sharedSecretBytes = new byte[bb.remaining()];
+  bb.get(sharedSecretBytes);
+} else {
+  String sharedSecredFromEnv = 
System.getenv(LLAP_EXTERNAL_CLIENT_CLOUD_JWT_SHARED_SECRET_ENV_VAR);
+  if (sharedSecredFromEnv != null) {
+sharedSecretBytes = sharedSecredFromEnv.getBytes();
+  }
+}
+
+Preconditions.checkNotNull(sharedSecretBytes,

Review comment:
   nit: IllegalStateException is more appropriate I guess. check state 
instead? 

##
File path: 
llap-common/src/java/org/apache/hadoop/hive/llap/security/DefaultJwtSharedSecretProvider.java
##
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or

[jira] [Work logged] (HIVE-24022) Optimise HiveMetaStoreAuthorizer.createHiveMetaStoreAuthorizer

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24022?focusedWorklogId=476743=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476743
 ]

ASF GitHub Bot logged work on HIVE-24022:
-

Author: ASF GitHub Bot
Created on: 31/Aug/20 18:27
Start Date: 31/Aug/20 18:27
Worklog Time Spent: 10m 
  Work Description: sam-an-cloudera commented on pull request #1385:
URL: https://github.com/apache/hive/pull/1385#issuecomment-683950735


   > @sam-an-cloudera initialValue() return type was Configuration and is now 
HiveConf which is a sub-class of Configuration. Was there a reason for this 
change? It would be more appropriate that the return type be Configuration. 
Thoughts?
   
   It was mainly to avoid having to cast. The config used in 
createHiveMetastoreAuthorizer( ) method is required to be a HiveConf, the 
createHiveAuthorizer( ) call in particular. If we keep it as Configuration, 
then down there we have to do a cast, which is not good style, so I changed the 
original to HiveConf. 
   
   HiveConf hiveConf = tConfig.get();
   if(hiveConf == null){
 HiveConf hiveConf1 = new HiveConf(super.getConf(), HiveConf.class); 
<==If I don't change, then here would need a cast. 
 tConfig.set(hiveConf1);
 hiveConf = hiveConf1;
   }



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476743)
Time Spent: 0.5h  (was: 20m)

> Optimise HiveMetaStoreAuthorizer.createHiveMetaStoreAuthorizer
> --
>
> Key: HIVE-24022
> URL: https://issues.apache.org/jira/browse/HIVE-24022
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Sam An
>Priority: Minor
>  Labels: performance, pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> For a table with 3000+ partitions, analyze table takes a lot longer time as 
> HiveMetaStoreAuthorizer tries to create HiveConf for every partition request.
>  
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/metastore/HiveMetaStoreAuthorizer.java#L319]
>  
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/metastore/HiveMetaStoreAuthorizer.java#L447]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24072) HiveAggregateJoinTransposeRule may try to create an invalid transformation

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24072?focusedWorklogId=476710=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476710
 ]

ASF GitHub Bot logged work on HIVE-24072:
-

Author: ASF GitHub Bot
Created on: 31/Aug/20 17:37
Start Date: 31/Aug/20 17:37
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on a change in pull request #1432:
URL: https://github.com/apache/hive/pull/1432#discussion_r480282656



##
File path: ql/src/test/results/clientpositive/llap/groupby_join_pushdown.q.out
##
@@ -644,29 +646,18 @@ STAGE PLANS:
   Statistics: Num rows: 9173 Data size: 82188 Basic stats: 
COMPLETE Column stats: COMPLETE
   Group By Operator
 aggregations: max(_col0)
-keys: _col1 (type: bigint)
-minReductionHashAggr: 0.49994552
+keys: _col1 (type: bigint), _col0 (type: int)

Review comment:
   The aggregate call column is also part of the group by key now.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476710)
Time Spent: 40m  (was: 0.5h)

> HiveAggregateJoinTransposeRule may try to create an invalid transformation
> --
>
> Key: HIVE-24072
> URL: https://issues.apache.org/jira/browse/HIVE-24072
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {code}
> java.lang.AssertionError: 
> Cannot add expression of different type to set:
> set type is RecordType(INTEGER NOT NULL o_orderkey, DECIMAL(10, 0) 
> o_totalprice, DATE o_orderdate, INTEGER NOT NULL c_custkey, VARCHAR(25) 
> CHARACTER SET "UTF-16LE" c_name, DOUBLE $f5) NOT NULL
> expression type is RecordType(INTEGER NOT NULL o_orderkey, INTEGER NOT NULL 
> o_custkey, DECIMAL(10, 0) o_totalprice, DATE o_orderdate, INTEGER NOT NULL 
> c_custkey, DOUBLE $f1) NOT NULL
> set is rel#567:HiveAggregate.HIVE.[].any(input=HepRelVertex#490,group={2, 4, 
> 5, 6, 7},agg#0=sum($1))
> expression is HiveProject(o_orderkey=[$2], o_custkey=[$3], o_totalprice=[$4], 
> o_orderdate=[$5], c_custkey=[$6], $f1=[$1])
>   HiveJoin(condition=[=($2, $0)], joinType=[inner], algorithm=[none], 
> cost=[{2284.5 rows, 0.0 cpu, 0.0 io}])
> HiveAggregate(group=[{0}], agg#0=[sum($1)])
>   HiveProject(l_orderkey=[$0], l_quantity=[$4])
> HiveTableScan(table=[[tpch_0_001, lineitem]], table:alias=[l])
> HiveJoin(condition=[=($0, $6)], joinType=[inner], algorithm=[none], 
> cost=[{1.9115E15 rows, 0.0 cpu, 0.0 io}])
>   HiveJoin(condition=[=($4, $1)], joinType=[inner], algorithm=[none], 
> cost=[{1650.0 rows, 0.0 cpu, 0.0 io}])
> HiveProject(o_orderkey=[$0], o_custkey=[$1], o_totalprice=[$3], 
> o_orderdate=[$4])
>   HiveTableScan(table=[[tpch_0_001, orders]], table:alias=[orders])
> HiveProject(c_custkey=[$0], c_name=[$1])
>   HiveTableScan(table=[[tpch_0_001, customer]], 
> table:alias=[customer])
>   HiveProject($f0=[$0])
> HiveFilter(condition=[>($1, 3E2)])
>   HiveAggregate(group=[{0}], agg#0=[sum($4)])
> HiveTableScan(table=[[tpch_0_001, lineitem]], 
> table:alias=[lineitem])
>   at 
> org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelOptUtil.java:383)
>   at 
> org.apache.calcite.plan.hep.HepRuleCall.transformTo(HepRuleCall.java:57)
>   at 
> org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:236)
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveAggregateJoinTransposeRule.onMatch(HiveAggregateJoinTransposeRule.java:300)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24072) HiveAggregateJoinTransposeRule may try to create an invalid transformation

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24072?focusedWorklogId=476709=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476709
 ]

ASF GitHub Bot logged work on HIVE-24072:
-

Author: ASF GitHub Bot
Created on: 31/Aug/20 17:32
Start Date: 31/Aug/20 17:32
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on a change in pull request #1432:
URL: https://github.com/apache/hive/pull/1432#discussion_r480279852



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveAggregateJoinTransposeRule.java
##
@@ -145,8 +145,7 @@ public void onMatch(RelOptRuleCall call) {
 int fieldCount = joinInput.getRowType().getFieldCount();
 final ImmutableBitSet fieldSet =
 ImmutableBitSet.range(offset, offset + fieldCount);
-final ImmutableBitSet belowAggregateKeyNotShifted =
-belowAggregateColumns.intersect(fieldSet);

Review comment:
   `belowAggregateColumns` includes the aggregate fields for all inputs 
together with the join keys (L120). Intersection is supposed to gather only 
those applicable to current input in the loop, it should not remove column 2 
unless I am missing anything... 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476709)
Time Spent: 0.5h  (was: 20m)

> HiveAggregateJoinTransposeRule may try to create an invalid transformation
> --
>
> Key: HIVE-24072
> URL: https://issues.apache.org/jira/browse/HIVE-24072
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {code}
> java.lang.AssertionError: 
> Cannot add expression of different type to set:
> set type is RecordType(INTEGER NOT NULL o_orderkey, DECIMAL(10, 0) 
> o_totalprice, DATE o_orderdate, INTEGER NOT NULL c_custkey, VARCHAR(25) 
> CHARACTER SET "UTF-16LE" c_name, DOUBLE $f5) NOT NULL
> expression type is RecordType(INTEGER NOT NULL o_orderkey, INTEGER NOT NULL 
> o_custkey, DECIMAL(10, 0) o_totalprice, DATE o_orderdate, INTEGER NOT NULL 
> c_custkey, DOUBLE $f1) NOT NULL
> set is rel#567:HiveAggregate.HIVE.[].any(input=HepRelVertex#490,group={2, 4, 
> 5, 6, 7},agg#0=sum($1))
> expression is HiveProject(o_orderkey=[$2], o_custkey=[$3], o_totalprice=[$4], 
> o_orderdate=[$5], c_custkey=[$6], $f1=[$1])
>   HiveJoin(condition=[=($2, $0)], joinType=[inner], algorithm=[none], 
> cost=[{2284.5 rows, 0.0 cpu, 0.0 io}])
> HiveAggregate(group=[{0}], agg#0=[sum($1)])
>   HiveProject(l_orderkey=[$0], l_quantity=[$4])
> HiveTableScan(table=[[tpch_0_001, lineitem]], table:alias=[l])
> HiveJoin(condition=[=($0, $6)], joinType=[inner], algorithm=[none], 
> cost=[{1.9115E15 rows, 0.0 cpu, 0.0 io}])
>   HiveJoin(condition=[=($4, $1)], joinType=[inner], algorithm=[none], 
> cost=[{1650.0 rows, 0.0 cpu, 0.0 io}])
> HiveProject(o_orderkey=[$0], o_custkey=[$1], o_totalprice=[$3], 
> o_orderdate=[$4])
>   HiveTableScan(table=[[tpch_0_001, orders]], table:alias=[orders])
> HiveProject(c_custkey=[$0], c_name=[$1])
>   HiveTableScan(table=[[tpch_0_001, customer]], 
> table:alias=[customer])
>   HiveProject($f0=[$0])
> HiveFilter(condition=[>($1, 3E2)])
>   HiveAggregate(group=[{0}], agg#0=[sum($4)])
> HiveTableScan(table=[[tpch_0_001, lineitem]], 
> table:alias=[lineitem])
>   at 
> org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelOptUtil.java:383)
>   at 
> org.apache.calcite.plan.hep.HepRuleCall.transformTo(HepRuleCall.java:57)
>   at 
> org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:236)
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveAggregateJoinTransposeRule.onMatch(HiveAggregateJoinTransposeRule.java:300)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23456) Upgrade Calcite version to 1.25.0

2020-08-31 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-23456:
---
Summary: Upgrade Calcite version to 1.25.0  (was: Upgrade Calcite version 
to 1.23.0)

> Upgrade Calcite version to 1.25.0
> -
>
> Key: HIVE-23456
> URL: https://issues.apache.org/jira/browse/HIVE-23456
> Project: Hive
>  Issue Type: Task
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
> Attachments: HIVE-23456.01.patch, HIVE-23456.02.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24022) Optimise HiveMetaStoreAuthorizer.createHiveMetaStoreAuthorizer

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24022?focusedWorklogId=476675=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476675
 ]

ASF GitHub Bot logged work on HIVE-24022:
-

Author: ASF GitHub Bot
Created on: 31/Aug/20 16:59
Start Date: 31/Aug/20 16:59
Worklog Time Spent: 10m 
  Work Description: nrg4878 commented on pull request #1385:
URL: https://github.com/apache/hive/pull/1385#issuecomment-683904667


   @sam-an-cloudera initialValue() return type was Configuration and is now 
HiveConf which is a sub-class of Configuration. Was there a reason for this 
change? It would be more appropriate that the return type be Configuration. 
Thoughts?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476675)
Time Spent: 20m  (was: 10m)

> Optimise HiveMetaStoreAuthorizer.createHiveMetaStoreAuthorizer
> --
>
> Key: HIVE-24022
> URL: https://issues.apache.org/jira/browse/HIVE-24022
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Sam An
>Priority: Minor
>  Labels: performance, pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> For a table with 3000+ partitions, analyze table takes a lot longer time as 
> HiveMetaStoreAuthorizer tries to create HiveConf for every partition request.
>  
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/metastore/HiveMetaStoreAuthorizer.java#L319]
>  
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/metastore/HiveMetaStoreAuthorizer.java#L447]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23938) LLAP: JDK11 - some GC log file rotation related jvm arguments cannot be used anymore

2020-08-31 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17187862#comment-17187862
 ] 

Ashutosh Chauhan commented on HIVE-23938:
-

+1

> LLAP: JDK11 - some GC log file rotation related jvm arguments cannot be used 
> anymore
> 
>
> Key: HIVE-23938
> URL: https://issues.apache.org/jira/browse/HIVE-23938
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
> Attachments: gc_2020-07-27-13.log, gc_2020-07-29-12.jdk8.log
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> https://github.com/apache/hive/blob/master/llap-server/bin/runLlapDaemon.sh#L55
> {code}
> JAVA_OPTS_BASE="-server -Djava.net.preferIPv4Stack=true -XX:+UseNUMA 
> -XX:+PrintGCDetails -verbose:gc -XX:+UseGCLogFileRotation 
> -XX:NumberOfGCLogFiles=4 -XX:GCLogFileSize=100M -XX:+PrintGCDateStamps"
> {code}
> on JDK11 I got something like:
> {code}
> + exec /usr/lib/jvm/jre-11-openjdk/bin/java -Dproc_llapdaemon -Xms32000m 
> -Xmx64000m -Dhttp.maxConnections=17 -XX:+UseG1GC -XX:+ResizeTLAB -XX:+UseNUMA 
> -XX:+AggressiveOpts -XX:MetaspaceSize=1024m 
> -XX:InitiatingHeapOccupancyPercent=80 -XX:MaxGCPauseMillis=200 
> -XX:+PreserveFramePointer -XX:AllocatePrefetchStyle=2 
> -Dhttp.maxConnections=10 -Dasync.profiler.home=/grid/0/async-profiler -server 
> -Djava.net.preferIPv4Stack=true -XX:+UseNUMA -XX:+PrintGCDetails -verbose:gc 
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=4 -XX:GCLogFileSize=100M 
> -XX:+PrintGCDateStamps 
> -Xloggc:/grid/2/yarn/container-logs/application_1595375468459_0113/container_e26_1595375468459_0113_01_09/gc_2020-07-27-12.log
>  
> ... 
> org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon
> OpenJDK 64-Bit Server VM warning: Option AggressiveOpts was deprecated in 
> version 11.0 and will likely be removed in a future release.
> Unrecognized VM option 'UseGCLogFileRotation'
> Error: Could not create the Java Virtual Machine.
> Error: A fatal exception has occurred. Program will exit.
> {code}
> These are not valid in JDK11:
> {code}
> -XX:+UseGCLogFileRotation
> -XX:NumberOfGCLogFiles
> -XX:GCLogFileSize
> -XX:+PrintGCTimeStamps
> -XX:+PrintGCDateStamps
> {code}
> Instead something like:
> {code}
> -Xlog:gc*,safepoint:gc.log:time,uptime:filecount=4,filesize=100M
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24064) Disable Materialized View Replication

2020-08-31 Thread Arko Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arko Sharma updated HIVE-24064:
---
Attachment: HIVE-24064.06.patch

> Disable Materialized View Replication
> -
>
> Key: HIVE-24064
> URL: https://issues.apache.org/jira/browse/HIVE-24064
> Project: Hive
>  Issue Type: Bug
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24064.01.patch, HIVE-24064.02.patch, 
> HIVE-24064.03.patch, HIVE-24064.04.patch, HIVE-24064.05.patch, 
> HIVE-24064.06.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24096) Abort failed compaction's txn on TException or IOException

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24096:
--
Labels: pull-request-available  (was: )

> Abort failed compaction's txn on TException or IOException
> --
>
> Key: HIVE-24096
> URL: https://issues.apache.org/jira/browse/HIVE-24096
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If compaction fails with a TException or IOException (e.g. IOException from 
> [getAcidState|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java#L500]
>  which is) after the compaction txn has been opened, the compaction is marked 
> 'failed' but the compaction txn is never aborted.
> We should abort an open compaction txn upon TExceptions/IOExceptions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24096) Abort failed compaction's txn on TException or IOException

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24096?focusedWorklogId=476612=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476612
 ]

ASF GitHub Bot logged work on HIVE-24096:
-

Author: ASF GitHub Bot
Created on: 31/Aug/20 15:03
Start Date: 31/Aug/20 15:03
Worklog Time Spent: 10m 
  Work Description: klcopp opened a new pull request #1447:
URL: https://github.com/apache/hive/pull/1447


   If compaction fails with a TException or IOException (e.g. IOException from 
getAcidState which comes) after the compaction txn has been opened, the 
compaction is marked 'failed' but the compaction txn is never aborted.
   
   We should abort an open compaction txn upon TExceptions/IOExceptions.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476612)
Remaining Estimate: 0h
Time Spent: 10m

> Abort failed compaction's txn on TException or IOException
> --
>
> Key: HIVE-24096
> URL: https://issues.apache.org/jira/browse/HIVE-24096
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If compaction fails with a TException or IOException (e.g. IOException from 
> [getAcidState|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java#L500]
>  which is) after the compaction txn has been opened, the compaction is marked 
> 'failed' but the compaction txn is never aborted.
> We should abort an open compaction txn upon TExceptions/IOExceptions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24096) Abort failed compaction's txn on TException or IOException

2020-08-31 Thread Karen Coppage (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage updated HIVE-24096:
-
Summary: Abort failed compaction's txn on TException or IOException  (was: 
Abort failed compaction's txn on TException|IOException)

> Abort failed compaction's txn on TException or IOException
> --
>
> Key: HIVE-24096
> URL: https://issues.apache.org/jira/browse/HIVE-24096
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>
> If compaction fails with a TException or IOException (e.g. IOException from 
> [getAcidState|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java#L500]
>  which is) after the compaction txn has been opened, the compaction is marked 
> 'failed' but the compaction txn is never aborted.
> We should abort an open compaction txn upon TExceptions/IOExceptions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24096) Abort failed compaction's txn on TException|IOException

2020-08-31 Thread Karen Coppage (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage reassigned HIVE-24096:



> Abort failed compaction's txn on TException|IOException
> ---
>
> Key: HIVE-24096
> URL: https://issues.apache.org/jira/browse/HIVE-24096
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>
> If compaction fails with a TException or IOException (e.g. IOException from 
> [getAcidState|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java#L500]
>  which is) after the compaction txn has been opened, the compaction is marked 
> 'failed' but the compaction txn is never aborted.
> We should abort an open compaction txn upon TExceptions/IOExceptions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22622) Hive allows to create a struct with duplicate attribute names

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-22622:
--
Labels: pull-request-available  (was: )

> Hive allows to create a struct with duplicate attribute names
> -
>
> Key: HIVE-22622
> URL: https://issues.apache.org/jira/browse/HIVE-22622
> Project: Hive
>  Issue Type: Bug
>Reporter: Denys Kuzmenko
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When you create at table with a struct with twice the same attribute name, 
> hive allow you to create it.
> create table test_struct( duplicateColumn struct);
> You can insert data into it :
> insert into test_struct select named_struct("id",1,"id",1);
> But you can not read it :
> select * from test_struct;
> Return : java.io.IOException: java.io.IOException: Error reading file: 
> hdfs://.../test_struct/delta_001_001_/bucket_0 ,
> We can create and insert. but fail on read the Struct part of the tables. We 
> can still read all other columns (if we have more than one) but not the 
> struct anymore.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22622) Hive allows to create a struct with duplicate attribute names

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22622?focusedWorklogId=476568=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476568
 ]

ASF GitHub Bot logged work on HIVE-22622:
-

Author: ASF GitHub Bot
Created on: 31/Aug/20 13:02
Start Date: 31/Aug/20 13:02
Worklog Time Spent: 10m 
  Work Description: kasakrisz opened a new pull request #1446:
URL: https://github.com/apache/hive/pull/1446


   ### What changes were proposed in this pull request?
   Add a check for duplicated struct field identifiers and throw 
SemanticException with customized error message when found.
   
   ### Why are the changes needed?
   Creating a table with a struct type column with duplicate field identifier 
and inserting records is allowed but later when querying from the table we 
cannot distinguish between the attributes of the struct has the same identifier.
   In some cases (depending on table serde format) the query may fails. See 
jira for details.
   
   ### Does this PR introduce _any_ user-facing change?
   Introduce new error code and message. Example:
   ```
   FAILED: SemanticException [Error 10423]: Struct field is not unique: id
   ```
   
   ### How was this patch tested?
   1. Create new negative test:
   ```
   mvn test -Dtest.output.overwrite -DskipSparkTests 
-Dtest=TestNegativeCliDriver -Dqfile=struct_field_uniqueness.q -pl itests/qtest 
-Pitests
   ```
   
   2. Reproduce query failure
   ```
   CREATE TABLE person
   (
   `id`  int,
   `address` struct
   )
   ROW FORMAT SERDE
   'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
   STORED AS INPUTFORMAT
   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
   OUTPUTFORMAT
   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat';
   
   INSERT INTO person
   VALUES (1, named_struct('number', 61, 'street', 'Terrasse', 'number', 62));
   INSERT INTO person
   VALUES (2, named_struct('number', 51, 'street', 'Terrasse', 'number', 52));
   
   SELECT address.number FROM person;
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476568)
Remaining Estimate: 0h
Time Spent: 10m

> Hive allows to create a struct with duplicate attribute names
> -
>
> Key: HIVE-22622
> URL: https://issues.apache.org/jira/browse/HIVE-22622
> Project: Hive
>  Issue Type: Bug
>Reporter: Denys Kuzmenko
>Assignee: Krisztian Kasa
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When you create at table with a struct with twice the same attribute name, 
> hive allow you to create it.
> create table test_struct( duplicateColumn struct);
> You can insert data into it :
> insert into test_struct select named_struct("id",1,"id",1);
> But you can not read it :
> select * from test_struct;
> Return : java.io.IOException: java.io.IOException: Error reading file: 
> hdfs://.../test_struct/delta_001_001_/bucket_0 ,
> We can create and insert. but fail on read the Struct part of the tables. We 
> can still read all other columns (if we have more than one) but not the 
> struct anymore.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24081) Enable pre-materializing CTEs referenced in scalar subqueries

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24081?focusedWorklogId=476549=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476549
 ]

ASF GitHub Bot logged work on HIVE-24081:
-

Author: ASF GitHub Bot
Created on: 31/Aug/20 11:52
Start Date: 31/Aug/20 11:52
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on a change in pull request #1437:
URL: https://github.com/apache/hive/pull/1437#discussion_r480078957



##
File path: ql/src/test/queries/clientpositive/cte_mat_6.q
##
@@ -0,0 +1,81 @@
+set hive.optimize.cte.materialize.threshold=1;
+
+create table t0(col0 int);
+
+insert into t0(col0) values
+(1),(2),
+(100),(100),(100),
+(200),(200);
+
+-- CTE is referenced from scalar subquery in the select clause
+explain
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, (select small_count from cte)
+from t0
+order by t0.col0;
+
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, (select small_count from cte)
+from t0
+order by t0.col0;
+
+-- disable cte materialization
+set hive.optimize.cte.materialize.threshold=-1;
+
+explain
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, (select small_count from cte)
+from t0
+order by t0.col0;
+
+
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, (select small_count from cte)
+from t0
+order by t0.col0;
+
+
+-- enable cte materialization
+set hive.optimize.cte.materialize.threshold=1;
+
+-- CTE is referenced from scalar subquery in the where clause
+explain
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0
+from t0
+where t0.col0 > (select small_count from cte)
+order by t0.col0;
+
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0
+from t0
+where t0.col0 > (select small_count from cte)
+order by t0.col0;
+
+-- CTE is referenced from scalar subquery in the having clause
+explain
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, count(*)
+from t0
+group by col0
+having count(*) > (select small_count from cte)
+order by t0.col0;
+
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, count(*)
+from t0
+group by col0
+having count(*) > (select small_count from cte)
+order by t0.col0;
+
+-- mix full aggregate and non-full aggregate ctes
+explain
+with cte1 as (select col0 as k1 from t0 where col0 = '5'),
+ cte2 as (select count(*) as all_count from t0),
+ cte3 as (select col0 as k3, col0 + col0 as k3_2x, count(*) as key_count 
from t0 group by col0)
+select t0.col0, count(*)
+from t0
+join cte1 on t0.col0 = cte1.k1
+join cte3 on t0.col0 = cte3.k3
+group by col0
+having count(*) > (select all_count from cte2)

Review comment:
   @jcamachor 
   I just found that CTE materialization was enabled for CTAS in the past or at 
least it was tested. Please see
   
https://github.com/apache/hive/blob/master/ql/src/test/queries/clientpositive/cte_4.q
   Should we disable it anyway?
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476549)
Time Spent: 1h 20m  (was: 1h 10m)

> Enable pre-materializing CTEs referenced in scalar subqueries
> -
>
> Key: HIVE-24081
> URL: https://issues.apache.org/jira/browse/HIVE-24081
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> HIVE-11752 introduces materializing CTE based on config
> {code}
> hive.optimize.cte.materialize.threshold
> {code}
> Goal of this jira is
> * extending the implementation to support materializing CTE's referenced in 
> scalar subqueries
> * add a config to materialize CTEs with aggregate output only



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24064) Disable Materialized View Replication

2020-08-31 Thread Aasha Medhi (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17187665#comment-17187665
 ] 

Aasha Medhi commented on HIVE-24064:


+1

> Disable Materialized View Replication
> -
>
> Key: HIVE-24064
> URL: https://issues.apache.org/jira/browse/HIVE-24064
> Project: Hive
>  Issue Type: Bug
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24064.01.patch, HIVE-24064.02.patch, 
> HIVE-24064.03.patch, HIVE-24064.04.patch, HIVE-24064.05.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24064) Disable Materialized View Replication

2020-08-31 Thread Arko Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arko Sharma updated HIVE-24064:
---
Attachment: HIVE-24064.05.patch

> Disable Materialized View Replication
> -
>
> Key: HIVE-24064
> URL: https://issues.apache.org/jira/browse/HIVE-24064
> Project: Hive
>  Issue Type: Bug
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24064.01.patch, HIVE-24064.02.patch, 
> HIVE-24064.03.patch, HIVE-24064.04.patch, HIVE-24064.05.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24081) Enable pre-materializing CTEs referenced in scalar subqueries

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24081?focusedWorklogId=476457=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476457
 ]

ASF GitHub Bot logged work on HIVE-24081:
-

Author: ASF GitHub Bot
Created on: 31/Aug/20 09:02
Start Date: 31/Aug/20 09:02
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on a change in pull request #1437:
URL: https://github.com/apache/hive/pull/1437#discussion_r479970570



##
File path: ql/src/test/queries/clientpositive/cte_mat_6.q
##
@@ -0,0 +1,81 @@
+set hive.optimize.cte.materialize.threshold=1;
+
+create table t0(col0 int);
+
+insert into t0(col0) values
+(1),(2),
+(100),(100),(100),
+(200),(200);
+
+-- CTE is referenced from scalar subquery in the select clause
+explain
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, (select small_count from cte)
+from t0
+order by t0.col0;
+
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, (select small_count from cte)
+from t0
+order by t0.col0;
+
+-- disable cte materialization
+set hive.optimize.cte.materialize.threshold=-1;
+
+explain
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, (select small_count from cte)
+from t0
+order by t0.col0;
+
+
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, (select small_count from cte)
+from t0
+order by t0.col0;
+
+
+-- enable cte materialization
+set hive.optimize.cte.materialize.threshold=1;
+
+-- CTE is referenced from scalar subquery in the where clause
+explain
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0
+from t0
+where t0.col0 > (select small_count from cte)
+order by t0.col0;
+
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0
+from t0
+where t0.col0 > (select small_count from cte)
+order by t0.col0;
+
+-- CTE is referenced from scalar subquery in the having clause
+explain
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, count(*)
+from t0
+group by col0
+having count(*) > (select small_count from cte)
+order by t0.col0;
+
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, count(*)
+from t0
+group by col0
+having count(*) > (select small_count from cte)
+order by t0.col0;
+
+-- mix full aggregate and non-full aggregate ctes
+explain
+with cte1 as (select col0 as k1 from t0 where col0 = '5'),
+ cte2 as (select count(*) as all_count from t0),
+ cte3 as (select col0 as k3, col0 + col0 as k3_2x, count(*) as key_count 
from t0 group by col0)
+select t0.col0, count(*)
+from t0
+join cte1 on t0.col0 = cte1.k1
+join cte3 on t0.col0 = cte3.k3
+group by col0
+having count(*) > (select all_count from cte2)

Review comment:
   I added a check for CTAS.
   
   For `create materialize view` CTE materialization is disabled here 
   `SemanticAnalyzer.genResolvedParseTree`
   ```
   // 5. Resolve Parse Tree
   // Materialization is allowed if it is not a view definition
   getMetaData(qb, createVwDesc == null && !forViewCreation);
   ```
   
https://github.com/apache/hive/blob/54aff33d8e1d659d295e1f53b88aad91ba8cc23e/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L12355





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476457)
Time Spent: 1h 10m  (was: 1h)

> Enable pre-materializing CTEs referenced in scalar subqueries
> -
>
> Key: HIVE-24081
> URL: https://issues.apache.org/jira/browse/HIVE-24081
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> HIVE-11752 introduces materializing CTE based on config
> {code}
> hive.optimize.cte.materialize.threshold
> {code}
> Goal of this jira is
> * extending the implementation to support materializing CTE's referenced in 
> scalar subqueries
> * add a config to materialize CTEs with aggregate output only



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24081) Enable pre-materializing CTEs referenced in scalar subqueries

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24081?focusedWorklogId=476456=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476456
 ]

ASF GitHub Bot logged work on HIVE-24081:
-

Author: ASF GitHub Bot
Created on: 31/Aug/20 09:02
Start Date: 31/Aug/20 09:02
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on a change in pull request #1437:
URL: https://github.com/apache/hive/pull/1437#discussion_r479994280



##
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java
##
@@ -677,6 +685,33 @@ public void setNoScanAnalyzeCommand(boolean 
isNoScanAnalyzeCommand) {
   public boolean hasInsertTables() {
 return this.insertIntoTables.size() > 0 || 
this.insertOverwriteTables.size() > 0;
   }
+
+  public boolean isFullyAggregate() throws SemanticException {

Review comment:
   Added javadocs. I think it can be non-static until it is used from 
another location. It uses the `destToSelExpr` field located in `QBParseInfo`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476456)
Time Spent: 1h  (was: 50m)

> Enable pre-materializing CTEs referenced in scalar subqueries
> -
>
> Key: HIVE-24081
> URL: https://issues.apache.org/jira/browse/HIVE-24081
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> HIVE-11752 introduces materializing CTE based on config
> {code}
> hive.optimize.cte.materialize.threshold
> {code}
> Goal of this jira is
> * extending the implementation to support materializing CTE's referenced in 
> scalar subqueries
> * add a config to materialize CTEs with aggregate output only



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23851) MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23851?focusedWorklogId=476453=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476453
 ]

ASF GitHub Bot logged work on HIVE-23851:
-

Author: ASF GitHub Bot
Created on: 31/Aug/20 09:00
Start Date: 31/Aug/20 09:00
Worklog Time Spent: 10m 
  Work Description: shameersss1 commented on pull request #1271:
URL: https://github.com/apache/hive/pull/1271#issuecomment-683657845


   @kgyrtkirk Ping for review request!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476453)
Time Spent: 2.5h  (was: 2h 20m)

> MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions
> 
>
> Key: HIVE-23851
> URL: https://issues.apache.org/jira/browse/HIVE-23851
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> *Steps to reproduce:*
> # Create external table
> # Run msck command to sync all the partitions with metastore
> # Remove one of the partition path
> # Run msck repair with partition filtering
> *Stack Trace:*
> {code:java}
>  2020-07-15T02:10:29,045 ERROR [4dad298b-28b1-4e6b-94b6-aa785b60c576 main] 
> ppr.PartitionExpressionForMetastore: Failed to deserialize the expression
>  java.lang.IndexOutOfBoundsException: Index: 110, Size: 0
>  at java.util.ArrayList.rangeCheck(ArrayList.java:657) ~[?:1.8.0_192]
>  at java.util.ArrayList.get(ArrayList.java:433) ~[?:1.8.0_192]
>  at 
> org.apache.hive.com.esotericsoftware.kryo.util.MapReferenceResolver.getReadObject(MapReferenceResolver.java:60)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:857)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:707) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:211)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeObjectFromKryo(SerializationUtilities.java:806)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeExpressionFromKryo(SerializationUtilities.java:775)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.deserializeExpr(PartitionExpressionForMetastore.java:96)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.convertExprToFilter(PartitionExpressionForMetastore.java:52)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.PartFilterExprUtil.makeExpressionTree(PartFilterExprUtil.java:48)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:3593)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.VerifyingObjectStore.getPartitionsByExpr(VerifyingObjectStore.java:80)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT-tests.jar:4.0.0-SNAPSHOT]
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_192]
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_192]
> {code}
> *Cause:*
> In case of msck repair with partition filtering we expect expression proxy 
> class to be set as PartitionExpressionForMetastore ( 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/ddl/misc/msck/MsckAnalyzer.java#L78
>  ), While dropping partition we serialize the drop partition filter 
> expression as ( 
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java#L589
>  ) which is incompatible during deserializtion happening in 
> PartitionExpressionForMetastore ( 
>

[jira] [Work logged] (HIVE-24081) Enable pre-materializing CTEs referenced in scalar subqueries

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24081?focusedWorklogId=476452=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476452
 ]

ASF GitHub Bot logged work on HIVE-24081:
-

Author: ASF GitHub Bot
Created on: 31/Aug/20 08:59
Start Date: 31/Aug/20 08:59
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on a change in pull request #1437:
URL: https://github.com/apache/hive/pull/1437#discussion_r479992802



##
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java
##
@@ -457,4 +469,17 @@ public boolean hasTableDefined() {
 return !(aliases.size() == 1 && 
aliases.get(0).equals(SemanticAnalyzer.DUMMY_TABLE));
   }
 
+  public void addSubqExprAlias(ASTNode expressionTree, SemanticAnalyzer 
semanticAnalyzer) throws SemanticException {
+String alias = "__subexpr" + subQueryExpressionAliasCounter++;
+
+// Recursively do the first phase of semantic analysis for the subquery
+QBExpr qbexpr = new QBExpr(alias);
+
+ASTNode subqref = (ASTNode) expressionTree.getChild(1);
+semanticAnalyzer.doPhase1QBExpr(subqref, qbexpr, getId(), alias, 
isInsideView());

Review comment:
   This step parses the subquery only. It is necessary to collect 
references to the CTEs in subquery expressions.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476452)
Time Spent: 50m  (was: 40m)

> Enable pre-materializing CTEs referenced in scalar subqueries
> -
>
> Key: HIVE-24081
> URL: https://issues.apache.org/jira/browse/HIVE-24081
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> HIVE-11752 introduces materializing CTE based on config
> {code}
> hive.optimize.cte.materialize.threshold
> {code}
> Goal of this jira is
> * extending the implementation to support materializing CTE's referenced in 
> scalar subqueries
> * add a config to materialize CTEs with aggregate output only



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24081) Enable pre-materializing CTEs referenced in scalar subqueries

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24081?focusedWorklogId=476451=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476451
 ]

ASF GitHub Bot logged work on HIVE-24081:
-

Author: ASF GitHub Bot
Created on: 31/Aug/20 08:58
Start Date: 31/Aug/20 08:58
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on a change in pull request #1437:
URL: https://github.com/apache/hive/pull/1437#discussion_r479992094



##
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
##
@@ -2591,6 +2591,10 @@ private static void 
populateLlapDaemonVarsSet(Set llapDaemonVarsSetLocal
 HIVE_CTE_MATERIALIZE_THRESHOLD("hive.optimize.cte.materialize.threshold", 
-1,

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476451)
Time Spent: 40m  (was: 0.5h)

> Enable pre-materializing CTEs referenced in scalar subqueries
> -
>
> Key: HIVE-24081
> URL: https://issues.apache.org/jira/browse/HIVE-24081
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> HIVE-11752 introduces materializing CTE based on config
> {code}
> hive.optimize.cte.materialize.threshold
> {code}
> Goal of this jira is
> * extending the implementation to support materializing CTE's referenced in 
> scalar subqueries
> * add a config to materialize CTEs with aggregate output only



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24059) Llap external client - Initial changes for running in cloud environment

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24059?focusedWorklogId=476435=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476435
 ]

ASF GitHub Bot logged work on HIVE-24059:
-

Author: ASF GitHub Bot
Created on: 31/Aug/20 08:40
Start Date: 31/Aug/20 08:40
Worklog Time Spent: 10m 
  Work Description: ShubhamChaurasia commented on a change in pull request 
#1418:
URL: https://github.com/apache/hive/pull/1418#discussion_r479982596



##
File path: 
llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/ContainerRunnerImpl.java
##
@@ -342,6 +352,33 @@ public SubmitWorkResponseProto 
submitWork(SubmitWorkRequestProto request) throws
 .build();
   }
 
+  // if request is coming from llap external client, verify the JWT
+  // as of now, JWT contains applicationId

Review comment:
   added a comment in code which explains this - 
   
   In GenericUDTFGetSplits
   
   // 6. Generate JWT for external clients if it's a cloud deployment
   // we inject extClientAppId in JWT which is same as what fragment 
contains.
   // extClientAppId in JWT and in fragment are compared on LLAP when a 
fragment is submitted.
   // see method ContainerRunnerImpl#verifyJwtForExternalClient
   
   
   In ContainerRunnerImpl#verifyJwtForExternalClient
   
   
   // extClientAppId is injected in JWT and fragment request by initial 
get_splits() call.
 // so both of these - extClientAppIdFromJwt and 
extClientAppIdFromSplit should be equal eventually if the signed JWT is valid 
for this request.
 // In get_splits, this extClientAppId is obtained via 
LlapCoordinator#createExtClientAppId which generates a
 // application Id to be used by external clients.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476435)
Time Spent: 1h 10m  (was: 1h)

> Llap external client - Initial changes for running in cloud environment
> ---
>
> Key: HIVE-24059
> URL: https://issues.apache.org/jira/browse/HIVE-24059
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Please see problem description in 
> https://issues.apache.org/jira/browse/HIVE-24058
> Initial changes include - 
> 1. Moving LLAP discovery logic from client side to server (HS2 / get_splits) 
> side.
> 2. Opening additional RPC port in LLAP Daemon.
> 3. JWT Based authentication on this port.
> cc [~prasanth_j] [~jdere] [~anishek] [~thejas]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24059) Llap external client - Initial changes for running in cloud environment

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24059?focusedWorklogId=476436=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476436
 ]

ASF GitHub Bot logged work on HIVE-24059:
-

Author: ASF GitHub Bot
Created on: 31/Aug/20 08:40
Start Date: 31/Aug/20 08:40
Worklog Time Spent: 10m 
  Work Description: ShubhamChaurasia commented on a change in pull request 
#1418:
URL: https://github.com/apache/hive/pull/1418#discussion_r479982693



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java
##
@@ -559,12 +566,22 @@ private SplitResult getSplits(JobConf job, TezWork work, 
Schema schema, Applicat
 // 4. Make location hints.
 SplitLocationInfo[] locations = makeLocationHints(hints.get(i));
 
+// 5. populate info about llap daemons(to help client submit request 
and read data)
+LlapDaemonInfo[] llapDaemonInfos = populateLlapDaemonInfos(job, 
locations);
+
+// 6. Generate JWT for external clients if it's a cloud deployment
+String jwt = "";
+if (LlapUtil.isCloudDeployment()) {
+  JwtHelper jwtHelper = new JwtHelper(SessionState.getSessionConf());
+  jwt = jwtHelper.buildJwtForLlap(applicationId);

Review comment:
   done

##
File path: 
llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/ContainerRunnerImpl.java
##
@@ -342,6 +352,33 @@ public SubmitWorkResponseProto 
submitWork(SubmitWorkRequestProto request) throws
 .build();
   }
 
+  // if request is coming from llap external client, verify the JWT
+  // as of now, JWT contains applicationId
+  private void verifyJwtForExternalClient(SubmitWorkRequestProto request, 
String applicationIdString,
+  String fragmentIdString) {
+LOG.info("Checking if request[{}] is from llap external client in a cloud 
based deployment", applicationIdString);
+if (request.getIsExternalClientRequest() && LlapUtil.isCloudDeployment()) {
+  LOG.info("Llap external client request - {}, verifying JWT", 
applicationIdString);
+  Preconditions.checkState(request.hasJwt(), "JWT not found in request, 
fragmentId: " + fragmentIdString);
+
+  JwtHelper jwtHelper = new JwtHelper(getConfig());
+  Jws claimsJws;
+  try {
+claimsJws = jwtHelper.parseClaims(request.getJwt());
+  } catch (JwtException e) {
+LOG.error("Cannot verify JWT provided with the request, fragmentId: 
{}, {}", fragmentIdString, e);
+throw e;
+  }
+
+  String appIdInJwt = (String) 
claimsJws.getBody().get(JwtHelper.LLAP_EXT_CLIENT_APP_ID);
+  // this should never happen ideally.
+  Preconditions.checkState(appIdInJwt.equals(applicationIdString),

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476436)
Time Spent: 1h 20m  (was: 1h 10m)

> Llap external client - Initial changes for running in cloud environment
> ---
>
> Key: HIVE-24059
> URL: https://issues.apache.org/jira/browse/HIVE-24059
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Please see problem description in 
> https://issues.apache.org/jira/browse/HIVE-24058
> Initial changes include - 
> 1. Moving LLAP discovery logic from client side to server (HS2 / get_splits) 
> side.
> 2. Opening additional RPC port in LLAP Daemon.
> 3. JWT Based authentication on this port.
> cc [~prasanth_j] [~jdere] [~anishek] [~thejas]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24059) Llap external client - Initial changes for running in cloud environment

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24059?focusedWorklogId=476430=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476430
 ]

ASF GitHub Bot logged work on HIVE-24059:
-

Author: ASF GitHub Bot
Created on: 31/Aug/20 08:33
Start Date: 31/Aug/20 08:33
Worklog Time Spent: 10m 
  Work Description: ShubhamChaurasia commented on a change in pull request 
#1418:
URL: https://github.com/apache/hive/pull/1418#discussion_r479979149



##
File path: 
llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/LlapDaemon.java
##
@@ -474,6 +478,10 @@ public void serviceStart() throws Exception {
   getConfig().setInt(ConfVars.LLAP_DAEMON_WEB_PORT.varname, 
webServices.getPort());
 }
 getConfig().setInt(ConfVars.LLAP_DAEMON_OUTPUT_SERVICE_PORT.varname, 
LlapOutputFormatService.get().getPort());
+if (LlapUtil.isCloudDeployment()) {
+  getConfig().setInt(ConfVars.LLAP_EXTERNAL_CLIENT_CLOUD_RPC_PORT.varname,

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476430)
Time Spent: 1h  (was: 50m)

> Llap external client - Initial changes for running in cloud environment
> ---
>
> Key: HIVE-24059
> URL: https://issues.apache.org/jira/browse/HIVE-24059
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Please see problem description in 
> https://issues.apache.org/jira/browse/HIVE-24058
> Initial changes include - 
> 1. Moving LLAP discovery logic from client side to server (HS2 / get_splits) 
> side.
> 2. Opening additional RPC port in LLAP Daemon.
> 3. JWT Based authentication on this port.
> cc [~prasanth_j] [~jdere] [~anishek] [~thejas]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24059) Llap external client - Initial changes for running in cloud environment

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24059?focusedWorklogId=476428=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476428
 ]

ASF GitHub Bot logged work on HIVE-24059:
-

Author: ASF GitHub Bot
Created on: 31/Aug/20 08:32
Start Date: 31/Aug/20 08:32
Worklog Time Spent: 10m 
  Work Description: ShubhamChaurasia commented on a change in pull request 
#1418:
URL: https://github.com/apache/hive/pull/1418#discussion_r479978466



##
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
##
@@ -4880,6 +4880,22 @@ private static void 
populateLlapDaemonVarsSet(Set llapDaemonVarsSetLocal
 
LLAP_EXTERNAL_CLIENT_USE_HYBRID_CALENDAR("hive.llap.external.client.use.hybrid.calendar",
 false,
 "Whether to use hybrid calendar for parsing of data/timestamps."),
+
+// confs for llap-external-client cloud deployment
+
LLAP_EXTERNAL_CLIENT_CLOUD_RPC_PORT("hive.llap.external.client.cloud.rpc.port", 
30004,
+"The LLAP daemon RPC port for external clients when llap is running in 
cloud environment."),
+
LLAP_EXTERNAL_CLIENT_CLOUD_OUTPUT_SERVICE_PORT("hive.llap.external.client.cloud.output.service.port",
 30005,
+"LLAP output service port when llap is running in cloud 
environment"),
+LLAP_EXTERNAL_CLIENT_CLOUD_JWT_SHARED_SECRET_PROVIDER(
+"hive.llap.external.client.cloud.jwt.shared.secret.provider",
+
"org.apache.hadoop.hive.llap.security.ConfBasedJwtSharedSecretProvider",

Review comment:
   It's now changed to DefaultJwtSharedSecretProvider which reads from 
conf.getPassword, if not found, then tries ENV variable.

##
File path: 
llap-client/src/java/org/apache/hadoop/hive/llap/registry/LlapServiceInstance.java
##
@@ -47,6 +49,24 @@
*/
   public int getOutputFormatPort();
 
+  /**
+   * External host, usually needed in cloud envs where we cannot access 
internal host from outside
+   *
+   * @return
+   */
+  String getExternalHost();

Review comment:
   done

##
File path: 
llap-common/src/java/org/apache/hadoop/hive/llap/security/ConfBasedJwtSharedSecretProvider.java
##
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.llap.security;
+
+import com.google.common.base.Preconditions;
+import io.jsonwebtoken.security.Keys;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hive.conf.HiveConf;
+
+import java.security.Key;
+
+/**
+ * Default implementation of {@link JwtSecretProvider}.
+ * It uses the same encryption and decryption secret which can be used to sign 
and verify JWT.
+ */
+public class ConfBasedJwtSharedSecretProvider implements JwtSecretProvider {
+
+  private Key jwtEncryptionKey;
+
+  @Override public Key getEncryptionSecret() {
+return jwtEncryptionKey;
+  }
+
+  @Override public Key getDecryptionSecret() {
+return jwtEncryptionKey;
+  }
+
+  @Override public void init(final Configuration conf) {
+final String sharedSecret = HiveConf.getVar(conf, 
HiveConf.ConfVars.LLAP_EXTERNAL_CLIENT_CLOUD_JWT_SHARED_SECRET);

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476428)
Time Spent: 40m  (was: 0.5h)

> Llap external client - Initial changes for running in cloud environment
> ---
>
> Key: HIVE-24059
> URL: https://issues.apache.org/jira/browse/HIVE-24059
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining

[jira] [Work logged] (HIVE-24059) Llap external client - Initial changes for running in cloud environment

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24059?focusedWorklogId=476429=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476429
 ]

ASF GitHub Bot logged work on HIVE-24059:
-

Author: ASF GitHub Bot
Created on: 31/Aug/20 08:32
Start Date: 31/Aug/20 08:32
Worklog Time Spent: 10m 
  Work Description: ShubhamChaurasia commented on a change in pull request 
#1418:
URL: https://github.com/apache/hive/pull/1418#discussion_r479978966



##
File path: 
llap-ext-client/src/java/org/apache/hadoop/hive/llap/LlapBaseInputFormat.java
##
@@ -157,12 +153,12 @@ public LlapBaseInputFormat() {
 HiveConf.setVar(job, HiveConf.ConfVars.LLAP_ZK_REGISTRY_USER, 
llapSplit.getLlapUser());
 SubmitWorkInfo submitWorkInfo = 
SubmitWorkInfo.fromBytes(llapSplit.getPlanBytes());
 
-LlapServiceInstance serviceInstance = getServiceInstance(job, llapSplit);
-String host = serviceInstance.getHost();
-int llapSubmitPort = serviceInstance.getRpcPort();
+final LlapDaemonInfo llapDaemonInfo = llapSplit.getLlapDaemonInfos()[0];

Review comment:
   no, it does not as of now. Same has been validated in get_splits. Added 
a comment about it in new patch.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476429)
Time Spent: 50m  (was: 40m)

> Llap external client - Initial changes for running in cloud environment
> ---
>
> Key: HIVE-24059
> URL: https://issues.apache.org/jira/browse/HIVE-24059
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Please see problem description in 
> https://issues.apache.org/jira/browse/HIVE-24058
> Initial changes include - 
> 1. Moving LLAP discovery logic from client side to server (HS2 / get_splits) 
> side.
> 2. Opening additional RPC port in LLAP Daemon.
> 3. JWT Based authentication on this port.
> cc [~prasanth_j] [~jdere] [~anishek] [~thejas]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work started] (HIVE-24081) Enable pre-materializing CTEs referenced in scalar subqueries

2020-08-31 Thread Krisztian Kasa (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-24081 started by Krisztian Kasa.
-
> Enable pre-materializing CTEs referenced in scalar subqueries
> -
>
> Key: HIVE-24081
> URL: https://issues.apache.org/jira/browse/HIVE-24081
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> HIVE-11752 introduces materializing CTE based on config
> {code}
> hive.optimize.cte.materialize.threshold
> {code}
> Goal of this jira is
> * extending the implementation to support materializing CTE's referenced in 
> scalar subqueries
> * add a config to materialize CTEs with aggregate output only



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24059) Llap external client - Initial changes for running in cloud environment

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24059?focusedWorklogId=476426=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476426
 ]

ASF GitHub Bot logged work on HIVE-24059:
-

Author: ASF GitHub Bot
Created on: 31/Aug/20 08:29
Start Date: 31/Aug/20 08:29
Worklog Time Spent: 10m 
  Work Description: ShubhamChaurasia commented on a change in pull request 
#1418:
URL: https://github.com/apache/hive/pull/1418#discussion_r479977303



##
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
##
@@ -4880,6 +4880,22 @@ private static void 
populateLlapDaemonVarsSet(Set llapDaemonVarsSetLocal
 
LLAP_EXTERNAL_CLIENT_USE_HYBRID_CALENDAR("hive.llap.external.client.use.hybrid.calendar",
 false,
 "Whether to use hybrid calendar for parsing of data/timestamps."),
+
+// confs for llap-external-client cloud deployment
+
LLAP_EXTERNAL_CLIENT_CLOUD_RPC_PORT("hive.llap.external.client.cloud.rpc.port", 
30004,
+"The LLAP daemon RPC port for external clients when llap is running in 
cloud environment."),
+
LLAP_EXTERNAL_CLIENT_CLOUD_OUTPUT_SERVICE_PORT("hive.llap.external.client.cloud.output.service.port",
 30005,
+"LLAP output service port when llap is running in cloud 
environment"),
+LLAP_EXTERNAL_CLIENT_CLOUD_JWT_SHARED_SECRET_PROVIDER(
+"hive.llap.external.client.cloud.jwt.shared.secret.provider",
+
"org.apache.hadoop.hive.llap.security.ConfBasedJwtSharedSecretProvider",
+"Shared secret provider to be used to sign JWT"),
+
LLAP_EXTERNAL_CLIENT_CLOUD_JWT_SHARED_SECRET("hive.llap.external.client.cloud.jwt.shared.secret",
+"Let me give you this secret and you will get the access!",

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476426)
Time Spent: 0.5h  (was: 20m)

> Llap external client - Initial changes for running in cloud environment
> ---
>
> Key: HIVE-24059
> URL: https://issues.apache.org/jira/browse/HIVE-24059
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Please see problem description in 
> https://issues.apache.org/jira/browse/HIVE-24058
> Initial changes include - 
> 1. Moving LLAP discovery logic from client side to server (HS2 / get_splits) 
> side.
> 2. Opening additional RPC port in LLAP Daemon.
> 3. JWT Based authentication on this port.
> cc [~prasanth_j] [~jdere] [~anishek] [~thejas]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24081) Enable pre-materializing CTEs referenced in scalar subqueries

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24081?focusedWorklogId=476418=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476418
 ]

ASF GitHub Bot logged work on HIVE-24081:
-

Author: ASF GitHub Bot
Created on: 31/Aug/20 08:16
Start Date: 31/Aug/20 08:16
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on a change in pull request #1437:
URL: https://github.com/apache/hive/pull/1437#discussion_r479970570



##
File path: ql/src/test/queries/clientpositive/cte_mat_6.q
##
@@ -0,0 +1,81 @@
+set hive.optimize.cte.materialize.threshold=1;
+
+create table t0(col0 int);
+
+insert into t0(col0) values
+(1),(2),
+(100),(100),(100),
+(200),(200);
+
+-- CTE is referenced from scalar subquery in the select clause
+explain
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, (select small_count from cte)
+from t0
+order by t0.col0;
+
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, (select small_count from cte)
+from t0
+order by t0.col0;
+
+-- disable cte materialization
+set hive.optimize.cte.materialize.threshold=-1;
+
+explain
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, (select small_count from cte)
+from t0
+order by t0.col0;
+
+
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, (select small_count from cte)
+from t0
+order by t0.col0;
+
+
+-- enable cte materialization
+set hive.optimize.cte.materialize.threshold=1;
+
+-- CTE is referenced from scalar subquery in the where clause
+explain
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0
+from t0
+where t0.col0 > (select small_count from cte)
+order by t0.col0;
+
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0
+from t0
+where t0.col0 > (select small_count from cte)
+order by t0.col0;
+
+-- CTE is referenced from scalar subquery in the having clause
+explain
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, count(*)
+from t0
+group by col0
+having count(*) > (select small_count from cte)
+order by t0.col0;
+
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, count(*)
+from t0
+group by col0
+having count(*) > (select small_count from cte)
+order by t0.col0;
+
+-- mix full aggregate and non-full aggregate ctes
+explain
+with cte1 as (select col0 as k1 from t0 where col0 = '5'),
+ cte2 as (select count(*) as all_count from t0),
+ cte3 as (select col0 as k3, col0 + col0 as k3_2x, count(*) as key_count 
from t0 group by col0)
+select t0.col0, count(*)
+from t0
+join cte1 on t0.col0 = cte1.k1
+join cte3 on t0.col0 = cte3.k3
+group by col0
+having count(*) > (select all_count from cte2)

Review comment:
   I added a check for CTAS and will update the PR soon.
   For `create materialize view` CTE materialization is disabled here 
   `SemanticAnalyzer.genResolvedParseTree`
   ```
   // 5. Resolve Parse Tree
   // Materialization is allowed if it is not a view definition
   getMetaData(qb, createVwDesc == null && !forViewCreation);
   ```
   
https://github.com/apache/hive/blob/54aff33d8e1d659d295e1f53b88aad91ba8cc23e/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L12355





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476418)
Time Spent: 0.5h  (was: 20m)

> Enable pre-materializing CTEs referenced in scalar subqueries
> -
>
> Key: HIVE-24081
> URL: https://issues.apache.org/jira/browse/HIVE-24081
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> HIVE-11752 introduces materializing CTE based on config
> {code}
> hive.optimize.cte.materialize.threshold
> {code}
> Goal of this jira is
> * extending the implementation to support materializing CTE's referenced in 
> scalar subqueries
> * add a config to materialize CTEs with aggregate output only



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24091) Replace multiple constraints call with getAllTableConstraints api call in query planner

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24091?focusedWorklogId=476391=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476391
 ]

ASF GitHub Bot logged work on HIVE-24091:
-

Author: ASF GitHub Bot
Created on: 31/Aug/20 07:24
Start Date: 31/Aug/20 07:24
Worklog Time Spent: 10m 
  Work Description: adesh-rao commented on a change in pull request #1444:
URL: https://github.com/apache/hive/pull/1444#discussion_r479944470



##
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
##
@@ -5682,163 +5684,65 @@ public void dropConstraint(String dbName, String 
tableName, String constraintNam
 }
   }
 
-  /**
-   * Get all primary key columns associated with the table.
-   *
-   * @param dbName Database Name
-   * @param tblName Table Name
-   * @return Primary Key associated with the table.
-   * @throws HiveException
-   */
-  public PrimaryKeyInfo getPrimaryKeys(String dbName, String tblName) throws 
HiveException {
-return getPrimaryKeys(dbName, tblName, false);
-  }
-
-  /**
-   * Get primary key columns associated with the table that are available for 
optimization.
-   *
-   * @param dbName Database Name
-   * @param tblName Table Name
-   * @return Primary Key associated with the table.
-   * @throws HiveException
-   */
-  public PrimaryKeyInfo getReliablePrimaryKeys(String dbName, String tblName) 
throws HiveException {
-return getPrimaryKeys(dbName, tblName, true);
-  }
-
-  private PrimaryKeyInfo getPrimaryKeys(String dbName, String tblName, boolean 
onlyReliable)
-  throws HiveException {
-PerfLogger perfLogger = SessionState.getPerfLogger();
-perfLogger.perfLogBegin(CLASS_NAME, PerfLogger.HIVE_GET_PK);
-try {
-  List primaryKeys = getMSC().getPrimaryKeys(new 
PrimaryKeysRequest(dbName, tblName));
-  if (onlyReliable && primaryKeys != null && !primaryKeys.isEmpty()) {

Review comment:
   nvm. Got it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476391)
Time Spent: 0.5h  (was: 20m)

> Replace multiple constraints call with getAllTableConstraints api call in 
> query planner
> ---
>
> Key: HIVE-24091
> URL: https://issues.apache.org/jira/browse/HIVE-24091
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Inorder get all the constraints of table i.e. PrimaryKey, ForeignKey, 
> UniqueConstraint ,NotNullConstraint ,DefaultConstraint ,CheckConstraint. We 
> have to do 6 different metastore call. Replace these call with one  
> getAllTableConstraints  api which provide all the constraints at once



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24091) Replace multiple constraints call with getAllTableConstraints api call in query planner

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24091?focusedWorklogId=476390=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476390
 ]

ASF GitHub Bot logged work on HIVE-24091:
-

Author: ASF GitHub Bot
Created on: 31/Aug/20 07:23
Start Date: 31/Aug/20 07:23
Worklog Time Spent: 10m 
  Work Description: adesh-rao commented on a change in pull request #1444:
URL: https://github.com/apache/hive/pull/1444#discussion_r479944166



##
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
##
@@ -5682,163 +5684,65 @@ public void dropConstraint(String dbName, String 
tableName, String constraintNam
 }
   }
 
-  /**
-   * Get all primary key columns associated with the table.
-   *
-   * @param dbName Database Name
-   * @param tblName Table Name
-   * @return Primary Key associated with the table.
-   * @throws HiveException
-   */
-  public PrimaryKeyInfo getPrimaryKeys(String dbName, String tblName) throws 
HiveException {
-return getPrimaryKeys(dbName, tblName, false);
-  }
-
-  /**
-   * Get primary key columns associated with the table that are available for 
optimization.
-   *
-   * @param dbName Database Name
-   * @param tblName Table Name
-   * @return Primary Key associated with the table.
-   * @throws HiveException
-   */
-  public PrimaryKeyInfo getReliablePrimaryKeys(String dbName, String tblName) 
throws HiveException {
-return getPrimaryKeys(dbName, tblName, true);
-  }
-
-  private PrimaryKeyInfo getPrimaryKeys(String dbName, String tblName, boolean 
onlyReliable)
-  throws HiveException {
-PerfLogger perfLogger = SessionState.getPerfLogger();
-perfLogger.perfLogBegin(CLASS_NAME, PerfLogger.HIVE_GET_PK);
-try {
-  List primaryKeys = getMSC().getPrimaryKeys(new 
PrimaryKeysRequest(dbName, tblName));
-  if (onlyReliable && primaryKeys != null && !primaryKeys.isEmpty()) {

Review comment:
   Is this being taken care of in getAllTableConstraints?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476390)
Time Spent: 20m  (was: 10m)

> Replace multiple constraints call with getAllTableConstraints api call in 
> query planner
> ---
>
> Key: HIVE-24091
> URL: https://issues.apache.org/jira/browse/HIVE-24091
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Inorder get all the constraints of table i.e. PrimaryKey, ForeignKey, 
> UniqueConstraint ,NotNullConstraint ,DefaultConstraint ,CheckConstraint. We 
> have to do 6 different metastore call. Replace these call with one  
> getAllTableConstraints  api which provide all the constraints at once



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24064) Disable Materialized View Replication

2020-08-31 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24064?focusedWorklogId=476349=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476349
 ]

ASF GitHub Bot logged work on HIVE-24064:
-

Author: ASF GitHub Bot
Created on: 31/Aug/20 06:35
Start Date: 31/Aug/20 06:35
Worklog Time Spent: 10m 
  Work Description: aasha commented on a change in pull request #1422:
URL: https://github.com/apache/hive/pull/1422#discussion_r479922802



##
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java
##
@@ -542,6 +549,23 @@ private Long incrementalDump(Path dumpRoot, DumpMetaData 
dmd, Path cmRoot, Hive
   if (ev.getEventId() <= resumeFrom) {
 continue;
   }
+
+  //disable materialized-view replication if not configured
+  if(!isMaterializedViewsReplEnabled()){
+String tblName = ev.getTableName();
+if(tblName != null) {
+  try {
+HiveWrapper.Tuple tabletuple = new HiveWrapper(hiveDb, 
dbName).table(tblName, conf);

Review comment:
   you can use hiveDb directly instead of using HiveWrapper





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 476349)
Time Spent: 40m  (was: 0.5h)

> Disable Materialized View Replication
> -
>
> Key: HIVE-24064
> URL: https://issues.apache.org/jira/browse/HIVE-24064
> Project: Hive
>  Issue Type: Bug
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24064.01.patch, HIVE-24064.02.patch, 
> HIVE-24064.03.patch, HIVE-24064.04.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

82 matches

Mail list logo