[jira] [Updated] (HIVE-24064) Disable Materialized View Replication

2020-08-24 Thread Arko Sharma (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arko Sharma updated HIVE-24064:
---
Attachment: HIVE-24064.02.patch

> Disable Materialized View Replication
> -
>
> Key: HIVE-24064
> URL: https://issues.apache.org/jira/browse/HIVE-24064
> Project: Hive
>  Issue Type: Bug
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24064.01.patch, HIVE-24064.02.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-20817) Reading Timestamp datatype via HiveServer2 gives errors

2020-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-20817?focusedWorklogId=474106=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-474106
 ]

ASF GitHub Bot logged work on HIVE-20817:
-

Author: ASF GitHub Bot
Created on: 25/Aug/20 00:40
Start Date: 25/Aug/20 00:40
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1179:
URL: https://github.com/apache/hive/pull/1179#issuecomment-679436133


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 474106)
Time Spent: 40m  (was: 0.5h)

> Reading Timestamp datatype via HiveServer2 gives errors
> ---
>
> Key: HIVE-20817
> URL: https://issues.apache.org/jira/browse/HIVE-20817
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20817.01.patch, HIVE-20817.02.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> CREATE TABLE JdbcBasicRead ( empno int, desg string,empname string,doj 
> timestamp,Salary float,mgrid smallint, deptno tinyint ) ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY ',';
> LOAD DATA LOCAL INPATH '/tmp/art_jdbc/hive/input/input_7columns.txt' 
> OVERWRITE INTO TABLE JdbcBasicRead;
> Sample Data.
> —
> 7369,M,SMITH,1980-12-17 17:07:29.234234,5000.00,7902,20
> 7499,X,ALLEN,1981-02-20 17:07:29.234234,1250.00,7698,30
> 7521,X,WARD,1981-02-22 17:07:29.234234,01600.57,7698,40
> 7566,M,JONES,1981-04-02 17:07:29.234234,02975.65,7839,10
> 7654,X,MARTIN,1981-09-28 17:07:29.234234,01250.00,7698,20
> 7698,M,BLAKE,1981-05-01 17:07:29.234234,2850.98,7839,30
> 7782,M,CLARK,1981-06-09 17:07:29.234234,02450.00,7839,20
> —
> Select statement: SELECT empno, desg, empname, doj, salary, mgrid, deptno 
> FROM JdbcBasicWrite
> {code}
> 2018-09-25T07:11:03,222 WARN [HiveServer2-Handler-Pool: Thread-83]: 
> thrift.ThriftCLIService (:()) - Error fetching results:
> org.apache.hive.service.cli.HiveSQLException: java.lang.ClassCastException: 
> org.apache.hadoop.hive.common.type.Timestamp cannot be cast to 
> java.sql.Timestamp
> at 
> org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:469)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:328)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:910)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at sun.reflect.GeneratedMethodAccessor50.invoke(Unknown Source) ~[?:?]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_112]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
> at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112]
> at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112]
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>  ~[hadoop-common-3.1.1.3.0.1.0-187.jar:?]
> at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
>  ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at com.sun.proxy.$Proxy46.fetchResults(Unknown Source) ~[?:?]
> at org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:564) 
> ~[hive-service-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
> at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:786)
>  

[jira] [Work logged] (HIVE-23546) Skip authorization when user is a superuser

2020-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23546?focusedWorklogId=474096=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-474096
 ]

ASF GitHub Bot logged work on HIVE-23546:
-

Author: ASF GitHub Bot
Created on: 24/Aug/20 23:57
Start Date: 24/Aug/20 23:57
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 opened a new pull request #1033:
URL: https://github.com/apache/hive/pull/1033


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 474096)
Time Spent: 1h 10m  (was: 1h)

> Skip authorization when user is a superuser
> ---
>
> Key: HIVE-23546
> URL: https://issues.apache.org/jira/browse/HIVE-23546
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23546.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> If the current user is a superuser, there is no need to do authorization. 
> This can speed up queries, especially for those ddl queries. For example, the 
> superuser add partitions when the external data is ready, or show partitions 
> to check whether it OK to take the work flow one step further in a busy hive 
> cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23546) Skip authorization when user is a superuser

2020-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23546?focusedWorklogId=474093=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-474093
 ]

ASF GitHub Bot logged work on HIVE-23546:
-

Author: ASF GitHub Bot
Created on: 24/Aug/20 23:51
Start Date: 24/Aug/20 23:51
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 closed pull request #1033:
URL: https://github.com/apache/hive/pull/1033


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 474093)
Time Spent: 1h  (was: 50m)

> Skip authorization when user is a superuser
> ---
>
> Key: HIVE-23546
> URL: https://issues.apache.org/jira/browse/HIVE-23546
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23546.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> If the current user is a superuser, there is no need to do authorization. 
> This can speed up queries, especially for those ddl queries. For example, the 
> superuser add partitions when the external data is ready, or show partitions 
> to check whether it OK to take the work flow one step further in a busy hive 
> cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24069) HiveHistory should log the task that ends abnormally

2020-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24069:
--
Labels: pull-request-available  (was: )

> HiveHistory should log the task that ends abnormally
> 
>
> Key: HIVE-24069
> URL: https://issues.apache.org/jira/browse/HIVE-24069
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When the task returns with the exitVal not equal to 0,  The Executor would 
> skip marking the task return code and calling endTask.  This may make the 
> history log incomplete for such tasks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24069) HiveHistory should log the task that ends abnormally

2020-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24069?focusedWorklogId=474092=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-474092
 ]

ASF GitHub Bot logged work on HIVE-24069:
-

Author: ASF GitHub Bot
Created on: 24/Aug/20 23:49
Start Date: 24/Aug/20 23:49
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 opened a new pull request #1429:
URL: https://github.com/apache/hive/pull/1429


   
   
   ### What changes were proposed in this pull request?
   HiveHistory logs the task that ends abnormally.
   
   
   
   ### Why are the changes needed?
   When the task returns with the exitVal not equal to 0,  The Executor would 
skip marking the task return code and calling endTask.  This may make the 
history log incomplete for such tasks.
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   
   ### How was this patch tested?
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 474092)
Remaining Estimate: 0h
Time Spent: 10m

> HiveHistory should log the task that ends abnormally
> 
>
> Key: HIVE-24069
> URL: https://issues.apache.org/jira/browse/HIVE-24069
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When the task returns with the exitVal not equal to 0,  The Executor would 
> skip marking the task return code and calling endTask.  This may make the 
> history log incomplete for such tasks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24068) Add re-execution plugin for handling DAG submission failures

2020-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24068:
--
Labels: pull-request-available  (was: )

> Add re-execution plugin for handling DAG submission failures
> 
>
> Key: HIVE-24068
> URL: https://issues.apache.org/jira/browse/HIVE-24068
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> DAG submission failure can also happen in environments where AM container 
> died causing DNS issues. DAG submissions are safe to retry as the DAG hasn't 
> started execution yet. There are retries at getSession and submitDAG level 
> individually but some submitDAG failure has to retry getSession as well as AM 
> could be unreachable, this can be handled in re-execution plugin.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24068) Add re-execution plugin for handling DAG submission failures

2020-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24068?focusedWorklogId=474081=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-474081
 ]

ASF GitHub Bot logged work on HIVE-24068:
-

Author: ASF GitHub Bot
Created on: 24/Aug/20 23:23
Start Date: 24/Aug/20 23:23
Worklog Time Spent: 10m 
  Work Description: prasanthj opened a new pull request #1428:
URL: https://github.com/apache/hive/pull/1428


   ### What changes were proposed in this pull request?
   DAG submission failure can also happen in environments where AM container 
died causing DNS issues. DAG submissions are safe to retry as the DAG hasn't 
started execution yet. There are retries at getSession and submitDAG level 
individually but some submitDAG failure has to retry getSession as well as AM 
could be unreachable, this can be handled in re-execution plugin. This PR adds 
a new re-execution plugin for intermittent DAG submission failures. 
   
   ### Why are the changes needed?
   To make hive resilient to environments with network/DNS issues.
   
   ### Does this PR introduce _any_ user-facing change?
   Yes. Adds the re-exec plugin as default option.
   
   
   ### How was this patch tested?
   Manually. Tez code was changed to explicitly throw UnknownHostException to 
simulate DNS/network issue and tested to make sure retry happens.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 474081)
Remaining Estimate: 0h
Time Spent: 10m

> Add re-execution plugin for handling DAG submission failures
> 
>
> Key: HIVE-24068
> URL: https://issues.apache.org/jira/browse/HIVE-24068
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> DAG submission failure can also happen in environments where AM container 
> died causing DNS issues. DAG submissions are safe to retry as the DAG hasn't 
> started execution yet. There are retries at getSession and submitDAG level 
> individually but some submitDAG failure has to retry getSession as well as AM 
> could be unreachable, this can be handled in re-execution plugin.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24068) Add re-execution plugin for handling DAG submission failures

2020-08-24 Thread Prasanth Jayachandran (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-24068:
-
Description: DAG submission failure can also happen in environments where 
AM container died causing DNS issues. DAG submissions are safe to retry as the 
DAG hasn't started execution yet. There are retries at getSession and submitDAG 
level individually but some submitDAG failure has to retry getSession as well 
as AM could be unreachable, this can be handled in re-execution plugin.  (was: 
ReExecutionOverlayPlugin handles cases where there is a vertex failure. DAG 
submission failure can also happen in environments where AM container died 
causing DNS issues. DAG submissions are safe to retry as the DAG hasn't started 
execution yet.)

> Add re-execution plugin for handling DAG submission failures
> 
>
> Key: HIVE-24068
> URL: https://issues.apache.org/jira/browse/HIVE-24068
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>
> DAG submission failure can also happen in environments where AM container 
> died causing DNS issues. DAG submissions are safe to retry as the DAG hasn't 
> started execution yet. There are retries at getSession and submitDAG level 
> individually but some submitDAG failure has to retry getSession as well as AM 
> could be unreachable, this can be handled in re-execution plugin.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24068) Add re-execution plugin for handling DAG submission failures

2020-08-24 Thread Prasanth Jayachandran (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-24068:
-
Summary: Add re-execution plugin for handling DAG submission failures  
(was: ReExecutionOverlayPlugin can handle DAG submission failures as well)

> Add re-execution plugin for handling DAG submission failures
> 
>
> Key: HIVE-24068
> URL: https://issues.apache.org/jira/browse/HIVE-24068
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>
> ReExecutionOverlayPlugin handles cases where there is a vertex failure. DAG 
> submission failure can also happen in environments where AM container died 
> causing DNS issues. DAG submissions are safe to retry as the DAG hasn't 
> started execution yet.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HIVE-23649) Fix FindBug issues in hive-service-rpc

2020-08-24 Thread Mustafa Iman (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-23649 started by Mustafa Iman.
---
> Fix FindBug issues in hive-service-rpc
> --
>
> Key: HIVE-23649
> URL: https://issues.apache.org/jira/browse/HIVE-23649
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Attachments: spotbugsXml.xml
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23649) Fix FindBug issues in hive-service-rpc

2020-08-24 Thread Mustafa Iman (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mustafa Iman updated HIVE-23649:

Status: Patch Available  (was: In Progress)

> Fix FindBug issues in hive-service-rpc
> --
>
> Key: HIVE-23649
> URL: https://issues.apache.org/jira/browse/HIVE-23649
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Attachments: spotbugsXml.xml
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23302) Create HiveJdbcDatabaseAccessor for JDBC storage handler

2020-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-23302:
--
Labels: pull-request-available  (was: )

> Create HiveJdbcDatabaseAccessor for JDBC storage handler
> 
>
> Key: HIVE-23302
> URL: https://issues.apache.org/jira/browse/HIVE-23302
> Project: Hive
>  Issue Type: Bug
>  Components: StorageHandler
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The {{JdbcDatabaseAccessor}} associated with the storage handler makes some 
> SQL calls to the RDBMS through the JDBC connection. There is a 
> {{GenericJdbcDatabaseAccessor}} with a generic implementation that the 
> storage handler uses if there is no specific implementation for a certain 
> RDBMS.
> Currently, Hive uses the {{GenericJdbcDatabaseAccessor}}. Afaik the only 
> generic query that will not work is splitting the query based on offset and 
> limit, since the syntax for that query is different than the one accepted by 
> Hive. We should create a {{HiveJdbcDatabaseAccessor}} to override that query 
> and possibly fix any other existing incompatibilities.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23302) Create HiveJdbcDatabaseAccessor for JDBC storage handler

2020-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23302?focusedWorklogId=474061=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-474061
 ]

ASF GitHub Bot logged work on HIVE-23302:
-

Author: ASF GitHub Bot
Created on: 24/Aug/20 22:04
Start Date: 24/Aug/20 22:04
Worklog Time Spent: 10m 
  Work Description: jcamachor opened a new pull request #1427:
URL: https://github.com/apache/hive/pull/1427


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 474061)
Remaining Estimate: 0h
Time Spent: 10m

> Create HiveJdbcDatabaseAccessor for JDBC storage handler
> 
>
> Key: HIVE-23302
> URL: https://issues.apache.org/jira/browse/HIVE-23302
> Project: Hive
>  Issue Type: Bug
>  Components: StorageHandler
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The {{JdbcDatabaseAccessor}} associated with the storage handler makes some 
> SQL calls to the RDBMS through the JDBC connection. There is a 
> {{GenericJdbcDatabaseAccessor}} with a generic implementation that the 
> storage handler uses if there is no specific implementation for a certain 
> RDBMS.
> Currently, Hive uses the {{GenericJdbcDatabaseAccessor}}. Afaik the only 
> generic query that will not work is splitting the query based on offset and 
> limit, since the syntax for that query is different than the one accepted by 
> Hive. We should create a {{HiveJdbcDatabaseAccessor}} to override that query 
> and possibly fix any other existing incompatibilities.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23302) Create HiveJdbcDatabaseAccessor for JDBC storage handler

2020-08-24 Thread Jesus Camacho Rodriguez (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-23302:
---
Status: Patch Available  (was: In Progress)

> Create HiveJdbcDatabaseAccessor for JDBC storage handler
> 
>
> Key: HIVE-23302
> URL: https://issues.apache.org/jira/browse/HIVE-23302
> Project: Hive
>  Issue Type: Bug
>  Components: StorageHandler
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> The {{JdbcDatabaseAccessor}} associated with the storage handler makes some 
> SQL calls to the RDBMS through the JDBC connection. There is a 
> {{GenericJdbcDatabaseAccessor}} with a generic implementation that the 
> storage handler uses if there is no specific implementation for a certain 
> RDBMS.
> Currently, Hive uses the {{GenericJdbcDatabaseAccessor}}. Afaik the only 
> generic query that will not work is splitting the query based on offset and 
> limit, since the syntax for that query is different than the one accepted by 
> Hive. We should create a {{HiveJdbcDatabaseAccessor}} to override that query 
> and possibly fix any other existing incompatibilities.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HIVE-23302) Create HiveJdbcDatabaseAccessor for JDBC storage handler

2020-08-24 Thread Jesus Camacho Rodriguez (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-23302 started by Jesus Camacho Rodriguez.
--
> Create HiveJdbcDatabaseAccessor for JDBC storage handler
> 
>
> Key: HIVE-23302
> URL: https://issues.apache.org/jira/browse/HIVE-23302
> Project: Hive
>  Issue Type: Bug
>  Components: StorageHandler
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> The {{JdbcDatabaseAccessor}} associated with the storage handler makes some 
> SQL calls to the RDBMS through the JDBC connection. There is a 
> {{GenericJdbcDatabaseAccessor}} with a generic implementation that the 
> storage handler uses if there is no specific implementation for a certain 
> RDBMS.
> Currently, Hive uses the {{GenericJdbcDatabaseAccessor}}. Afaik the only 
> generic query that will not work is splitting the query based on offset and 
> limit, since the syntax for that query is different than the one accepted by 
> Hive. We should create a {{HiveJdbcDatabaseAccessor}} to override that query 
> and possibly fix any other existing incompatibilities.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-23302) Create HiveJdbcDatabaseAccessor for JDBC storage handler

2020-08-24 Thread Jesus Camacho Rodriguez (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-23302:
--

Assignee: Jesus Camacho Rodriguez

> Create HiveJdbcDatabaseAccessor for JDBC storage handler
> 
>
> Key: HIVE-23302
> URL: https://issues.apache.org/jira/browse/HIVE-23302
> Project: Hive
>  Issue Type: Bug
>  Components: StorageHandler
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> The {{JdbcDatabaseAccessor}} associated with the storage handler makes some 
> SQL calls to the RDBMS through the JDBC connection. There is a 
> {{GenericJdbcDatabaseAccessor}} with a generic implementation that the 
> storage handler uses if there is no specific implementation for a certain 
> RDBMS.
> Currently, Hive uses the {{GenericJdbcDatabaseAccessor}}. Afaik the only 
> generic query that will not work is splitting the query based on offset and 
> limit, since the syntax for that query is different than the one accepted by 
> Hive. We should create a {{HiveJdbcDatabaseAccessor}} to override that query 
> and possibly fix any other existing incompatibilities.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23649) Fix FindBug issues in hive-service-rpc

2020-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23649?focusedWorklogId=474059=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-474059
 ]

ASF GitHub Bot logged work on HIVE-23649:
-

Author: ASF GitHub Bot
Created on: 24/Aug/20 21:56
Start Date: 24/Aug/20 21:56
Worklog Time Spent: 10m 
  Work Description: mustafaiman opened a new pull request #1426:
URL: https://github.com/apache/hive/pull/1426


   Entire org.apache.hive.service.rpc.thrift package is generated files. We 
should ignore these when running spotbugs.
   
   Change-Id: I0ec78853b50e3720976daf52a2efbc200047b281
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 474059)
Remaining Estimate: 0h
Time Spent: 10m

> Fix FindBug issues in hive-service-rpc
> --
>
> Key: HIVE-23649
> URL: https://issues.apache.org/jira/browse/HIVE-23649
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Mustafa Iman
>Priority: Major
> Attachments: spotbugsXml.xml
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23649) Fix FindBug issues in hive-service-rpc

2020-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-23649:
--
Labels: pull-request-available  (was: )

> Fix FindBug issues in hive-service-rpc
> --
>
> Key: HIVE-23649
> URL: https://issues.apache.org/jira/browse/HIVE-23649
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Attachments: spotbugsXml.xml
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-23649) Fix FindBug issues in hive-service-rpc

2020-08-24 Thread Mustafa Iman (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mustafa Iman reassigned HIVE-23649:
---

Assignee: Mustafa Iman

> Fix FindBug issues in hive-service-rpc
> --
>
> Key: HIVE-23649
> URL: https://issues.apache.org/jira/browse/HIVE-23649
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Mustafa Iman
>Priority: Major
> Attachments: spotbugsXml.xml
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24068) ReExecutionOverlayPlugin can handle DAG submission failures as well

2020-08-24 Thread Prasanth Jayachandran (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-24068:



> ReExecutionOverlayPlugin can handle DAG submission failures as well
> ---
>
> Key: HIVE-24068
> URL: https://issues.apache.org/jira/browse/HIVE-24068
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>
> ReExecutionOverlayPlugin handles cases where there is a vertex failure. DAG 
> submission failure can also happen in environments where AM container died 
> causing DNS issues. DAG submissions are safe to retry as the DAG hasn't 
> started execution yet.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-21025) LLAP IO fails on read if partition column is included in the table and the query has a predicate on the partition column

2020-08-24 Thread Mustafa Iman (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mustafa Iman reassigned HIVE-21025:
---

Assignee: (was: Mustafa Iman)

> LLAP IO fails on read if partition column is included in the table and the 
> query has a predicate on the partition column
> 
>
> Key: HIVE-21025
> URL: https://issues.apache.org/jira/browse/HIVE-21025
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.3.4
>Reporter: Eugene Koifman
>Priority: Major
>
> Hive doesn't officially support the case when a partitioning column is also 
> included in the data itself, though it works in some cases. Hive would never 
> write a data file with partition column in it but this can happen for 
> external tables where data is added by the end user.
> Consider improving validation (at least for schema-aware files) on read to 
> produce a better error than {{ArrayIndexOutOfBoundsException}}
> {code:java}
> Caused by: java.lang.ArrayIndexOutOfBoundsException 
> ], TaskAttempt 3 failed, info=[Error: Error while running task ( failure ) : 
> attempt_1539023000868_24675_3_01_07_3:java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
> java.io.IOException: java.lang.ArrayIndexOutOfBoundsException 
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:218)
>  
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:172) 
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
>  
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>  
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>  
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:422) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
>  
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>  
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>  
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) 
> at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
>  
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  
> at java.lang.Thread.run(Thread.java:745) 
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: java.io.IOException: 
> java.lang.ArrayIndexOutOfBoundsException 
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:189)
>  
> ... 15 more 
> Caused by: java.io.IOException: java.io.IOException: 
> java.lang.ArrayIndexOutOfBoundsException 
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>  
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>  
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:355)
>  
> at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
>  
> at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
>  
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>  
> at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
>  
> at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) 
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
>  
> ... 17 more 
> Caused by: java.io.IOException: java.lang.ArrayIndexOutOfBoundsException 
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.rethrowErrorIfAny(LlapRecordReader.java:355)
>  
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.nextCvb(LlapRecordReader.java:310)
>  
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.next(LlapRecordReader.java:250)
>  
> at 
> 

[jira] [Updated] (HIVE-24067) TestReplicationScenariosExclusiveReplica - Wrong FS error during DB drop

2020-08-24 Thread Pravin Sinha (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pravin Sinha updated HIVE-24067:

Status: Patch Available  (was: Open)

> TestReplicationScenariosExclusiveReplica - Wrong FS error during DB drop
> 
>
> Key: HIVE-24067
> URL: https://issues.apache.org/jira/browse/HIVE-24067
> Project: Hive
>  Issue Type: Task
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24067.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In TestReplicationScenariosExclusiveReplica during drop database operation 
> for primary db, it leads to wrong FS error as the ReplChangeManager is 
> associated with replica FS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24067) TestReplicationScenariosExclusiveReplica - Wrong FS error during DB drop

2020-08-24 Thread Pravin Sinha (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pravin Sinha updated HIVE-24067:

Attachment: HIVE-24067.01.patch

> TestReplicationScenariosExclusiveReplica - Wrong FS error during DB drop
> 
>
> Key: HIVE-24067
> URL: https://issues.apache.org/jira/browse/HIVE-24067
> Project: Hive
>  Issue Type: Task
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24067.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In TestReplicationScenariosExclusiveReplica during drop database operation 
> for primary db, it leads to wrong FS error as the ReplChangeManager is 
> associated with replica FS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24067) TestReplicationScenariosExclusiveReplica - Wrong FS error during DB drop

2020-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24067?focusedWorklogId=474043=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-474043
 ]

ASF GitHub Bot logged work on HIVE-24067:
-

Author: ASF GitHub Bot
Created on: 24/Aug/20 20:41
Start Date: 24/Aug/20 20:41
Worklog Time Spent: 10m 
  Work Description: pkumarsinha opened a new pull request #1425:
URL: https://github.com/apache/hive/pull/1425


   … during DB drop
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 474043)
Remaining Estimate: 0h
Time Spent: 10m

> TestReplicationScenariosExclusiveReplica - Wrong FS error during DB drop
> 
>
> Key: HIVE-24067
> URL: https://issues.apache.org/jira/browse/HIVE-24067
> Project: Hive
>  Issue Type: Task
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In TestReplicationScenariosExclusiveReplica during drop database operation 
> for primary db, it leads to wrong FS error as the ReplChangeManager is 
> associated with replica FS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24067) TestReplicationScenariosExclusiveReplica - Wrong FS error during DB drop

2020-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24067:
--
Labels: pull-request-available  (was: )

> TestReplicationScenariosExclusiveReplica - Wrong FS error during DB drop
> 
>
> Key: HIVE-24067
> URL: https://issues.apache.org/jira/browse/HIVE-24067
> Project: Hive
>  Issue Type: Task
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In TestReplicationScenariosExclusiveReplica during drop database operation 
> for primary db, it leads to wrong FS error as the ReplChangeManager is 
> associated with replica FS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24067) TestReplicationScenariosExclusiveReplica - Wrong FS error during DB drop

2020-08-24 Thread Pravin Sinha (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pravin Sinha reassigned HIVE-24067:
---


> TestReplicationScenariosExclusiveReplica - Wrong FS error during DB drop
> 
>
> Key: HIVE-24067
> URL: https://issues.apache.org/jira/browse/HIVE-24067
> Project: Hive
>  Issue Type: Task
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>
> In TestReplicationScenariosExclusiveReplica during drop database operation 
> for primary db, it leads to wrong FS error as the ReplChangeManager is 
> associated with replica FS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24066) Hive query on parquet data should identify if column is not present in file schema and show NULL value instead of Exception

2020-08-24 Thread Jainik Vora (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jainik Vora updated HIVE-24066:
---
Priority: Minor  (was: Trivial)

> Hive query on parquet data should identify if column is not present in file 
> schema and show NULL value instead of Exception
> ---
>
> Key: HIVE-24066
> URL: https://issues.apache.org/jira/browse/HIVE-24066
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.3.5
>Reporter: Jainik Vora
>Priority: Minor
>
> I created a hive table containing columns with struct data type 
>  
> {code:java}
> CREATE EXTERNAL TABLE abc_dwh.table_on_parquet (
>   `context` struct<`app`:struct<`build`:string, `name`:string, 
> `namespace`:string, `version`:string>, `screen`:struct<`height`:bigint, 
> `width`:bigint>, `timezone`:string>,
>   `messageid` string,
>   `timestamp` string,
>   `userid` string)
> PARTITIONED BY (year string, month string, day string, hour string)
> STORED as PARQUET
> LOCATION 's3://abc/xyz';
>   {code}
>  
> All columns are nullable hence the parquet files read by the table don't 
> always contain all columns. If any file in a partition doesn't have 
> "context.app" struct and if "context.app.version" is queried, Hive throws an 
> exception as below. Same for "context.screen" as well.
>  
> {code:java}
>  Caused by: java.io.IOException: java.lang.RuntimeException: Primitive type 
> appshould not doesn't match typeapp[version]
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:379)
> at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:203)
> ... 25 more
> Caused by: java.lang.RuntimeException: Primitive type appshould not doesn't 
> match typeapp[version]
> at 
> org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.projectLeafTypes(DataWritableReadSupport.java:330)
> at 
> org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.projectLeafTypes(DataWritableReadSupport.java:322)
> at 
> org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.getProjectedSchema(DataWritableReadSupport.java:249)
> at 
> org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:379)
> at 
> org.apache.hadoop.hive.ql.io.parquet.ParquetRecordReaderBase.getSplit(ParquetRecordReaderBase.java:84)
> at 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:75)
> at 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:60)
> at 
> org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:75)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:376)
> ... 26 more
>  {code}
>  
> Querying context.app shows as null
> {code:java}
> hive> select context.app from abc_dwh.table_on_parquet where year=2020 and 
> month='07' and day=26 and hour='03' limit 5;
> OK
> NULL
> NULL
> NULL
> NULL
> NULL
>   {code}
>  
> As a workaround, I tried querying "context.app.version" only if "context.app" 
> is not null but that also gave the same error.  *To verify the case statement 
> for null check, I ran below query which should produce "0" in result for all 
> columns produced "1".*  Distinct value of context.app for the partition is 
> NULL so ruled out differences in select with limit. Running the same query in 
> SparkSQL provides the correct result. 
> {code:java}
> hive> select case when context.app is null then 0 else 1 end status from 
> abc_dwh.table_on_parquet where year=2020 and month='07' and day=26 and 
> hour='03' limit 5;
> OK
> 1
> 1
> 1
> 1
> 1 {code}
> Hive Version used: 2.3.5-amzn-0 (on AWS EMR){color:#88}
> {color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-3619) Hive JDBC driver should return a proper update-count of rows affected by query

2020-08-24 Thread Miklos Szurap (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183511#comment-17183511
 ] 

Miklos Szurap commented on HIVE-3619:
-

Linked HIVE-20218 - seems that fixed this jira.

> Hive JDBC driver should return a proper update-count of rows affected by query
> --
>
> Key: HIVE-3619
> URL: https://issues.apache.org/jira/browse/HIVE-3619
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 0.9.0
>Reporter: Harsh J
>Priority: Minor
> Attachments: HIVE-3619.patch
>
>
> HiveStatement.java currently has an explicit 0 return:
> public int getUpdateCount() throws SQLException { return 0; }
> Ideally we ought to emit the exact number of rows affected by the query 
> statement itself.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24031) Infinite planning time on syntactically big queries

2020-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24031:
--
Labels: pull-request-available  (was: )

> Infinite planning time on syntactically big queries
> ---
>
> Key: HIVE-24031
> URL: https://issues.apache.org/jira/browse/HIVE-24031
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: ASTNode_getChildren_cost.png, 
> query_big_array_constructor.nps
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Syntactically big queries (~1 million tokens), such as the query shown below, 
> lead to very big (seemingly infinite) planning times.
> {code:sql}
> select posexplode(array('item1', 'item2', ..., 'item1M'));
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24031) Infinite planning time on syntactically big queries

2020-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24031?focusedWorklogId=473922=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-473922
 ]

ASF GitHub Bot logged work on HIVE-24031:
-

Author: ASF GitHub Bot
Created on: 24/Aug/20 15:05
Start Date: 24/Aug/20 15:05
Worklog Time Spent: 10m 
  Work Description: zabetak opened a new pull request #1424:
URL: https://github.com/apache/hive/pull/1424


   ### What changes were proposed in this pull request?
   
   1. Drop the defensive copy of children inside ASTNode#getChildren.
   2. Protect clients by accidentally modifying the list via an
   unmodifiable collection.
   
   ### Why are the changes needed?
   Profiling shows the vast majority of time spend on creating defensive
   copies of the node expression list inside ASTNode#getChildren.
   
   The method is called extensively from various places in the code
   especially those walking over the expression tree so it needs to be
   efficient.
   
   Most of the time creating defensive copies is not necessary. For those
   cases (if any) that the list needs to be modified clients should perform
   a copy themselves.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   The test was added in a separate branch since it is not meant to be 
committed upstream for the following reasons:
   
   - the query for reproducing the problem takes up a few MBs
   - requires some changes in the default configurations.
   
   If you want to run the test run the following commands: 
   ```
   git checkout -b HIVE-24031-TEST master
   git pull g...@github.com:zabetak/hive.git HIVE-24031-PLUS-TEST
   mvn clean install -DskipTests
   cd itests
   mvn clean install -DskipTests
   cd qtest
   mvn test -Dtest=TestMiniLlapLocalCliDriver 
-Dqfile=big_query_with_array_constructor.q -Dtest.output.overwrite
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 473922)
Remaining Estimate: 0h
Time Spent: 10m

> Infinite planning time on syntactically big queries
> ---
>
> Key: HIVE-24031
> URL: https://issues.apache.org/jira/browse/HIVE-24031
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: ASTNode_getChildren_cost.png, 
> query_big_array_constructor.nps
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Syntactically big queries (~1 million tokens), such as the query shown below, 
> lead to very big (seemingly infinite) planning times.
> {code:sql}
> select posexplode(array('item1', 'item2', ..., 'item1M'));
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24065) Bloom filters can be cached after deserialization in VectorInBloomFilterColDynamicValue

2020-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24065:
--
Labels: pull-request-available  (was: )

> Bloom filters can be cached after deserialization in 
> VectorInBloomFilterColDynamicValue
> ---
>
> Key: HIVE-24065
> URL: https://issues.apache.org/jira/browse/HIVE-24065
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2020-08-05-10-05-25-080.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Same bloom filter is loaded multiple times across tasks. It would be good to 
> check if we can optimise this, to avoid deserializing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24065) Bloom filters can be cached after deserialization in VectorInBloomFilterColDynamicValue

2020-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24065?focusedWorklogId=473920=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-473920
 ]

ASF GitHub Bot logged work on HIVE-24065:
-

Author: ASF GitHub Bot
Created on: 24/Aug/20 14:49
Start Date: 24/Aug/20 14:49
Worklog Time Spent: 10m 
  Work Description: abstractdog opened a new pull request #1423:
URL: https://github.com/apache/hive/pull/1423


   Change-Id: I311f131c03392618cc2dac186e7e53a48ede1eb4
   
   
   
   ### What changes were proposed in this pull request?
   As the title suggests, expensive bloom filter deserialization can be 
eliminated by caching the bloom filters. This way, only 1 filter instance per 
daemon (or container in container mode) will be present.
   
   
   ### Why are the changes needed?
   Performance improvement.
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   ### How was this patch tested?
   Tested on cluster.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 473920)
Remaining Estimate: 0h
Time Spent: 10m

> Bloom filters can be cached after deserialization in 
> VectorInBloomFilterColDynamicValue
> ---
>
> Key: HIVE-24065
> URL: https://issues.apache.org/jira/browse/HIVE-24065
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Attachments: image-2020-08-05-10-05-25-080.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Same bloom filter is loaded multiple times across tasks. It would be good to 
> check if we can optimise this, to avoid deserializing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24065) Bloom filters can be cached after deserialization in VectorInBloomFilterColDynamicValue

2020-08-24 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-24065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-24065:

Attachment: image-2020-08-05-10-05-25-080.png

> Bloom filters can be cached after deserialization in 
> VectorInBloomFilterColDynamicValue
> ---
>
> Key: HIVE-24065
> URL: https://issues.apache.org/jira/browse/HIVE-24065
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Attachments: image-2020-08-05-10-05-25-080.png
>
>
> Same bloom filter is loaded multiple times across tasks. It would be good to 
> check if we can optimise this, to avoid deserializing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24065) Bloom filters can be cached after deserialization in VectorInBloomFilterColDynamicValue

2020-08-24 Thread Jira


[ 
https://issues.apache.org/jira/browse/HIVE-24065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183353#comment-17183353
 ] 

László Bodor commented on HIVE-24065:
-

the idea is to cache the bloom filters in order to eliminate deserialization

> Bloom filters can be cached after deserialization in 
> VectorInBloomFilterColDynamicValue
> ---
>
> Key: HIVE-24065
> URL: https://issues.apache.org/jira/browse/HIVE-24065
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Attachments: image-2020-08-05-10-05-25-080.png
>
>
> Same bloom filter is loaded multiple times across tasks. It would be good to 
> check if we can optimise this, to avoid deserializing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24065) Bloom filters can be cached after deserialization in VectorInBloomFilterColDynamicValue

2020-08-24 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-24065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-24065:

Description: Same bloom filter is loaded multiple times across tasks. It 
would be good to check if we can optimise this, to avoid deserializing.

> Bloom filters can be cached after deserialization in 
> VectorInBloomFilterColDynamicValue
> ---
>
> Key: HIVE-24065
> URL: https://issues.apache.org/jira/browse/HIVE-24065
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> Same bloom filter is loaded multiple times across tasks. It would be good to 
> check if we can optimise this, to avoid deserializing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24065) Bloom filters can be cached after deserialization in VectorInBloomFilterColDynamicValue

2020-08-24 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-24065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor reassigned HIVE-24065:
---

Assignee: László Bodor

> Bloom filters can be cached after deserialization in 
> VectorInBloomFilterColDynamicValue
> ---
>
> Key: HIVE-24065
> URL: https://issues.apache.org/jira/browse/HIVE-24065
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24064) Disable Materialized View Replication

2020-08-24 Thread Arko Sharma (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arko Sharma updated HIVE-24064:
---
Attachment: HIVE-24064.01.patch
Status: Patch Available  (was: Open)

> Disable Materialized View Replication
> -
>
> Key: HIVE-24064
> URL: https://issues.apache.org/jira/browse/HIVE-24064
> Project: Hive
>  Issue Type: Bug
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24064.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24064) Disable Materialized View Replication

2020-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24064?focusedWorklogId=473902=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-473902
 ]

ASF GitHub Bot logged work on HIVE-24064:
-

Author: ASF GitHub Bot
Created on: 24/Aug/20 13:47
Start Date: 24/Aug/20 13:47
Worklog Time Spent: 10m 
  Work Description: ArkoSharma opened a new pull request #1422:
URL: https://github.com/apache/hive/pull/1422


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 473902)
Remaining Estimate: 0h
Time Spent: 10m

> Disable Materialized View Replication
> -
>
> Key: HIVE-24064
> URL: https://issues.apache.org/jira/browse/HIVE-24064
> Project: Hive
>  Issue Type: Bug
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24064) Disable Materialized View Replication

2020-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24064:
--
Labels: pull-request-available  (was: )

> Disable Materialized View Replication
> -
>
> Key: HIVE-24064
> URL: https://issues.apache.org/jira/browse/HIVE-24064
> Project: Hive
>  Issue Type: Bug
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24064) Disable Materialized View Replication

2020-08-24 Thread Arko Sharma (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arko Sharma reassigned HIVE-24064:
--

Assignee: Arko Sharma

> Disable Materialized View Replication
> -
>
> Key: HIVE-24064
> URL: https://issues.apache.org/jira/browse/HIVE-24064
> Project: Hive
>  Issue Type: Bug
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23880) Bloom filters can be merged in a parallel way in VectorUDAFBloomFilterMerge

2020-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23880?focusedWorklogId=473884=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-473884
 ]

ASF GitHub Bot logged work on HIVE-23880:
-

Author: ASF GitHub Bot
Created on: 24/Aug/20 12:49
Start Date: 24/Aug/20 12:49
Worklog Time Spent: 10m 
  Work Description: abstractdog opened a new pull request #1280:
URL: https://github.com/apache/hive/pull/1280


   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY)
   For more details, please see 
https://cwiki.apache.org/confluence/display/Hive/HowToContribute
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 473884)
Time Spent: 8h 20m  (was: 8h 10m)

> Bloom filters can be merged in a parallel way in VectorUDAFBloomFilterMerge
> ---
>
> Key: HIVE-23880
> URL: https://issues.apache.org/jira/browse/HIVE-23880
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
> Attachments: lipwig-output3605036885489193068.svg
>
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> Merging bloom filters in semijoin reduction can become the main bottleneck in 
> case of large number of source mapper tasks (~1000, Map 1 in below example) 
> and a large amount of expected entries (50M) in bloom filters.
> For example in TPCDS Q93:
> {code}
> select /*+ semi(store_returns, sr_item_sk, store_sales, 7000)*/ 
> ss_customer_sk
> ,sum(act_sales) sumsales
>   from (select ss_item_sk
>   ,ss_ticket_number
>   ,ss_customer_sk
>   ,case when sr_return_quantity is not null then 
> (ss_quantity-sr_return_quantity)*ss_sales_price
> else 
> (ss_quantity*ss_sales_price) end act_sales
> from store_sales left outer join store_returns on (sr_item_sk = 
> ss_item_sk
>and 
> sr_ticket_number = ss_ticket_number)
> ,reason
> where sr_reason_sk = r_reason_sk
>   and r_reason_desc = 'reason 66') t
>   group by ss_customer_sk
>   order by sumsales, ss_customer_sk
> limit 100;
> {code}
> On 10TB-30TB scale there is a chance that from 3-4 mins of query runtime 1-2 
> mins are spent with merging bloom filters (Reducer 2), as in:  
> [^lipwig-output3605036885489193068.svg] 
> {code}
> --
> VERTICES  MODESTATUS  TOTAL  COMPLETED  RUNNING  PENDING  
> FAILED  KILLED
> --
> Map 3 ..  llap SUCCEEDED  1  100  
>  0   0
> Map 1 ..  llap SUCCEEDED   1263   126300  
>  0   0
> Reducer 2 llap   RUNNING  1  010  
>  0   0
> Map 4 llap   RUNNING   6154  0  207 5947  
>  0   0
> Reducer 5 llapINITED 43  00   43  
>  0   0
> Reducer 6 llapINITED  1  001  
>  0   0
> --
> VERTICES: 02/06  [>>--] 16%   ELAPSED TIME: 149.98 s
> --
> {code}
> For example, 70M entries in bloom filter leads to a 436 465 696 bits, so 
> merging 1263 bloom filters means running ~ 1263 * 436 465 696 bitwise OR 
> operation, which is very hot codepath, but can be parallelized.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23880) Bloom filters can be merged in a parallel way in VectorUDAFBloomFilterMerge

2020-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23880?focusedWorklogId=473883=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-473883
 ]

ASF GitHub Bot logged work on HIVE-23880:
-

Author: ASF GitHub Bot
Created on: 24/Aug/20 12:49
Start Date: 24/Aug/20 12:49
Worklog Time Spent: 10m 
  Work Description: abstractdog closed pull request #1280:
URL: https://github.com/apache/hive/pull/1280


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 473883)
Time Spent: 8h 10m  (was: 8h)

> Bloom filters can be merged in a parallel way in VectorUDAFBloomFilterMerge
> ---
>
> Key: HIVE-23880
> URL: https://issues.apache.org/jira/browse/HIVE-23880
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
> Attachments: lipwig-output3605036885489193068.svg
>
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> Merging bloom filters in semijoin reduction can become the main bottleneck in 
> case of large number of source mapper tasks (~1000, Map 1 in below example) 
> and a large amount of expected entries (50M) in bloom filters.
> For example in TPCDS Q93:
> {code}
> select /*+ semi(store_returns, sr_item_sk, store_sales, 7000)*/ 
> ss_customer_sk
> ,sum(act_sales) sumsales
>   from (select ss_item_sk
>   ,ss_ticket_number
>   ,ss_customer_sk
>   ,case when sr_return_quantity is not null then 
> (ss_quantity-sr_return_quantity)*ss_sales_price
> else 
> (ss_quantity*ss_sales_price) end act_sales
> from store_sales left outer join store_returns on (sr_item_sk = 
> ss_item_sk
>and 
> sr_ticket_number = ss_ticket_number)
> ,reason
> where sr_reason_sk = r_reason_sk
>   and r_reason_desc = 'reason 66') t
>   group by ss_customer_sk
>   order by sumsales, ss_customer_sk
> limit 100;
> {code}
> On 10TB-30TB scale there is a chance that from 3-4 mins of query runtime 1-2 
> mins are spent with merging bloom filters (Reducer 2), as in:  
> [^lipwig-output3605036885489193068.svg] 
> {code}
> --
> VERTICES  MODESTATUS  TOTAL  COMPLETED  RUNNING  PENDING  
> FAILED  KILLED
> --
> Map 3 ..  llap SUCCEEDED  1  100  
>  0   0
> Map 1 ..  llap SUCCEEDED   1263   126300  
>  0   0
> Reducer 2 llap   RUNNING  1  010  
>  0   0
> Map 4 llap   RUNNING   6154  0  207 5947  
>  0   0
> Reducer 5 llapINITED 43  00   43  
>  0   0
> Reducer 6 llapINITED  1  001  
>  0   0
> --
> VERTICES: 02/06  [>>--] 16%   ELAPSED TIME: 149.98 s
> --
> {code}
> For example, 70M entries in bloom filter leads to a 436 465 696 bits, so 
> merging 1263 bloom filters means running ~ 1263 * 436 465 696 bitwise OR 
> operation, which is very hot codepath, but can be parallelized.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-18284) NPE when inserting data with 'distribute by' clause with dynpart sort optimization

2020-08-24 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-18284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183160#comment-17183160
 ] 

Syed Shameerur Rahman commented on HIVE-18284:
--

[~jcamachorodriguez] Could you please review the PR?

> NPE when inserting data with 'distribute by' clause with dynpart sort 
> optimization
> --
>
> Key: HIVE-18284
> URL: https://issues.apache.org/jira/browse/HIVE-18284
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.3.1, 2.3.2
>Reporter: Aki Tanaka
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> A Null Pointer Exception occurs when inserting data with 'distribute by' 
> clause. The following snippet query reproduces this issue:
> *(non-vectorized , non-llap mode)*
> {code:java}
> create table table1 (col1 string, datekey int);
> insert into table1 values ('ROW1', 1), ('ROW2', 2), ('ROW3', 1);
> create table table2 (col1 string) partitioned by (datekey int);
> set hive.vectorized.execution.enabled=false;
> set hive.optimize.sort.dynamic.partition=true;
> set hive.exec.dynamic.partition.mode=nonstrict;
> insert into table table2
> PARTITION(datekey)
> select col1,
> datekey
> from table1
> distribute by datekey ;
> {code}
> I could run the insert query without the error if I remove Distribute By  or 
> use Cluster By clause.
> It seems that the issue happens because Distribute By does not guarantee 
> clustering or sorting properties on the distributed keys.
> FileSinkOperator removes the previous fsp. FileSinkOperator will remove the 
> previous fsp which might be re-used when we use Distribute By.
> https://github.com/apache/hive/blob/branch-2.3/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java#L972
> The following stack trace is logged.
> {code:java}
> Vertex failed, vertexName=Reducer 2, vertexId=vertex_1513111717879_0056_1_01, 
> diagnostics=[Task failed, taskId=task_1513111717879_0056_1_01_00, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( 
> failure ) : 
> attempt_1513111717879_0056_1_01_00_0:java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row (tag=0) {"key":{},"value":{"_col0":"ROW3","_col1":1}}
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row (tag=0) 
> {"key":{},"value":{"_col0":"ROW3","_col1":1}}
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:365)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:250)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:317)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:185)
>   ... 14 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:762)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:356)
>   ... 17 more
> 

[jira] [Work logged] (HIVE-23851) MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions

2020-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23851?focusedWorklogId=473832=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-473832
 ]

ASF GitHub Bot logged work on HIVE-23851:
-

Author: ASF GitHub Bot
Created on: 24/Aug/20 11:01
Start Date: 24/Aug/20 11:01
Worklog Time Spent: 10m 
  Work Description: shameersss1 commented on pull request #1271:
URL: https://github.com/apache/hive/pull/1271#issuecomment-679059909


   @kgyrtkirk Could you please review the PR?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 473832)
Time Spent: 2h 20m  (was: 2h 10m)

> MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions
> 
>
> Key: HIVE-23851
> URL: https://issues.apache.org/jira/browse/HIVE-23851
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> *Steps to reproduce:*
> # Create external table
> # Run msck command to sync all the partitions with metastore
> # Remove one of the partition path
> # Run msck repair with partition filtering
> *Stack Trace:*
> {code:java}
>  2020-07-15T02:10:29,045 ERROR [4dad298b-28b1-4e6b-94b6-aa785b60c576 main] 
> ppr.PartitionExpressionForMetastore: Failed to deserialize the expression
>  java.lang.IndexOutOfBoundsException: Index: 110, Size: 0
>  at java.util.ArrayList.rangeCheck(ArrayList.java:657) ~[?:1.8.0_192]
>  at java.util.ArrayList.get(ArrayList.java:433) ~[?:1.8.0_192]
>  at 
> org.apache.hive.com.esotericsoftware.kryo.util.MapReferenceResolver.getReadObject(MapReferenceResolver.java:60)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:857)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:707) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:211)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeObjectFromKryo(SerializationUtilities.java:806)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeExpressionFromKryo(SerializationUtilities.java:775)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.deserializeExpr(PartitionExpressionForMetastore.java:96)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.convertExprToFilter(PartitionExpressionForMetastore.java:52)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.PartFilterExprUtil.makeExpressionTree(PartFilterExprUtil.java:48)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:3593)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.VerifyingObjectStore.getPartitionsByExpr(VerifyingObjectStore.java:80)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT-tests.jar:4.0.0-SNAPSHOT]
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_192]
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_192]
> {code}
> *Cause:*
> In case of msck repair with partition filtering we expect expression proxy 
> class to be set as PartitionExpressionForMetastore ( 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/ddl/misc/msck/MsckAnalyzer.java#L78
>  ), While dropping partition we serialize the drop partition filter 
> expression as ( 
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java#L589
>  ) which is incompatible during deserializtion happening in 
> PartitionExpressionForMetastore ( 
> 

[jira] [Commented] (HIVE-23851) MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions

2020-08-24 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183159#comment-17183159
 ] 

Syed Shameerur Rahman commented on HIVE-23851:
--

[~kgyrtkirk] Could you please review the PR?

> MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions
> 
>
> Key: HIVE-23851
> URL: https://issues.apache.org/jira/browse/HIVE-23851
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> *Steps to reproduce:*
> # Create external table
> # Run msck command to sync all the partitions with metastore
> # Remove one of the partition path
> # Run msck repair with partition filtering
> *Stack Trace:*
> {code:java}
>  2020-07-15T02:10:29,045 ERROR [4dad298b-28b1-4e6b-94b6-aa785b60c576 main] 
> ppr.PartitionExpressionForMetastore: Failed to deserialize the expression
>  java.lang.IndexOutOfBoundsException: Index: 110, Size: 0
>  at java.util.ArrayList.rangeCheck(ArrayList.java:657) ~[?:1.8.0_192]
>  at java.util.ArrayList.get(ArrayList.java:433) ~[?:1.8.0_192]
>  at 
> org.apache.hive.com.esotericsoftware.kryo.util.MapReferenceResolver.getReadObject(MapReferenceResolver.java:60)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:857)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:707) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:211)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeObjectFromKryo(SerializationUtilities.java:806)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeExpressionFromKryo(SerializationUtilities.java:775)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.deserializeExpr(PartitionExpressionForMetastore.java:96)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.convertExprToFilter(PartitionExpressionForMetastore.java:52)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.PartFilterExprUtil.makeExpressionTree(PartFilterExprUtil.java:48)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:3593)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.VerifyingObjectStore.getPartitionsByExpr(VerifyingObjectStore.java:80)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT-tests.jar:4.0.0-SNAPSHOT]
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_192]
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_192]
> {code}
> *Cause:*
> In case of msck repair with partition filtering we expect expression proxy 
> class to be set as PartitionExpressionForMetastore ( 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/ddl/misc/msck/MsckAnalyzer.java#L78
>  ), While dropping partition we serialize the drop partition filter 
> expression as ( 
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java#L589
>  ) which is incompatible during deserializtion happening in 
> PartitionExpressionForMetastore ( 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionExpressionForMetastore.java#L52
>  ) hence the query fails with Failed to deserialize the expression.
> *Solutions*:
> I could think of two approaches to this problem
> # Since PartitionExpressionForMetastore is required only during parition 
> pruning step, We can switch back the expression proxy class to 
> MsckPartitionExpressionProxy once the partition pruning step is done.
> # The other solution is to make serialization process in msck drop partition 
> filter expression compatible with the one with 
> PartitionExpressionForMetastore, We can do this via Reflection since the drop 
> partition serialization happens in Msck class (standadlone-metatsore) by this 
> way we can completely remove the need for class 

[jira] [Updated] (HIVE-23926) Flaky test TestTableLevelReplicationScenarios.testRenameTableScenariosWithReplacePolicyDMLOperattion

2020-08-24 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23926:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

+1. Merged to master. Thanks for the patch [~^sharma]

> Flaky test 
> TestTableLevelReplicationScenarios.testRenameTableScenariosWithReplacePolicyDMLOperattion
> 
>
> Key: HIVE-23926
> URL: https://issues.apache.org/jira/browse/HIVE-23926
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23926.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> http://ci.hive.apache.org/job/hive-precommit/job/master/123/testReport/org.apache.hadoop.hive.ql.parse/TestTableLevelReplicationScenarios/Testing___split_18___Archive___testRenameTableScenariosWithReplacePolicyDMLOperattion/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24032) Remove hadoop shims dependency and use FileSystem Api directly from standalone metastore

2020-08-24 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24032:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Merged to master. Thanks for the patch [~aasha] an review [~pkumarsinha] !

> Remove hadoop shims dependency and use FileSystem Api directly from 
> standalone metastore
> 
>
> Key: HIVE-24032
> URL: https://issues.apache.org/jira/browse/HIVE-24032
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24032.01.patch, HIVE-24032.02.patch, 
> HIVE-24032.03.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Remove hadoop shims dependency from standalone metastore. 
> Rename hive.repl.data.copy.lazy hive conf to 
> hive.repl.run.data.copy.tasks.on.target



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23723) Limit operator pushdown through LOJ

2020-08-24 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa updated HIVE-23723:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Limit operator pushdown through LOJ
> ---
>
> Key: HIVE-23723
> URL: https://issues.apache.org/jira/browse/HIVE-23723
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Limit operator (without an order by) can be pushed through SELECTS and LEFT 
> OUTER JOINs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23723) Limit operator pushdown through LOJ

2020-08-24 Thread Krisztian Kasa (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183131#comment-17183131
 ] 

Krisztian Kasa commented on HIVE-23723:
---

Pushed to master, thanks [~amagyar]!

> Limit operator pushdown through LOJ
> ---
>
> Key: HIVE-23723
> URL: https://issues.apache.org/jira/browse/HIVE-23723
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Limit operator (without an order by) can be pushed through SELECTS and LEFT 
> OUTER JOINs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24032) Remove hadoop shims dependency and use FileSystem Api directly from standalone metastore

2020-08-24 Thread Aasha Medhi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-24032:
---
Description: Remove hadoop shims dependency from standalone metastore.

> Remove hadoop shims dependency and use FileSystem Api directly from 
> standalone metastore
> 
>
> Key: HIVE-24032
> URL: https://issues.apache.org/jira/browse/HIVE-24032
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24032.01.patch, HIVE-24032.02.patch, 
> HIVE-24032.03.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Remove hadoop shims dependency from standalone metastore.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23723) Limit operator pushdown through LOJ

2020-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23723?focusedWorklogId=473820=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-473820
 ]

ASF GitHub Bot logged work on HIVE-23723:
-

Author: ASF GitHub Bot
Created on: 24/Aug/20 10:12
Start Date: 24/Aug/20 10:12
Worklog Time Spent: 10m 
  Work Description: kasakrisz merged pull request #1323:
URL: https://github.com/apache/hive/pull/1323


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 473820)
Time Spent: 1h 20m  (was: 1h 10m)

> Limit operator pushdown through LOJ
> ---
>
> Key: HIVE-23723
> URL: https://issues.apache.org/jira/browse/HIVE-23723
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Limit operator (without an order by) can be pushed through SELECTS and LEFT 
> OUTER JOINs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24032) Remove hadoop shims dependency and use FileSystem Api directly from standalone metastore

2020-08-24 Thread Aasha Medhi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-24032:
---
Description: 
Remove hadoop shims dependency from standalone metastore. 
Rename hive.repl.data.copy.lazy hive conf to 
hive.repl.run.data.copy.tasks.on.target

  was:Remove hadoop shims dependency from standalone metastore.


> Remove hadoop shims dependency and use FileSystem Api directly from 
> standalone metastore
> 
>
> Key: HIVE-24032
> URL: https://issues.apache.org/jira/browse/HIVE-24032
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24032.01.patch, HIVE-24032.02.patch, 
> HIVE-24032.03.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Remove hadoop shims dependency from standalone metastore. 
> Rename hive.repl.data.copy.lazy hive conf to 
> hive.repl.run.data.copy.tasks.on.target



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24063) SqlFunctionConverter#getHiveUDF handles cast before geting FunctionInfo

2020-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24063:
--
Labels: pull-request-available  (was: )

> SqlFunctionConverter#getHiveUDF handles cast before geting FunctionInfo
> ---
>
> Key: HIVE-24063
> URL: https://issues.apache.org/jira/browse/HIVE-24063
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Priority: Trivial
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When the current SqlOperator is SqlCastFunction, 
> FunctionRegistry.getFunctionInfo would return null, 
> but when hive.allow.udf.load.on.demand is enabled, HiveServer2 will refer to 
> metastore for the function definition,  an exception stack trace can be seen 
> here in HiveServer2 log:
> INFO exec.FunctionRegistry: Unable to look up default.cast in metastore
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> NoSuchObjectException(message:Function @hive#default.cast does not exist)
>  at org.apache.hadoop.hive.ql.metadata.Hive.getFunction(Hive.java:5495) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.Registry.getFunctionInfoFromMetastoreNoLock(Registry.java:788)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.Registry.getQualifiedFunctionInfo(Registry.java:657)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.Registry.getFunctionInfo(Registry.java:351) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionInfo(FunctionRegistry.java:597)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.calcite.translator.SqlFunctionConverter.getHiveUDF(SqlFunctionConverter.java:158)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.PartitionPrune$ExtractPartPruningPredicate.visitCall(PartitionPrune.java:112)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.PartitionPrune$ExtractPartPruningPredicate.visitCall(PartitionPrune.java:68)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at org.apache.calcite.rex.RexCall.accept(RexCall.java:191) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.PartitionPrune$ExtractPartPruningPredicate.visitCall(PartitionPrune.java:134)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.PartitionPrune$ExtractPartPruningPredicate.visitCall(PartitionPrune.java:68)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at org.apache.calcite.rex.RexCall.accept(RexCall.java:191) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.PartitionPrune$ExtractPartPruningPredicate.visitCall(PartitionPrune.java:134)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.PartitionPrune$ExtractPartPruningPredicate.visitCall(PartitionPrune.java:68)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at org.apache.calcite.rex.RexCall.accept(RexCall.java:191) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.PartitionPrune$ExtractPartPruningPredicate.visitCall(PartitionPrune.java:134)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] 
>  
> So it's may be better to handle explicit cast before geting the FunctionInfo 
> from Registry. Even if there is no cast in the query,  the method 
> handleExplicitCast returns null quickly when op.kind is not a SqlKind.CAST.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24063) SqlFunctionConverter#getHiveUDF handles cast before geting FunctionInfo

2020-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24063?focusedWorklogId=473816=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-473816
 ]

ASF GitHub Bot logged work on HIVE-24063:
-

Author: ASF GitHub Bot
Created on: 24/Aug/20 10:06
Start Date: 24/Aug/20 10:06
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 opened a new pull request #1421:
URL: https://github.com/apache/hive/pull/1421


   …g FunctionInfo
   
   
   
   ### What changes were proposed in this pull request?
   SqlFunctionConverter#getHiveUDF handles cast before geting FunctionInfo
   
   
   
   ### Why are the changes needed?
   With hive.allow.udf.load.on.demand is enabled,  another rpc call will be 
make to metastore for cast definition when getting FunctionInfo, but there is 
no need to do this.
   
   
   
   ### Does this PR introduce _any_ user-facing change
   No
   
   
   
   ### How was this patch tested?
   Included tests
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 473816)
Remaining Estimate: 0h
Time Spent: 10m

> SqlFunctionConverter#getHiveUDF handles cast before geting FunctionInfo
> ---
>
> Key: HIVE-24063
> URL: https://issues.apache.org/jira/browse/HIVE-24063
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Priority: Trivial
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When the current SqlOperator is SqlCastFunction, 
> FunctionRegistry.getFunctionInfo would return null, 
> but when hive.allow.udf.load.on.demand is enabled, HiveServer2 will refer to 
> metastore for the function definition,  an exception stack trace can be seen 
> here in HiveServer2 log:
> INFO exec.FunctionRegistry: Unable to look up default.cast in metastore
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> NoSuchObjectException(message:Function @hive#default.cast does not exist)
>  at org.apache.hadoop.hive.ql.metadata.Hive.getFunction(Hive.java:5495) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.Registry.getFunctionInfoFromMetastoreNoLock(Registry.java:788)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.Registry.getQualifiedFunctionInfo(Registry.java:657)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.Registry.getFunctionInfo(Registry.java:351) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionInfo(FunctionRegistry.java:597)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.calcite.translator.SqlFunctionConverter.getHiveUDF(SqlFunctionConverter.java:158)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.PartitionPrune$ExtractPartPruningPredicate.visitCall(PartitionPrune.java:112)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.PartitionPrune$ExtractPartPruningPredicate.visitCall(PartitionPrune.java:68)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at org.apache.calcite.rex.RexCall.accept(RexCall.java:191) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.PartitionPrune$ExtractPartPruningPredicate.visitCall(PartitionPrune.java:134)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.PartitionPrune$ExtractPartPruningPredicate.visitCall(PartitionPrune.java:68)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at org.apache.calcite.rex.RexCall.accept(RexCall.java:191) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.PartitionPrune$ExtractPartPruningPredicate.visitCall(PartitionPrune.java:134)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.PartitionPrune$ExtractPartPruningPredicate.visitCall(PartitionPrune.java:68)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at org.apache.calcite.rex.RexCall.accept(RexCall.java:191) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.PartitionPrune$ExtractPartPruningPredicate.visitCall(PartitionPrune.java:134)
>  

[jira] [Assigned] (HIVE-24062) Combine all table constrains RDBMS calls in one SQL call

2020-08-24 Thread Ashish Sharma (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma reassigned HIVE-24062:



> Combine all table constrains RDBMS calls in one SQL call
> 
>
> Key: HIVE-24062
> URL: https://issues.apache.org/jira/browse/HIVE-24062
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Table consist of 6 different type of constrains namely 
> PrimaryKey,ForeignKey,UniqueConstraint,NotNullConstraint,DefaultConstraint,CheckConstraint.
>  All constrains has different SQL query to fetch the infromation from RDBMS. 
> Which lead to 6 different RDBS call. 
> Idea here is to have one complex query which fetch all the constrains 
> information at once then filter the result set on the basis of constrains 
> type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22352) Hive JDBC Storage Handler, simple select query failed with NPE if executed using Fetch Task

2020-08-24 Thread chenruotao (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183003#comment-17183003
 ] 

chenruotao commented on HIVE-22352:
---

maybe you should try to set hive.sql.query.fieldNames=id,date and  set 
hive.sql.query.fieldTypes=int,timestamp

or update the ext table column name col1 to id and col2 to date

> Hive JDBC Storage Handler, simple select query failed with NPE if executed 
> using Fetch Task
> ---
>
> Key: HIVE-22352
> URL: https://issues.apache.org/jira/browse/HIVE-22352
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.1
> Environment: Hive-3.1
>Reporter: Rajkumar Singh
>Priority: Blocker
>
> Steps To Repro:
>  
> {code:java}
> // MySQL Table
> CREATE TABLE `visitors` ( `id` bigint(20) unsigned NOT NULL, `date` timestamp 
> NOT NULL DEFAULT CURRENT_TIMESTAMP )
> // hive table
> CREATE EXTERNAL TABLE `hive_visitors`( `col1` bigint COMMENT 'from 
> deserializer', `col2` timestamp COMMENT 'from deserializer') ROW FORMAT SERDE 
> 'org.apache.hive.storage.jdbc.JdbcSerDe' STORED BY 
> 'org.apache.hive.storage.jdbc.JdbcStorageHandler' WITH SERDEPROPERTIES ( 
> 'serialization.format'='1') TBLPROPERTIES ( 'bucketing_version'='2', 
> 'hive.sql.database.type'='MYSQL', 'hive.sql.dbcp.maxActive'='1', 
> 'hive.sql.dbcp.password'='hive', 'hive.sql.dbcp.username'='hive', 
> 'hive.sql.jdbc.driver'='com.mysql.jdbc.Driver', 
> 'hive.sql.jdbc.url'='jdbc:mysql://hostname/test', 
> 'hive.sql.table'='visitors', 'transient_lastDdlTime'='1554910389')
> Query:
> select * from hive_visitors ;
> Exception:
> 2019-10-16T04:04:39,483 WARN  [HiveServer2-Handler-Pool: Thread-71]: 
> thrift.ThriftCLIService (:()) - Error fetching results: 
> org.apache.hive.service.cli.HiveSQLException: java.io.IOException: 
> java.lang.NullPointerException at 
> org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:478)
>  ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at 
> org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:328)
>  ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at 
> org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:952)
>  ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at 
> sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source) ~[?:?] at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_112] at java.lang.reflect.Method.invoke(Method.java:498) 
> ~[?:1.8.0_112] at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
>  ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at 
> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
>  ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at 
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
>  ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at 
> java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112] at 
> javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112] at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>  ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
>  ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at 
> com.sun.proxy.$Proxy42.fetchResults(Unknown Source) ~[?:?] at 
> org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:565) 
> ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:792)
>  ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1837)
>  ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.1000-SNAPSHOT] at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1822)
>  ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.1000-SNAPSHOT] at 
> org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) 
> ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.1000-SNAPSHOT] at 
> org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) 
> ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.1000-SNAPSHOT] at 
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
>  ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>  ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.1000-SNAPSHOT] at 
>