date:20200717

[jira] [Assigned] (HIVE-23254) Upgrade guava version in hive from 19.0 to 27.0-jre

2020-07-17 Thread wenjun ma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wenjun ma reassigned HIVE-23254:


Assignee: wenjun ma

> Upgrade guava version in hive from 19.0 to 27.0-jre
> ---
>
> Key: HIVE-23254
> URL: https://issues.apache.org/jira/browse/HIVE-23254
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Ankur Raj
>Assignee: wenjun ma
>Priority: Critical
>
> Upgrade guava version in hive from 19.0 to 27.0-jre. 
> Hadoop has already upgraded it as part of 
> [https://jira.apache.org/jira/browse/HADOOP-16213]
> Concern : [https://nvd.nist.gov/vuln/detail/CVE-2018-10237 
> :|https://nvd.nist.gov/vuln/detail/CVE-2018-10237]
> Unbounded memory allocation in Google Guava 11.0 through 24.x before 24.1.1 
> allows remote attackers to conduct denial of service attacks against servers 
> that depend on this library and deserialize attacker-provided data, because 
> the AtomicDoubleArray class (when serialized with Java serialization) and the 
> CompoundOrdering class (when serialized with GWT serialization) perform eager 
> allocation without appropriate checks on what a client has sent and whether 
> the data size is reasonable.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23873) Querying Hive JDBCStorageHandler table fails with NPE

2020-07-17 Thread Chiran Ravani (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17160324#comment-17160324
 ] 

Chiran Ravani commented on HIVE-23873:
--

[~srahman] Yes, problem exists with Master branch too. Going by Code, it does 
not seem we are handling column name case conversion in Master branch.

{code}
2020-07-18T03:44:39,678 INFO  [83565622-bc0d-4dbd-b463-88188d46b64e main]: 
dao.GenericJdbcDatabaseAccessor (:()) - Query to execute is [select * from 
TESTHIVEJDBCSTORAGE]
2020-07-18T03:44:39,898 ERROR [83565622-bc0d-4dbd-b463-88188d46b64e main]: 
CliDriver (:()) - Failed with exception 
java.io.IOException:java.lang.NullPointerException
java.io.IOException: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:638)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:545)
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:150)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:603)
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:243)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:277)
at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:201)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:127)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:422)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:862)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:798)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:717)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:318)
at org.apache.hadoop.util.RunJar.main(RunJar.java:232)
Caused by: java.lang.NullPointerException
at 
org.apache.hive.storage.jdbc.JdbcSerDe.deserialize(JdbcSerDe.java:235)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:619)
... 17 more
{code}

> Querying Hive JDBCStorageHandler table fails with NPE
> -
>
> Key: HIVE-23873
> URL: https://issues.apache.org/jira/browse/HIVE-23873
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 3.1.0, 3.1.1, 3.1.2
>Reporter: Chiran Ravani
>Assignee: Chiran Ravani
>Priority: Critical
> Attachments: HIVE-23873.01.patch
>
>
> Scenario is Hive table having same schema as table in Oracle, however when we 
> query the table with data it fails with NPE, below is the trace.
> {code}
> Caused by: java.io.IOException: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:617)
>  ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:524) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2739) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
>  ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:473)
>  ~[hive-service-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> ... 34 more
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hive.storage.jdbc.JdbcSerDe.deserialize(JdbcSerDe.java:164) 
> ~[hive-jdbc-handler-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:598)
>  ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:524) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2739) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
>

[jira] [Assigned] (HIVE-21335) NPE in ObjectStore.getObjectCount

2020-07-17 Thread wenjun ma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wenjun ma reassigned HIVE-21335:


Assignee: wenjun ma

> NPE in ObjectStore.getObjectCount
> -
>
> Key: HIVE-21335
> URL: https://issues.apache.org/jira/browse/HIVE-21335
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Adam Holley
>Assignee: wenjun ma
>Priority: Major
>
> In ObjectStore.getObjectCount() there is no null check on result before call 
> intValue(). 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-22233) Wrong result with vectorized execution when column value is casted to TINYINT

2020-07-17 Thread wenjun ma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wenjun ma reassigned HIVE-22233:


Assignee: wenjun ma

> Wrong result with vectorized execution when column value is casted to TINYINT
> -
>
> Key: HIVE-22233
> URL: https://issues.apache.org/jira/browse/HIVE-22233
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 3.1.1, 2.3.4, 2.3.6
>Reporter: Ganesha Shreedhara
>Assignee: wenjun ma
>Priority: Major
>
> Casting a column value to TINYINT is giving incorrect result when vectorized 
> mode of the reduce-side GROUP BY query execution is enabled by setting 
> *hive.vectorized.execution.reduce.groupby.enabled* parameter (enabled by 
> default). This issue is only when the sub query has SUM/COUNT aggregation 
> operations in IF condition.  
>  
> *Steps to reproduce:* 
> {code:java}
> create table test(id int);
> insert into test values (1);
> SELECT CAST(col AS TINYINT) col_cast FROM ( SELECT IF(SUM(1) > 0, 1, 0) col 
> FROM test) x;
> {code}
>  
> *Result:*
> {code:java}
> 0{code}
> *Expected result:*
> {code:java}
> 1{code}
>  
> We get the expected result when 
> *hive.vectorized.execution.reduce.groupby.enabled* parameter is disabled. 
> We also get the expected result when we don't CAST or don't have SUM/COUNT 
> aggregation in IF condition.
> The following queries give correct result when 
> hive.vectorized.execution.reduce.groupby.enabled is set.  
> {code:java}
> SELECT CAST(col AS INT) col_cast FROM ( SELECT IF(SUM(1) > 0, 1, 0) col FROM 
> test) x;
> SELECT col FROM ( SELECT IF(SUM(1) > 0, 1, 0) col FROM test) x;
> SELECT CAST(col AS TINYINT) col_cast FROM ( SELECT IF(2 > 1, 1, 0) col FROM 
> test) x;
> SELECT CAST(col AS TINYINT) col_cast FROM ( SELECT IF(true, 1, 0) col FROM 
> test) x;
> {code}
>  
> This issue is only when we use *CAST(col AS TINYINT)* along with *IF(SUM(1) > 
> 0, 1, 0)* or *IF(COUNT(1) > 0, 1, 0)* in sub query. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-22383) `alterPartitions` is invoked twice during dynamic partition load causing runtime delay

2020-07-17 Thread wenjun ma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wenjun ma reassigned HIVE-22383:


Assignee: wenjun ma

> `alterPartitions` is invoked twice during dynamic partition load causing 
> runtime delay
> --
>
> Key: HIVE-22383
> URL: https://issues.apache.org/jira/browse/HIVE-22383
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: wenjun ma
>Priority: Major
>  Labels: performance
>
> First invocation in {{Hive::loadDynamicPartitions}}
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2978
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2638
> Second invocation in {{BasicStatsTask::aggregateStats}}
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java#L335
> This leads to good amount of delay in dynamic partition loading.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23740) [Hive]delete from ; without where clause not giving correct error msg

2020-07-17 Thread wenjun ma (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17160299#comment-17160299
 ] 

wenjun ma commented on HIVE-23740:
--

Hi [~abhishek.akg], It should be as design. For insert_only table, you can only 
inert and drop it. 

> [Hive]delete from ; without where clause not giving correct error 
> msg
> 
>
> Key: HIVE-23740
> URL: https://issues.apache.org/jira/browse/HIVE-23740
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: ABHISHEK KUMAR GUPTA
>Assignee: wenjun ma
>Priority: Minor
>
> Created Hive table from Hive
> Inserted data 
> and fire delete from ;
> CREATE TABLE insert_only (key int, value string) STORED AS ORC
>  TBLPROPERTIES ("transactional"="true", 
> "transactional_properties"="insert_only");
>  INSERT INTO insert_only VALUES (13,'BAD'), (14,'SUCCESS');
>  delete from insert_only;
> Error throws:
> Error: Error while compiling statement: FAILED: SemanticException [Error 
> 10414]: Attempt to do update or delete on table hive.insert_only that is 
> insert-only transactional (state=42000,code=10414)
> Expectation:
> Should throw as where clause is missing because to delete all the content of 
> table hive provides truncate table ;



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-23740) [Hive]delete from ; without where clause not giving correct error msg

2020-07-17 Thread wenjun ma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wenjun ma reassigned HIVE-23740:


Assignee: wenjun ma

> [Hive]delete from ; without where clause not giving correct error 
> msg
> 
>
> Key: HIVE-23740
> URL: https://issues.apache.org/jira/browse/HIVE-23740
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: ABHISHEK KUMAR GUPTA
>Assignee: wenjun ma
>Priority: Minor
>
> Created Hive table from Hive
> Inserted data 
> and fire delete from ;
> CREATE TABLE insert_only (key int, value string) STORED AS ORC
>  TBLPROPERTIES ("transactional"="true", 
> "transactional_properties"="insert_only");
>  INSERT INTO insert_only VALUES (13,'BAD'), (14,'SUCCESS');
>  delete from insert_only;
> Error throws:
> Error: Error while compiling statement: FAILED: SemanticException [Error 
> 10414]: Attempt to do update or delete on table hive.insert_only that is 
> insert-only transactional (state=42000,code=10414)
> Expectation:
> Should throw as where clause is missing because to delete all the content of 
> table hive provides truncate table ;



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23351) Ranger Replication Scheduling

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23351?focusedWorklogId=460626=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460626
 ]

ASF GitHub Bot logged work on HIVE-23351:
-

Author: ASF GitHub Bot
Created on: 18/Jul/20 00:32
Start Date: 18/Jul/20 00:32
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1004:
URL: https://github.com/apache/hive/pull/1004


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460626)
Time Spent: 3h 10m  (was: 3h)

> Ranger Replication Scheduling
> -
>
> Key: HIVE-23351
> URL: https://issues.apache.org/jira/browse/HIVE-23351
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23351.01.patch, HIVE-23351.02.patch, 
> HIVE-23351.03.patch, HIVE-23351.04.patch, HIVE-23351.05.patch, 
> HIVE-23351.06.patch, HIVE-23351.07.patch, HIVE-23351.08.patch, 
> HIVE-23351.09.patch, HIVE-23351.10.patch, HIVE-23351.10.patch, 
> HIVE-23351.11.patch, HIVE-23351.12.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-23876) Miss init NOTIFICATION_SEQUENCE on hive-schema-3.1.0.mssql.sql

2020-07-17 Thread wenjun ma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wenjun ma reassigned HIVE-23876:


Assignee: wenjun ma

> Miss init NOTIFICATION_SEQUENCE on hive-schema-3.1.0.mssql.sql
> --
>
> Key: HIVE-23876
> URL: https://issues.apache.org/jira/browse/HIVE-23876
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: wenjun ma
>Assignee: wenjun ma
>Priority: Major
>
> Miss 
> INSERT INTO NOTIFICATION_SEQUENCE (NNI_ID, NEXT_EVENT_ID) SELECT 1,1 FROM 
> DUAL WHERE NOT EXISTS ( SELECT NEXT_EVENT_ID FROM NOTIFICATION_SEQUENCE);
> in hive-schema-3.1.0.mssql.sql
> Others db schemas are OK.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23876) Miss init NOTIFICATION_SEQUENCE on hive-schema-3.1.0.mssql.sql

2020-07-17 Thread wenjun ma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wenjun ma updated HIVE-23876:
-
Description: 
Miss 

INSERT INTO NOTIFICATION_SEQUENCE (NNI_ID, NEXT_EVENT_ID) SELECT 1,1 FROM DUAL 
WHERE NOT EXISTS ( SELECT NEXT_EVENT_ID FROM NOTIFICATION_SEQUENCE);

in hive-schema-3.1.0.mssql.sql

Others db schemas are OK.

  was:
Miss 

INSERT INTO NOTIFICATION_SEQUENCE (NNI_ID, NEXT_EVENT_ID) SELECT 1,1 FROM DUAL 
WHERE NOT EXISTS ( SELECT NEXT_EVENT_ID FROM NOTIFICATION_SEQUENCE);

in hive-schema-3.1.0.mssql.sql


> Miss init NOTIFICATION_SEQUENCE on hive-schema-3.1.0.mssql.sql
> --
>
> Key: HIVE-23876
> URL: https://issues.apache.org/jira/browse/HIVE-23876
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: wenjun ma
>Priority: Major
>
> Miss 
> INSERT INTO NOTIFICATION_SEQUENCE (NNI_ID, NEXT_EVENT_ID) SELECT 1,1 FROM 
> DUAL WHERE NOT EXISTS ( SELECT NEXT_EVENT_ID FROM NOTIFICATION_SEQUENCE);
> in hive-schema-3.1.0.mssql.sql
> Others db schemas are OK.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23836) Make "cols" dependent so that it cascade deletes

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23836?focusedWorklogId=460624=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460624
 ]

ASF GitHub Bot logged work on HIVE-23836:
-

Author: ASF GitHub Bot
Created on: 18/Jul/20 00:27
Start Date: 18/Jul/20 00:27
Worklog Time Spent: 10m 
  Work Description: ashutoshc commented on pull request #1239:
URL: https://github.com/apache/hive/pull/1239#issuecomment-660391990


   +1 LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460624)
Time Spent: 0.5h  (was: 20m)

> Make "cols" dependent so that it cascade deletes
> 
>
> Key: HIVE-23836
> URL: https://issues.apache.org/jira/browse/HIVE-23836
> Project: Hive
>  Issue Type: Bug
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {quote}
> If you want the deletion of a persistent object to cause the deletion of 
> related objects then you need to mark the related fields in the mapping to be 
> "dependent".
> {quote}
> http://www.datanucleus.org/products/accessplatform/jdo/persistence.html#dependent_fields
> http://www.datanucleus.org/products/datanucleus/jdo/persistence.html#_deleting_an_object
> The database won't do it:
> {code:sql|title=Derby Schema}
> ALTER TABLE "APP"."COLUMNS_V2" ADD CONSTRAINT "COLUMNS_V2_FK1" FOREIGN KEY 
> ("CD_ID") REFERENCES "APP"."CDS" ("CD_ID") ON DELETE NO ACTION ON UPDATE NO 
> ACTION;
> {code}
> https://github.com/apache/hive/blob/65cf6957cf9432277a096f91b40985237274579f/standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql#L452



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HIVE-23707) Unable to create materialized views with transactions enabled with MySQL metastore

2020-07-17 Thread wenjun ma (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17160283#comment-17160283
 ] 

wenjun ma edited comment on HIVE-23707 at 7/18/20, 12:26 AM:
-

I can not reproduce this issue with the same version with the MS SQL database.  
can you try to create a new cluster to reproduce? 


was (Author: wenjunma003):
I can not reproduce this issue with the same version with the MS SQL database.  
what're kinds of DB do you use it?  can you try to create a new cluster to 
reproduce? 

> Unable to create materialized views with transactions enabled with MySQL 
> metastore
> --
>
> Key: HIVE-23707
> URL: https://issues.apache.org/jira/browse/HIVE-23707
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.1.2
>Reporter: Dustin Koupal
>Assignee: wenjun ma
>Priority: Blocker
>
> When attempting to create a materialized view with transactions enabled, we 
> get the following exception:
>  
> {code:java}
> ERROR : FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Failed to 
> generate new Mapping of type 
> org.datanucleus.store.rdbms.mapping.java.StringMapping, exception : JDBC type 
> CLOB declared for field 
> "org.apache.hadoop.hive.metastore.model.MCreationMetadata.txnList" of java 
> type java.lang.String cant be mapped for this datastore.ERROR : FAILED: 
> Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. 
> MetaException(message:Failed to generate new Mapping of type 
> org.datanucleus.store.rdbms.mapping.java.StringMapping, exception : JDBC type 
> CLOB declared for field 
> "org.apache.hadoop.hive.metastore.model.MCreationMetadata.txnList" of java 
> type java.lang.String cant be mapped for this datastore.JDBC type CLOB 
> declared for field 
> "org.apache.hadoop.hive.metastore.model.MCreationMetadata.txnList" of java 
> type java.lang.String cant be mapped for this 
> datastore.org.datanucleus.exceptions.NucleusException: JDBC type CLOB 
> declared for field 
> "org.apache.hadoop.hive.metastore.model.MCreationMetadata.txnList" of java 
> type java.lang.String cant be mapped for this datastore. at 
> org.datanucleus.store.rdbms.mapping.RDBMSMappingManager.getDatastoreMappingClass(RDBMSMappingManager.java:1386)
>  at 
> org.datanucleus.store.rdbms.mapping.RDBMSMappingManager.createDatastoreMapping(RDBMSMappingManager.java:1616)
>  at 
> org.datanucleus.store.rdbms.mapping.java.SingleFieldMapping.prepareDatastoreMapping(SingleFieldMapping.java:59)
>  at 
> org.datanucleus.store.rdbms.mapping.java.SingleFieldMapping.initialize(SingleFieldMapping.java:48)
>  at 
> org.datanucleus.store.rdbms.mapping.RDBMSMappingManager.getMapping(RDBMSMappingManager.java:482)
>  at 
> org.datanucleus.store.rdbms.table.ClassTable.manageMembers(ClassTable.java:536)
>  at 
> org.datanucleus.store.rdbms.table.ClassTable.manageClass(ClassTable.java:442) 
> at 
> org.datanucleus.store.rdbms.table.ClassTable.initializeForClass(ClassTable.java:1270)
>  at 
> org.datanucleus.store.rdbms.table.ClassTable.initialize(ClassTable.java:276) 
> at 
> org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.initializeClassTables(RDBMSStoreManager.java:3279)
>  at 
> org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.run(RDBMSStoreManager.java:2889)
>  at 
> org.datanucleus.store.rdbms.AbstractSchemaTransaction.execute(AbstractSchemaTransaction.java:119)
>  at 
> org.datanucleus.store.rdbms.RDBMSStoreManager.manageClasses(RDBMSStoreManager.java:1627)
>  at 
> org.datanucleus.store.rdbms.RDBMSStoreManager.getDatastoreClass(RDBMSStoreManager.java:672)
>  at 
> org.datanucleus.store.rdbms.RDBMSStoreManager.getPropertiesForGenerator(RDBMSStoreManager.java:2088)
>  at 
> org.datanucleus.store.AbstractStoreManager.getStrategyValue(AbstractStoreManager.java:1271)
>  at 
> org.datanucleus.ExecutionContextImpl.newObjectId(ExecutionContextImpl.java:3760)
>  at 
> org.datanucleus.state.StateManagerImpl.setIdentity(StateManagerImpl.java:2267)
>  at 
> org.datanucleus.state.StateManagerImpl.initialiseForPersistentNew(StateManagerImpl.java:484)
>  at 
> org.datanucleus.state.StateManagerImpl.initialiseForPersistentNew(StateManagerImpl.java:120)
>  at 
> org.datanucleus.state.ObjectProviderFactoryImpl.newForPersistentNew(ObjectProviderFactoryImpl.java:218)
>  at 
> org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2079)
>  at 
> org.datanucleus.ExecutionContextImpl.persistObjectWork(ExecutionContextImpl.java:1923)
>  at 
> org.datanucleus.ExecutionContextImpl.persistObject(ExecutionContextImpl.java:1778)
>  at 
>

[jira] [Commented] (HIVE-23707) Unable to create materialized views with transactions enabled with MySQL metastore

2020-07-17 Thread wenjun ma (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17160283#comment-17160283
 ] 

wenjun ma commented on HIVE-23707:
--

I can not reproduce this issue with the same version with the MS SQL database.  
what're kinds of DB do you use it?  can you try to create a new cluster to 
reproduce? 

> Unable to create materialized views with transactions enabled with MySQL 
> metastore
> --
>
> Key: HIVE-23707
> URL: https://issues.apache.org/jira/browse/HIVE-23707
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.1.2
>Reporter: Dustin Koupal
>Assignee: wenjun ma
>Priority: Blocker
>
> When attempting to create a materialized view with transactions enabled, we 
> get the following exception:
>  
> {code:java}
> ERROR : FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Failed to 
> generate new Mapping of type 
> org.datanucleus.store.rdbms.mapping.java.StringMapping, exception : JDBC type 
> CLOB declared for field 
> "org.apache.hadoop.hive.metastore.model.MCreationMetadata.txnList" of java 
> type java.lang.String cant be mapped for this datastore.ERROR : FAILED: 
> Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. 
> MetaException(message:Failed to generate new Mapping of type 
> org.datanucleus.store.rdbms.mapping.java.StringMapping, exception : JDBC type 
> CLOB declared for field 
> "org.apache.hadoop.hive.metastore.model.MCreationMetadata.txnList" of java 
> type java.lang.String cant be mapped for this datastore.JDBC type CLOB 
> declared for field 
> "org.apache.hadoop.hive.metastore.model.MCreationMetadata.txnList" of java 
> type java.lang.String cant be mapped for this 
> datastore.org.datanucleus.exceptions.NucleusException: JDBC type CLOB 
> declared for field 
> "org.apache.hadoop.hive.metastore.model.MCreationMetadata.txnList" of java 
> type java.lang.String cant be mapped for this datastore. at 
> org.datanucleus.store.rdbms.mapping.RDBMSMappingManager.getDatastoreMappingClass(RDBMSMappingManager.java:1386)
>  at 
> org.datanucleus.store.rdbms.mapping.RDBMSMappingManager.createDatastoreMapping(RDBMSMappingManager.java:1616)
>  at 
> org.datanucleus.store.rdbms.mapping.java.SingleFieldMapping.prepareDatastoreMapping(SingleFieldMapping.java:59)
>  at 
> org.datanucleus.store.rdbms.mapping.java.SingleFieldMapping.initialize(SingleFieldMapping.java:48)
>  at 
> org.datanucleus.store.rdbms.mapping.RDBMSMappingManager.getMapping(RDBMSMappingManager.java:482)
>  at 
> org.datanucleus.store.rdbms.table.ClassTable.manageMembers(ClassTable.java:536)
>  at 
> org.datanucleus.store.rdbms.table.ClassTable.manageClass(ClassTable.java:442) 
> at 
> org.datanucleus.store.rdbms.table.ClassTable.initializeForClass(ClassTable.java:1270)
>  at 
> org.datanucleus.store.rdbms.table.ClassTable.initialize(ClassTable.java:276) 
> at 
> org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.initializeClassTables(RDBMSStoreManager.java:3279)
>  at 
> org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.run(RDBMSStoreManager.java:2889)
>  at 
> org.datanucleus.store.rdbms.AbstractSchemaTransaction.execute(AbstractSchemaTransaction.java:119)
>  at 
> org.datanucleus.store.rdbms.RDBMSStoreManager.manageClasses(RDBMSStoreManager.java:1627)
>  at 
> org.datanucleus.store.rdbms.RDBMSStoreManager.getDatastoreClass(RDBMSStoreManager.java:672)
>  at 
> org.datanucleus.store.rdbms.RDBMSStoreManager.getPropertiesForGenerator(RDBMSStoreManager.java:2088)
>  at 
> org.datanucleus.store.AbstractStoreManager.getStrategyValue(AbstractStoreManager.java:1271)
>  at 
> org.datanucleus.ExecutionContextImpl.newObjectId(ExecutionContextImpl.java:3760)
>  at 
> org.datanucleus.state.StateManagerImpl.setIdentity(StateManagerImpl.java:2267)
>  at 
> org.datanucleus.state.StateManagerImpl.initialiseForPersistentNew(StateManagerImpl.java:484)
>  at 
> org.datanucleus.state.StateManagerImpl.initialiseForPersistentNew(StateManagerImpl.java:120)
>  at 
> org.datanucleus.state.ObjectProviderFactoryImpl.newForPersistentNew(ObjectProviderFactoryImpl.java:218)
>  at 
> org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2079)
>  at 
> org.datanucleus.ExecutionContextImpl.persistObjectWork(ExecutionContextImpl.java:1923)
>  at 
> org.datanucleus.ExecutionContextImpl.persistObject(ExecutionContextImpl.java:1778)
>  at 
> org.datanucleus.ExecutionContextThreadedImpl.persistObject(ExecutionContextThreadedImpl.java:217)
>  at 
> org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:724)
>  at 
> org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:749)
>

[jira] [Work logged] (HIVE-23855) TestQueryShutdownHooks is flaky

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23855?focusedWorklogId=460620=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460620
 ]

ASF GitHub Bot logged work on HIVE-23855:
-

Author: ASF GitHub Bot
Created on: 18/Jul/20 00:17
Start Date: 18/Jul/20 00:17
Worklog Time Spent: 10m 
  Work Description: mustafaiman opened a new pull request #1277:
URL: https://github.com/apache/hive/pull/1277


   Increased timeout for async query. Test were not isolated very well. Test 
async query did not clean up properly. State leaked to test sync causing it to 
fail. Cleanup is moved to @After so cleanup is always run.
   
   Change-Id: I669ba35c22020910f5e348003b1f05d8a7cde75d
   
   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY)
   For more details, please see 
https://cwiki.apache.org/confluence/display/Hive/HowToContribute
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460620)
Time Spent: 0.5h  (was: 20m)

> TestQueryShutdownHooks is flaky
> ---
>
> Key: HIVE-23855
> URL: https://issues.apache.org/jira/browse/HIVE-23855
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> http://ci.hive.apache.org/job/hive-precommit/job/master/100/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23855) TestQueryShutdownHooks is flaky

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23855?focusedWorklogId=460619=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460619
 ]

ASF GitHub Bot logged work on HIVE-23855:
-

Author: ASF GitHub Bot
Created on: 18/Jul/20 00:16
Start Date: 18/Jul/20 00:16
Worklog Time Spent: 10m 
  Work Description: mustafaiman closed pull request #1277:
URL: https://github.com/apache/hive/pull/1277


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460619)
Time Spent: 20m  (was: 10m)

> TestQueryShutdownHooks is flaky
> ---
>
> Key: HIVE-23855
> URL: https://issues.apache.org/jira/browse/HIVE-23855
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> http://ci.hive.apache.org/job/hive-precommit/job/master/100/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23797) Throw exception when no metastore found in zookeeper

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23797?focusedWorklogId=460615=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460615
 ]

ASF GitHub Bot logged work on HIVE-23797:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 23:58
Start Date: 17/Jul/20 23:58
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 commented on pull request #1201:
URL: https://github.com/apache/hive/pull/1201#issuecomment-660385890


   @belugabehr Is there anything else to do to make the pr get through? thank 
you very much!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460615)
Time Spent: 1h  (was: 50m)

> Throw exception when no metastore  found in zookeeper
> -
>
> Key: HIVE-23797
> URL: https://issues.apache.org/jira/browse/HIVE-23797
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> When enable service discovery for metastore, there is a chance that the 
> client may find no metastore uris available in zookeeper, such as during 
> metastores startup or the client wrongly configured the path. This results to 
> redundant retries and finally MetaException with "Unknown exception" message.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-20441) NPE in GenericUDF when hive.allow.udf.load.on.demand is set to true

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-20441?focusedWorklogId=460611=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460611
 ]

ASF GitHub Bot logged work on HIVE-20441:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 23:53
Start Date: 17/Jul/20 23:53
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 commented on pull request #1242:
URL: https://github.com/apache/hive/pull/1242#issuecomment-660384821


   @kgyrtkirk @pvary cloud you please take a look? thanks!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460611)
Time Spent: 1h 10m  (was: 1h)

> NPE in GenericUDF  when hive.allow.udf.load.on.demand is set to true
> 
>
> Key: HIVE-20441
> URL: https://issues.apache.org/jira/browse/HIVE-20441
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, HiveServer2
>Affects Versions: 1.2.1, 2.3.3
>Reporter: Hui Huang
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20441.1.patch, HIVE-20441.2.patch, 
> HIVE-20441.3.patch, HIVE-20441.4.patch, HIVE-20441.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> When hive.allow.udf.load.on.demand is set to true and hiveserver2 has been 
> started, the new created function from other clients or hiveserver2 will be 
> loaded from the metastore at the first time. 
> When the udf is used in where clause, we got a NPE like:
> {code:java}
> Error executing statement:
> org.apache.hive.service.cli.HiveSQLException: Error while compiling 
> statement: FAILED: NullPointerException null
> at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:380)
>  ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:206)
>  ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:290)
>  ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at 
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:320) 
> ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:530)
>  ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAP
> SHOT]
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:517)
>  ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHO
> T]
> at 
> org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:310)
>  ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:542)
>  ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1437)
>  ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNA
> PSHOT]
> at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1422)
>  ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNA
> PSHOT]
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) 
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) 
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at 
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:57)
>  ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>  ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [?:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [?:1.8.0_77]
> at java.lang.Thread.run(Thread.java:745) [?:1.8.0_77]
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:236)
>  ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:1104)

[jira] [Work logged] (HIVE-23855) TestQueryShutdownHooks is flaky

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23855?focusedWorklogId=460573=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460573
 ]

ASF GitHub Bot logged work on HIVE-23855:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 22:12
Start Date: 17/Jul/20 22:12
Worklog Time Spent: 10m 
  Work Description: mustafaiman opened a new pull request #1277:
URL: https://github.com/apache/hive/pull/1277


   Increased timeout for async query. Test were not isolated very well. Test 
async query did not clean up properly. State leaked to test sync causing it to 
fail. Cleanup is moved to @After so cleanup is always run.
   
   Change-Id: I669ba35c22020910f5e348003b1f05d8a7cde75d
   
   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY)
   For more details, please see 
https://cwiki.apache.org/confluence/display/Hive/HowToContribute
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460573)
Remaining Estimate: 0h
Time Spent: 10m

> TestQueryShutdownHooks is flaky
> ---
>
> Key: HIVE-23855
> URL: https://issues.apache.org/jira/browse/HIVE-23855
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Mustafa Iman
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> http://ci.hive.apache.org/job/hive-precommit/job/master/100/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23855) TestQueryShutdownHooks is flaky

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-23855:
--
Labels: pull-request-available  (was: )

> TestQueryShutdownHooks is flaky
> ---
>
> Key: HIVE-23855
> URL: https://issues.apache.org/jira/browse/HIVE-23855
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> http://ci.hive.apache.org/job/hive-precommit/job/master/100/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-23874) Add Debug Logging to HiveQueryResultSet

2020-07-17 Thread Hunter Logan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hunter Logan reassigned HIVE-23874:
---

Assignee: Hunter Logan

> Add Debug Logging to HiveQueryResultSet
> ---
>
> Key: HIVE-23874
> URL: https://issues.apache.org/jira/browse/HIVE-23874
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Reporter: Hunter Logan
>Assignee: Hunter Logan
>Priority: Minor
>
> Adding a debug message on this topic with handle, orientation, and fetch size 
> would be useful.
> [https://github.com/apache/hive/blob/bc00454c194413753ac1d7067044ca78c77e1a34/jdbc/src/java/org/apache/hive/jdbc/HiveQueryResultSet.java#L342]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23875) Add VSCode files to gitignore

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23875?focusedWorklogId=460519=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460519
 ]

ASF GitHub Bot logged work on HIVE-23875:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 20:47
Start Date: 17/Jul/20 20:47
Worklog Time Spent: 10m 
  Work Description: HunterL opened a new pull request #1276:
URL: https://github.com/apache/hive/pull/1276


   Added VSCode files to gitignore



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460519)
Remaining Estimate: 0h
Time Spent: 10m

> Add VSCode files to gitignore
> -
>
> Key: HIVE-23875
> URL: https://issues.apache.org/jira/browse/HIVE-23875
> Project: Hive
>  Issue Type: Improvement
>Reporter: Hunter Logan
>Assignee: Hunter Logan
>Priority: Trivial
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> gitignore currently includes Eclipse and Intellij specific files, should 
> include VSCode as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23875) Add VSCode files to gitignore

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-23875:
--
Labels: pull-request-available  (was: )

> Add VSCode files to gitignore
> -
>
> Key: HIVE-23875
> URL: https://issues.apache.org/jira/browse/HIVE-23875
> Project: Hive
>  Issue Type: Improvement
>Reporter: Hunter Logan
>Assignee: Hunter Logan
>Priority: Trivial
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> gitignore currently includes Eclipse and Intellij specific files, should 
> include VSCode as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-23875) Add VSCode files to gitignore

2020-07-17 Thread Hunter Logan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hunter Logan reassigned HIVE-23875:
---

Assignee: Hunter Logan

> Add VSCode files to gitignore
> -
>
> Key: HIVE-23875
> URL: https://issues.apache.org/jira/browse/HIVE-23875
> Project: Hive
>  Issue Type: Improvement
>Reporter: Hunter Logan
>Assignee: Hunter Logan
>Priority: Trivial
>
> gitignore currently includes Eclipse and Intellij specific files, should 
> include VSCode as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23871) ObjectStore should properly handle MicroManaged Table properties

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23871?focusedWorklogId=460467=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460467
 ]

ASF GitHub Bot logged work on HIVE-23871:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 18:29
Start Date: 17/Jul/20 18:29
Worklog Time Spent: 10m 
  Work Description: pgaref edited a comment on pull request #1273:
URL: https://github.com/apache/hive/pull/1273#issuecomment-660269331


   Thanks for the review @mustafaiman ! Addressed your comments as part of the 
second commit -- I am still expecting some q.out differences so currently 
waiting for the qtests to run



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460467)
Time Spent: 1h  (was: 50m)

> ObjectStore should properly handle MicroManaged Table properties
> 
>
> Key: HIVE-23871
> URL: https://issues.apache.org/jira/browse/HIVE-23871
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: table1
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> HIVE-23281 optimizes StorageDescriptor conversion as part of the ObjectStore 
> by skipping particular Table properties like SkewInfo, bucketCols, ordering 
> etc.
>  However, it does that for all Transactional Tables – not only ACID – causing 
> MicroManaged Tables to behave abnormally.
>  MicroManaged (insert_only) tables may miss needed properties such as Storage 
> Desc Params – that may define how lines are delimited (like in the example 
> below):
> To repro the issue:
> {code:java}
> CREATE TRANSACTIONAL TABLE delim_table_trans(id INT, name STRING, safety INT) 
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE;
> LOAD DATA INPATH 'table1' OVERWRITE INTO TABLE delim_table_trans;
> describe formatted delim_table_trans;
> SELECT * FROM delim_table_trans;
> {code}
> Result:
> {code:java}
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   bucketing_version   2   
>   numFiles1   
>   numRows 0   
>   rawDataSize 0   
>   totalSize   72  
>   transactional   true
>   transactional_propertiesinsert_only 
>  A masked pattern was here 
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> PREHOOK: query: SELECT * FROM delim_table_trans
> PREHOOK: type: QUERY
> PREHOOK: Input: default@delim_table_trans
>  A masked pattern was here 
> POSTHOOK: query: SELECT * FROM delim_table_trans
> POSTHOOK: type: QUERY
> POSTHOOK: Input: default@delim_table_trans
>  A masked pattern was here 
> NULL  NULLNULL
> NULL  NULLNULL
> NULL  NULLNULL
> NULL  NULLNULL
> NULL  NULLNULL
> NULL  NULLNULL
>  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23871) ObjectStore should properly handle MicroManaged Table properties

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23871?focusedWorklogId=460456=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460456
 ]

ASF GitHub Bot logged work on HIVE-23871:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 18:24
Start Date: 17/Jul/20 18:24
Worklog Time Spent: 10m 
  Work Description: pgaref commented on pull request #1273:
URL: https://github.com/apache/hive/pull/1273#issuecomment-660269331


   Thanks for the review @mustafaiman ! Addressed your comments as part of the 
second commit -- I am still expecting some q.out difference so I am currently 
waiting for the qtests to run



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460456)
Time Spent: 50m  (was: 40m)

> ObjectStore should properly handle MicroManaged Table properties
> 
>
> Key: HIVE-23871
> URL: https://issues.apache.org/jira/browse/HIVE-23871
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: table1
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> HIVE-23281 optimizes StorageDescriptor conversion as part of the ObjectStore 
> by skipping particular Table properties like SkewInfo, bucketCols, ordering 
> etc.
>  However, it does that for all Transactional Tables – not only ACID – causing 
> MicroManaged Tables to behave abnormally.
>  MicroManaged (insert_only) tables may miss needed properties such as Storage 
> Desc Params – that may define how lines are delimited (like in the example 
> below):
> To repro the issue:
> {code:java}
> CREATE TRANSACTIONAL TABLE delim_table_trans(id INT, name STRING, safety INT) 
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE;
> LOAD DATA INPATH 'table1' OVERWRITE INTO TABLE delim_table_trans;
> describe formatted delim_table_trans;
> SELECT * FROM delim_table_trans;
> {code}
> Result:
> {code:java}
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   bucketing_version   2   
>   numFiles1   
>   numRows 0   
>   rawDataSize 0   
>   totalSize   72  
>   transactional   true
>   transactional_propertiesinsert_only 
>  A masked pattern was here 
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> PREHOOK: query: SELECT * FROM delim_table_trans
> PREHOOK: type: QUERY
> PREHOOK: Input: default@delim_table_trans
>  A masked pattern was here 
> POSTHOOK: query: SELECT * FROM delim_table_trans
> POSTHOOK: type: QUERY
> POSTHOOK: Input: default@delim_table_trans
>  A masked pattern was here 
> NULL  NULLNULL
> NULL  NULLNULL
> NULL  NULLNULL
> NULL  NULLNULL
> NULL  NULLNULL
> NULL  NULLNULL
>  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23871) ObjectStore should properly handle MicroManaged Table properties

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23871?focusedWorklogId=460455=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460455
 ]

ASF GitHub Bot logged work on HIVE-23871:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 18:23
Start Date: 17/Jul/20 18:23
Worklog Time Spent: 10m 
  Work Description: pgaref commented on a change in pull request #1273:
URL: https://github.com/apache/hive/pull/1273#discussion_r456604112



##
File path: ql/src/test/results/clientpositive/llap/load_micromanaged_delim.q.out
##
@@ -0,0 +1,186 @@
+ A masked pattern was here 
+PREHOOK: type: CREATETABLE
+ A masked pattern was here 
+PREHOOK: Output: database:default
+PREHOOK: Output: default@delim_table_ext
+ A masked pattern was here 
+POSTHOOK: type: CREATETABLE
+ A masked pattern was here 
+POSTHOOK: Output: database:default
+POSTHOOK: Output: default@delim_table_ext
+PREHOOK: query: describe formatted delim_table_ext
+PREHOOK: type: DESCTABLE
+PREHOOK: Input: default@delim_table_ext
+POSTHOOK: query: describe formatted delim_table_ext
+POSTHOOK: type: DESCTABLE
+POSTHOOK: Input: default@delim_table_ext
+# col_name data_type   comment 
+id int 
+name   string  
+safety int 
+
+# Detailed Table Information
+Database:  default  
+ A masked pattern was here 
+Retention: 0
+ A masked pattern was here 
+Table Type:EXTERNAL_TABLE   
+Table Parameters:   
+   EXTERNALTRUE
+   bucketing_version   2   
+   numFiles1   
+   totalSize   52  
+ A masked pattern was here 
+
+# Storage Information   
+SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
 
+InputFormat:   org.apache.hadoop.mapred.TextInputFormat 
+OutputFormat:  
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
+Compressed:No   
+Num Buckets:   -1   
+Bucket Columns:[]   
+Sort Columns:  []   
+Storage Desc Params:
+   field.delim \t  
+   serialization.format\t  
+PREHOOK: query: SELECT * FROM delim_table_ext
+PREHOOK: type: QUERY
+PREHOOK: Input: default@delim_table_ext
+ A masked pattern was here 
+POSTHOOK: query: SELECT * FROM delim_table_ext
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@delim_table_ext
+ A masked pattern was here 
+1  Acura   4
+2  Toyota  3
+3  Tesla   5
+4  Honda   5
+11 Mazda   2
+PREHOOK: query: CREATE TABLE delim_table_micro(id INT, name STRING, safety 
INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE 
TBLPROPERTIES('transactional'='true', "transactional_properties"="insert_only")
+PREHOOK: type: CREATETABLE
+PREHOOK: Output: database:default
+PREHOOK: Output: default@delim_table_micro
+POSTHOOK: query: CREATE TABLE delim_table_micro(id INT, name STRING, safety 
INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE 
TBLPROPERTIES('transactional'='true', "transactional_properties"="insert_only")
+POSTHOOK: type: CREATETABLE
+POSTHOOK: Output: database:default
+POSTHOOK: Output: default@delim_table_micro
+ A masked pattern was here 
+PREHOOK: type: LOAD
+ A masked pattern was here 
+PREHOOK: Output: default@delim_table_micro
+ A masked pattern was here 
+POSTHOOK: type: LOAD
+ A masked pattern was here 
+POSTHOOK: Output: default@delim_table_micro
+PREHOOK: query: describe formatted delim_table_micro
+PREHOOK: type: DESCTABLE
+PREHOOK: Input: default@delim_table_micro
+POSTHOOK: query: describe formatted delim_table_micro
+POSTHOOK: type: DESCTABLE
+POSTHOOK: Input: default@delim_table_micro
+# col_name data_type   comment 
+id int 
+name   string  
+safety int 
+
+# Detailed Table Information
+Database:  default  
+ A masked pattern was here 
+Retention: 0
+ A masked pattern was here 
+Table Type:MANAGED_TABLE
+Table Parameters:   
+   bucketing_version   2   
+

[jira] [Work logged] (HIVE-23871) ObjectStore should properly handle MicroManaged Table properties

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23871?focusedWorklogId=460454=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460454
 ]

ASF GitHub Bot logged work on HIVE-23871:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 18:22
Start Date: 17/Jul/20 18:22
Worklog Time Spent: 10m 
  Work Description: pgaref commented on a change in pull request #1273:
URL: https://github.com/apache/hive/pull/1273#discussion_r456603816



##
File path: ql/src/test/queries/clientpositive/load_micromanaged_delim.q
##
@@ -0,0 +1,32 @@
+set hive.support.concurrency=true;
+set hive.exec.dynamic.partition.mode=nonstrict;
+set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
+
+
+dfs -mkdir ${system:test.tmp.dir}/delim_table;
+dfs -mkdir ${system:test.tmp.dir}/delim_table_ext;
+dfs -mkdir ${system:test.tmp.dir}/delim_table_trans;
+dfs -cp ${system:hive.root}/data/files/table1 
${system:test.tmp.dir}/delim_table/;
+dfs -cp ${system:hive.root}/data/files/table1 
${system:test.tmp.dir}/delim_table_ext/;
+dfs -cp ${system:hive.root}/data/files/table1 
${system:test.tmp.dir}/delim_table_trans/;
+
+-- Checking that MicroManged and External tables have the same behaviour with 
delimited input files
+-- External table
+CREATE EXTERNAL TABLE delim_table_ext(id INT, name STRING, safety INT) ROW 
FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE LOCATION 
'${system:test.tmp.dir}/delim_table_ext/';
+describe formatted delim_table_ext;
+SELECT * FROM delim_table_ext;
+
+-- SET hive.create.as.acid=true
+-- SET hive.create.as.insert.only=true

Review comment:
   Creates the same behaviour as the Table properties below but I agree it 
makes sense to remove it

##
File path: data/files/table1
##
@@ -0,0 +1,5 @@
+1  Acura   4

Review comment:
   sure, done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460454)
Time Spent: 0.5h  (was: 20m)

> ObjectStore should properly handle MicroManaged Table properties
> 
>
> Key: HIVE-23871
> URL: https://issues.apache.org/jira/browse/HIVE-23871
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: table1
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> HIVE-23281 optimizes StorageDescriptor conversion as part of the ObjectStore 
> by skipping particular Table properties like SkewInfo, bucketCols, ordering 
> etc.
>  However, it does that for all Transactional Tables – not only ACID – causing 
> MicroManaged Tables to behave abnormally.
>  MicroManaged (insert_only) tables may miss needed properties such as Storage 
> Desc Params – that may define how lines are delimited (like in the example 
> below):
> To repro the issue:
> {code:java}
> CREATE TRANSACTIONAL TABLE delim_table_trans(id INT, name STRING, safety INT) 
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE;
> LOAD DATA INPATH 'table1' OVERWRITE INTO TABLE delim_table_trans;
> describe formatted delim_table_trans;
> SELECT * FROM delim_table_trans;
> {code}
> Result:
> {code:java}
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   bucketing_version   2   
>   numFiles1   
>   numRows 0   
>   rawDataSize 0   
>   totalSize   72  
>   transactional   true
>   transactional_propertiesinsert_only 
>  A masked pattern was here 
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> PREHOOK: query: SELECT * FROM delim_table_trans
> PREHOOK: type: QUERY
> PREHOOK: Input: default@delim_table_trans
>  A masked pattern was here 
> POSTHOOK: query: SELECT * FROM delim_table_trans
>

[jira] [Work logged] (HIVE-23786) HMS Server side filter

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23786?focusedWorklogId=460434=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460434
 ]

ASF GitHub Bot logged work on HIVE-23786:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 18:07
Start Date: 17/Jul/20 18:07
Worklog Time Spent: 10m 
  Work Description: sam-an-cloudera commented on a change in pull request 
#1221:
URL: https://github.com/apache/hive/pull/1221#discussion_r456596817



##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestGetPartitions.java
##
@@ -416,7 +417,7 @@ public void testGetPartitionsByNamesNullDbName() throws 
Exception {
   createTable3PartCols1Part(client);
   client.getPartitionsByNames(null, TABLE_NAME, 
Lists.newArrayList("=2000/mm=01/dd=02"));
   fail("Should have thrown exception");
-} catch (NullPointerException | TTransportException e) {
+} catch (NullPointerException | TTransportException | MetaException e) {

Review comment:
   coalesce with above as 1 single test issue. 

##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestGetPartitions.java
##
@@ -427,7 +428,7 @@ public void testGetPartitionsByNamesNullTblName() throws 
Exception {
   createTable3PartCols1Part(client);
   client.getPartitionsByNames(DB_NAME, null, 
Lists.newArrayList("=2000/mm=01/dd=02"));
   fail("Should have thrown exception");
-} catch (NullPointerException | TTransportException e) {
+} catch (NullPointerException | TTransportException | TProtocolException | 
MetaException e ) {

Review comment:
   coalesce with above as 1 single test issue.
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460434)
Time Spent: 3h 20m  (was: 3h 10m)

> HMS Server side filter
> --
>
> Key: HIVE-23786
> URL: https://issues.apache.org/jira/browse/HIVE-23786
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sam An
>Assignee: Sam An
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> HMS server side filter of results based on authorization. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23786) HMS Server side filter

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23786?focusedWorklogId=460432=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460432
 ]

ASF GitHub Bot logged work on HIVE-23786:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 18:06
Start Date: 17/Jul/20 18:06
Worklog Time Spent: 10m 
  Work Description: sam-an-cloudera commented on a change in pull request 
#1221:
URL: https://github.com/apache/hive/pull/1221#discussion_r456596445



##
File path: 
ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientGetPartitionsTempTable.java
##
@@ -123,13 +123,13 @@ public void testGetPartitionsByNamesEmptyParts() throws 
Exception {
 getClient().getPartitionsByNames(DB_NAME, TABLE_NAME, 
Lists.newArrayList("", ""));
   }
 
-  @Test(expected = MetaException.class)
+  @Test
   @Override
   public void testGetPartitionsByNamesNullDbName() throws Exception {
 super.testGetPartitionsByNamesNullDbName();
   }
 
-  @Test(expected = MetaException.class)
+  @Test

Review comment:
   coalesce with above





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460432)
Time Spent: 3h 10m  (was: 3h)

> HMS Server side filter
> --
>
> Key: HIVE-23786
> URL: https://issues.apache.org/jira/browse/HIVE-23786
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sam An
>Assignee: Sam An
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> HMS server side filter of results based on authorization. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23786) HMS Server side filter

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23786?focusedWorklogId=460431=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460431
 ]

ASF GitHub Bot logged work on HIVE-23786:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 18:06
Start Date: 17/Jul/20 18:06
Worklog Time Spent: 10m 
  Work Description: sam-an-cloudera commented on a change in pull request 
#1221:
URL: https://github.com/apache/hive/pull/1221#discussion_r456596312



##
File path: 
ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientGetPartitionsTempTable.java
##
@@ -123,13 +123,13 @@ public void testGetPartitionsByNamesEmptyParts() throws 
Exception {
 getClient().getPartitionsByNames(DB_NAME, TABLE_NAME, 
Lists.newArrayList("", ""));
   }
 
-  @Test(expected = MetaException.class)
+  @Test

Review comment:
   I don't recall why we (Ramesh and I) made the change in downstream here, 
but I will see if it can be reverted. I didn't change HMS API per se. The 
MetaException could be thrown from getPartitionsByNames before my changes. 
Those are tests only. 
   Any way, let me see if I can change to the way the tests were ,and if not, 
will give justification. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460431)
Time Spent: 3h  (was: 2h 50m)

> HMS Server side filter
> --
>
> Key: HIVE-23786
> URL: https://issues.apache.org/jira/browse/HIVE-23786
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sam An
>Assignee: Sam An
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> HMS server side filter of results based on authorization. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23716) Support Anti Join in Hive

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23716?focusedWorklogId=460429=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460429
 ]

ASF GitHub Bot logged work on HIVE-23716:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 18:02
Start Date: 17/Jul/20 18:02
Worklog Time Spent: 10m 
  Work Description: maheshk114 commented on a change in pull request #1147:
URL: https://github.com/apache/hive/pull/1147#discussion_r456594608



##
File path: ql/src/test/results/clientpositive/llap/antijoin.q.out
##
@@ -0,0 +1,1007 @@
+PREHOOK: query: create table t1_n55 as select cast(key as int) key, value from 
src where key <= 10
+PREHOOK: type: CREATETABLE_AS_SELECT
+PREHOOK: Input: default@src
+PREHOOK: Output: database:default
+PREHOOK: Output: default@t1_n55
+POSTHOOK: query: create table t1_n55 as select cast(key as int) key, value 
from src where key <= 10
+POSTHOOK: type: CREATETABLE_AS_SELECT
+POSTHOOK: Input: default@src
+POSTHOOK: Output: database:default
+POSTHOOK: Output: default@t1_n55
+POSTHOOK: Lineage: t1_n55.key EXPRESSION [(src)src.FieldSchema(name:key, 
type:string, comment:default), ]
+POSTHOOK: Lineage: t1_n55.value SIMPLE [(src)src.FieldSchema(name:value, 
type:string, comment:default), ]
+PREHOOK: query: select * from t1_n55 sort by key
+PREHOOK: type: QUERY
+PREHOOK: Input: default@t1_n55
+ A masked pattern was here 
+POSTHOOK: query: select * from t1_n55 sort by key
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@t1_n55
+ A masked pattern was here 
+0  val_0

Review comment:
   These all new test cases are added from the failure test cases of  a dry 
run with anti join enabled  true. Manually i have verified that the resultant 
records are same and plan difference is as per expected behavior. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460429)
Time Spent: 50m  (was: 40m)

> Support Anti Join in Hive 
> --
>
> Key: HIVE-23716
> URL: https://issues.apache.org/jira/browse/HIVE-23716
> Project: Hive
>  Issue Type: Bug
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23716.01.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Currently hive does not support Anti join. The query for anti join is 
> converted to left outer join and null filter on right side join key is added 
> to get the desired result. This is causing
>  # Extra computation — The left outer join projects the redundant columns 
> from right side. Along with that, filtering is done to remove the redundant 
> rows. This is can be avoided in case of anti join as anti join will project 
> only the required columns and rows from the left side table.
>  # Extra shuffle — In case of anti join the duplicate records moved to join 
> node can be avoided from the child node. This can reduce significant amount 
> of data movement if the number of distinct rows( join keys) is significant.
>  # Extra Memory Usage - In case of map based anti join , hash set is 
> sufficient as just the key is required to check  if the records matches the 
> join condition. In case of left join, we need the key and the non key columns 
> also and thus a hash table will be required.
> For a query like
> {code:java}
>  select wr_order_number FROM web_returns LEFT JOIN web_sales  ON 
> wr_order_number = ws_order_number WHERE ws_order_number IS NULL;{code}
> The number of distinct ws_order_number in web_sales table in a typical 10TB 
> TPCDS set up is just 10% of total records. So when we convert this query to 
> anti join, instead of 7 billion rows, only 600 million rows are moved to join 
> node.
> In the current patch, just one conversion is done. The pattern of 
> project->filter->left-join is converted to project->anti-join. This will take 
> care of sub queries with “not exists” clause. The queries with “not exists” 
> are converted first to filter + left-join and then its converted to anti 
> join. The queries with “not in” are not handled in the current patch.
> From execution side, both merge join and map join with vectorized execution  
> is supported for anti join.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23716) Support Anti Join in Hive

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23716?focusedWorklogId=460427=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460427
 ]

ASF GitHub Bot logged work on HIVE-23716:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 18:01
Start Date: 17/Jul/20 18:01
Worklog Time Spent: 10m 
  Work Description: maheshk114 commented on a change in pull request #1147:
URL: https://github.com/apache/hive/pull/1147#discussion_r456593908



##
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
##
@@ -2162,7 +2162,8 @@ private static void populateLlapDaemonVarsSet(Set 
llapDaemonVarsSetLocal
 "Whether Hive enables the optimization about converting common join 
into mapjoin based on the input file size. \n" +
 "If this parameter is on, and the sum of size for n-1 of the 
tables/partitions for a n-way join is smaller than the\n" +
 "specified size, the join is directly converted to a mapjoin (there is 
no conditional task)."),
-
+HIVE_CONVERT_ANTI_JOIN("hive.auto.convert.anti.join", false,

Review comment:
   Yes, i had triggered a ptest run with this config enabled to true by 
default. There were some 26 failures. I had analyzed those and some fixes were 
done to make sure that the result is same for both and difference in plan is as 
expected. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460427)
Time Spent: 40m  (was: 0.5h)

> Support Anti Join in Hive 
> --
>
> Key: HIVE-23716
> URL: https://issues.apache.org/jira/browse/HIVE-23716
> Project: Hive
>  Issue Type: Bug
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23716.01.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently hive does not support Anti join. The query for anti join is 
> converted to left outer join and null filter on right side join key is added 
> to get the desired result. This is causing
>  # Extra computation — The left outer join projects the redundant columns 
> from right side. Along with that, filtering is done to remove the redundant 
> rows. This is can be avoided in case of anti join as anti join will project 
> only the required columns and rows from the left side table.
>  # Extra shuffle — In case of anti join the duplicate records moved to join 
> node can be avoided from the child node. This can reduce significant amount 
> of data movement if the number of distinct rows( join keys) is significant.
>  # Extra Memory Usage - In case of map based anti join , hash set is 
> sufficient as just the key is required to check  if the records matches the 
> join condition. In case of left join, we need the key and the non key columns 
> also and thus a hash table will be required.
> For a query like
> {code:java}
>  select wr_order_number FROM web_returns LEFT JOIN web_sales  ON 
> wr_order_number = ws_order_number WHERE ws_order_number IS NULL;{code}
> The number of distinct ws_order_number in web_sales table in a typical 10TB 
> TPCDS set up is just 10% of total records. So when we convert this query to 
> anti join, instead of 7 billion rows, only 600 million rows are moved to join 
> node.
> In the current patch, just one conversion is done. The pattern of 
> project->filter->left-join is converted to project->anti-join. This will take 
> care of sub queries with “not exists” clause. The queries with “not exists” 
> are converted first to filter + left-join and then its converted to anti 
> join. The queries with “not in” are not handled in the current patch.
> From execution side, both merge join and map join with vectorized execution  
> is supported for anti join.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23716) Support Anti Join in Hive

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23716?focusedWorklogId=460416=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460416
 ]

ASF GitHub Bot logged work on HIVE-23716:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 17:52
Start Date: 17/Jul/20 17:52
Worklog Time Spent: 10m 
  Work Description: vineetgarg02 commented on a change in pull request 
#1147:
URL: https://github.com/apache/hive/pull/1147#discussion_r456588923



##
File path: ql/src/test/results/clientpositive/llap/antijoin.q.out
##
@@ -0,0 +1,1007 @@
+PREHOOK: query: create table t1_n55 as select cast(key as int) key, value from 
src where key <= 10
+PREHOOK: type: CREATETABLE_AS_SELECT
+PREHOOK: Input: default@src
+PREHOOK: Output: database:default
+PREHOOK: Output: default@t1_n55
+POSTHOOK: query: create table t1_n55 as select cast(key as int) key, value 
from src where key <= 10
+POSTHOOK: type: CREATETABLE_AS_SELECT
+POSTHOOK: Input: default@src
+POSTHOOK: Output: database:default
+POSTHOOK: Output: default@t1_n55
+POSTHOOK: Lineage: t1_n55.key EXPRESSION [(src)src.FieldSchema(name:key, 
type:string, comment:default), ]
+POSTHOOK: Lineage: t1_n55.value SIMPLE [(src)src.FieldSchema(name:value, 
type:string, comment:default), ]
+PREHOOK: query: select * from t1_n55 sort by key
+PREHOOK: type: QUERY
+PREHOOK: Input: default@t1_n55
+ A masked pattern was here 
+POSTHOOK: query: select * from t1_n55 sort by key
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@t1_n55
+ A masked pattern was here 
+0  val_0

Review comment:
   How was the correctness of results verified?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460416)
Time Spent: 0.5h  (was: 20m)

> Support Anti Join in Hive 
> --
>
> Key: HIVE-23716
> URL: https://issues.apache.org/jira/browse/HIVE-23716
> Project: Hive
>  Issue Type: Bug
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23716.01.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently hive does not support Anti join. The query for anti join is 
> converted to left outer join and null filter on right side join key is added 
> to get the desired result. This is causing
>  # Extra computation — The left outer join projects the redundant columns 
> from right side. Along with that, filtering is done to remove the redundant 
> rows. This is can be avoided in case of anti join as anti join will project 
> only the required columns and rows from the left side table.
>  # Extra shuffle — In case of anti join the duplicate records moved to join 
> node can be avoided from the child node. This can reduce significant amount 
> of data movement if the number of distinct rows( join keys) is significant.
>  # Extra Memory Usage - In case of map based anti join , hash set is 
> sufficient as just the key is required to check  if the records matches the 
> join condition. In case of left join, we need the key and the non key columns 
> also and thus a hash table will be required.
> For a query like
> {code:java}
>  select wr_order_number FROM web_returns LEFT JOIN web_sales  ON 
> wr_order_number = ws_order_number WHERE ws_order_number IS NULL;{code}
> The number of distinct ws_order_number in web_sales table in a typical 10TB 
> TPCDS set up is just 10% of total records. So when we convert this query to 
> anti join, instead of 7 billion rows, only 600 million rows are moved to join 
> node.
> In the current patch, just one conversion is done. The pattern of 
> project->filter->left-join is converted to project->anti-join. This will take 
> care of sub queries with “not exists” clause. The queries with “not exists” 
> are converted first to filter + left-join and then its converted to anti 
> join. The queries with “not in” are not handled in the current patch.
> From execution side, both merge join and map join with vectorized execution  
> is supported for anti join.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23716) Support Anti Join in Hive

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23716?focusedWorklogId=460414=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460414
 ]

ASF GitHub Bot logged work on HIVE-23716:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 17:51
Start Date: 17/Jul/20 17:51
Worklog Time Spent: 10m 
  Work Description: vineetgarg02 commented on a change in pull request 
#1147:
URL: https://github.com/apache/hive/pull/1147#discussion_r456588241



##
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
##
@@ -2162,7 +2162,8 @@ private static void populateLlapDaemonVarsSet(Set 
llapDaemonVarsSetLocal
 "Whether Hive enables the optimization about converting common join 
into mapjoin based on the input file size. \n" +
 "If this parameter is on, and the sum of size for n-1 of the 
tables/partitions for a n-way join is smaller than the\n" +
 "specified size, the join is directly converted to a mapjoin (there is 
no conditional task)."),
-
+HIVE_CONVERT_ANTI_JOIN("hive.auto.convert.anti.join", false,

Review comment:
   @maheshk114 Have you run all the tests with this feature set to true by 
default? This change touches existing logic/code and we should definitely run 
all the existing tests with this set to TRUE.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460414)
Time Spent: 20m  (was: 10m)

> Support Anti Join in Hive 
> --
>
> Key: HIVE-23716
> URL: https://issues.apache.org/jira/browse/HIVE-23716
> Project: Hive
>  Issue Type: Bug
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23716.01.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently hive does not support Anti join. The query for anti join is 
> converted to left outer join and null filter on right side join key is added 
> to get the desired result. This is causing
>  # Extra computation — The left outer join projects the redundant columns 
> from right side. Along with that, filtering is done to remove the redundant 
> rows. This is can be avoided in case of anti join as anti join will project 
> only the required columns and rows from the left side table.
>  # Extra shuffle — In case of anti join the duplicate records moved to join 
> node can be avoided from the child node. This can reduce significant amount 
> of data movement if the number of distinct rows( join keys) is significant.
>  # Extra Memory Usage - In case of map based anti join , hash set is 
> sufficient as just the key is required to check  if the records matches the 
> join condition. In case of left join, we need the key and the non key columns 
> also and thus a hash table will be required.
> For a query like
> {code:java}
>  select wr_order_number FROM web_returns LEFT JOIN web_sales  ON 
> wr_order_number = ws_order_number WHERE ws_order_number IS NULL;{code}
> The number of distinct ws_order_number in web_sales table in a typical 10TB 
> TPCDS set up is just 10% of total records. So when we convert this query to 
> anti join, instead of 7 billion rows, only 600 million rows are moved to join 
> node.
> In the current patch, just one conversion is done. The pattern of 
> project->filter->left-join is converted to project->anti-join. This will take 
> care of sub queries with “not exists” clause. The queries with “not exists” 
> are converted first to filter + left-join and then its converted to anti 
> join. The queries with “not in” are not handled in the current patch.
> From execution side, both merge join and map join with vectorized execution  
> is supported for anti join.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23873) Querying Hive JDBCStorageHandler table fails with NPE

2020-07-17 Thread Syed Shameerur Rahman (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17160110#comment-17160110
 ] 

Syed Shameerur Rahman commented on HIVE-23873:
--

[~chiran54321] Do you see the same issue with hive master branch?

> Querying Hive JDBCStorageHandler table fails with NPE
> -
>
> Key: HIVE-23873
> URL: https://issues.apache.org/jira/browse/HIVE-23873
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 3.1.0, 3.1.1, 3.1.2
>Reporter: Chiran Ravani
>Assignee: Chiran Ravani
>Priority: Critical
> Attachments: HIVE-23873.01.patch
>
>
> Scenario is Hive table having same schema as table in Oracle, however when we 
> query the table with data it fails with NPE, below is the trace.
> {code}
> Caused by: java.io.IOException: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:617)
>  ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:524) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2739) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
>  ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:473)
>  ~[hive-service-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> ... 34 more
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hive.storage.jdbc.JdbcSerDe.deserialize(JdbcSerDe.java:164) 
> ~[hive-jdbc-handler-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:598)
>  ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:524) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2739) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
>  ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:473)
>  ~[hive-service-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> ... 34 more
> {code}
> Problem appears when column names in Oracle are in Upper case and since in 
> Hive, table and column names are forced to store in lowercase during 
> creation. User runs into NPE error while fetching data.
> While deserializing data, input consists of column names in lower case which 
> fails to get the value
> https://github.com/apache/hive/blob/rel/release-3.1.2/jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/JdbcSerDe.java#L136
> {code}
> rowVal = ((ObjectWritable)value).get();
> {code}
> Log Snio:
> =
> {code}
> 2020-07-17T16:49:09,598 INFO  [04ed42ec-91d2-4662-aee7-37e840a06036 
> HiveServer2-Handler-Pool: Thread-104]: dao.GenericJdbcDatabaseAccessor (:()) 
> - Query to execute is [select * from TESTHIVEJDBCSTORAGE]
> 2020-07-17T16:49:10,642 INFO  [04ed42ec-91d2-4662-aee7-37e840a06036 
> HiveServer2-Handler-Pool: Thread-104]: jdbc.JdbcSerDe (:()) - *** ColumnKey = 
> ID
> 2020-07-17T16:49:10,642 INFO  [04ed42ec-91d2-4662-aee7-37e840a06036 
> HiveServer2-Handler-Pool: Thread-104]: jdbc.JdbcSerDe (:()) - *** Blob value 
> = {fname=OW[class=class java.lang.String,value=Name1], id=OW[class=class 
> java.lang.Integer,value=1]}
> {code}
> Simple Reproducer for this case.
> =
> 1. Create table in Oracle
> {code}
> create table TESTHIVEJDBCSTORAGE(ID INT, FNAME VARCHAR(20));
> {code}
> 2. Insert dummy data.
> {code}
> Insert into TESTHIVEJDBCSTORAGE values (1, 'Name1');
> {code}
> 3. Create JDBCStorageHandler table in Hive.
> {code}
> CREATE EXTERNAL TABLE default.TESTHIVEJDBCSTORAGE_HIVE_TBL (ID INT, FNAME 
> VARCHAR(20)) 
> STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' 
> TBLPROPERTIES ( 
> "hive.sql.database.type" = "ORACLE", 
> "hive.sql.jdbc.driver" = "oracle.jdbc.OracleDriver", 
> "hive.sql.jdbc.url" = "jdbc:oracle:thin:@orachehostname/XE", 
> "hive.sql.dbcp.username" = "chiran", 
> "hive.sql.dbcp.password" = "supersecurepassword", 
>

[jira] [Work logged] (HIVE-23871) ObjectStore should properly handle MicroManaged Table properties

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23871?focusedWorklogId=460404=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460404
 ]

ASF GitHub Bot logged work on HIVE-23871:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 17:32
Start Date: 17/Jul/20 17:32
Worklog Time Spent: 10m 
  Work Description: mustafaiman commented on a change in pull request #1273:
URL: https://github.com/apache/hive/pull/1273#discussion_r456570304



##
File path: data/files/table1
##
@@ -0,0 +1,5 @@
+1  Acura   4

Review comment:
   can we give the file a non generic name?

##
File path: ql/src/test/results/clientpositive/llap/load_micromanaged_delim.q.out
##
@@ -0,0 +1,186 @@
+ A masked pattern was here 
+PREHOOK: type: CREATETABLE
+ A masked pattern was here 
+PREHOOK: Output: database:default
+PREHOOK: Output: default@delim_table_ext
+ A masked pattern was here 
+POSTHOOK: type: CREATETABLE
+ A masked pattern was here 
+POSTHOOK: Output: database:default
+POSTHOOK: Output: default@delim_table_ext
+PREHOOK: query: describe formatted delim_table_ext
+PREHOOK: type: DESCTABLE
+PREHOOK: Input: default@delim_table_ext
+POSTHOOK: query: describe formatted delim_table_ext
+POSTHOOK: type: DESCTABLE
+POSTHOOK: Input: default@delim_table_ext
+# col_name data_type   comment 
+id int 
+name   string  
+safety int 
+
+# Detailed Table Information
+Database:  default  
+ A masked pattern was here 
+Retention: 0
+ A masked pattern was here 
+Table Type:EXTERNAL_TABLE   
+Table Parameters:   
+   EXTERNALTRUE
+   bucketing_version   2   
+   numFiles1   
+   totalSize   52  
+ A masked pattern was here 
+
+# Storage Information   
+SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
 
+InputFormat:   org.apache.hadoop.mapred.TextInputFormat 
+OutputFormat:  
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
+Compressed:No   
+Num Buckets:   -1   
+Bucket Columns:[]   
+Sort Columns:  []   
+Storage Desc Params:
+   field.delim \t  
+   serialization.format\t  
+PREHOOK: query: SELECT * FROM delim_table_ext
+PREHOOK: type: QUERY
+PREHOOK: Input: default@delim_table_ext
+ A masked pattern was here 
+POSTHOOK: query: SELECT * FROM delim_table_ext
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@delim_table_ext
+ A masked pattern was here 
+1  Acura   4
+2  Toyota  3
+3  Tesla   5
+4  Honda   5
+11 Mazda   2
+PREHOOK: query: CREATE TABLE delim_table_micro(id INT, name STRING, safety 
INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE 
TBLPROPERTIES('transactional'='true', "transactional_properties"="insert_only")
+PREHOOK: type: CREATETABLE
+PREHOOK: Output: database:default
+PREHOOK: Output: default@delim_table_micro
+POSTHOOK: query: CREATE TABLE delim_table_micro(id INT, name STRING, safety 
INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE 
TBLPROPERTIES('transactional'='true', "transactional_properties"="insert_only")
+POSTHOOK: type: CREATETABLE
+POSTHOOK: Output: database:default
+POSTHOOK: Output: default@delim_table_micro
+ A masked pattern was here 
+PREHOOK: type: LOAD
+ A masked pattern was here 
+PREHOOK: Output: default@delim_table_micro
+ A masked pattern was here 
+POSTHOOK: type: LOAD
+ A masked pattern was here 
+POSTHOOK: Output: default@delim_table_micro
+PREHOOK: query: describe formatted delim_table_micro
+PREHOOK: type: DESCTABLE
+PREHOOK: Input: default@delim_table_micro
+POSTHOOK: query: describe formatted delim_table_micro
+POSTHOOK: type: DESCTABLE
+POSTHOOK: Input: default@delim_table_micro
+# col_name data_type   comment 
+id int 
+name   string  
+safety int 
+
+# Detailed Table Information
+Database:  default  
+ A masked pattern was here 
+Retention: 0
+ A masked pattern

[jira] [Work logged] (HIVE-23324) Parallelise compaction directory cleaning process

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23324?focusedWorklogId=460388=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460388
 ]

ASF GitHub Bot logged work on HIVE-23324:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 17:09
Start Date: 17/Jul/20 17:09
Worklog Time Spent: 10m 
  Work Description: adesh-rao opened a new pull request #1275:
URL: https://github.com/apache/hive/pull/1275


   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY)
   For more details, please see 
https://cwiki.apache.org/confluence/display/Hive/HowToContribute
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460388)
Remaining Estimate: 0h
Time Spent: 10m

> Parallelise compaction directory cleaning process
> -
>
> Key: HIVE-23324
> URL: https://issues.apache.org/jira/browse/HIVE-23324
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marton Bod
>Assignee: Adesh Kumar Rao
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Initiator processes the various compaction candidates in parallel, so we 
> could follow a similar approach in Cleaner where we currently clean the 
> directories sequentially.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23324) Parallelise compaction directory cleaning process

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-23324:
--
Labels: pull-request-available  (was: )

> Parallelise compaction directory cleaning process
> -
>
> Key: HIVE-23324
> URL: https://issues.apache.org/jira/browse/HIVE-23324
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marton Bod
>Assignee: Adesh Kumar Rao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Initiator processes the various compaction candidates in parallel, so we 
> could follow a similar approach in Cleaner where we currently clean the 
> directories sequentially.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23873) Querying Hive JDBCStorageHandler table fails with NPE

2020-07-17 Thread Chiran Ravani (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chiran Ravani updated HIVE-23873:
-
Description: 
Scenario is Hive table having same schema as table in Oracle, however when we 
query the table with data it fails with NPE, below is the trace.

{code}
Caused by: java.io.IOException: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:617) 
~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:524) 
~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) 
~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2739) 
~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229) 
~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
at 
org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:473)
 ~[hive-service-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
... 34 more
Caused by: java.lang.NullPointerException
at 
org.apache.hive.storage.jdbc.JdbcSerDe.deserialize(JdbcSerDe.java:164) 
~[hive-jdbc-handler-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:598) 
~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:524) 
~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) 
~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2739) 
~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229) 
~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
at 
org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:473)
 ~[hive-service-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
... 34 more
{code}

Problem appears when column names in Oracle are in Upper case and since in 
Hive, table and column names are forced to store in lowercase during creation. 
User runs into NPE error while fetching data.

While deserializing data, input consists of column names in lower case which 
fails to get the value

https://github.com/apache/hive/blob/rel/release-3.1.2/jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/JdbcSerDe.java#L136
{code}
rowVal = ((ObjectWritable)value).get();
{code}

Log Snio:
=
{code}
2020-07-17T16:49:09,598 INFO  [04ed42ec-91d2-4662-aee7-37e840a06036 
HiveServer2-Handler-Pool: Thread-104]: dao.GenericJdbcDatabaseAccessor (:()) - 
Query to execute is [select * from TESTHIVEJDBCSTORAGE]
2020-07-17T16:49:10,642 INFO  [04ed42ec-91d2-4662-aee7-37e840a06036 
HiveServer2-Handler-Pool: Thread-104]: jdbc.JdbcSerDe (:()) - *** ColumnKey = ID
2020-07-17T16:49:10,642 INFO  [04ed42ec-91d2-4662-aee7-37e840a06036 
HiveServer2-Handler-Pool: Thread-104]: jdbc.JdbcSerDe (:()) - *** Blob value = 
{fname=OW[class=class java.lang.String,value=Name1], id=OW[class=class 
java.lang.Integer,value=1]}
{code}

Simple Reproducer for this case.
=
1. Create table in Oracle
{code}
create table TESTHIVEJDBCSTORAGE(ID INT, FNAME VARCHAR(20));
{code}

2. Insert dummy data.
{code}
Insert into TESTHIVEJDBCSTORAGE values (1, 'Name1');
{code}

3. Create JDBCStorageHandler table in Hive.
{code}
CREATE EXTERNAL TABLE default.TESTHIVEJDBCSTORAGE_HIVE_TBL (ID INT, FNAME 
VARCHAR(20)) 
STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' 
TBLPROPERTIES ( 
"hive.sql.database.type" = "ORACLE", 
"hive.sql.jdbc.driver" = "oracle.jdbc.OracleDriver", 
"hive.sql.jdbc.url" = "jdbc:oracle:thin:@orachehostname/XE", 
"hive.sql.dbcp.username" = "chiran", 
"hive.sql.dbcp.password" = "supersecurepassword", 
"hive.sql.table" = "TESTHIVEJDBCSTORAGE", 
"hive.sql.dbcp.maxActive" = "1" 
);
{code}

4. Query Hive table, fails with NPE.
{code}
> select * from default.TESTHIVEJDBCSTORAGE_HIVE_TBL;
INFO  : Compiling 
command(queryId=hive_20200717164857_cd6f5020-4a69-4a2d-9e63-9db99d0121bc): 
select * from default.TESTHIVEJDBCSTORAGE_HIVE_TBL
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Returning Hive schema: 
Schema(fieldSchemas:[FieldSchema(name:testhivejdbcstorage_hive_tbl.id, 
type:int, comment:null), FieldSchema(name:testhivejdbcstorage_hive_tbl.fname, 
type:varchar(20), comment:null)], properties:null)
INFO  : Completed compiling 
command(queryId=hive_20200717164857_cd6f5020-4a69-4a2d-9e63-9db99d0121bc); Time 
taken: 9.914 seconds
INFO  : Executing

[jira] [Updated] (HIVE-23873) Querying Hive JDBCStorageHandler table fails with NPE

2020-07-17 Thread Chiran Ravani (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chiran Ravani updated HIVE-23873:
-
Attachment: HIVE-23873.01.patch
Status: Patch Available  (was: Open)

> Querying Hive JDBCStorageHandler table fails with NPE
> -
>
> Key: HIVE-23873
> URL: https://issues.apache.org/jira/browse/HIVE-23873
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 3.1.2, 3.1.1, 3.1.0
>Reporter: Chiran Ravani
>Assignee: Chiran Ravani
>Priority: Critical
> Attachments: HIVE-23873.01.patch
>
>
> Scenario is Hive table having same schema as table in Oracle, however when we 
> query the table with data it fails with NPE, below is the trace.
> {code}
> Caused by: java.io.IOException: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:617)
>  ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:524) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2739) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
>  ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:473)
>  ~[hive-service-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> ... 34 more
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hive.storage.jdbc.JdbcSerDe.deserialize(JdbcSerDe.java:164) 
> ~[hive-jdbc-handler-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:598)
>  ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:524) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2739) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
>  ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:473)
>  ~[hive-service-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> ... 34 more
> {code}
> Problem appears when column names in Oracle are in Upper case and since in 
> Hive, table and column names are forced to store in lowercase during 
> creation. User runs into NPE error while fetching data.
> While deserializing data, input consists of column names in lower case which 
> fails to get the value
> https://github.com/apache/hive/blob/rel/release-3.1.2/jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/JdbcSerDe.java#L136
> {code}
> rowVal = ((ObjectWritable)value).get();
> {code}
> Log Snio:
> =
> {code}
> 2020-07-17T16:49:09,598 INFO  [04ed42ec-91d2-4662-aee7-37e840a06036 
> HiveServer2-Handler-Pool: Thread-104]: dao.GenericJdbcDatabaseAccessor (:()) 
> - Query to execute is [select * from TESTHIVEJDBCSTORAGE]
> 2020-07-17T16:49:10,642 INFO  [04ed42ec-91d2-4662-aee7-37e840a06036 
> HiveServer2-Handler-Pool: Thread-104]: jdbc.JdbcSerDe (:()) - *** ColumnKey = 
> ID
> 2020-07-17T16:49:10,642 INFO  [04ed42ec-91d2-4662-aee7-37e840a06036 
> HiveServer2-Handler-Pool: Thread-104]: jdbc.JdbcSerDe (:()) - *** Blob value 
> = {fname=OW[class=class java.lang.String,value=Name1], id=OW[class=class 
> java.lang.Integer,value=1]}
> {code}
> Simple Reproducer for this case.
> =
> 1. Create table in Oracle
> {code}
> create table TESTHIVEJDBCSTORAGE(ID INT, FNAME VARCHAR(20));
> {code}
> 2. Insert dummy data.
> {code}
> Insert into TESTHIVEJDBCSTORAGE values (1, 'Name1');
> {code}
> 3. Create JDBCStorageHandler table in Hive.
> {code}
> CREATE EXTERNAL TABLE default.TESTHIVEJDBCSTORAGE_HIVE_TBL (ID INT, FNAME 
> VARCHAR(20)) 
> STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' 
> TBLPROPERTIES ( 
> "hive.sql.database.type" = "ORACLE", 
> "hive.sql.jdbc.driver" = "oracle.jdbc.OracleDriver", 
> "hive.sql.jdbc.url" = "jdbc:oracle:thin:@10.96.95.99:49161/XE", 
> "hive.sql.dbcp.username" = "chiran", 
> "hive.sql.dbcp.password" = "hadoop", 
> "hive.sql.table" = "TESTHIVEJDBCSTORAGE", 
>

[jira] [Updated] (HIVE-23873) Querying Hive JDBCStorageHandler table fails with NPE

2020-07-17 Thread Chiran Ravani (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chiran Ravani updated HIVE-23873:
-
Attachment: (was: HIVE-23873.01.patch)

> Querying Hive JDBCStorageHandler table fails with NPE
> -
>
> Key: HIVE-23873
> URL: https://issues.apache.org/jira/browse/HIVE-23873
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 3.1.0, 3.1.1, 3.1.2
>Reporter: Chiran Ravani
>Assignee: Chiran Ravani
>Priority: Critical
>
> Scenario is Hive table having same schema as table in Oracle, however when we 
> query the table with data it fails with NPE, below is the trace.
> {code}
> Caused by: java.io.IOException: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:617)
>  ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:524) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2739) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
>  ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:473)
>  ~[hive-service-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> ... 34 more
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hive.storage.jdbc.JdbcSerDe.deserialize(JdbcSerDe.java:164) 
> ~[hive-jdbc-handler-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:598)
>  ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:524) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2739) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
>  ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:473)
>  ~[hive-service-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> ... 34 more
> {code}
> Problem appears when column names in Oracle are in Upper case and since in 
> Hive, table and column names are forced to store in lowercase during 
> creation. User runs into NPE error while fetching data.
> While deserializing data, input consists of column names in lower case which 
> fails to get the value
> https://github.com/apache/hive/blob/rel/release-3.1.2/jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/JdbcSerDe.java#L136
> {code}
> rowVal = ((ObjectWritable)value).get();
> {code}
> Log Snio:
> =
> {code}
> 2020-07-17T16:49:09,598 INFO  [04ed42ec-91d2-4662-aee7-37e840a06036 
> HiveServer2-Handler-Pool: Thread-104]: dao.GenericJdbcDatabaseAccessor (:()) 
> - Query to execute is [select * from TESTHIVEJDBCSTORAGE]
> 2020-07-17T16:49:10,642 INFO  [04ed42ec-91d2-4662-aee7-37e840a06036 
> HiveServer2-Handler-Pool: Thread-104]: jdbc.JdbcSerDe (:()) - *** ColumnKey = 
> ID
> 2020-07-17T16:49:10,642 INFO  [04ed42ec-91d2-4662-aee7-37e840a06036 
> HiveServer2-Handler-Pool: Thread-104]: jdbc.JdbcSerDe (:()) - *** Blob value 
> = {fname=OW[class=class java.lang.String,value=Name1], id=OW[class=class 
> java.lang.Integer,value=1]}
> {code}
> Simple Reproducer for this case.
> =
> 1. Create table in Oracle
> {code}
> create table TESTHIVEJDBCSTORAGE(ID INT, FNAME VARCHAR(20));
> {code}
> 2. Insert dummy data.
> {code}
> Insert into TESTHIVEJDBCSTORAGE values (1, 'Name1');
> {code}
> 3. Create JDBCStorageHandler table in Hive.
> {code}
> CREATE EXTERNAL TABLE default.TESTHIVEJDBCSTORAGE_HIVE_TBL (ID INT, FNAME 
> VARCHAR(20)) 
> STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' 
> TBLPROPERTIES ( 
> "hive.sql.database.type" = "ORACLE", 
> "hive.sql.jdbc.driver" = "oracle.jdbc.OracleDriver", 
> "hive.sql.jdbc.url" = "jdbc:oracle:thin:@10.96.95.99:49161/XE", 
> "hive.sql.dbcp.username" = "chiran", 
> "hive.sql.dbcp.password" = "hadoop", 
> "hive.sql.table" = "TESTHIVEJDBCSTORAGE", 
> "hive.sql.dbcp.maxActive" = "1" 
> );
> {code}
> 4. Query Hive table, fails with NPE.
> {code}
>

[jira] [Updated] (HIVE-23873) Querying Hive JDBCStorageHandler table fails with NPE

2020-07-17 Thread Chiran Ravani (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chiran Ravani updated HIVE-23873:
-
Attachment: HIVE-23873.01.patch

> Querying Hive JDBCStorageHandler table fails with NPE
> -
>
> Key: HIVE-23873
> URL: https://issues.apache.org/jira/browse/HIVE-23873
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 3.1.0, 3.1.1, 3.1.2
>Reporter: Chiran Ravani
>Priority: Critical
>
> Scenario is Hive table having same schema as table in Oracle, however when we 
> query the table with data it fails with NPE, below is the trace.
> {code}
> Caused by: java.io.IOException: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:617)
>  ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:524) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2739) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
>  ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:473)
>  ~[hive-service-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> ... 34 more
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hive.storage.jdbc.JdbcSerDe.deserialize(JdbcSerDe.java:164) 
> ~[hive-jdbc-handler-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:598)
>  ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:524) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2739) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
>  ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:473)
>  ~[hive-service-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> ... 34 more
> {code}
> Problem appears when column names in Oracle are in Upper case and since in 
> Hive, table and column names are forced to store in lowercase during 
> creation. User runs into NPE error while fetching data.
> While deserializing data, input consists of column names in lower case which 
> fails to get the value
> https://github.com/apache/hive/blob/rel/release-3.1.2/jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/JdbcSerDe.java#L136
> {code}
> rowVal = ((ObjectWritable)value).get();
> {code}
> Log Snio:
> =
> {code}
> 2020-07-17T16:49:09,598 INFO  [04ed42ec-91d2-4662-aee7-37e840a06036 
> HiveServer2-Handler-Pool: Thread-104]: dao.GenericJdbcDatabaseAccessor (:()) 
> - Query to execute is [select * from TESTHIVEJDBCSTORAGE]
> 2020-07-17T16:49:10,642 INFO  [04ed42ec-91d2-4662-aee7-37e840a06036 
> HiveServer2-Handler-Pool: Thread-104]: jdbc.JdbcSerDe (:()) - *** ColumnKey = 
> ID
> 2020-07-17T16:49:10,642 INFO  [04ed42ec-91d2-4662-aee7-37e840a06036 
> HiveServer2-Handler-Pool: Thread-104]: jdbc.JdbcSerDe (:()) - *** Blob value 
> = {fname=OW[class=class java.lang.String,value=Name1], id=OW[class=class 
> java.lang.Integer,value=1]}
> {code}
> Simple Reproducer for this case.
> =
> 1. Create table in Oracle
> {code}
> create table TESTHIVEJDBCSTORAGE(ID INT, FNAME VARCHAR(20));
> {code}
> 2. Insert dummy data.
> {code}
> Insert into TESTHIVEJDBCSTORAGE values (1, 'Name1');
> {code}
> 3. Create JDBCStorageHandler table in Hive.
> {code}
> CREATE EXTERNAL TABLE default.TESTHIVEJDBCSTORAGE_HIVE_TBL (ID INT, FNAME 
> VARCHAR(20)) 
> STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' 
> TBLPROPERTIES ( 
> "hive.sql.database.type" = "ORACLE", 
> "hive.sql.jdbc.driver" = "oracle.jdbc.OracleDriver", 
> "hive.sql.jdbc.url" = "jdbc:oracle:thin:@10.96.95.99:49161/XE", 
> "hive.sql.dbcp.username" = "chiran", 
> "hive.sql.dbcp.password" = "hadoop", 
> "hive.sql.table" = "TESTHIVEJDBCSTORAGE", 
> "hive.sql.dbcp.maxActive" = "1" 
> );
> {code}
> 4. Query Hive table, fails with NPE.
> {code}
> > select * from

[jira] [Assigned] (HIVE-23873) Querying Hive JDBCStorageHandler table fails with NPE

2020-07-17 Thread Chiran Ravani (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chiran Ravani reassigned HIVE-23873:


Assignee: Chiran Ravani

> Querying Hive JDBCStorageHandler table fails with NPE
> -
>
> Key: HIVE-23873
> URL: https://issues.apache.org/jira/browse/HIVE-23873
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 3.1.0, 3.1.1, 3.1.2
>Reporter: Chiran Ravani
>Assignee: Chiran Ravani
>Priority: Critical
>
> Scenario is Hive table having same schema as table in Oracle, however when we 
> query the table with data it fails with NPE, below is the trace.
> {code}
> Caused by: java.io.IOException: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:617)
>  ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:524) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2739) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
>  ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:473)
>  ~[hive-service-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> ... 34 more
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hive.storage.jdbc.JdbcSerDe.deserialize(JdbcSerDe.java:164) 
> ~[hive-jdbc-handler-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:598)
>  ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:524) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2739) 
> ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
>  ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> at 
> org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:473)
>  ~[hive-service-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
> ... 34 more
> {code}
> Problem appears when column names in Oracle are in Upper case and since in 
> Hive, table and column names are forced to store in lowercase during 
> creation. User runs into NPE error while fetching data.
> While deserializing data, input consists of column names in lower case which 
> fails to get the value
> https://github.com/apache/hive/blob/rel/release-3.1.2/jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/JdbcSerDe.java#L136
> {code}
> rowVal = ((ObjectWritable)value).get();
> {code}
> Log Snio:
> =
> {code}
> 2020-07-17T16:49:09,598 INFO  [04ed42ec-91d2-4662-aee7-37e840a06036 
> HiveServer2-Handler-Pool: Thread-104]: dao.GenericJdbcDatabaseAccessor (:()) 
> - Query to execute is [select * from TESTHIVEJDBCSTORAGE]
> 2020-07-17T16:49:10,642 INFO  [04ed42ec-91d2-4662-aee7-37e840a06036 
> HiveServer2-Handler-Pool: Thread-104]: jdbc.JdbcSerDe (:()) - *** ColumnKey = 
> ID
> 2020-07-17T16:49:10,642 INFO  [04ed42ec-91d2-4662-aee7-37e840a06036 
> HiveServer2-Handler-Pool: Thread-104]: jdbc.JdbcSerDe (:()) - *** Blob value 
> = {fname=OW[class=class java.lang.String,value=Name1], id=OW[class=class 
> java.lang.Integer,value=1]}
> {code}
> Simple Reproducer for this case.
> =
> 1. Create table in Oracle
> {code}
> create table TESTHIVEJDBCSTORAGE(ID INT, FNAME VARCHAR(20));
> {code}
> 2. Insert dummy data.
> {code}
> Insert into TESTHIVEJDBCSTORAGE values (1, 'Name1');
> {code}
> 3. Create JDBCStorageHandler table in Hive.
> {code}
> CREATE EXTERNAL TABLE default.TESTHIVEJDBCSTORAGE_HIVE_TBL (ID INT, FNAME 
> VARCHAR(20)) 
> STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' 
> TBLPROPERTIES ( 
> "hive.sql.database.type" = "ORACLE", 
> "hive.sql.jdbc.driver" = "oracle.jdbc.OracleDriver", 
> "hive.sql.jdbc.url" = "jdbc:oracle:thin:@10.96.95.99:49161/XE", 
> "hive.sql.dbcp.username" = "chiran", 
> "hive.sql.dbcp.password" = "hadoop", 
> "hive.sql.table" = "TESTHIVEJDBCSTORAGE", 
> "hive.sql.dbcp.maxActive" = "1" 
> );
> {code}
> 4. Query Hive table, fails with NPE.
> {code}
> > select *

[jira] [Commented] (HIVE-23871) ObjectStore should properly handle MicroManaged Table properties

2020-07-17 Thread Jean-Daniel Cryans (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159994#comment-17159994
 ] 

Jean-Daniel Cryans commented on HIVE-23871:
---

Thanks for taking care of this, [~pgaref], it's a pretty bad issue.

> ObjectStore should properly handle MicroManaged Table properties
> 
>
> Key: HIVE-23871
> URL: https://issues.apache.org/jira/browse/HIVE-23871
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: table1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HIVE-23281 optimizes StorageDescriptor conversion as part of the ObjectStore 
> by skipping particular Table properties like SkewInfo, bucketCols, ordering 
> etc.
>  However, it does that for all Transactional Tables – not only ACID – causing 
> MicroManaged Tables to behave abnormally.
>  MicroManaged (insert_only) tables may miss needed properties such as Storage 
> Desc Params – that may define how lines are delimited (like in the example 
> below):
> To repro the issue:
> {code:java}
> CREATE TRANSACTIONAL TABLE delim_table_trans(id INT, name STRING, safety INT) 
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE;
> LOAD DATA INPATH 'table1' OVERWRITE INTO TABLE delim_table_trans;
> describe formatted delim_table_trans;
> SELECT * FROM delim_table_trans;
> {code}
> Result:
> {code:java}
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   bucketing_version   2   
>   numFiles1   
>   numRows 0   
>   rawDataSize 0   
>   totalSize   72  
>   transactional   true
>   transactional_propertiesinsert_only 
>  A masked pattern was here 
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> PREHOOK: query: SELECT * FROM delim_table_trans
> PREHOOK: type: QUERY
> PREHOOK: Input: default@delim_table_trans
>  A masked pattern was here 
> POSTHOOK: query: SELECT * FROM delim_table_trans
> POSTHOOK: type: QUERY
> POSTHOOK: Input: default@delim_table_trans
>  A masked pattern was here 
> NULL  NULLNULL
> NULL  NULLNULL
> NULL  NULLNULL
> NULL  NULLNULL
> NULL  NULLNULL
> NULL  NULLNULL
>  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23850) Allow PPD when subject is not a column with grouping sets present

2020-07-17 Thread Zhihua Deng (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159993#comment-17159993
 ] 

Zhihua Deng commented on HIVE-23850:


Thanks a lot for the help and review, [~jcamachorodriguez]!

> Allow PPD when subject is not a column with grouping sets present
> -
>
> Key: HIVE-23850
> URL: https://issues.apache.org/jira/browse/HIVE-23850
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> After [HIVE-19653|https://issues.apache.org/jira/browse/HIVE-19653],  filters 
> with only columns and constants are pushed down, but in some cases,  this may 
> not work as well, for example:
> SET hive.cbo.enable=false;
> SELECT a, b, sum(s)
> FROM T1
> GROUP BY a, b GROUPING SETS ((a), (a, b))
> HAVING upper(a) = "AAA" AND sum(s) > 100;
>  
> SELECT upper(a), b, sum(s)
> FROM T1
> GROUP BY upper(a), b GROUPING SETS ((upper(a)), (upper(a), b))
> HAVING upper(a) = "AAA" AND sum(s) > 100;
>  
> The filters pushed down to GBY can be f(gbyKey) or gbyKey with udf ,  not 
> only the column groupby keys.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23869) Move alter statements in parser to new file

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23869?focusedWorklogId=460351=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460351
 ]

ASF GitHub Bot logged work on HIVE-23869:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 15:16
Start Date: 17/Jul/20 15:16
Worklog Time Spent: 10m 
  Work Description: jcamachor merged pull request #1270:
URL: https://github.com/apache/hive/pull/1270


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460351)
Time Spent: 20m  (was: 10m)

> Move alter statements in parser to new file
> ---
>
> Key: HIVE-23869
> URL: https://issues.apache.org/jira/browse/HIVE-23869
> Project: Hive
>  Issue Type: Improvement
>  Components: Parser
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We are hitting HiveParser 'code too large' problem. HIVE-23857 introduced an 
> adhoc script to solve this problem. Instead, we can split HiveParser.g into 
> smaller files. For instance, we can group all alter statements into their own 
> .g file.
> This patch also fixes an ambiguity warning that was thrown related to LIKE 
> ALL/ANY clauses.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work started] (HIVE-23869) Move alter statements in parser to new file

2020-07-17 Thread Jesus Camacho Rodriguez (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-23869 started by Jesus Camacho Rodriguez.
--
> Move alter statements in parser to new file
> ---
>
> Key: HIVE-23869
> URL: https://issues.apache.org/jira/browse/HIVE-23869
> Project: Hive
>  Issue Type: Improvement
>  Components: Parser
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We are hitting HiveParser 'code too large' problem. HIVE-23857 introduced an 
> adhoc script to solve this problem. Instead, we can split HiveParser.g into 
> smaller files. For instance, we can group all alter statements into their own 
> .g file.
> This patch also fixes an ambiguity warning that was thrown related to LIKE 
> ALL/ANY clauses.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-23869) Move alter statements in parser to new file

2020-07-17 Thread Jesus Camacho Rodriguez (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez resolved HIVE-23869.

Fix Version/s: 4.0.0
   Resolution: Fixed

Pushed to master, thanks for reviewing [~mgergely].

> Move alter statements in parser to new file
> ---
>
> Key: HIVE-23869
> URL: https://issues.apache.org/jira/browse/HIVE-23869
> Project: Hive
>  Issue Type: Improvement
>  Components: Parser
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We are hitting HiveParser 'code too large' problem. HIVE-23857 introduced an 
> adhoc script to solve this problem. Instead, we can split HiveParser.g into 
> smaller files. For instance, we can group all alter statements into their own 
> .g file.
> This patch also fixes an ambiguity warning that was thrown related to LIKE 
> ALL/ANY clauses.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23852) Natively support Date type in ReduceSink operator

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23852?focusedWorklogId=460352=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460352
 ]

ASF GitHub Bot logged work on HIVE-23852:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 15:16
Start Date: 17/Jul/20 15:16
Worklog Time Spent: 10m 
  Work Description: pgaref opened a new pull request #1274:
URL: https://github.com/apache/hive/pull/1274


   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY)
   For more details, please see 
https://cwiki.apache.org/confluence/display/Hive/HowToContribute
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460352)
Time Spent: 50m  (was: 40m)

> Natively support Date type in ReduceSink operator
> -
>
> Key: HIVE-23852
> URL: https://issues.apache.org/jira/browse/HIVE-23852
> Project: Hive
>  Issue Type: Improvement
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> There is no native support currently meaning that these types end up being 
> serialized as multi-key columns which is much slower (iterating through batch 
> columns instead of writing a value directly)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23852) Natively support Date type in ReduceSink operator

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23852?focusedWorklogId=460350=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460350
 ]

ASF GitHub Bot logged work on HIVE-23852:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 15:14
Start Date: 17/Jul/20 15:14
Worklog Time Spent: 10m 
  Work Description: pgaref closed pull request #1257:
URL: https://github.com/apache/hive/pull/1257


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460350)
Time Spent: 40m  (was: 0.5h)

> Natively support Date type in ReduceSink operator
> -
>
> Key: HIVE-23852
> URL: https://issues.apache.org/jira/browse/HIVE-23852
> Project: Hive
>  Issue Type: Improvement
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> There is no native support currently meaning that these types end up being 
> serialized as multi-key columns which is much slower (iterating through batch 
> columns instead of writing a value directly)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23871) ObjectStore should properly handle MicroManaged Table properties

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-23871:
--
Labels: pull-request-available  (was: )

> ObjectStore should properly handle MicroManaged Table properties
> 
>
> Key: HIVE-23871
> URL: https://issues.apache.org/jira/browse/HIVE-23871
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: table1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HIVE-23281 optimizes StorageDescriptor conversion as part of the ObjectStore 
> by skipping particular Table properties like SkewInfo, bucketCols, ordering 
> etc.
>  However, it does that for all Transactional Tables – not only ACID – causing 
> MicroManaged Tables to behave abnormally.
>  MicroManaged (insert_only) tables may miss needed properties such as Storage 
> Desc Params – that may define how lines are delimited (like in the example 
> below):
> To repro the issue:
> {code:java}
> CREATE TRANSACTIONAL TABLE delim_table_trans(id INT, name STRING, safety INT) 
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE;
> LOAD DATA INPATH 'table1' OVERWRITE INTO TABLE delim_table_trans;
> describe formatted delim_table_trans;
> SELECT * FROM delim_table_trans;
> {code}
> Result:
> {code:java}
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   bucketing_version   2   
>   numFiles1   
>   numRows 0   
>   rawDataSize 0   
>   totalSize   72  
>   transactional   true
>   transactional_propertiesinsert_only 
>  A masked pattern was here 
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> PREHOOK: query: SELECT * FROM delim_table_trans
> PREHOOK: type: QUERY
> PREHOOK: Input: default@delim_table_trans
>  A masked pattern was here 
> POSTHOOK: query: SELECT * FROM delim_table_trans
> POSTHOOK: type: QUERY
> POSTHOOK: Input: default@delim_table_trans
>  A masked pattern was here 
> NULL  NULLNULL
> NULL  NULLNULL
> NULL  NULLNULL
> NULL  NULLNULL
> NULL  NULLNULL
> NULL  NULLNULL
>  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23871) ObjectStore should properly handle MicroManaged Table properties

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23871?focusedWorklogId=460349=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460349
 ]

ASF GitHub Bot logged work on HIVE-23871:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 15:13
Start Date: 17/Jul/20 15:13
Worklog Time Spent: 10m 
  Work Description: pgaref opened a new pull request #1273:
URL: https://github.com/apache/hive/pull/1273


   ObjectStore should properly handle MicroManaged Table properties
   
   Change-Id: Ia5db047419a11504f3c6047a1eb63acd2a14bdc3
   
   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY)
   For more details, please see 
https://cwiki.apache.org/confluence/display/Hive/HowToContribute
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460349)
Remaining Estimate: 0h
Time Spent: 10m

> ObjectStore should properly handle MicroManaged Table properties
> 
>
> Key: HIVE-23871
> URL: https://issues.apache.org/jira/browse/HIVE-23871
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
> Attachments: table1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HIVE-23281 optimizes StorageDescriptor conversion as part of the ObjectStore 
> by skipping particular Table properties like SkewInfo, bucketCols, ordering 
> etc.
>  However, it does that for all Transactional Tables – not only ACID – causing 
> MicroManaged Tables to behave abnormally.
>  MicroManaged (insert_only) tables may miss needed properties such as Storage 
> Desc Params – that may define how lines are delimited (like in the example 
> below):
> To repro the issue:
> {code:java}
> CREATE TRANSACTIONAL TABLE delim_table_trans(id INT, name STRING, safety INT) 
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE;
> LOAD DATA INPATH 'table1' OVERWRITE INTO TABLE delim_table_trans;
> describe formatted delim_table_trans;
> SELECT * FROM delim_table_trans;
> {code}
> Result:
> {code:java}
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   bucketing_version   2   
>   numFiles1   
>   numRows 0   
>   rawDataSize 0   
>   totalSize   72  
>   transactional   true
>   transactional_propertiesinsert_only 
>  A masked pattern was here 
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> PREHOOK: query: SELECT * FROM delim_table_trans
> PREHOOK: type: QUERY
> PREHOOK: Input: default@delim_table_trans
>  A masked pattern was here 
> POSTHOOK: query: SELECT * FROM delim_table_trans
> POSTHOOK: type: QUERY
> POSTHOOK: Input: default@delim_table_trans
>  A masked pattern was here 
> NULL  NULLNULL
> NULL  NULLNULL
> NULL  NULLNULL
> NULL  NULLNULL
> NULL  NULLNULL
> NULL  NULLNULL
>  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Issue Comment Deleted] (HIVE-23850) Allow PPD when subject is not a column with grouping sets present

2020-07-17 Thread Jesus Camacho Rodriguez (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-23850:
---
Comment: was deleted

(was: Commit id is 
https://github.com/apache/hive/commit/44aa72f096639d7b1a52ef18887016af98bd6999 
. I missed the JIRA number in the commit message.)

> Allow PPD when subject is not a column with grouping sets present
> -
>
> Key: HIVE-23850
> URL: https://issues.apache.org/jira/browse/HIVE-23850
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> After [HIVE-19653|https://issues.apache.org/jira/browse/HIVE-19653],  filters 
> with only columns and constants are pushed down, but in some cases,  this may 
> not work as well, for example:
> SET hive.cbo.enable=false;
> SELECT a, b, sum(s)
> FROM T1
> GROUP BY a, b GROUPING SETS ((a), (a, b))
> HAVING upper(a) = "AAA" AND sum(s) > 100;
>  
> SELECT upper(a), b, sum(s)
> FROM T1
> GROUP BY upper(a), b GROUPING SETS ((upper(a)), (upper(a), b))
> HAVING upper(a) = "AAA" AND sum(s) > 100;
>  
> The filters pushed down to GBY can be f(gbyKey) or gbyKey with udf ,  not 
> only the column groupby keys.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23850) Allow PPD when subject is not a column with grouping sets present

2020-07-17 Thread Jesus Camacho Rodriguez (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159985#comment-17159985
 ] 

Jesus Camacho Rodriguez commented on HIVE-23850:


Commit id is 
https://github.com/apache/hive/commit/44aa72f096639d7b1a52ef18887016af98bd6999 
. I missed the JIRA number in the commit message.

> Allow PPD when subject is not a column with grouping sets present
> -
>
> Key: HIVE-23850
> URL: https://issues.apache.org/jira/browse/HIVE-23850
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> After [HIVE-19653|https://issues.apache.org/jira/browse/HIVE-19653],  filters 
> with only columns and constants are pushed down, but in some cases,  this may 
> not work as well, for example:
> SET hive.cbo.enable=false;
> SELECT a, b, sum(s)
> FROM T1
> GROUP BY a, b GROUPING SETS ((a), (a, b))
> HAVING upper(a) = "AAA" AND sum(s) > 100;
>  
> SELECT upper(a), b, sum(s)
> FROM T1
> GROUP BY upper(a), b GROUPING SETS ((upper(a)), (upper(a), b))
> HAVING upper(a) = "AAA" AND sum(s) > 100;
>  
> The filters pushed down to GBY can be f(gbyKey) or gbyKey with udf ,  not 
> only the column groupby keys.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23851) MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23851?focusedWorklogId=460347=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460347
 ]

ASF GitHub Bot logged work on HIVE-23851:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 15:05
Start Date: 17/Jul/20 15:05
Worklog Time Spent: 10m 
  Work Description: shameersss1 commented on a change in pull request #1271:
URL: https://github.com/apache/hive/pull/1271#discussion_r456501029



##
File path: standalone-metastore/metastore-server/pom.xml
##
@@ -204,6 +204,11 @@
   hive-storage-api
   ${storage-api.version}
 
+
+  org.apache.hive
+  hive-serde

Review comment:
   @kgyrtkirk Is this change okay? I mean we could have used reflection 
again to call serde classes but it will make code more complex and non-readable.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460347)
Time Spent: 0.5h  (was: 20m)

> MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions
> 
>
> Key: HIVE-23851
> URL: https://issues.apache.org/jira/browse/HIVE-23851
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> *Steps to reproduce:*
> # Create external table
> # Run msck command to sync all the partitions with metastore
> # Remove one of the partition path
> # Run msck repair with partition filtering
> *Stack Trace:*
> {code:java}
>  2020-07-15T02:10:29,045 ERROR [4dad298b-28b1-4e6b-94b6-aa785b60c576 main] 
> ppr.PartitionExpressionForMetastore: Failed to deserialize the expression
>  java.lang.IndexOutOfBoundsException: Index: 110, Size: 0
>  at java.util.ArrayList.rangeCheck(ArrayList.java:657) ~[?:1.8.0_192]
>  at java.util.ArrayList.get(ArrayList.java:433) ~[?:1.8.0_192]
>  at 
> org.apache.hive.com.esotericsoftware.kryo.util.MapReferenceResolver.getReadObject(MapReferenceResolver.java:60)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:857)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:707) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:211)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeObjectFromKryo(SerializationUtilities.java:806)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeExpressionFromKryo(SerializationUtilities.java:775)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.deserializeExpr(PartitionExpressionForMetastore.java:96)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.convertExprToFilter(PartitionExpressionForMetastore.java:52)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.PartFilterExprUtil.makeExpressionTree(PartFilterExprUtil.java:48)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:3593)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.VerifyingObjectStore.getPartitionsByExpr(VerifyingObjectStore.java:80)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT-tests.jar:4.0.0-SNAPSHOT]
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_192]
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_192]
> {code}
> *Cause:*
> In case of msck repair with partition filtering we expect expression proxy 
> class to be set as PartitionExpressionForMetastore ( 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/ddl/misc/msck/MsckAnalyzer.java#L78
>  ), While dropping partition we serialize the drop partition filter 
> expression as ( 
>

[jira] [Work logged] (HIVE-23850) Allow PPD when subject is not a column with grouping sets present

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23850?focusedWorklogId=460345=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460345
 ]

ASF GitHub Bot logged work on HIVE-23850:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 15:04
Start Date: 17/Jul/20 15:04
Worklog Time Spent: 10m 
  Work Description: jcamachor merged pull request #1255:
URL: https://github.com/apache/hive/pull/1255


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460345)
Time Spent: 50m  (was: 40m)

> Allow PPD when subject is not a column with grouping sets present
> -
>
> Key: HIVE-23850
> URL: https://issues.apache.org/jira/browse/HIVE-23850
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> After [HIVE-19653|https://issues.apache.org/jira/browse/HIVE-19653],  filters 
> with only columns and constants are pushed down, but in some cases,  this may 
> not work as well, for example:
> SET hive.cbo.enable=false;
> SELECT a, b, sum(s)
> FROM T1
> GROUP BY a, b GROUPING SETS ((a), (a, b))
> HAVING upper(a) = "AAA" AND sum(s) > 100;
>  
> SELECT upper(a), b, sum(s)
> FROM T1
> GROUP BY upper(a), b GROUPING SETS ((upper(a)), (upper(a), b))
> HAVING upper(a) = "AAA" AND sum(s) > 100;
>  
> The filters pushed down to GBY can be f(gbyKey) or gbyKey with udf ,  not 
> only the column groupby keys.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-23850) Allow PPD when subject is not a column with grouping sets present

2020-07-17 Thread Jesus Camacho Rodriguez (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez resolved HIVE-23850.

Fix Version/s: 4.0.0
   Resolution: Fixed

Pushed to master, thanks for your contribution [~dengzh]!

> Allow PPD when subject is not a column with grouping sets present
> -
>
> Key: HIVE-23850
> URL: https://issues.apache.org/jira/browse/HIVE-23850
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> After [HIVE-19653|https://issues.apache.org/jira/browse/HIVE-19653],  filters 
> with only columns and constants are pushed down, but in some cases,  this may 
> not work as well, for example:
> SET hive.cbo.enable=false;
> SELECT a, b, sum(s)
> FROM T1
> GROUP BY a, b GROUPING SETS ((a), (a, b))
> HAVING upper(a) = "AAA" AND sum(s) > 100;
>  
> SELECT upper(a), b, sum(s)
> FROM T1
> GROUP BY upper(a), b GROUPING SETS ((upper(a)), (upper(a), b))
> HAVING upper(a) = "AAA" AND sum(s) > 100;
>  
> The filters pushed down to GBY can be f(gbyKey) or gbyKey with udf ,  not 
> only the column groupby keys.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23851) MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23851?focusedWorklogId=460344=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460344
 ]

ASF GitHub Bot logged work on HIVE-23851:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 15:04
Start Date: 17/Jul/20 15:04
Worklog Time Spent: 10m 
  Work Description: shameersss1 commented on a change in pull request #1271:
URL: https://github.com/apache/hive/pull/1271#discussion_r456500229



##
File path: 
ql/src/test/org/apache/hadoop/hive/ql/exec/TestPartitionManagement.java
##
@@ -15,7 +15,7 @@
  * See the License for the specific language governing permissions and
  * limitations under the License.
  */
-package org.apache.hadoop.hive.metastore;
+package org.apache.hadoop.hive.ql.exec;

Review comment:
   Moved TestPartitionManagement.java to ql module due to dependency on 
PartitionExpressionForMetastore and some other ql class for serializing 
partition expression.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460344)
Time Spent: 20m  (was: 10m)

> MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions
> 
>
> Key: HIVE-23851
> URL: https://issues.apache.org/jira/browse/HIVE-23851
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> *Steps to reproduce:*
> # Create external table
> # Run msck command to sync all the partitions with metastore
> # Remove one of the partition path
> # Run msck repair with partition filtering
> *Stack Trace:*
> {code:java}
>  2020-07-15T02:10:29,045 ERROR [4dad298b-28b1-4e6b-94b6-aa785b60c576 main] 
> ppr.PartitionExpressionForMetastore: Failed to deserialize the expression
>  java.lang.IndexOutOfBoundsException: Index: 110, Size: 0
>  at java.util.ArrayList.rangeCheck(ArrayList.java:657) ~[?:1.8.0_192]
>  at java.util.ArrayList.get(ArrayList.java:433) ~[?:1.8.0_192]
>  at 
> org.apache.hive.com.esotericsoftware.kryo.util.MapReferenceResolver.getReadObject(MapReferenceResolver.java:60)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:857)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:707) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:211)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeObjectFromKryo(SerializationUtilities.java:806)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeExpressionFromKryo(SerializationUtilities.java:775)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.deserializeExpr(PartitionExpressionForMetastore.java:96)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.convertExprToFilter(PartitionExpressionForMetastore.java:52)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.PartFilterExprUtil.makeExpressionTree(PartFilterExprUtil.java:48)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:3593)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.VerifyingObjectStore.getPartitionsByExpr(VerifyingObjectStore.java:80)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT-tests.jar:4.0.0-SNAPSHOT]
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_192]
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_192]
> {code}
> *Cause:*
> In case of msck repair with partition filtering we expect expression proxy 
> class to be set as PartitionExpressionForMetastore ( 
>

[jira] [Assigned] (HIVE-23850) Allow PPD when subject is not a column with grouping sets present

2020-07-17 Thread Jesus Camacho Rodriguez (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-23850:
--

Assignee: Zhihua Deng

> Allow PPD when subject is not a column with grouping sets present
> -
>
> Key: HIVE-23850
> URL: https://issues.apache.org/jira/browse/HIVE-23850
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> After [HIVE-19653|https://issues.apache.org/jira/browse/HIVE-19653],  filters 
> with only columns and constants are pushed down, but in some cases,  this may 
> not work as well, for example:
> SET hive.cbo.enable=false;
> SELECT a, b, sum(s)
> FROM T1
> GROUP BY a, b GROUPING SETS ((a), (a, b))
> HAVING upper(a) = "AAA" AND sum(s) > 100;
>  
> SELECT upper(a), b, sum(s)
> FROM T1
> GROUP BY upper(a), b GROUPING SETS ((upper(a)), (upper(a), b))
> HAVING upper(a) = "AAA" AND sum(s) > 100;
>  
> The filters pushed down to GBY can be f(gbyKey) or gbyKey with udf ,  not 
> only the column groupby keys.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23868) Windowing function spec: support 0 preceeding/following

2020-07-17 Thread Jesus Camacho Rodriguez (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-23868:
---
Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master, thanks [~jdere]!

> Windowing function spec: support 0 preceeding/following
> ---
>
> Key: HIVE-23868
> URL: https://issues.apache.org/jira/browse/HIVE-23868
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-23868.1.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> HIVE-12574 removed support for 0 PRECEDING/FOLLOWING in window function 
> specifications. We can restore support for this by converting 0 
> PRECEDING/FOLLOWING to CURRENT ROW in the query plan, which should be the 
> same.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23868) Windowing function spec: support 0 preceeding/following

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23868?focusedWorklogId=460342=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460342
 ]

ASF GitHub Bot logged work on HIVE-23868:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 14:55
Start Date: 17/Jul/20 14:55
Worklog Time Spent: 10m 
  Work Description: jcamachor merged pull request #1269:
URL: https://github.com/apache/hive/pull/1269


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460342)
Time Spent: 20m  (was: 10m)

> Windowing function spec: support 0 preceeding/following
> ---
>
> Key: HIVE-23868
> URL: https://issues.apache.org/jira/browse/HIVE-23868
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23868.1.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> HIVE-12574 removed support for 0 PRECEDING/FOLLOWING in window function 
> specifications. We can restore support for this by converting 0 
> PRECEDING/FOLLOWING to CURRENT ROW in the query plan, which should be the 
> same.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22869) Add locking benchmark to metastore-tools/metastore-benchmarks

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22869?focusedWorklogId=460300=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460300
 ]

ASF GitHub Bot logged work on HIVE-22869:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 13:16
Start Date: 17/Jul/20 13:16
Worklog Time Spent: 10m 
  Work Description: zchovan commented on a change in pull request #1073:
URL: https://github.com/apache/hive/pull/1073#discussion_r456434704



##
File path: 
standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/HMSClient.java
##
@@ -345,21 +348,44 @@ boolean openTxn(int numTxns) throws TException {
 return openTxns;
   }
 
+  List getOpenTxnsInfo() throws TException {
+return client.get_open_txns_info().getOpen_txns();
+  }
+
   boolean commitTxn(long txnId) throws TException {
 client.commit_txn(new CommitTxnRequest(txnId));
 return true;
   }
 
-  boolean abortTxn(long txnId) throws TException {
-client.abort_txn(new AbortTxnRequest(txnId));
+  boolean abortTxns(List txnIds) throws TException {
+client.abort_txns(new AbortTxnsRequest(txnIds));
 return true;
   }
 
-  boolean abortTxns(List txnIds) throws TException {
-client.abort_txns(new AbortTxnsRequest(txnIds));
+  boolean allocateTableWriteIds(String dbName, String tableName, List 
openTxns) throws TException {
+AllocateTableWriteIdsRequest awiRqst = new 
AllocateTableWriteIdsRequest(dbName, tableName);
+openTxns.forEach(t -> {
+  awiRqst.addToTxnIds(t);
+});
+
+client.allocate_table_write_ids(awiRqst);
 return true;
   }
 
+  boolean getValidWriteIds(List fullTableNames) throws TException {

Review comment:
   ah sorry, I was mistaken, the reason why it never returned the writeIds 
is because they are never used, the benchmark is just executing the api call. 
The return value from the hms is actually a GetValidWriteIdsResponse object, 
not a list. As it is never used I'm not sure if we need to change this.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460300)
Time Spent: 2h  (was: 1h 50m)

> Add locking benchmark to metastore-tools/metastore-benchmarks
> -
>
> Key: HIVE-22869
> URL: https://issues.apache.org/jira/browse/HIVE-22869
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22869.2.patch, HIVE-22869.3.patch, 
> HIVE-22869.4.patch, HIVE-22869.5.patch, HIVE-22869.6.patch, 
> HIVE-22869.7.patch, HIVE-22869.8.patch, HIVE-22869.9.patch, HIVE-22869.patch
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Add the possibility to run benchmarks on opening lock in the HMS



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22869) Add locking benchmark to metastore-tools/metastore-benchmarks

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22869?focusedWorklogId=460294=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460294
 ]

ASF GitHub Bot logged work on HIVE-22869:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 13:05
Start Date: 17/Jul/20 13:05
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #1073:
URL: https://github.com/apache/hive/pull/1073#discussion_r456421989



##
File path: 
standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/HMSClient.java
##
@@ -345,21 +348,44 @@ boolean openTxn(int numTxns) throws TException {
 return openTxns;
   }
 
+  List getOpenTxnsInfo() throws TException {
+return client.get_open_txns_info().getOpen_txns();
+  }
+
   boolean commitTxn(long txnId) throws TException {
 client.commit_txn(new CommitTxnRequest(txnId));
 return true;
   }
 
-  boolean abortTxn(long txnId) throws TException {
-client.abort_txn(new AbortTxnRequest(txnId));
+  boolean abortTxns(List txnIds) throws TException {
+client.abort_txns(new AbortTxnsRequest(txnIds));
 return true;
   }
 
-  boolean abortTxns(List txnIds) throws TException {
-client.abort_txns(new AbortTxnsRequest(txnIds));
+  boolean allocateTableWriteIds(String dbName, String tableName, List 
openTxns) throws TException {
+AllocateTableWriteIdsRequest awiRqst = new 
AllocateTableWriteIdsRequest(dbName, tableName);
+openTxns.forEach(t -> {
+  awiRqst.addToTxnIds(t);
+});
+
+client.allocate_table_write_ids(awiRqst);
 return true;
   }
 
+  boolean getValidWriteIds(List fullTableNames) throws TException {

Review comment:
   I don't get what does it have to do with throwingSupplierWrapper. 
throwingSupplierWrapper just handles checked exceptions. Could you please 
elaborate here?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460294)
Time Spent: 1h 50m  (was: 1h 40m)

> Add locking benchmark to metastore-tools/metastore-benchmarks
> -
>
> Key: HIVE-22869
> URL: https://issues.apache.org/jira/browse/HIVE-22869
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22869.2.patch, HIVE-22869.3.patch, 
> HIVE-22869.4.patch, HIVE-22869.5.patch, HIVE-22869.6.patch, 
> HIVE-22869.7.patch, HIVE-22869.8.patch, HIVE-22869.9.patch, HIVE-22869.patch
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Add the possibility to run benchmarks on opening lock in the HMS



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23818) Use String Switch-Case Statement in StatUtils

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23818?focusedWorklogId=460293=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460293
 ]

ASF GitHub Bot logged work on HIVE-23818:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 13:04
Start Date: 17/Jul/20 13:04
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on pull request #1229:
URL: https://github.com/apache/hive/pull/1229#issuecomment-660094981


   @kgyrtkirk Thanks a million for pointing that out.  I addressed the issue, 
tests pass, and I have merged.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460293)
Time Spent: 1.5h  (was: 1h 20m)

> Use String Switch-Case Statement in StatUtils
> -
>
> Key: HIVE-23818
> URL: https://issues.apache.org/jira/browse/HIVE-23818
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> switch-case statements with Java is now available.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-23818) Use String Switch-Case Statement in StatUtils

2020-07-17 Thread David Mollitor (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor resolved HIVE-23818.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Pushed to master!

> Use String Switch-Case Statement in StatUtils
> -
>
> Key: HIVE-23818
> URL: https://issues.apache.org/jira/browse/HIVE-23818
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> switch-case statements with Java is now available.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23818) Use String Switch-Case Statement in StatUtils

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23818?focusedWorklogId=460292=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460292
 ]

ASF GitHub Bot logged work on HIVE-23818:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 13:03
Start Date: 17/Jul/20 13:03
Worklog Time Spent: 10m 
  Work Description: belugabehr merged pull request #1229:
URL: https://github.com/apache/hive/pull/1229


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460292)
Time Spent: 1h 20m  (was: 1h 10m)

> Use String Switch-Case Statement in StatUtils
> -
>
> Key: HIVE-23818
> URL: https://issues.apache.org/jira/browse/HIVE-23818
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> switch-case statements with Java is now available.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22869) Add locking benchmark to metastore-tools/metastore-benchmarks

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22869?focusedWorklogId=460291=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460291
 ]

ASF GitHub Bot logged work on HIVE-22869:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 13:02
Start Date: 17/Jul/20 13:02
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #1073:
URL: https://github.com/apache/hive/pull/1073#discussion_r456427550



##
File path: 
standalone-metastore/metastore-tools/metastore-benchmarks/src/main/java/org/apache/hadoop/hive/metastore/tools/ACIDBenchmarks.java
##
@@ -0,0 +1,247 @@
+package org.apache.hadoop.hive.metastore.tools;
+
+import org.apache.hadoop.hive.metastore.api.DataOperationType;
+import org.apache.hadoop.hive.metastore.api.LockComponent;
+import org.apache.hadoop.hive.metastore.api.LockRequest;
+import org.apache.logging.log4j.Level;
+import org.apache.logging.log4j.LogManager;
+import org.apache.logging.log4j.core.LoggerContext;
+import org.apache.logging.log4j.core.config.Configuration;
+import org.apache.thrift.TException;
+import org.openjdk.jmh.annotations.Benchmark;
+import org.openjdk.jmh.annotations.Param;
+import org.openjdk.jmh.annotations.Scope;
+import org.openjdk.jmh.annotations.Setup;
+import org.openjdk.jmh.annotations.State;
+import org.openjdk.jmh.annotations.TearDown;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import static 
org.apache.hadoop.hive.metastore.tools.BenchmarkUtils.createManyTables;
+import static 
org.apache.hadoop.hive.metastore.tools.BenchmarkUtils.dropManyTables;
+import static 
org.apache.hadoop.hive.metastore.tools.Util.throwingSupplierWrapper;
+
+public class ACIDBenchmarks {
+
+  private static final Logger LOG = LoggerFactory.getLogger(CoreContext.class);
+
+  @State(Scope.Benchmark)
+  public static class CoreContext {
+@Param("1")
+protected int howMany;
+
+@State(Scope.Thread)
+public static class ThreadState {
+  HMSClient client;
+
+  @Setup
+  public void doSetup() throws Exception {
+LOG.debug("Creating client");
+client = HMSConfig.getInstance().newClient();
+  }
+
+  @TearDown
+  public void doTearDown() throws Exception {
+client.close();
+LOG.debug("Closed a connection to metastore.");
+  }
+}
+
+@Setup
+public void setup() {
+  LoggerContext ctx = (LoggerContext) LogManager.getContext(false);
+  Configuration ctxConfig = ctx.getConfiguration();
+  
ctxConfig.getLoggerConfig(CoreContext.class.getName()).setLevel(Level.INFO);
+  ctx.updateLoggers(ctxConfig);
+}
+  }
+
+  @State(Scope.Benchmark)
+  public static class TestOpenTxn extends CoreContext {
+
+@State(Scope.Thread)
+public static class ThreadState extends CoreContext.ThreadState {
+  List openTxns = new ArrayList<>();
+
+  @TearDown
+  public void doTearDown() throws Exception {
+client.abortTxns(openTxns);
+LOG.debug("aborted all opened txns");
+  }
+
+  void addTxn(List openTxn) {
+openTxns.addAll(openTxn);
+  }
+}
+
+@Benchmark
+public void openTxn(TestOpenTxn.ThreadState state) throws TException {
+  state.addTxn(state.client.openTxn(howMany));
+  LOG.debug("opened txns, count=", howMany);
+}
+  }
+
+  @State(Scope.Benchmark)
+  public static class TestLocking extends CoreContext {
+private int nTables;
+
+@Param("0")
+private int nPartitions;
+
+private List lockComponents;
+
+@Setup
+public void setup() {
+  this.nTables = (nPartitions != 0) ? howMany / nPartitions : howMany;
+  createLockComponents();
+}
+
+@State(Scope.Thread)
+public static class ThreadState extends CoreContext.ThreadState {
+  List openTxns = new ArrayList<>();
+  long txnId;
+
+  @Setup(org.openjdk.jmh.annotations.Level.Invocation)
+  public void iterSetup() {
+txnId = executeOpenTxnAndGetTxnId(client);
+LOG.debug("opened txn, id={}", txnId);
+openTxns.add(txnId);
+  }
+
+  @TearDown
+  public void doTearDown() throws Exception {
+client.abortTxns(openTxns);
+if (BenchmarkUtils.checkTxnsCleaned(client, openTxns) == false) {
+  LOG.error("Something went wrong with the cleanup of txns");
+}
+LOG.debug("aborted all opened txns");
+  }
+}
+
+@Benchmark
+public void lock(TestLocking.ThreadState state) {
+  LOG.debug("sending lock request");
+  executeLock(state.client, state.txnId, lockComponents);
+}
+
+private void createLockComponents() {
+  lockComponents = new ArrayList<>();
+
+  for (int i = 0; i < nTables; i++) {
+for (int j = 0; j < nPartitions - (nPartitions > 1 ? 1 : 0); j++) {
+  lockComponents.add(
+

[jira] [Work logged] (HIVE-23862) Clean Up StatsUtils and BasicStats

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23862?focusedWorklogId=460290=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460290
 ]

ASF GitHub Bot logged work on HIVE-23862:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 13:00
Start Date: 17/Jul/20 13:00
Worklog Time Spent: 10m 
  Work Description: belugabehr opened a new pull request #1264:
URL: https://github.com/apache/hive/pull/1264


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460290)
Time Spent: 1h  (was: 50m)

> Clean Up StatsUtils and BasicStats
> --
>
> Key: HIVE-23862
> URL: https://issues.apache.org/jira/browse/HIVE-23862
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Miscellaneous improvements to readability and performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23862) Clean Up StatsUtils and BasicStats

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23862?focusedWorklogId=460289=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460289
 ]

ASF GitHub Bot logged work on HIVE-23862:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 13:00
Start Date: 17/Jul/20 13:00
Worklog Time Spent: 10m 
  Work Description: belugabehr closed pull request #1264:
URL: https://github.com/apache/hive/pull/1264


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460289)
Time Spent: 50m  (was: 40m)

> Clean Up StatsUtils and BasicStats
> --
>
> Key: HIVE-23862
> URL: https://issues.apache.org/jira/browse/HIVE-23862
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Miscellaneous improvements to readability and performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23835) Repl Dump should dump function binaries to staging directory

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23835?focusedWorklogId=460288=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460288
 ]

ASF GitHub Bot logged work on HIVE-23835:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 12:59
Start Date: 17/Jul/20 12:59
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #1249:
URL: https://github.com/apache/hive/pull/1249#discussion_r456425658



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/CreateFunctionHandler.java
##
@@ -41,13 +53,36 @@ CreateFunctionMessage eventMessage(String 
stringRepresentation) {
   public void handle(Context withinContext) throws Exception {
 LOG.info("Processing#{} CREATE_FUNCTION message : {}", fromEventId(), 
eventMessageAsJSON);
 Path metadataPath = new Path(withinContext.eventRoot, 
EximUtil.METADATA_NAME);
+Path dataPath = new Path(withinContext.eventRoot, EximUtil.DATA_PATH_NAME);
 FileSystem fileSystem = metadataPath.getFileSystem(withinContext.hiveConf);
-
+List functionBinaryCopyPaths = new ArrayList<>();
 try (JsonWriter jsonWriter = new JsonWriter(fileSystem, metadataPath)) {
-  new FunctionSerializer(eventMessage.getFunctionObj(), 
withinContext.hiveConf)
-  .writeTo(jsonWriter, withinContext.replicationSpec);
+  FunctionSerializer serializer = new 
FunctionSerializer(eventMessage.getFunctionObj(),
+  dataPath, withinContext.hiveConf);
+  serializer.writeTo(jsonWriter, withinContext.replicationSpec);
+  functionBinaryCopyPaths.addAll(serializer.getFunctionBinaryCopyPaths());
 }
 withinContext.createDmd(this).write();
+copyFunctionBinaries(functionBinaryCopyPaths, withinContext.hiveConf);
+  }
+
+  private void copyFunctionBinaries(List 
functionBinaryCopyPaths, HiveConf hiveConf)

Review comment:
   no, for function binary copy, we are not using the load flag. It is 
retained as it is currently.  meaning: earlier during load it used to copy from 
src location. Now with this change, it will copy from staging location. So that 
src cluster visibility in not required during load of function.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460288)
Time Spent: 50m  (was: 40m)

> Repl Dump should dump function binaries to staging directory
> 
>
> Key: HIVE-23835
> URL: https://issues.apache.org/jira/browse/HIVE-23835
> Project: Hive
>  Issue Type: Task
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23835.01.patch, HIVE-23835.02.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {color:#172b4d}When hive function's binaries are on source HDFS, repl dump 
> should dump it to the staging location in order to break cross clusters 
> visibility requirement.{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23474) Deny Repl Dump if the database is a target of replication

2020-07-17 Thread Pravin Sinha (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159906#comment-17159906
 ] 

Pravin Sinha commented on HIVE-23474:
-

+1

> Deny Repl Dump if the database is a target of replication
> -
>
> Key: HIVE-23474
> URL: https://issues.apache.org/jira/browse/HIVE-23474
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23474.01.patch, HIVE-23474.02.patch, 
> HIVE-23474.03.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22869) Add locking benchmark to metastore-tools/metastore-benchmarks

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22869?focusedWorklogId=460285=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460285
 ]

ASF GitHub Bot logged work on HIVE-22869:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 12:55
Start Date: 17/Jul/20 12:55
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #1073:
URL: https://github.com/apache/hive/pull/1073#discussion_r456423568



##
File path: 
standalone-metastore/metastore-tools/metastore-benchmarks/src/main/java/org/apache/hadoop/hive/metastore/tools/BenchmarkUtils.java
##
@@ -0,0 +1,72 @@
+package org.apache.hadoop.hive.metastore.tools;
+
+import org.apache.hadoop.hive.metastore.TableType;
+import org.apache.hadoop.hive.metastore.api.FieldSchema;
+import org.apache.hadoop.hive.metastore.api.TxnInfo;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.List;
+import java.util.stream.IntStream;
+
+import static org.apache.hadoop.hive.metastore.tools.Util.createSchema;
+import static 
org.apache.hadoop.hive.metastore.tools.Util.throwingSupplierWrapper;
+
+public class BenchmarkUtils {
+  private static final Logger LOG = 
LoggerFactory.getLogger(BenchmarkUtils.class);
+
+
+  static void createManyTables(HMSClient client, int howMany, String dbName, 
String format) {
+List columns = createSchema(new 
ArrayList<>(Arrays.asList("name", "string")));
+List partitions = createSchema(new 
ArrayList<>(Arrays.asList("date", "string")));
+IntStream.range(0, howMany)
+.forEach(i ->
+throwingSupplierWrapper(() -> client.createTable(
+new Util.TableBuilder(dbName, String.format(format, i))
+.withType(TableType.MANAGED_TABLE)
+.withColumns(columns)
+.withPartitionKeys(partitions)
+.build(;
+  }
+
+  static void dropManyTables(HMSClient client, int howMany, String dbName, 
String format) {
+IntStream.range(0, howMany)
+.forEach(i ->
+throwingSupplierWrapper(() -> client.dropTable(dbName, 
String.format(format, i;
+  }
+
+  // Create a simple table with a single column and single partition
+  static void createPartitionedTable(HMSClient client, String dbName, String 
tableName) {
+throwingSupplierWrapper(() -> client.createTable(
+new Util.TableBuilder(dbName, tableName)
+.withType(TableType.MANAGED_TABLE)
+
.withColumns(createSchema(Collections.singletonList("name:string")))
+.withPartitionKeys(createSchema(Collections.singletonList("date")))
+.build()));
+  }
+
+  static boolean checkTxnsCleaned(HMSClient client, List 
txnsOpenedByBenchmark) throws InterruptedException {
+// let's wait the default cleaner run period
+Thread.sleep(10);
+List notCleanedTxns = new ArrayList<>();
+throwingSupplierWrapper(() -> {
+  List txnInfos = client.getOpenTxnsInfo();

Review comment:
   can't see any change here





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460285)
Time Spent: 1.5h  (was: 1h 20m)

> Add locking benchmark to metastore-tools/metastore-benchmarks
> -
>
> Key: HIVE-22869
> URL: https://issues.apache.org/jira/browse/HIVE-22869
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22869.2.patch, HIVE-22869.3.patch, 
> HIVE-22869.4.patch, HIVE-22869.5.patch, HIVE-22869.6.patch, 
> HIVE-22869.7.patch, HIVE-22869.8.patch, HIVE-22869.9.patch, HIVE-22869.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Add the possibility to run benchmarks on opening lock in the HMS



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22869) Add locking benchmark to metastore-tools/metastore-benchmarks

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22869?focusedWorklogId=460284=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460284
 ]

ASF GitHub Bot logged work on HIVE-22869:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 12:54
Start Date: 17/Jul/20 12:54
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #1073:
URL: https://github.com/apache/hive/pull/1073#discussion_r456423093



##
File path: 
standalone-metastore/metastore-tools/metastore-benchmarks/src/main/java/org/apache/hadoop/hive/metastore/tools/BenchmarkTool.java
##
@@ -141,12 +175,62 @@ private static void saveDataFile(String location, String 
name,
 }
   }
 
-
   @Override
   public void run() {
-LOG.info("Using warmup " + warmup +
-" spin " + spinCount + " nparams " + nParameters + " threads " + 
nThreads);
+LOG.info("Using warmup " + warmup + " spin " + spinCount + " nparams " + 
Arrays.toString(nParameters) + " threads "
++ nThreads);
+HMSConfig.getInstance().init(host, port, confDir);
+
+if (runMode == RunModes.ALL) {

Review comment:
   can't see change here





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460284)
Time Spent: 1h 20m  (was: 1h 10m)

> Add locking benchmark to metastore-tools/metastore-benchmarks
> -
>
> Key: HIVE-22869
> URL: https://issues.apache.org/jira/browse/HIVE-22869
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22869.2.patch, HIVE-22869.3.patch, 
> HIVE-22869.4.patch, HIVE-22869.5.patch, HIVE-22869.6.patch, 
> HIVE-22869.7.patch, HIVE-22869.8.patch, HIVE-22869.9.patch, HIVE-22869.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Add the possibility to run benchmarks on opening lock in the HMS



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22869) Add locking benchmark to metastore-tools/metastore-benchmarks

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22869?focusedWorklogId=460283=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460283
 ]

ASF GitHub Bot logged work on HIVE-22869:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 12:52
Start Date: 17/Jul/20 12:52
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #1073:
URL: https://github.com/apache/hive/pull/1073#discussion_r456421989



##
File path: 
standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/HMSClient.java
##
@@ -345,21 +348,44 @@ boolean openTxn(int numTxns) throws TException {
 return openTxns;
   }
 
+  List getOpenTxnsInfo() throws TException {
+return client.get_open_txns_info().getOpen_txns();
+  }
+
   boolean commitTxn(long txnId) throws TException {
 client.commit_txn(new CommitTxnRequest(txnId));
 return true;
   }
 
-  boolean abortTxn(long txnId) throws TException {
-client.abort_txn(new AbortTxnRequest(txnId));
+  boolean abortTxns(List txnIds) throws TException {
+client.abort_txns(new AbortTxnsRequest(txnIds));
 return true;
   }
 
-  boolean abortTxns(List txnIds) throws TException {
-client.abort_txns(new AbortTxnsRequest(txnIds));
+  boolean allocateTableWriteIds(String dbName, String tableName, List 
openTxns) throws TException {
+AllocateTableWriteIdsRequest awiRqst = new 
AllocateTableWriteIdsRequest(dbName, tableName);
+openTxns.forEach(t -> {
+  awiRqst.addToTxnIds(t);
+});
+
+client.allocate_table_write_ids(awiRqst);
 return true;
   }
 
+  boolean getValidWriteIds(List fullTableNames) throws TException {

Review comment:
   I don't get what does it have to do with throwingSupplierWrapper. 
throwingSupplierWrapper just handles checked exceptions.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460283)
Time Spent: 1h 10m  (was: 1h)

> Add locking benchmark to metastore-tools/metastore-benchmarks
> -
>
> Key: HIVE-22869
> URL: https://issues.apache.org/jira/browse/HIVE-22869
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22869.2.patch, HIVE-22869.3.patch, 
> HIVE-22869.4.patch, HIVE-22869.5.patch, HIVE-22869.6.patch, 
> HIVE-22869.7.patch, HIVE-22869.8.patch, HIVE-22869.9.patch, HIVE-22869.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Add the possibility to run benchmarks on opening lock in the HMS



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23474) Deny Repl Dump if the database is a target of replication

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23474?focusedWorklogId=460274=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460274
 ]

ASF GitHub Bot logged work on HIVE-23474:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 12:21
Start Date: 17/Jul/20 12:21
Worklog Time Spent: 10m 
  Work Description: aasha commented on a change in pull request #1247:
URL: https://github.com/apache/hive/pull/1247#discussion_r456406599



##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcrossInstances.java
##
@@ -944,96 +944,6 @@ public void testIncrementalDumpMultiIteration() throws 
Throwable {
 Assert.assertEquals(IncrementalLoadTasksBuilder.getNumIteration(), 
numEvents);
   }
 
-  @Test
-  public void testIfCkptAndSourceOfReplPropsIgnoredByReplDump() throws 
Throwable {

Review comment:
   Agree, added





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460274)
Time Spent: 0.5h  (was: 20m)

> Deny Repl Dump if the database is a target of replication
> -
>
> Key: HIVE-23474
> URL: https://issues.apache.org/jira/browse/HIVE-23474
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23474.01.patch, HIVE-23474.02.patch, 
> HIVE-23474.03.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23474) Deny Repl Dump if the database is a target of replication

2020-07-17 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-23474:
---
Attachment: HIVE-23474.03.patch
Status: Patch Available  (was: In Progress)

> Deny Repl Dump if the database is a target of replication
> -
>
> Key: HIVE-23474
> URL: https://issues.apache.org/jira/browse/HIVE-23474
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23474.01.patch, HIVE-23474.02.patch, 
> HIVE-23474.03.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23474) Deny Repl Dump if the database is a target of replication

2020-07-17 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-23474:
---
Status: In Progress  (was: Patch Available)

> Deny Repl Dump if the database is a target of replication
> -
>
> Key: HIVE-23474
> URL: https://issues.apache.org/jira/browse/HIVE-23474
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23474.01.patch, HIVE-23474.02.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23835) Repl Dump should dump function binaries to staging directory

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23835?focusedWorklogId=460270=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460270
 ]

ASF GitHub Bot logged work on HIVE-23835:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 12:15
Start Date: 17/Jul/20 12:15
Worklog Time Spent: 10m 
  Work Description: aasha commented on a change in pull request #1249:
URL: https://github.com/apache/hive/pull/1249#discussion_r456403763



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/CreateFunctionHandler.java
##
@@ -41,13 +53,36 @@ CreateFunctionMessage eventMessage(String 
stringRepresentation) {
   public void handle(Context withinContext) throws Exception {
 LOG.info("Processing#{} CREATE_FUNCTION message : {}", fromEventId(), 
eventMessageAsJSON);
 Path metadataPath = new Path(withinContext.eventRoot, 
EximUtil.METADATA_NAME);
+Path dataPath = new Path(withinContext.eventRoot, EximUtil.DATA_PATH_NAME);
 FileSystem fileSystem = metadataPath.getFileSystem(withinContext.hiveConf);
-
+List functionBinaryCopyPaths = new ArrayList<>();
 try (JsonWriter jsonWriter = new JsonWriter(fileSystem, metadataPath)) {
-  new FunctionSerializer(eventMessage.getFunctionObj(), 
withinContext.hiveConf)
-  .writeTo(jsonWriter, withinContext.replicationSpec);
+  FunctionSerializer serializer = new 
FunctionSerializer(eventMessage.getFunctionObj(),
+  dataPath, withinContext.hiveConf);
+  serializer.writeTo(jsonWriter, withinContext.replicationSpec);
+  functionBinaryCopyPaths.addAll(serializer.getFunctionBinaryCopyPaths());
 }
 withinContext.createDmd(this).write();
+copyFunctionBinaries(functionBinaryCopyPaths, withinContext.hiveConf);
+  }
+
+  private void copyFunctionBinaries(List 
functionBinaryCopyPaths, HiveConf hiveConf)

Review comment:
   Does this depend on whether copy of load flag is true or false? Or 
always we will do it at the time of load?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460270)
Time Spent: 40m  (was: 0.5h)

> Repl Dump should dump function binaries to staging directory
> 
>
> Key: HIVE-23835
> URL: https://issues.apache.org/jira/browse/HIVE-23835
> Project: Hive
>  Issue Type: Task
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23835.01.patch, HIVE-23835.02.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {color:#172b4d}When hive function's binaries are on source HDFS, repl dump 
> should dump it to the staging location in order to break cross clusters 
> visibility requirement.{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23871) ObjectStore should properly handle MicroManaged Table properties

2020-07-17 Thread Panagiotis Garefalakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-23871:
--
Component/s: Metastore

> ObjectStore should properly handle MicroManaged Table properties
> 
>
> Key: HIVE-23871
> URL: https://issues.apache.org/jira/browse/HIVE-23871
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
> Attachments: table1
>
>
> HIVE-23281 optimizes StorageDescriptor conversion as part of the ObjectStore 
> by skipping particular Table properties like SkewInfo, bucketCols, ordering 
> etc.
>  However, it does that for all Transactional Tables – not only ACID – causing 
> MicroManaged Tables to behave abnormally.
>  MicroManaged (insert_only) tables may miss needed properties such as Storage 
> Desc Params – that may define how lines are delimited (like in the example 
> below):
> To repro the issue:
> {code:java}
> CREATE TRANSACTIONAL TABLE delim_table_trans(id INT, name STRING, safety INT) 
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE;
> LOAD DATA INPATH 'table1' OVERWRITE INTO TABLE delim_table_trans;
> describe formatted delim_table_trans;
> SELECT * FROM delim_table_trans;
> {code}
> Result:
> {code:java}
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   bucketing_version   2   
>   numFiles1   
>   numRows 0   
>   rawDataSize 0   
>   totalSize   72  
>   transactional   true
>   transactional_propertiesinsert_only 
>  A masked pattern was here 
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> PREHOOK: query: SELECT * FROM delim_table_trans
> PREHOOK: type: QUERY
> PREHOOK: Input: default@delim_table_trans
>  A masked pattern was here 
> POSTHOOK: query: SELECT * FROM delim_table_trans
> POSTHOOK: type: QUERY
> POSTHOOK: Input: default@delim_table_trans
>  A masked pattern was here 
> NULL  NULLNULL
> NULL  NULLNULL
> NULL  NULLNULL
> NULL  NULLNULL
> NULL  NULLNULL
> NULL  NULLNULL
>  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23871) ObjectStore should properly handle MicroManaged Table properties

2020-07-17 Thread Panagiotis Garefalakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-23871:
--
Description: 
HIVE-23281 optimizes StorageDescriptor conversion as part of the ObjectStore by 
skipping particular Table properties like SkewInfo, bucketCols, ordering etc.
 However, it does that for all Transactional Tables – not only ACID – causing 
MicroManaged Tables to behave abnormally.
 MicroManaged (insert_only) tables may miss needed properties such as Storage 
Desc Params – that may define how lines are delimited (like in the example 
below):

To repro the issue:
{code:java}
CREATE TRANSACTIONAL TABLE delim_table_trans(id INT, name STRING, safety INT) 
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE;
LOAD DATA INPATH 'table1' OVERWRITE INTO TABLE delim_table_trans;
describe formatted delim_table_trans;
SELECT * FROM delim_table_trans;
{code}
Result:
{code:java}
Table Type: MANAGED_TABLE
Table Parameters:
bucketing_version   2   
numFiles1   
numRows 0   
rawDataSize 0   
totalSize   72  
transactional   true
transactional_propertiesinsert_only 
 A masked pattern was here 
 
# Storage Information
SerDe Library:  org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
 
InputFormat:org.apache.hadoop.mapred.TextInputFormat 
OutputFormat:   
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
Compressed: No   
Num Buckets:-1   
Bucket Columns: []   
Sort Columns:   []   
PREHOOK: query: SELECT * FROM delim_table_trans
PREHOOK: type: QUERY
PREHOOK: Input: default@delim_table_trans
 A masked pattern was here 
POSTHOOK: query: SELECT * FROM delim_table_trans
POSTHOOK: type: QUERY
POSTHOOK: Input: default@delim_table_trans
 A masked pattern was here 
NULLNULLNULL
NULLNULLNULL
NULLNULLNULL
NULLNULLNULL
NULLNULLNULL
NULLNULLNULL
 {code}

  was:
HIVE-23281 optimizes StorageDescriptor conversion as part of the ObjectStore by 
skipping particular Table properties like SkewInfo, bucketCols, ordering etc.
 However, it does that for all Transactional Tables – not only ACID – causing 
MicroManaged Tables to behave abnormally.
MicroManaged (insert_only) tables may miss needed properties such as Storage 
Desc Params – that may define how lines are delimited (like in the example 
below):

To repro the issue:
{code:java}
CREATE TRANSACTIONAL TABLE delim_table_trans(id INT, name STRING, safety INT) 
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE;
LOAD DATA INPATH 'table1' OVERWRITE INTO TABLE delim_table_trans;
describe formatted delim_table_trans;
SELECT * FROM delim_table_trans;
{code}
Result:
{code:java}
# Storage Information
SerDe Library:  org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
 
InputFormat:org.apache.hadoop.mapred.TextInputFormat 
OutputFormat:   
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
Compressed: No   
Num Buckets:-1   
Bucket Columns: []   
Sort Columns:   []   
PREHOOK: query: SELECT * FROM delim_table_trans
PREHOOK: type: QUERY
PREHOOK: Input: default@delim_table_trans
 A masked pattern was here 
POSTHOOK: query: SELECT * FROM delim_table_trans
POSTHOOK: type: QUERY
POSTHOOK: Input: default@delim_table_trans
 A masked pattern was here 
NULLNULLNULL
NULLNULLNULL
NULLNULLNULL
NULLNULLNULL
NULLNULLNULL
NULLNULLNULL
{code}


> ObjectStore should properly handle MicroManaged Table properties
> 
>
> Key: HIVE-23871
> URL: https://issues.apache.org/jira/browse/HIVE-23871
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
> Attachments: table1
>
>
> HIVE-23281 optimizes StorageDescriptor conversion as part of the ObjectStore 
> by skipping particular Table properties like SkewInfo, bucketCols, ordering 
> etc.
>  However, it does that for all Transactional Tables – not only ACID – causing 
> MicroManaged Tables to behave abnormally.
>  MicroManaged (insert_only) tables may miss needed properties such as Storage 
> Desc Params –

[jira] [Updated] (HIVE-23871) ObjectStore should properly handle MicroManaged Table properties

2020-07-17 Thread Panagiotis Garefalakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-23871:
--
Description: 
HIVE-23281 optimizes StorageDescriptor conversion as part of the ObjectStore by 
skipping particular Table properties like SkewInfo, bucketCols, ordering etc.
 However, it does that for all Transactional Tables – not only ACID – causing 
MicroManaged Tables to behave abnormally.
MicroManaged (insert_only) tables may miss needed properties such as Storage 
Desc Params – that may define how lines are delimited (like in the example 
below):

To repro the issue:
{code:java}
CREATE TRANSACTIONAL TABLE delim_table_trans(id INT, name STRING, safety INT) 
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE;
LOAD DATA INPATH 'table1' OVERWRITE INTO TABLE delim_table_trans;
describe formatted delim_table_trans;
SELECT * FROM delim_table_trans;
{code}
Result:
{code:java}
# Storage Information
SerDe Library:  org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
 
InputFormat:org.apache.hadoop.mapred.TextInputFormat 
OutputFormat:   
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
Compressed: No   
Num Buckets:-1   
Bucket Columns: []   
Sort Columns:   []   
PREHOOK: query: SELECT * FROM delim_table_trans
PREHOOK: type: QUERY
PREHOOK: Input: default@delim_table_trans
 A masked pattern was here 
POSTHOOK: query: SELECT * FROM delim_table_trans
POSTHOOK: type: QUERY
POSTHOOK: Input: default@delim_table_trans
 A masked pattern was here 
NULLNULLNULL
NULLNULLNULL
NULLNULLNULL
NULLNULLNULL
NULLNULLNULL
NULLNULLNULL
{code}

  was:
HIVE-23281 optimizes StorageDescriptor conversion as part of the ObjectStore by 
skipping particular Table properties like SkewInfo, bucketCols, ordering etc.
However, it does that for all Transactional Tables -- not only ACID.
This causes MicroManaged (insert_only) table to skip needed properties such as 
Storage Desc Params -- that may define how lines are delimited (like in the 
example below):

To repro the issue:
{code:java}
CREATE TRANSACTIONAL TABLE delim_table_trans(id INT, name STRING, safety INT) 
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE;
LOAD DATA INPATH 'table1' OVERWRITE INTO TABLE delim_table_trans;
describe formatted delim_table_trans;
SELECT * FROM delim_table_trans;
{code}

Result:


{code:java}
# Storage Information
SerDe Library:  org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
 
InputFormat:org.apache.hadoop.mapred.TextInputFormat 
OutputFormat:   
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
Compressed: No   
Num Buckets:-1   
Bucket Columns: []   
Sort Columns:   []   
PREHOOK: query: SELECT * FROM delim_table_trans
PREHOOK: type: QUERY
PREHOOK: Input: default@delim_table_trans
 A masked pattern was here 
POSTHOOK: query: SELECT * FROM delim_table_trans
POSTHOOK: type: QUERY
POSTHOOK: Input: default@delim_table_trans
 A masked pattern was here 
NULLNULLNULL
NULLNULLNULL
NULLNULLNULL
NULLNULLNULL
NULLNULLNULL
NULLNULLNULL
{code}




> ObjectStore should properly handle MicroManaged Table properties
> 
>
> Key: HIVE-23871
> URL: https://issues.apache.org/jira/browse/HIVE-23871
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
> Attachments: table1
>
>
> HIVE-23281 optimizes StorageDescriptor conversion as part of the ObjectStore 
> by skipping particular Table properties like SkewInfo, bucketCols, ordering 
> etc.
>  However, it does that for all Transactional Tables – not only ACID – causing 
> MicroManaged Tables to behave abnormally.
> MicroManaged (insert_only) tables may miss needed properties such as Storage 
> Desc Params – that may define how lines are delimited (like in the example 
> below):
> To repro the issue:
> {code:java}
> CREATE TRANSACTIONAL TABLE delim_table_trans(id INT, name STRING, safety INT) 
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE;
> LOAD DATA INPATH 'table1' OVERWRITE INTO TABLE delim_table_trans;
> describe formatted delim_table_trans;
> SELECT * FROM delim_table_trans;
> {code}
> Result:
> {code:java}
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
>

[jira] [Updated] (HIVE-23871) ObjectStore should properly handle MicroManaged Table properties

2020-07-17 Thread Panagiotis Garefalakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-23871:
--
Description: 
HIVE-23281 optimizes StorageDescriptor conversion as part of the ObjectStore by 
skipping particular Table properties like SkewInfo, bucketCols, ordering etc.
However, it does that for all Transactional Tables -- not only ACID.
This causes MicroManaged (insert_only) table to skip needed properties such as 
Storage Desc Params -- that may define how lines are delimited (like in the 
example below):

To repro the issue:
{code:java}
CREATE TRANSACTIONAL TABLE delim_table_trans(id INT, name STRING, safety INT) 
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE;
LOAD DATA INPATH 'table1' OVERWRITE INTO TABLE delim_table_trans;
describe formatted delim_table_trans;
SELECT * FROM delim_table_trans;
{code}

Result:


{code:java}
# Storage Information
SerDe Library:  org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
 
InputFormat:org.apache.hadoop.mapred.TextInputFormat 
OutputFormat:   
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
Compressed: No   
Num Buckets:-1   
Bucket Columns: []   
Sort Columns:   []   
PREHOOK: query: SELECT * FROM delim_table_trans
PREHOOK: type: QUERY
PREHOOK: Input: default@delim_table_trans
 A masked pattern was here 
POSTHOOK: query: SELECT * FROM delim_table_trans
POSTHOOK: type: QUERY
POSTHOOK: Input: default@delim_table_trans
 A masked pattern was here 
NULLNULLNULL
NULLNULLNULL
NULLNULLNULL
NULLNULLNULL
NULLNULLNULL
NULLNULLNULL
{code}



  was:
HIVE-23281 optimizes StorageDescriptor conversion as part of the ObjectStore by 
skipping particular Table properties like SkewInfo, bucketCols, ordering etc.
However, it does that for all Transactional Tables -- not only ACID.
This causes MicroManaged (insert_only) table to skip needed properties such as 
Storage Desc Params -- that may define how lines are delimited (like in the 
example below):

To repro the issue:




> ObjectStore should properly handle MicroManaged Table properties
> 
>
> Key: HIVE-23871
> URL: https://issues.apache.org/jira/browse/HIVE-23871
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
> Attachments: table1
>
>
> HIVE-23281 optimizes StorageDescriptor conversion as part of the ObjectStore 
> by skipping particular Table properties like SkewInfo, bucketCols, ordering 
> etc.
> However, it does that for all Transactional Tables -- not only ACID.
> This causes MicroManaged (insert_only) table to skip needed properties such 
> as Storage Desc Params -- that may define how lines are delimited (like in 
> the example below):
> To repro the issue:
> {code:java}
> CREATE TRANSACTIONAL TABLE delim_table_trans(id INT, name STRING, safety INT) 
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE;
> LOAD DATA INPATH 'table1' OVERWRITE INTO TABLE delim_table_trans;
> describe formatted delim_table_trans;
> SELECT * FROM delim_table_trans;
> {code}
> Result:
> {code:java}
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> PREHOOK: query: SELECT * FROM delim_table_trans
> PREHOOK: type: QUERY
> PREHOOK: Input: default@delim_table_trans
>  A masked pattern was here 
> POSTHOOK: query: SELECT * FROM delim_table_trans
> POSTHOOK: type: QUERY
> POSTHOOK: Input: default@delim_table_trans
>  A masked pattern was here 
> NULL  NULLNULL
> NULL  NULLNULL
> NULL  NULLNULL
> NULL  NULLNULL
> NULL  NULLNULL
> NULL  NULLNULL
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23871) ObjectStore should properly handle MicroManaged Table properties

2020-07-17 Thread Panagiotis Garefalakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-23871:
--
Attachment: table1

> ObjectStore should properly handle MicroManaged Table properties
> 
>
> Key: HIVE-23871
> URL: https://issues.apache.org/jira/browse/HIVE-23871
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
> Attachments: table1
>
>
> HIVE-23281 optimizes StorageDescriptor conversion as part of the ObjectStore 
> by skipping particular Table properties like SkewInfo, bucketCols, ordering 
> etc.
> However, it does that for all Transactional Tables -- not only ACID.
> This causes MicroManaged (insert_only) table to skip needed properties such 
> as Storage Desc Params -- that may define how lines are delimited (like in 
> the example below):
> To repro the issue:



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23871) ObjectStore should properly handle MicroManaged Table properties

2020-07-17 Thread Panagiotis Garefalakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-23871:
--
Description: 
HIVE-23281 optimizes StorageDescriptor conversion as part of the ObjectStore by 
skipping particular Table properties like SkewInfo, bucketCols, ordering etc.
However, it does that for all Transactional Tables -- not only ACID.
This causes MicroManaged (insert_only) table to skip needed properties such as 
Storage Desc Params -- that may define how lines are delimited (like in the 
example below):

To repro the issue:



  was:
HIVE-23281 optimizes StorageDescriptor conversion as part of the ObjectStore by 
skipping particular Table properties like SkewInfo, bucketCols, ordering etc.
However, it does not properly handle MicroManaged tables


> ObjectStore should properly handle MicroManaged Table properties
> 
>
> Key: HIVE-23871
> URL: https://issues.apache.org/jira/browse/HIVE-23871
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>
> HIVE-23281 optimizes StorageDescriptor conversion as part of the ObjectStore 
> by skipping particular Table properties like SkewInfo, bucketCols, ordering 
> etc.
> However, it does that for all Transactional Tables -- not only ACID.
> This causes MicroManaged (insert_only) table to skip needed properties such 
> as Storage Desc Params -- that may define how lines are delimited (like in 
> the example below):
> To repro the issue:



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23871) ObjectStore should properly handle MicroManaged Table properties

2020-07-17 Thread Panagiotis Garefalakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-23871:
--
Issue Type: Bug  (was: Improvement)

> ObjectStore should properly handle MicroManaged Table properties
> 
>
> Key: HIVE-23871
> URL: https://issues.apache.org/jira/browse/HIVE-23871
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>
> HIVE-23281 optimizes StorageDescriptor conversion as part of the ObjectStore 
> by skipping particular Table properties like SkewInfo, bucketCols, ordering 
> etc.
> However, it does not properly handle MicroManaged tables



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23871) ObjectStore should properly handle MicroManaged Table properties

2020-07-17 Thread Panagiotis Garefalakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-23871:
--
Description: 
HIVE-23281 optimizes StorageDescriptor conversion as part of the ObjectStore by 
skipping particular Table properties like SkewInfo, bucketCols, ordering etc.
However, it does not properly handle MicroManaged tables

> ObjectStore should properly handle MicroManaged Table properties
> 
>
> Key: HIVE-23871
> URL: https://issues.apache.org/jira/browse/HIVE-23871
> Project: Hive
>  Issue Type: Improvement
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>
> HIVE-23281 optimizes StorageDescriptor conversion as part of the ObjectStore 
> by skipping particular Table properties like SkewInfo, bucketCols, ordering 
> etc.
> However, it does not properly handle MicroManaged tables



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-23871) ObjectStore should properly handle MicroManaged Table properties

2020-07-17 Thread Panagiotis Garefalakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis reassigned HIVE-23871:
-


> ObjectStore should properly handle MicroManaged Table properties
> 
>
> Key: HIVE-23871
> URL: https://issues.apache.org/jira/browse/HIVE-23871
> Project: Hive
>  Issue Type: Improvement
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work started] (HIVE-23871) ObjectStore should properly handle MicroManaged Table properties

2020-07-17 Thread Panagiotis Garefalakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-23871 started by Panagiotis Garefalakis.
-
> ObjectStore should properly handle MicroManaged Table properties
> 
>
> Key: HIVE-23871
> URL: https://issues.apache.org/jira/browse/HIVE-23871
> Project: Hive
>  Issue Type: Improvement
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23727) Improve SQLOperation log handling when canceling background

2020-07-17 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-23727:
---
Summary: Improve SQLOperation log handling when canceling background  (was: 
Improve SQLOperation log handling when cancel background)

> Improve SQLOperation log handling when canceling background
> ---
>
> Key: HIVE-23727
> URL: https://issues.apache.org/jira/browse/HIVE-23727
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> The SQLOperation checks _if (shouldRunAsync() && state != 
> OperationState.CANCELED && state != OperationState.TIMEDOUT)_ to cancel the 
> background task. If true, the state should not be OperationState.CANCELED, so 
> logging under the state == OperationState.CANCELED should never happen.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23727) Improve SQLOperation log handling when cancel background

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23727?focusedWorklogId=460186=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460186
 ]

ASF GitHub Bot logged work on HIVE-23727:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 09:05
Start Date: 17/Jul/20 09:05
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 opened a new pull request #1272:
URL: https://github.com/apache/hive/pull/1272


   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY)
   For more details, please see 
https://cwiki.apache.org/confluence/display/Hive/HowToContribute
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460186)
Time Spent: 4h  (was: 3h 50m)

> Improve SQLOperation log handling when cancel background
> 
>
> Key: HIVE-23727
> URL: https://issues.apache.org/jira/browse/HIVE-23727
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> The SQLOperation checks _if (shouldRunAsync() && state != 
> OperationState.CANCELED && state != OperationState.TIMEDOUT)_ to cancel the 
> background task. If true, the state should not be OperationState.CANCELED, so 
> logging under the state == OperationState.CANCELED should never happen.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23339) SBA does not check permissions for DB location specified in Create or Alter database query

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23339?focusedWorklogId=460185=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460185
 ]

ASF GitHub Bot logged work on HIVE-23339:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 09:04
Start Date: 17/Jul/20 09:04
Worklog Time Spent: 10m 
  Work Description: miklosgergely merged pull request #1011:
URL: https://github.com/apache/hive/pull/1011


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460185)
Time Spent: 0.5h  (was: 20m)

> SBA does not check permissions for DB location specified in Create or Alter 
> database query
> --
>
> Key: HIVE-23339
> URL: https://issues.apache.org/jira/browse/HIVE-23339
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Riju Trivedi
>Assignee: Shubham Chaurasia
>Priority: Critical
>  Labels: pull-request-available
> Attachments: HIVE-23339.01.patch, HIVE-23339.02.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> With doAs=true and StorageBasedAuthorization provider, create database with 
> specific location succeeds even if user doesn't have access to that path.
>  
> {code:java}
>   hadoop fs -ls -d /tmp/cannot_write
>  drwx-- - hive hadoop 0 2020-04-01 22:53 /tmp/cannot_write
> create a database under /tmp/cannot_write. We would expect it to fail, but is 
> actually created successfully with "hive" as the owner:
> rtrivedi@bdp01:~> beeline -e "create database rtrivedi_1 location 
> '/tmp/cannot_write/rtrivedi_1'"
>  INFO : OK
>  No rows affected (0.116 seconds)
> hive@hpchdd2e:~> hadoop fs -ls /tmp/cannot_write
>  Found 1 items
>  drwx-- - hive hadoop 0 2020-04-01 23:05 /tmp/cannot_write/rtrivedi_1
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23727) Improve SQLOperation log handling when cancel background

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23727?focusedWorklogId=460182=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460182
 ]

ASF GitHub Bot logged work on HIVE-23727:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 09:01
Start Date: 17/Jul/20 09:01
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 closed pull request #1149:
URL: https://github.com/apache/hive/pull/1149


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460182)
Time Spent: 3h 50m  (was: 3h 40m)

> Improve SQLOperation log handling when cancel background
> 
>
> Key: HIVE-23727
> URL: https://issues.apache.org/jira/browse/HIVE-23727
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> The SQLOperation checks _if (shouldRunAsync() && state != 
> OperationState.CANCELED && state != OperationState.TIMEDOUT)_ to cancel the 
> background task. If true, the state should not be OperationState.CANCELED, so 
> logging under the state == OperationState.CANCELED should never happen.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23867) Truncate table fail with AccessControlException if doAs enabled and tbl database has source of replication

2020-07-17 Thread Aasha Medhi (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159785#comment-17159785
 ] 

Aasha Medhi commented on HIVE-23867:


After this patch HIVE-22736, for external tables we skip CM. So you will not 
face this issue.

> Truncate table fail with AccessControlException if doAs enabled and tbl 
> database has source of replication
> --
>
> Key: HIVE-23867
> URL: https://issues.apache.org/jira/browse/HIVE-23867
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, repl
>Affects Versions: 3.1.1
>Reporter: Rajkumar Singh
>Priority: Major
>
> Steps to repro:
> 1. enable doAs
> 2. with some user (not a super user) create database 
> create database sampledb with dbproperties('repl.source.for'='1,2,3');
> 3. create table using create table sampledb.sampletble (id int);
> 4. insert some data into it insert into sampledb.sampletble values (1), 
> (2),(3);
> 5. Run truncate command on the table which fail with following error
> {code:java}
>  org.apache.hadoop.ipc.RemoteException: User username is not a super user 
> (non-super user cannot change owner).
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setOwner(FSDirAttrOp.java:85)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setOwner(FSNamesystem.java:1907)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setOwner(NameNodeRpcServer.java:866)
>  at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setOwner(ClientNamenodeProtocolServerSideTranslatorPB.java:531)
>  at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
>  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
>  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
>  
>  at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1498) 
> ~[hadoop-common-3.1.1.3.1.5.0-152.jar:?]
>  at org.apache.hadoop.ipc.Client.call(Client.java:1444) 
> ~[hadoop-common-3.1.1.3.1.5.0-152.jar:?]
>  at org.apache.hadoop.ipc.Client.call(Client.java:1354) 
> ~[hadoop-common-3.1.1.3.1.5.0-152.jar:?]
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
>  ~[hadoop-common-3.1.1.3.1.5.0-152.jar:?]
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>  ~[hadoop-common-3.1.1.3.1.5.0-152.jar:?]
>  at com.sun.proxy.$Proxy31.setOwner(Unknown Source) ~[?:?]
>  at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.setOwner(ClientNamenodeProtocolTranslatorPB.java:470)
>  ~[hadoop-hdfs-client-3.1.1.3.1.5.0-152.jar:?]
>  at sun.reflect.GeneratedMethodAccessor151.invoke(Unknown Source) ~[?:?]
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_232]
>  at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_232]
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>  [hadoop-common-3.1.1.3.1.5.0-152.jar:?]
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>  ~[hadoop-common-3.1.1.3.1.5.0-152.jar:?]
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>  ~[hadoop-common-3.1.1.3.1.5.0-152.jar:?]
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>  [hadoop-common-3.1.1.3.1.5.0-152.jar:?]
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>  [hadoop-common-3.1.1.3.1.5.0-152.jar:?]
>  at com.sun.proxy.$Proxy32.setOwner(Unknown Source) [?:?]
>  at org.apache.hadoop.hdfs.DFSClient.setOwner(DFSClient.java:1914) 
> [hadoop-hdfs-client-3.1.1.3.1.5.0-152.jar:?]
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$36.doCall(DistributedFileSystem.java:1764)
>  [hadoop-hdfs-client-3.1.1.3.1.5.0-152.jar:?]
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$36.doCall(DistributedFileSystem.java:1761)
>  [hadoop-hdfs-client-3.1.1.3.1.5.0-152.jar:?]
>  at 
>

[jira] [Commented] (HIVE-23069) Memory efficient iterator should be used during replication.

2020-07-17 Thread Aasha Medhi (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159781#comment-17159781
 ] 

Aasha Medhi commented on HIVE-23069:


+1

> Memory efficient iterator should be used during replication.
> 
>
> Key: HIVE-23069
> URL: https://issues.apache.org/jira/browse/HIVE-23069
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23069.01.patch, HIVE-23069.02.patch, 
> HIVE-23069.03.patch
>
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> Currently the iterator used while copying table data is memory based. In case 
> of a database with very large number of table/partitions, such iterator may 
> cause HS2 process to go OOM.
> Also introduces a config option to run data copy tasks during repl load 
> operation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23851) MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions

2020-07-17 Thread Syed Shameerur Rahman (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159745#comment-17159745
 ] 

Syed Shameerur Rahman commented on HIVE-23851:
--

[~kgyrtkirk] I have raised a PR following the approach 2, Could you please 
review?

> MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions
> 
>
> Key: HIVE-23851
> URL: https://issues.apache.org/jira/browse/HIVE-23851
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Steps to reproduce:*
> # Create external table
> # Run msck command to sync all the partitions with metastore
> # Remove one of the partition path
> # Run msck repair with partition filtering
> *Stack Trace:*
> {code:java}
>  2020-07-15T02:10:29,045 ERROR [4dad298b-28b1-4e6b-94b6-aa785b60c576 main] 
> ppr.PartitionExpressionForMetastore: Failed to deserialize the expression
>  java.lang.IndexOutOfBoundsException: Index: 110, Size: 0
>  at java.util.ArrayList.rangeCheck(ArrayList.java:657) ~[?:1.8.0_192]
>  at java.util.ArrayList.get(ArrayList.java:433) ~[?:1.8.0_192]
>  at 
> org.apache.hive.com.esotericsoftware.kryo.util.MapReferenceResolver.getReadObject(MapReferenceResolver.java:60)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:857)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:707) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:211)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeObjectFromKryo(SerializationUtilities.java:806)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeExpressionFromKryo(SerializationUtilities.java:775)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.deserializeExpr(PartitionExpressionForMetastore.java:96)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.convertExprToFilter(PartitionExpressionForMetastore.java:52)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.PartFilterExprUtil.makeExpressionTree(PartFilterExprUtil.java:48)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:3593)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.VerifyingObjectStore.getPartitionsByExpr(VerifyingObjectStore.java:80)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT-tests.jar:4.0.0-SNAPSHOT]
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_192]
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_192]
> {code}
> *Cause:*
> In case of msck repair with partition filtering we expect expression proxy 
> class to be set as PartitionExpressionForMetastore ( 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/ddl/misc/msck/MsckAnalyzer.java#L78
>  ), While dropping partition we serialize the drop partition filter 
> expression as ( 
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java#L589
>  ) which is incompatible during deserializtion happening in 
> PartitionExpressionForMetastore ( 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionExpressionForMetastore.java#L52
>  ) hence the query fails with Failed to deserialize the expression.
> *Solutions*:
> I could think of two approaches to this problem
> # Since PartitionExpressionForMetastore is required only during parition 
> pruning step, We can switch back the expression proxy class to 
> MsckPartitionExpressionProxy once the partition pruning step is done.
> # The other solution is to make serialization process in msck drop partition 
> filter expression compatible with the one with 
> PartitionExpressionForMetastore, We can do this via Reflection since the drop 
> partition serialization happens in Msck class (standadlone-metatsore) by this 
> way we can

[jira] [Updated] (HIVE-23851) MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-23851:
--
Labels: pull-request-available  (was: )

> MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions
> 
>
> Key: HIVE-23851
> URL: https://issues.apache.org/jira/browse/HIVE-23851
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Steps to reproduce:*
> # Create external table
> # Run msck command to sync all the partitions with metastore
> # Remove one of the partition path
> # Run msck repair with partition filtering
> *Stack Trace:*
> {code:java}
>  2020-07-15T02:10:29,045 ERROR [4dad298b-28b1-4e6b-94b6-aa785b60c576 main] 
> ppr.PartitionExpressionForMetastore: Failed to deserialize the expression
>  java.lang.IndexOutOfBoundsException: Index: 110, Size: 0
>  at java.util.ArrayList.rangeCheck(ArrayList.java:657) ~[?:1.8.0_192]
>  at java.util.ArrayList.get(ArrayList.java:433) ~[?:1.8.0_192]
>  at 
> org.apache.hive.com.esotericsoftware.kryo.util.MapReferenceResolver.getReadObject(MapReferenceResolver.java:60)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:857)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:707) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:211)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeObjectFromKryo(SerializationUtilities.java:806)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeExpressionFromKryo(SerializationUtilities.java:775)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.deserializeExpr(PartitionExpressionForMetastore.java:96)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.convertExprToFilter(PartitionExpressionForMetastore.java:52)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.PartFilterExprUtil.makeExpressionTree(PartFilterExprUtil.java:48)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:3593)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.VerifyingObjectStore.getPartitionsByExpr(VerifyingObjectStore.java:80)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT-tests.jar:4.0.0-SNAPSHOT]
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_192]
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_192]
> {code}
> *Cause:*
> In case of msck repair with partition filtering we expect expression proxy 
> class to be set as PartitionExpressionForMetastore ( 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/ddl/misc/msck/MsckAnalyzer.java#L78
>  ), While dropping partition we serialize the drop partition filter 
> expression as ( 
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java#L589
>  ) which is incompatible during deserializtion happening in 
> PartitionExpressionForMetastore ( 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionExpressionForMetastore.java#L52
>  ) hence the query fails with Failed to deserialize the expression.
> *Solutions*:
> I could think of two approaches to this problem
> # Since PartitionExpressionForMetastore is required only during parition 
> pruning step, We can switch back the expression proxy class to 
> MsckPartitionExpressionProxy once the partition pruning step is done.
> # The other solution is to make serialization process in msck drop partition 
> filter expression compatible with the one with 
> PartitionExpressionForMetastore, We can do this via Reflection since the drop 
> partition serialization happens in Msck class (standadlone-metatsore) by this 
> way we can completely remove the need for class MsckPartitionExpressionProxy 
> and this also helps to reduce the

[jira] [Work logged] (HIVE-23851) MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23851?focusedWorklogId=460123=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460123
 ]

ASF GitHub Bot logged work on HIVE-23851:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 07:36
Start Date: 17/Jul/20 07:36
Worklog Time Spent: 10m 
  Work Description: shameersss1 opened a new pull request #1271:
URL: https://github.com/apache/hive/pull/1271


   Refer: https://issues.apache.org/jira/browse/HIVE-23851 for more information
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460123)
Remaining Estimate: 0h
Time Spent: 10m

> MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions
> 
>
> Key: HIVE-23851
> URL: https://issues.apache.org/jira/browse/HIVE-23851
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
> Fix For: 4.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Steps to reproduce:*
> # Create external table
> # Run msck command to sync all the partitions with metastore
> # Remove one of the partition path
> # Run msck repair with partition filtering
> *Stack Trace:*
> {code:java}
>  2020-07-15T02:10:29,045 ERROR [4dad298b-28b1-4e6b-94b6-aa785b60c576 main] 
> ppr.PartitionExpressionForMetastore: Failed to deserialize the expression
>  java.lang.IndexOutOfBoundsException: Index: 110, Size: 0
>  at java.util.ArrayList.rangeCheck(ArrayList.java:657) ~[?:1.8.0_192]
>  at java.util.ArrayList.get(ArrayList.java:433) ~[?:1.8.0_192]
>  at 
> org.apache.hive.com.esotericsoftware.kryo.util.MapReferenceResolver.getReadObject(MapReferenceResolver.java:60)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:857)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:707) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:211)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeObjectFromKryo(SerializationUtilities.java:806)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeExpressionFromKryo(SerializationUtilities.java:775)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.deserializeExpr(PartitionExpressionForMetastore.java:96)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.convertExprToFilter(PartitionExpressionForMetastore.java:52)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.PartFilterExprUtil.makeExpressionTree(PartFilterExprUtil.java:48)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:3593)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.VerifyingObjectStore.getPartitionsByExpr(VerifyingObjectStore.java:80)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT-tests.jar:4.0.0-SNAPSHOT]
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_192]
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_192]
> {code}
> *Cause:*
> In case of msck repair with partition filtering we expect expression proxy 
> class to be set as PartitionExpressionForMetastore ( 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/ddl/misc/msck/MsckAnalyzer.java#L78
>  ), While dropping partition we serialize the drop partition filter 
> expression as ( 
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java#L589
>  ) which is incompatible during deserializtion happening in 
> PartitionExpressionForMetastore ( 
>

[jira] [Work logged] (HIVE-23837) HbaseStorageHandler is not configured properly when the FileSinkOperator is the child of a MergeJoinWork

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23837?focusedWorklogId=460124=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460124
 ]

ASF GitHub Bot logged work on HIVE-23837:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 07:36
Start Date: 17/Jul/20 07:36
Worklog Time Spent: 10m 
  Work Description: deniskuzZ merged pull request #1244:
URL: https://github.com/apache/hive/pull/1244


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460124)
Time Spent: 1h 20m  (was: 1h 10m)

> HbaseStorageHandler is not configured properly when the FileSinkOperator is 
> the child of a MergeJoinWork
> 
>
> Key: HIVE-23837
> URL: https://issues.apache.org/jira/browse/HIVE-23837
> Project: Hive
>  Issue Type: Bug
>Reporter: Peter Varga
>Assignee: Peter Varga
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> If the FileSinkOperator's root operator is a MergeJoinWork the 
> HbaseStorageHandler.configureJobConf will never get called, and the execution 
> will miss the HBASE_AUTH_TOKEN and the hbase jars.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23474) Deny Repl Dump if the database is a target of replication

2020-07-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23474?focusedWorklogId=460100=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-460100
 ]

ASF GitHub Bot logged work on HIVE-23474:
-

Author: ASF GitHub Bot
Created on: 17/Jul/20 06:23
Start Date: 17/Jul/20 06:23
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #1247:
URL: https://github.com/apache/hive/pull/1247#discussion_r456242103



##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcrossInstances.java
##
@@ -944,96 +944,6 @@ public void testIncrementalDumpMultiIteration() throws 
Throwable {
 Assert.assertEquals(IncrementalLoadTasksBuilder.getNumIteration(), 
numEvents);
   }
 
-  @Test
-  public void testIfCkptAndSourceOfReplPropsIgnoredByReplDump() throws 
Throwable {

Review comment:
   This was testing that source of replication properties are ignored by 
replication while the custom ones are not. Can we retain that part?

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java
##
@@ -219,6 +219,11 @@ private void initReplDump(ASTNode ast) throws 
HiveException {
   " as it is not a source of replication (repl.source.for)");
   throw new 
SemanticException(ErrorMsg.REPL_DATABASE_IS_NOT_SOURCE_OF_REPLICATION.getMsg());
 }
+if (ReplUtils.isTargetOfReplication(database)) {
+  LOG.error("Cannot dump database " + dbNameOrPattern +
+" as it is a target of replication (repl.target.for)");

Review comment:
   nit: Can accommodate in one line





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 460100)
Time Spent: 20m  (was: 10m)

> Deny Repl Dump if the database is a target of replication
> -
>
> Key: HIVE-23474
> URL: https://issues.apache.org/jira/browse/HIVE-23474
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23474.01.patch, HIVE-23474.02.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

99 matches

Mail list logo