Review Request 71561: HIVE-22250

2019-09-30 Thread Krisztian Kasa

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71561/
---

Review request for hive, Jesús Camacho Rodríguez, Zoltan Haindrich, and Vineet 
Garg.


Bugs: HIVE-22250
https://issues.apache.org/jira/browse/HIVE-22250


Repository: hive-git


Description
---

Describe function does not provide description for rank functions
=
The `DESCRIBE FUNCTION` command gets the description of a function from the 
`@Description` annotations `value` field. If an UDF is annotated with the 
`@WindowFunctionDescription` hive prints 
```
There is no documentation for function 
```
Even if the description is present in the `@WindowFunctionDescription` 
annotation.
This patch implements a fall back to get the description text from 
`@WindowFunctionDescription` if `@Description` annotation does not exists.


Diffs
-

  
ql/src/java/org/apache/hadoop/hive/ql/ddl/function/desc/DescFunctionOperation.java
 6a94a93ef9 
  ql/src/test/queries/clientpositive/desc_function.q d055d9ca03 
  ql/src/test/results/clientpositive/desc_function.q.out 1f804bba60 


Diff: https://reviews.apache.org/r/71561/diff/1/


Testing
---

Added test cases to `desc_function.q`:
```
DESCRIBE FUNCTION dense_rank;
DESCRIBE FUNCTION EXTENDED dense_rank;
```


Thanks,

Krisztian Kasa



[jira] [Created] (HIVE-22275) OperationManager.queryIdOperation does not properly clean up multiple queryIds

2019-09-30 Thread Jason Dere (Jira)
Jason Dere created HIVE-22275:
-

 Summary: OperationManager.queryIdOperation does not properly clean 
up multiple queryIds
 Key: HIVE-22275
 URL: https://issues.apache.org/jira/browse/HIVE-22275
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Jason Dere
Assignee: Jason Dere


In the case that multiple statements are run by a single Session before being 
cleaned up, it appears that OperationManager.queryIdOperation is not cleaned up 
properly.
See the log statements below - with the exception of the first "Removed 
queryId:" log line, the queryId listed during cleanup is the same, when each of 
these handles should have their own queryId. Looks like only the last queryId 
executed is being cleaned up.

As a result, HS2 can run out of memory as OperationManager.queryIdOperation 
grows and never cleans these queryIds/Operations up.

{noformat}
2019-09-13T08:37:36,785 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a 
HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - 
Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, 
getHandleIdentifier()=dfed4c18-a284-4640-9f4a-1a20527105f9]
2019-09-13T08:37:38,432 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a 
HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - 
Removed queryId: hive_20190913083736_c49cf3cc-cfe8-48a1-bd22-8b924dfb0396 
corresponding to operation: OperationHandle [opType=EXECUTE_STATEMENT, 
getHandleIdentifier()=dfed4c18-a284-4640-9f4a-1a20527105f9] with tag: null
2019-09-13T08:37:38,469 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a 
HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - 
Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, 
getHandleIdentifier()=24d0030c-0e49-45fb-a918-2276f0941cfb]
2019-09-13T08:37:52,662 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a 
HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - 
Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, 
getHandleIdentifier()=b983802c-1dec-4fa0-8680-d05ab555321b]
2019-09-13T08:37:56,239 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a 
HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - 
Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, 
getHandleIdentifier()=75dbc531-2964-47b2-84d7-85b59f88999c]
2019-09-13T08:38:02,551 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a 
HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - 
Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, 
getHandleIdentifier()=72c79076-9d67-4894-a526-c233fa5450b2]
2019-09-13T08:38:10,558 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a 
HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - 
Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, 
getHandleIdentifier()=17b30a62-612d-4b70-9ba7-4287d2d9229b]
2019-09-13T08:38:16,930 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a 
HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - 
Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, 
getHandleIdentifier()=ea97e99d-cc77-470b-b49a-b869c73a4615]
2019-09-13T08:38:20,440 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a 
HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - 
Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, 
getHandleIdentifier()=a277b789-ebb8-4925-878f-6728d3e8c5fb]
2019-09-13T08:38:26,303 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a 
HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - 
Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, 
getHandleIdentifier()=9a023ab8-aa80-45db-af88-94790cc83033]
2019-09-13T08:38:30,791 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a 
HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - 
Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, 
getHandleIdentifier()=b697c801-7da0-4544-bcfa-442eb1d3bd77]
2019-09-13T08:39:10,187 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a 
HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - 
Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, 
getHandleIdentifier()=bda93c8f-0822-4592-a61c-4701720a1a5c]
2019-09-13T08:39:15,471 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a 
HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - 
Removed queryId: hive_20190913083910_c4809ca8-d8db-423c-8b6d-fbe3eee89971 
corresponding to operation: OperationHandle [opType=EXECUTE_STATEMENT, 
getHandleIdentifier()=24d0030c-0e49-45fb-a918-2276f0941cfb] with tag: null
2019-09-13T08:39:15,507 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a 
HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - 
Removed queryId: hive_20190913083910_c4809ca8-d8db-423c-8b6d-fbe3eee89971 
corresponding to operation: OperationHandle [opType=EXECUTE_STATEMENT, 

[jira] [Created] (HIVE-22274) Upgrade Calcite version to 1.21.0

2019-09-30 Thread Steve Carlin (Jira)
Steve Carlin created HIVE-22274:
---

 Summary: Upgrade Calcite version to 1.21.0
 Key: HIVE-22274
 URL: https://issues.apache.org/jira/browse/HIVE-22274
 Project: Hive
  Issue Type: Task
Affects Versions: 3.1.2
Reporter: Steve Carlin
Assignee: Steve Carlin






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 71558: HIVE-21987: Hive is unable to read Parquet int32 annotated with decimal

2019-09-30 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71558/#review217991
---


Ship it!




Ship It!

- Peter Vary


On szept. 30, 2019, 11:53 de, Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71558/
> ---
> 
> (Updated szept. 30, 2019, 11:53 de)
> 
> 
> Review request for hive and Peter Vary.
> 
> 
> Bugs: HIVE-21987
> https://issues.apache.org/jira/browse/HIVE-21987
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Added support to read INT32 Parquet decimals.
> 
> 
> Diffs
> -
> 
>   data/files/parquet_int_decimal_1.parquet PRE-CREATION 
>   data/files/parquet_int_decimal_2.parquet PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 
> 350ae2d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/ParquetDataColumnReaderFactory.java
>  320ce52 
>   ql/src/test/queries/clientpositive/parquet_int_decimal.q PRE-CREATION 
>   ql/src/test/results/clientpositive/parquet_int_decimal.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/type_change_test_fraction.q.out 07cf8fa 
> 
> 
> Diff: https://reviews.apache.org/r/71558/diff/1/
> 
> 
> Testing
> ---
> 
> Added new q tests for the use-case.
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



[jira] [Created] (HIVE-22273) Access check is failed when a temporary directory is removed

2019-09-30 Thread Peter Vary (Jira)
Peter Vary created HIVE-22273:
-

 Summary: Access check is failed when a temporary directory is 
removed
 Key: HIVE-22273
 URL: https://issues.apache.org/jira/browse/HIVE-22273
 Project: Hive
  Issue Type: Bug
Reporter: Peter Vary
Assignee: Peter Vary


The following exception is thrown if a temporary file is deleted during the 
access checks:
{code}
2019-09-24T16:12:59,611 ERROR [7e491237-1505-4388-afb9-5ec2a688b0dc 
HiveServer2-HttpHandler-Pool: Thread-11941]: authorizer.RangerHiveAuthorizer 
(:()) - Error getting permissions for hdfs://HDFS_FOLDER/TABLE_NAME
java.io.FileNotFoundException: File 
hdfs://HDFS_FOLDER/TABLE_NAME/.hive-staging_hive_2019-09-24_16-12-48_899_7291847300113791212-245/_tmp.-ext-10001
 does not exist.
at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1059)
 ~[hadoop-hdfs-client-3.1.1.3.1.0.0-78.jar:?]
at 
org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131)
 ~[hadoop-hdfs-client-3.1.1.3.1.0.0-78.jar:?]
at 
org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1119)
 ~[hadoop-hdfs-client-3.1.1.3.1.0.0-78.jar:?]
at 
org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1116)
 ~[hadoop-hdfs-client-3.1.1.3.1.0.0-78.jar:?]
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
 ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]
at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1126)
 ~[hadoop-hdfs-client-3.1.1.3.1.0.0-78.jar:?]
at 
org.apache.hadoop.hive.common.FileUtils.checkIsOwnerOfFileHierarchy(FileUtils.java:561)
 ~[hive-common-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
at 
org.apache.hadoop.hive.common.FileUtils.checkIsOwnerOfFileHierarchy(FileUtils.java:564)
 ~[hive-common-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
at 
org.apache.hadoop.hive.common.FileUtils.checkIsOwnerOfFileHierarchy(FileUtils.java:564)
 ~[hive-common-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
at org.apache.hadoop.hive.common.FileUtils$4.run(FileUtils.java:540) 
~[hive-common-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
at org.apache.hadoop.hive.common.FileUtils$4.run(FileUtils.java:536) 
~[hive-common-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_171]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_171]
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
 ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]
at 
org.apache.hadoop.hive.common.FileUtils.isOwnerOfFileHierarchy(FileUtils.java:536)
 ~[hive-common-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
at 
org.apache.hadoop.hive.common.FileUtils.isOwnerOfFileHierarchy(FileUtils.java:527)
 ~[hive-common-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
at 
org.apache.ranger.authorization.hive.authorizer.RangerHiveAuthorizer.isURIAccessAllowed(RangerHiveAuthorizer.java:1420)
 ~[?:?]
at 
org.apache.ranger.authorization.hive.authorizer.RangerHiveAuthorizer.checkPrivileges(RangerHiveAuthorizer.java:287)
 ~[?:?]
at org.apache.hadoop.hive.ql.Driver.doAuthorizationV2(Driver.java:1336) 
~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:1100) 
~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:709) 
~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1869) 
~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1816) 
~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1811) 
~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
 ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
at 
org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:197)
 ~[hive-service-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
at 
org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:262)
 ~[hive-service-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
at org.apache.hive.service.cli.operation.Operation.run(Operation.java:247) 
~[hive-service-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:575)
 ~[hive-service-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:561)
 ~[hive-service-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
at sun.reflect.GeneratedMethodAccessor148.invoke(Unknown Source) ~[?:?]

Review Request 71558: HIVE-21987: Hive is unable to read Parquet int32 annotated with decimal

2019-09-30 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71558/
---

Review request for hive and Peter Vary.


Bugs: HIVE-21987
https://issues.apache.org/jira/browse/HIVE-21987


Repository: hive-git


Description
---

Added support to read INT32 Parquet decimals.


Diffs
-

  data/files/parquet_int_decimal_1.parquet PRE-CREATION 
  data/files/parquet_int_decimal_2.parquet PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 
350ae2d 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/ParquetDataColumnReaderFactory.java
 320ce52 
  ql/src/test/queries/clientpositive/parquet_int_decimal.q PRE-CREATION 
  ql/src/test/results/clientpositive/parquet_int_decimal.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/type_change_test_fraction.q.out 07cf8fa 


Diff: https://reviews.apache.org/r/71558/diff/1/


Testing
---

Added new q tests for the use-case.


Thanks,

Marta Kuczora



[jira] [Created] (HIVE-22272) Hive embedded HS2 throws metastore exceptions from MetastoreStatsConnector thread

2019-09-30 Thread mahesh kumar behera (Jira)
mahesh kumar behera created HIVE-22272:
--

 Summary: Hive embedded HS2 throws metastore exceptions from 
MetastoreStatsConnector thread
 Key: HIVE-22272
 URL: https://issues.apache.org/jira/browse/HIVE-22272
 Project: Hive
  Issue Type: Bug
Reporter: mahesh kumar behera
Assignee: mahesh kumar behera


Hive config is not passed to MetastoreStatsConnector. This causes 
RuntimeStatsLoader connects to embedded HMS (even tough HMS is configured to be 
remote) and causes metastore exceptions as metastore db will not be created. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22271) Create index on the TBL_COL_PRIVS table for the columns COLUMN_NAME, PRINCIPAL_NAME, PRINCIPAL_TYPE and TBL_ID

2019-09-30 Thread Marta Kuczora (Jira)
Marta Kuczora created HIVE-22271:


 Summary: Create index on the TBL_COL_PRIVS table for the columns 
COLUMN_NAME, PRINCIPAL_NAME, PRINCIPAL_TYPE and TBL_ID
 Key: HIVE-22271
 URL: https://issues.apache.org/jira/browse/HIVE-22271
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Marta Kuczora


In one of the escalations for HDP-3.1.0 we found that the table privilege 
checks could be very slow and these checks could be speed up by defining an 
INDEX on the TBL_COL_PRIVS table for the following columns: 
COLUMN_NAME,PRINCIPAL_NAME,PRINCIPAL_TYPE,TBL_ID

In the MYSQL slow query log, we found that the following query is executed 
slowly:
{noformat}
SELECT DISTINCT 'org.apache.hadoop.hive.metastore.model.MTableColumnPrivilege' 
AS 
`NUCLEUS_TYPE`,`A0`.`AUTHORIZER`,`A0`.`COLUMN_NAME`,`A0`.`CREATE_TIME`,`A0`.`GRANT_OPTION`,`A0`.`GRANTOR`,`A0`.`GRANTOR_TYPE`,`A0`.`PRINCIPAL_NAME`,`A0`.`PRINCIPAL_TYPE`,`A0`.`TBL_COL_PRIV`,`A0`.`TBL_COLUMN_GRANT_ID`
 FROM `TBL_COL_PRIVS` `A0` LEFT OUTER JOIN `TBLS` `B0` ON `A0`.`TBL_ID` = 
`B0`.`TBL_ID` LEFT OUTER JOIN `DBS` `C0` ON `B0`.`DB_ID` = `C0`.`DB_ID` WHERE 
`A0`.`PRINCIPAL_NAME` = 'xxx' AND `A0`.`PRINCIPAL_TYPE` = 'GROUP' AND 
`B0`.`TBL_NAME` = '' AND `C0`.`NAME` = 'xxx' AND `C0`.`CTLG_NAME` = 'xxx' 
AND `A0`.`COLUMN_NAME` = 'xxx'
{noformat}
When checked the explain plan of the this query, it could be seen that the 
index defined on the TBL_COL_PRIVS table is not used. In the slow query, the 
COLUMN_NAME, PRINCIPAL_NAME, PRINCIPAL_TYPE and TBL_ID columns were used, and 
after creating an index on these columns only, we saw significant performance 
improvement.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22270) Upgrade commons-io to 2.6

2019-09-30 Thread David Lavati (Jira)
David Lavati created HIVE-22270:
---

 Summary: Upgrade commons-io to 2.6
 Key: HIVE-22270
 URL: https://issues.apache.org/jira/browse/HIVE-22270
 Project: Hive
  Issue Type: Improvement
Reporter: David Lavati
Assignee: David Lavati


Hive's currently using commons-io 2.4 and according to HIVE-21273, a number of 
issues are present in it, which can be resolved by upgrading to 2.6:

??IOUtils copyLarge() and skip() methods are performance hogs??
?? affectsVersions:2.3;2.4??
?? 
[https://issues.apache.org/jira/projects/IO/issues/IO-355?filter=allopenissues]??
?? CharSequenceInputStream#reset() behaves incorrectly in case when buffer size 
is not dividable by data size??
?? affectsVersions:2.4??
?? 
[https://issues.apache.org/jira/projects/IO/issues/IO-356?filter=allopenissues]??
?? [Tailer] InterruptedException while the thead is sleeping is silently 
ignored??
?? affectsVersions:2.4??
?? 
[https://issues.apache.org/jira/projects/IO/issues/IO-357?filter=allopenissues]??
?? IOUtils.contentEquals* methods returns false if input1 == input2; should 
return true??
?? affectsVersions:2.4??
?? 
[https://issues.apache.org/jira/projects/IO/issues/IO-362?filter=allopenissues]??
?? Apache Commons - standard links for documents are failing??
?? affectsVersions:2.4??
?? 
[https://issues.apache.org/jira/projects/IO/issues/IO-369?filter=allopenissues]??
?? FileUtils.sizeOfDirectoryAsBigInteger can overflow??
?? affectsVersions:2.4??
?? 
[https://issues.apache.org/jira/projects/IO/issues/IO-390?filter=allopenissues]??
?? Regression in FileUtils.readFileToString from 2.0.1??
?? affectsVersions:2.1;2.2;2.3;2.4??
?? 
[https://issues.apache.org/jira/projects/IO/issues/IO-453?filter=allopenissues]??
?? Correct exception message in FileUtils.getFile(File; String...)??
?? affectsVersions:2.4??
?? 
[https://issues.apache.org/jira/projects/IO/issues/IO-479?filter=allopenissues]??
?? org.apache.commons.io.FileUtils#waitFor waits too long??
?? affectsVersions:2.4??
?? 
[https://issues.apache.org/jira/projects/IO/issues/IO-481?filter=allopenissues]??
?? FilenameUtils should handle embedded null bytes??
?? affectsVersions:2.4??
?? 
[https://issues.apache.org/jira/projects/IO/issues/IO-484?filter=allopenissues]??
?? Exceptions are suppressed incorrectly when copying files.??
?? affectsVersions:2.4;2.5??
?? 
[https://issues.apache.org/jira/projects/IO/issues/IO-502?filter=allopenissues]??



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Review Request 71555: Incompatible java.util.ArrayList for java 11

2019-09-30 Thread Attila Magyar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71555/
---

Review request for hive, Laszlo Bodor, Ashutosh Chauhan, and Prasanth_J.


Bugs: HIVE-22097
https://issues.apache.org/jira/browse/HIVE-22097


Repository: hive-git


Description
---

The following exceptions come when running a query on Java 11:

java.lang.RuntimeException: java.lang.NoSuchFieldException: parentOffset
at 
org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.(SerializationUtilities.java:390)
at 
org.apache.hadoop.hive.ql.exec.SerializationUtilities$1.create(SerializationUtilities.java:235)
at 
org.apache.hive.com.esotericsoftware.kryo.pool.KryoPoolQueueImpl.borrow(KryoPoolQueueImpl.java:48)
at 
org.apache.hadoop.hive.ql.exec.SerializationUtilities.borrowKryo(SerializationUtilities.java:280)
at 
org.apache.hadoop.hive.ql.exec.Utilities.setBaseWork(Utilities.java:595)
at 
org.apache.hadoop.hive.ql.exec.Utilities.setMapWork(Utilities.java:587)
at 
org.apache.hadoop.hive.ql.exec.Utilities.setMapRedWork(Utilities.java:579)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:357)
at 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:159)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2317)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1969)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1636)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1396)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1390)
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:162)
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:223)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:242)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:189)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:408)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:838)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:777)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:696)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
Caused by: java.lang.NoSuchFieldException: parentOffset
at java.base/java.lang.Class.getDeclaredField(Class.java:2412)
at 
org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.(SerializationUtilities.java:384)
... 29 more

The internal structure of ArrayList$SubList changed and our serializer fails. 
This serialzier comes from kryo-serializers package where they already updated 
the code. This patch does the some.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/SerializationUtilities.java 
e4d33e82168 


Diff: https://reviews.apache.org/r/71555/diff/1/


Testing
---

Tested on a real cluster with Java 11.


Thanks,

Attila Magyar



[jira] [Created] (HIVE-22269) Missing stats in the operator with "hive.optimize.sort.dynamic.partition" (SortedDynPartitionOptimizer) misestimates reducer count

2019-09-30 Thread Rajesh Balamohan (Jira)
Rajesh Balamohan created HIVE-22269:
---

 Summary: Missing stats in the operator with 
"hive.optimize.sort.dynamic.partition" (SortedDynPartitionOptimizer) 
misestimates reducer count
 Key: HIVE-22269
 URL: https://issues.apache.org/jira/browse/HIVE-22269
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Reporter: Rajesh Balamohan


{{hive.optimize.sort.dynamic.partition=true}} introduces new stage to reduce 
number of writes in dynamic partitioning usecase. Earlier 
{{SortedDynPartitionOptimizer}} added this new operator via {{Optimizer.java}} 
and the stats for the newly added operator was populated via 
{{StatsRulesProcFactory$ReduceSinkStatsRule}}.

However, with "HIVE-20703" this got changed. This is moved to {{TezCompiler}} 
for cost based decision. Though the operator gets added correctly, the stats 
for this does not get added (as it runs after runStatsAnnotation()).

This causes reducer count to be mis-estimated in the query.
{noformat}
e.g For the following query, reducer_2 would be estimated as "2" instead of 
"1009". This causes huge delay in the runtime.

explain 
from tpcds_xtext_1000.store_sales ss
insert overwrite table store_sales partition (ss_sold_date_sk)
select
ss.ss_sold_time_sk,
ss.ss_item_sk,
ss.ss_customer_sk,
ss.ss_cdemo_sk,
ss.ss_hdemo_sk,
ss.ss_addr_sk,
ss.ss_store_sk,
ss.ss_promo_sk,
ss.ss_ticket_number,
ss.ss_quantity,
ss.ss_wholesale_cost,
ss.ss_list_price,
ss.ss_sales_price,
ss.ss_ext_discount_amt,
ss.ss_ext_sales_price,
ss.ss_ext_wholesale_cost,
ss.ss_ext_list_price,
ss.ss_ext_tax,
ss.ss_coupon_amt,
ss.ss_net_paid,
ss.ss_net_paid_inc_tax,
ss.ss_net_profit,
ss.ss_sold_date_sk
where ss.ss_sold_date_sk is not null
insert overwrite table store_sales partition (ss_sold_date_sk)
select
ss.ss_sold_time_sk,
ss.ss_item_sk,
ss.ss_customer_sk,
ss.ss_cdemo_sk,
ss.ss_hdemo_sk,
ss.ss_addr_sk,
ss.ss_store_sk,
ss.ss_promo_sk,
ss.ss_ticket_number,
ss.ss_quantity,
ss.ss_wholesale_cost,
ss.ss_list_price,
ss.ss_sales_price,
ss.ss_ext_discount_amt,
ss.ss_ext_sales_price,
ss.ss_ext_wholesale_cost,
ss.ss_ext_list_price,
ss.ss_ext_tax,
ss.ss_coupon_amt,
ss.ss_net_paid,
ss.ss_net_paid_inc_tax,
ss.ss_net_profit,
ss.ss_sold_date_sk
where ss.ss_sold_date_sk is null
distribute by ss.ss_item_sk
;
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)