Re: [VOTE] Release Apache Hive 4.0.0 (Release Candidate 0)
+1 (binding) Thanks a lot Denys for driving the release! * Verified the checksum and signature [OK] * Built Hive 4.0.0 from source [OK] * Initialized metastore with MySQL [OK] * Built package and ran metastore and hiveserver [OK] * Deployed and start the binary tar with Hadoop 3.3.6 and Tez 0.10.3 [OK] * Ran some simple Hive queries with external/acid/iceberg tables [OK] Regards, Marta On Tue, Mar 26, 2024 at 8:26 AM Denys Kuzmenko wrote: > Hi Everyone, > > We would like to thank everyone who has contributed to the project and > request > the Hive PMC members to review and vote on this new release candidate. > > Apache Hive 4.0.0 RC-0 artifacts are available here:* > https://people.apache.org/~dkuzmenko/apache-hive-4.0.0-rc0/ > > > The checksums are as follows: > - 83eb88549ae88d3df6a86bb3e2526c7f4a0f21acafe21452c18071cee058c666 > apache-hive-4.0.0-bin.tar.gz > - 4dbc9321d245e7fd26198e5d3dff95e5f7d0673d54d0727787d72956a1bca4f5 > apache-hive-4.0.0-src.tar.gz > > > You can find the KEYS file here: > > * https://downloads.apache.org/hive/KEYS > > > A staged Maven repository URL is:* > https://repository.apache.org/content/repositories/orgapachehive-1127/ > > The git commit hash is:* > > https://github.com/apache/hive/commit/183f8cb41d3dbed961ffd27999876468ff06690c > > > This corresponds to the tag: release-4.0.0-rc0 > * https://github.com/apache/hive/tree/release-4.0.0-rc0 > > The vote is open for the next 72 hours and passes if a majority of at least > three +1 PMC votes are cast. > > (Only PMC members have binding votes, however, other community members > are encouraged to cast non-binding votes.) > > > [ ] +1 Release this package as Apache Hive 4.0.0 > [ ] +0 > [ ] -1 Do not release this because... > > > Please download, verify, and test. > > > Regards, > > Denys >
[jira] [Created] (HIVE-25457) Implement querying Iceberg table metadata
Marta Kuczora created HIVE-25457: Summary: Implement querying Iceberg table metadata Key: HIVE-25457 URL: https://issues.apache.org/jira/browse/HIVE-25457 Project: Hive Issue Type: New Feature Reporter: Marta Kuczora Assignee: Marta Kuczora Fix For: 4.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [EXTERNAL] Re: Welcome Marta to Hive PMC
Thanks a lot, I am really honored. On Tue, Aug 3, 2021 at 11:31 AM Sankar Hariappan wrote: > Congrats Marta! > > Thanks, > Sankar > > -Original Message- > From: Peter Vary > Sent: 03 August 2021 14:26 > To: dev@hive.apache.org > Subject: [EXTERNAL] Re: Welcome Marta to Hive PMC > > Congratulations Marta! > > > On Aug 3, 2021, at 10:01, Karen Coppage wrote: > > > > Congratulations!! > > > > Karen > > > >> On 2021. Aug 3., at 6:50, Ashutosh Chauhan > wrote: > >> > >> Hi all, > >> > >> It's an honor to announce that Apache Hive PMC has recently voted to > >> invite Marta Kuczora as a new Hive PMC member. Marta is a long time > >> Hive contributor and committer, and has made significant contributions > in Hive. > >> Please join me in congratulating her and looking forward to a bigger > >> role that she will play in the Apache Hive project. > >> > >> Thanks, > >> Ashutosh > > > >
[jira] [Created] (HIVE-25357) Fix the checkstyle issue in HiveIcebergMetaHook which breaks the build
Marta Kuczora created HIVE-25357: Summary: Fix the checkstyle issue in HiveIcebergMetaHook which breaks the build Key: HIVE-25357 URL: https://issues.apache.org/jira/browse/HIVE-25357 Project: Hive Issue Type: Bug Affects Versions: 4.0.0 Reporter: Marta Kuczora Assignee: Marta Kuczora Fix For: 4.0.0 [ERROR] /home/jenkins/agent/workspace/hive-precommit_master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java:221:3: Cyclomatic Complexity is 13 (max allowed is 12). [CyclomaticComplexity] This issue probably came in with [this|https://github.com/apache/hive/commit/76c49b9df957c8c05b81a4016282c03648b728b9] commit -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25325) Add TRUNCATE TABLE support for Hive Iceberg tables
Marta Kuczora created HIVE-25325: Summary: Add TRUNCATE TABLE support for Hive Iceberg tables Key: HIVE-25325 URL: https://issues.apache.org/jira/browse/HIVE-25325 Project: Hive Issue Type: Improvement Reporter: Marta Kuczora Assignee: Marta Kuczora Fix For: 4.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25310) Fix local test run problems with Iceberg tests: Socket closed by peer
Marta Kuczora created HIVE-25310: Summary: Fix local test run problems with Iceberg tests: Socket closed by peer Key: HIVE-25310 URL: https://issues.apache.org/jira/browse/HIVE-25310 Project: Hive Issue Type: Test Reporter: Marta Kuczora Assignee: Marta Kuczora -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25264) Add tests to verify Hive can read/write after schema change on Iceberg table
Marta Kuczora created HIVE-25264: Summary: Add tests to verify Hive can read/write after schema change on Iceberg table Key: HIVE-25264 URL: https://issues.apache.org/jira/browse/HIVE-25264 Project: Hive Issue Type: Test Reporter: Marta Kuczora Assignee: Marta Kuczora -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25258) Incorrect row order after query-based MINOR compaction
Marta Kuczora created HIVE-25258: Summary: Incorrect row order after query-based MINOR compaction Key: HIVE-25258 URL: https://issues.apache.org/jira/browse/HIVE-25258 Project: Hive Issue Type: Bug Reporter: Marta Kuczora Assignee: Marta Kuczora -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25257) Incorrect row order validation for query-based MAJOR compaction
Marta Kuczora created HIVE-25257: Summary: Incorrect row order validation for query-based MAJOR compaction Key: HIVE-25257 URL: https://issues.apache.org/jira/browse/HIVE-25257 Project: Hive Issue Type: Bug Reporter: Marta Kuczora Assignee: Marta Kuczora -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24642) Multiple file listing calls are executed in the MoveTask in case of direct inserts
Marta Kuczora created HIVE-24642: Summary: Multiple file listing calls are executed in the MoveTask in case of direct inserts Key: HIVE-24642 URL: https://issues.apache.org/jira/browse/HIVE-24642 Project: Hive Issue Type: Improvement Reporter: Marta Kuczora Assignee: Marta Kuczora When inserting data into a table with dynamic partitioning with direct insert on, the MoveTask performs several file listings to look up the newly created partitions and files. Check if all files listings are necessary or it can be optimized to do less listings. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24530) Potential NPE in FileSinkOperator.closeRecordwriters method
Marta Kuczora created HIVE-24530: Summary: Potential NPE in FileSinkOperator.closeRecordwriters method Key: HIVE-24530 URL: https://issues.apache.org/jira/browse/HIVE-24530 Project: Hive Issue Type: Bug Affects Versions: 4.0.0 Reporter: Marta Kuczora Assignee: Marta Kuczora -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24506) Investigate the materialized_view_create_rewrite_4.q test with direct insert on
Marta Kuczora created HIVE-24506: Summary: Investigate the materialized_view_create_rewrite_4.q test with direct insert on Key: HIVE-24506 URL: https://issues.apache.org/jira/browse/HIVE-24506 Project: Hive Issue Type: Task Reporter: Marta Kuczora -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24505) Investigate if the arrays in the FileSinkOperator could be replaced by Lists
Marta Kuczora created HIVE-24505: Summary: Investigate if the arrays in the FileSinkOperator could be replaced by Lists Key: HIVE-24505 URL: https://issues.apache.org/jira/browse/HIVE-24505 Project: Hive Issue Type: Task Affects Versions: 4.0.0 Reporter: Marta Kuczora Assignee: Marta Kuczora The FileSinkOperator uses some array variables, like Path[] outPaths; Path[] outPathsCommitted; Path[] finalPaths; RecordWriter[] outWriters; RecordUpdater[] updaters; Working with these is not always convenient, like when in the createDynamicBucket method, they are extended with elements. Or in case of an UPDATE operation with direct insert on. Then the delete deltas have to be collected separately, because the outPaths array will contain only the inserted deltas. These operations would be much easier with lists. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24336) Turn off the direct insert for EXPLAIN ANALYZE queries
Marta Kuczora created HIVE-24336: Summary: Turn off the direct insert for EXPLAIN ANALYZE queries Key: HIVE-24336 URL: https://issues.apache.org/jira/browse/HIVE-24336 Project: Hive Issue Type: Bug Reporter: Marta Kuczora Assignee: Marta Kuczora Fix For: 4.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24322) In case of direct insert, the attempt ID has to be checked when reading the manifest files
Marta Kuczora created HIVE-24322: Summary: In case of direct insert, the attempt ID has to be checked when reading the manifest files Key: HIVE-24322 URL: https://issues.apache.org/jira/browse/HIVE-24322 Project: Hive Issue Type: Bug Affects Versions: 4.0.0 Reporter: Marta Kuczora Assignee: Marta Kuczora Fix For: 4.0.0 In [IMPALA-10247|https://issues.apache.org/jira/browse/IMPALA-10247] there was an exception from Hive when tyring to load the data: {noformat} 2020-10-13T16:50:53,424 ERROR [HiveServer2-Background-Pool: Thread-23832] exec.Task: Job Commit failed with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(java.io.EOFException)' org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException at org.apache.hadoop.hive.ql.exec.FileSinkOperator.jobCloseOp(FileSinkOperator.java:1468) at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:798) at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:803) at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:803) at org.apache.hadoop.hive.ql.exec.tez.TezTask.close(TezTask.java:627) at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:342) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357) at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:225) at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:322) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:340) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:392) at org.apache.hadoop.hive.ql.exec.Utilities.handleDirectInsertTableFinalPath(Utilities.java:4587) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.jobCloseOp(FileSinkOperator.java:1462) ... 29 more {noformat} The reason of the exception was that Hive was trying to read an empty manifest file. Manifest files are used in case of direct insert to determine which files needs to be kept and which one needs to be cleaned up. They are created by the tasks and they use the tast attempt Id as postfix. In this particular test what happened is that one of the container ran out of memory so Tez decided to kill it right after the manifest file got created but before the pathes got written into the manifest file. This was the manifest file for the task attempt 0. Then Tez assigned a new container to the task, so a new attemp was made with attemptId=1. This one was successful, and wrote the manifest file correctly. But Hive didn't know about this, since this out of memory issue got handled by Tez under the hood, so there was no exception in Hive, therefore no clean-up in the manifest folder. And when Hive is reading the manifest files, it just reads every file from the defined folder, so it tried to read the manifest files for attemp 0 and 1 as well. If there are multiple manifest files with the same name but different attemptId, Hive should only read the one with the biggest attempt Id. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23763) Query based minor compaction produces wrong files when rows with different buckets Ids are processed by the same FileSinkOperator
Marta Kuczora created HIVE-23763: Summary: Query based minor compaction produces wrong files when rows with different buckets Ids are processed by the same FileSinkOperator Key: HIVE-23763 URL: https://issues.apache.org/jira/browse/HIVE-23763 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 4.0.0 Reporter: Marta Kuczora Assignee: Marta Kuczora Fix For: 4.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: Review Request 72532: HIVE-23495 AcidUtils.getAcidState cleanup
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/72532/#review221005 --- Ship it! Ship It! - Marta Kuczora On June 8, 2020, 10:58 a.m., Peter Varga wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/72532/ > --- > > (Updated June 8, 2020, 10:58 a.m.) > > > Review request for hive, Karen Coppage, Marta Kuczora, and Peter Vary. > > > Repository: hive-git > > > Description > --- > > since HIVE-21225 there are two redundant implementation of the > AcidUtils.getAcidState. > > The previous implementation (without the recursive listing) can be removed. > > Also the performance can be improved, by removing unnecessary fileStatus > calls. > > > Diffs > - > > ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 635ed3149c > ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java ca234cfb37 > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 1059cb227f > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java > 16c915959c > > ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java > 598220b0c4 > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java 2a15913f9f > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java > 4e5d5b003b > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java > 7913295380 > > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MinorQueryCompactor.java > d83a50f555 > > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMajorQueryCompactor.java > 5e11d8d2d8 > > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMinorQueryCompactor.java > 1bdec7df2d > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java 75941b3f33 > ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java 337f469d1a > ql/src/test/org/apache/hadoop/hive/ql/io/TestAcidUtils.java f351f04b08 > ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInputOutputFormat.java > e4440e9136 > ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcRawRecordMerger.java > f63c40a7b5 > streaming/src/test/org/apache/hive/streaming/TestStreaming.java 3a3b267927 > > > Diff: https://reviews.apache.org/r/72532/diff/3/ > > > Testing > --- > > > Thanks, > > Peter Varga > >
[jira] [Created] (HIVE-23444) Concurrent ACID direct inserts may fail with FileNotFoundException
Marta Kuczora created HIVE-23444: Summary: Concurrent ACID direct inserts may fail with FileNotFoundException Key: HIVE-23444 URL: https://issues.apache.org/jira/browse/HIVE-23444 Project: Hive Issue Type: Bug Reporter: Marta Kuczora Assignee: Marta Kuczora Fix For: 4.0.0 {noformat} 2020-04-30 15:56:54,706 ERROR org.apache.hive.service.cli.operation.Operation: [HiveServer2-Background-Pool: Thread-675]: Error running hive query: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. java.io.IOException: java.io.FileNotFoundException: File hdfs://ns1/warehouse/tablespace/managed/hive/tpch_unbucketed.db/concurrent_insert_partitioned/l_tax=0.0/_tmp.delta_001_001_ does not exist. at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:362) ~[hive-service-3.1.3000.7.1.1.0-493.jar:3.1.3000.7.1.1.0-493] at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:241) ~[hive-service-3.1.3000.7.1.1.0-493.jar:3.1.3000.7.1.1.0-493] at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87) ~[hive-service-3.1.3000.7.1.1.0-493.jar:3.1.3000.7.1.1.0-493] at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:322) [hive-service-3.1.3000.7.1.1.0-493.jar:3.1.3000.7.1.1.0-493] at java.security.AccessController.doPrivileged(Native Method) [?:?] at javax.security.auth.Subject.doAs(Subject.java:423) [?:?] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) [hadoop-common-3.1.1.7.1.1.0-493.jar:?] at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:340) [hive-service-3.1.3000.7.1.1.0-493.jar:3.1.3000.7.1.1.0-493] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?] at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?] at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:834) [?:?] Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.io.FileNotFoundException: File hdfs://ns1/warehouse/tablespace/managed/hive/tpch_unbucketed.db/concurrent_insert_partitioned/l_tax=0.0/_tmp.delta_001_001_ does not exist. at org.apache.hadoop.hive.ql.metadata.Hive.loadPartitionInternal(Hive.java:2465) ~[hive-exec-3.1.3000.7.1.1.0-493.jar:3.1.3000.7.1.1.0-493] at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:2228) ~[hive-exec-3.1.3000.7.1.1.0-493.jar:3.1.3000.7.1.1.0-493] at org.apache.hadoop.hive.ql.exec.MoveTask.handleStaticParts(MoveTask.java:522) ~[hive-exec-3.1.3000.7.1.1.0-493.jar:3.1.3000.7.1.1.0-493] at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:442) ~[hive-exec-3.1.3000.7.1.1.0-493.jar:3.1.3000.7.1.1.0-493] at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) ~[hive-exec-3.1.3000.7.1.1.0-493.jar:3.1.3000.7.1.1.0-493] at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) ~[hive-exec-3.1.3000.7.1.1.0-493.jar:3.1.3000.7.1.1.0-493] at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:359) ~[hive-exec-3.1.3000.7.1.1.0-493.jar:3.1.3000.7.1.1.0-493] at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) ~[hive-exec-3.1.3000.7.1.1.0-493.jar:3.1.3000.7.1.1.0-493] at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) ~[hive-exec-3.1.3000.7.1.1.0-493.jar:3.1.3000.7.1.1.0-493] at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) ~[hive-exec-3.1.3000.7.1.1.0-493.jar:3.1.3000.7.1.1.0-493] at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) ~[hive-exec-3.1.3000.7.1.1.0-493.jar:3.1.3000.7.1.1.0-493] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488) ~[hive-exec-3.1.3000.7.1.1.0-493.jar:3.1.3000.7.1.1.0-493] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482) ~[hive-exec-3.1.3000.7.1.1.0-493.jar:3.1.3000.7.1.1.0-493] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) ~[hive-exec-3.1.3000.7.1.1.0-493.jar:3.1.3000.7.1.1.0-493] at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:225) ~[hive-service-3.1.3000.7.1.1.0-493.jar:3.1.3000.7.1.1.0-493
[jira] [Created] (HIVE-23442) ACID major compaction doesn't read base correct if it was written by insert overwrite by direct insert
Marta Kuczora created HIVE-23442: Summary: ACID major compaction doesn't read base correct if it was written by insert overwrite by direct insert Key: HIVE-23442 URL: https://issues.apache.org/jira/browse/HIVE-23442 Project: Hive Issue Type: Bug Reporter: Marta Kuczora -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23410) ACID: Improve the delete and update operations to avoid the move step
Marta Kuczora created HIVE-23410: Summary: ACID: Improve the delete and update operations to avoid the move step Key: HIVE-23410 URL: https://issues.apache.org/jira/browse/HIVE-23410 Project: Hive Issue Type: Improvement Affects Versions: 4.0.0 Reporter: Marta Kuczora Assignee: Marta Kuczora This is a follow-up task for [HIVE-21164|https://issues.apache.org/jira/browse/HIVE-21164], where the insert operation has been modified to write directly to the table locations instead of the staging directory. The same improvement should be done for the ACID update and delete operations as well. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23345) INT64 Parquet timestamps cannot be read into bigint Hive type
Marta Kuczora created HIVE-23345: Summary: INT64 Parquet timestamps cannot be read into bigint Hive type Key: HIVE-23345 URL: https://issues.apache.org/jira/browse/HIVE-23345 Project: Hive Issue Type: Bug Affects Versions: 4.0.0 Reporter: Marta Kuczora Assignee: Marta Kuczora -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23286) The clean-up in case of an aborted FileSinkOperator is not correct for ACID direct insert
Marta Kuczora created HIVE-23286: Summary: The clean-up in case of an aborted FileSinkOperator is not correct for ACID direct insert Key: HIVE-23286 URL: https://issues.apache.org/jira/browse/HIVE-23286 Project: Hive Issue Type: Bug Reporter: Marta Kuczora Assignee: Marta Kuczora Fix For: 4.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: Review Request 72336: HIVE-23114: Insert overwrite with dynamic partitioning is not working correctly with direct insert
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/72336/ --- (Updated April 8, 2020, 1:47 p.m.) Review request for hive and Peter Vary. Changes --- Fixing whitespaces. Bugs: HIVE-23114 https://issues.apache.org/jira/browse/HIVE-23114 Repository: hive-git Description --- The idea behind the patch is the following: When doing a multi-statement insert overwrite with dynamic partitioning, the partition information will be written to the manifest file. With this information, each FileSinkOperator can clean-up only the partition directories written by the same FileSinkOperator and do not clean-up the partition directories written by the other FileSinkOperators. If a statement from the insert overwrite query, doesn't produce any data, a manifest file will still be written, otherwise the missing manifest file would result a clean-up on table level which could delete the data written by the other FileSinkOperators. Diffs (updated) - itests/src/test/resources/testconfiguration.properties e99ce7babb ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java d68d8f9409 ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 04166a23ee ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java e25dc54e7d ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 17e6cdf162 ql/src/test/queries/clientpositive/acid_direct_insert_insert_overwrite.q PRE-CREATION ql/src/test/queries/clientpositive/acid_multiinsert_dyn_part.q PRE-CREATION ql/src/test/results/clientpositive/acid_direct_insert_insert_overwrite.q.out PRE-CREATION ql/src/test/results/clientpositive/acid_multiinsert_dyn_part.q.out PRE-CREATION ql/src/test/results/clientpositive/llap/acid_direct_insert_insert_overwrite.q.out PRE-CREATION ql/src/test/results/clientpositive/llap/acid_multiinsert_dyn_part.q.out PRE-CREATION Diff: https://reviews.apache.org/r/72336/diff/2/ Changes: https://reviews.apache.org/r/72336/diff/1-2/ Testing --- Added specific q tests for different insert overwrite scenarios. Thanks, Marta Kuczora
Review Request 72336: HIVE-23114: Insert overwrite with dynamic partitioning is not working correctly with direct insert
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/72336/ --- Review request for hive and Peter Vary. Bugs: HIVE-23114 https://issues.apache.org/jira/browse/HIVE-23114 Repository: hive-git Description --- The idea behind the patch is the following: When doing a multi-statement insert overwrite with dynamic partitioning, the partition information will be written to the manifest file. With this information, each FileSinkOperator can clean-up only the partition directories written by the same FileSinkOperator and do not clean-up the partition directories written by the other FileSinkOperators. If a statement from the insert overwrite query, doesn't produce any data, a manifest file will still be written, otherwise the missing manifest file would result a clean-up on table level which could delete the data written by the other FileSinkOperators. Diffs - itests/src/test/resources/testconfiguration.properties e99ce7babb ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java d68d8f9409 ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 04166a23ee ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java e25dc54e7d ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 17e6cdf162 ql/src/test/queries/clientpositive/acid_direct_insert_insert_overwrite.q PRE-CREATION ql/src/test/queries/clientpositive/acid_multiinsert_dyn_part.q PRE-CREATION ql/src/test/results/clientpositive/acid_direct_insert_insert_overwrite.q.out PRE-CREATION ql/src/test/results/clientpositive/acid_multiinsert_dyn_part.q.out PRE-CREATION ql/src/test/results/clientpositive/llap/acid_direct_insert_insert_overwrite.q.out PRE-CREATION ql/src/test/results/clientpositive/llap/acid_multiinsert_dyn_part.q.out PRE-CREATION Diff: https://reviews.apache.org/r/72336/diff/1/ Testing --- Added specific q tests for different insert overwrite scenarios. Thanks, Marta Kuczora
[jira] [Created] (HIVE-23114) Insert overwrite with dynamic partitioning is not working correctly with ACID tables with direct insert and with insert-only tables
Marta Kuczora created HIVE-23114: Summary: Insert overwrite with dynamic partitioning is not working correctly with ACID tables with direct insert and with insert-only tables Key: HIVE-23114 URL: https://issues.apache.org/jira/browse/HIVE-23114 Project: Hive Issue Type: Bug Reporter: Marta Kuczora Assignee: Marta Kuczora -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: Review Request 72181: HIVE-22832: Parallelise direct insert directory cleaning process
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/72181/#review219763 --- Ship it! Ship It! - Marta Kuczora On March 2, 2020, 9:22 a.m., Marton Bod wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/72181/ > --- > > (Updated March 2, 2020, 9:22 a.m.) > > > Review request for hive, Marta Kuczora and Peter Vary. > > > Repository: hive-git > > > Description > --- > > HIVE-22832: Parallelise direct insert directory cleaning process > > > Diffs > - > > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java e9966e6364 > > > Diff: https://reviews.apache.org/r/72181/diff/1/ > > > Testing > --- > > pre-commit build success: > https://builds.apache.org/job/PreCommit-HIVE-Build/20874/ > > > Thanks, > > Marton Bod > >
[jira] [Created] (HIVE-22969) Union remove optimisation results incorrect data when inserting to ACID table
Marta Kuczora created HIVE-22969: Summary: Union remove optimisation results incorrect data when inserting to ACID table Key: HIVE-22969 URL: https://issues.apache.org/jira/browse/HIVE-22969 Project: Hive Issue Type: Bug Affects Versions: 4.0.0 Reporter: Marta Kuczora Assignee: Marta Kuczora Steps to reproduce the issue: {noformat} create table input_text(key string, val string) stored as textfile location '/Users/martakuczora/work/hive/warehouse/external/input_text'; create table output_acid(key string, val string) stored as orc tblproperties('transactional'='true'); insert into input_text values ('1','1'), ('2','2'),('3','3'); {noformat} {noformat} set hive.mapred.mode=nonstrict; set hive.stats.autogather=false; set hive.optimize.union.remove=true; set hive.auto.convert.join=true; set hive.exec.submitviachild=false; set hive.exec.submit.local.task.via.child=false; SELECT * FROM ( select key, val from input_text union all select a.key as key, b.val as val FROM input_text a join input_text b on a.key=b.key) c; The result of the select: 1 1 2 2 3 3 1 1 2 2 3 3 {noformat} {noformat} insert into table output_acid SELECT * FROM ( select key, val from input_text union all select a.key as key, b.val as val FROM input_text a join input_text b on a.key=b.key) c; select * from output_acid; The result: 1 1 2 2 3 3 {noformat} The folder of the output_acid table contained the following delta directories: {noformat} drwxr-xr-x 6 martakuczora staff 192 Mar 2 16:29 delta_000_000 drwxr-xr-x 6 martakuczora staff 192 Mar 2 16:29 delta_001_001_0001 {noformat} It can be seen that the statement ID from the first directory is missing and when the select statements runs on the table, this directory will be ignored. That's why only half of the data got returned when running the select on the output_acid table. If either hive.stats.autogather is set to true or hive.optimize.union.remove is set to false the result of the insert will be correct. In this case there will be only 1 delta directory in the table's folder. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22918) Investigate empty bucket file creation for ACID tables
Marta Kuczora created HIVE-22918: Summary: Investigate empty bucket file creation for ACID tables Key: HIVE-22918 URL: https://issues.apache.org/jira/browse/HIVE-22918 Project: Hive Issue Type: Task Affects Versions: 4.0.0 Reporter: Marta Kuczora Assignee: Marton Bod When creating an insert-only bucketed table with 5 buckets, and we insert only one row to this table, Hive creates empty files for the other 4 buckets. This logic is in the code for ACID tables as well, but when checking the table's final directory after the insert, I found that only 1 files got created. When debugged this issue, I found that the empty files are created in the staging directory outside the delta directory, therefore they won't get copied by the move task to the final directory. This behavior seems broken, but not sure if we really need the empty files in this case. This Jira is about investigating whether or not we need these empty files for ACID tables and if we do, fix the code to have them for ACID tables as well. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22917) Configuration for Hive to recognise non-empty destination folders
Marta Kuczora created HIVE-22917: Summary: Configuration for Hive to recognise non-empty destination folders Key: HIVE-22917 URL: https://issues.apache.org/jira/browse/HIVE-22917 Project: Hive Issue Type: Task Reporter: Marta Kuczora Assignee: Marta Kuczora Currently Hive overwrites the LOCATION folder even if it is non-empty in case of INSERT or CTAS. Investigate this behavior and if we can introduce a switch whereby any ALTER/INSERT or CTAS or CREATE or DROP operation / transaction would be aborted if the switch is ON and the LOCATION clause points at a non-empty folder. {noformat} >> create table test (json_data string) STORED AS TEXTFILE LOCATION 'hdfs://host-10-17-102-132.coe.>ra.com:8020/tmp/test' TBLPROPERTIES ('serialization.null.format' = ''); >> insert into test values('test0'); >> insert into test values('test1'); >> insert into test values('test2'); >> select * from test; INFO : Compiling command(queryId=hive_20200207150101_601d6dbc-99cb-446d-86ac-6f8ce5304681): select * from test INFO : Executing command(queryId=hive_20200207150101_601d6dbc-99cb-446d-86ac-6f8ce5304681): select * from test INFO : Completed executing command(queryId=hive_20200207150101_601d6dbc-99cb-446d-86ac-6f8ce5304681); Time taken: 0.001 seconds INFO : OK -+ test.json_data -+ test0 test1 test2 -+ >> select * from test_id2; INFO : Compiling command(queryId=hive_20200207145656_e99d1a0d-ea4c-4636-ae3a-dd930df14644): select * from test_id2 INFO : Executing command(queryId=hive_20200207145656_e99d1a0d-ea4c-4636-ae3a-dd930df14644): select * from test_id2 INFO : Completed executing command(queryId=hive_20200207145656_e99d1a0d-ea4c-4636-ae3a-dd930df14644); Time taken: 0.001 seconds INFO : OK --+ test_id2.id --+ 1 13 14 --+ >> create table test2 (json_data int) STORED AS TEXTFILE LOCATION 'hdfs://host-10-17-102-132.coe.>ra.com:8020/tmp/test' as SELECT * from test_id; INFO : Completed executing command(queryId=hive_20200207150303_cbb57a17-1242-46dc-a98e-addf50f01c5b); Time taken: 13.137 seconds INFO : OK No rows affected (13.226 seconds) SELECT * from test; INFO : Compiling command(queryId=hive_20200207150404_d0aabd08-a15f-4e6c-99a3-e607b8a6cfd3): SELECT * from test INFO : Executing command(queryId=hive_20200207150404_d0aabd08-a15f-4e6c-99a3-e607b8a6cfd3): SELECT * from test INFO : Completed executing command(queryId=hive_20200207150404_d0aabd08-a15f-4e6c-99a3-e607b8a6cfd3); Time taken: 0.001 seconds INFO : OK -+ test.json_data -+ 1 13 14 -+ 3 rows selected (0.081 seconds) {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: Review Request 71904: HIVE-21164: ACID: explore how we can avoid a move step during inserts/compaction
> On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java > > Lines 1732-1737 (patched) > > <https://reviews.apache.org/r/71904/diff/3/?file=2210218#file2210218line1732> > > > > What about using lambda here? > > Marta Kuczora wrote: > Fixed it. At the end this code part got removed. - Marta --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71904/#review219487 ------- On Feb. 18, 2020, 12:21 p.m., Marta Kuczora wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/71904/ > --- > > (Updated Feb. 18, 2020, 12:21 p.m.) > > > Review request for hive, Gopal V and Peter Vary. > > > Bugs: HIVE-21164 > https://issues.apache.org/jira/browse/HIVE-21164 > > > Repository: hive-git > > > Description > --- > > Extended the original patch with saving the task attempt ids in the file > names and also fixed some bugs in the original patch. > With this fix, inserting into an ACID table would not use move task to place > the generated files into the final directory. It will inserts every files to > the final directory and then clean up the files which are not needed (like > written by failed task attempts). > Also fixed the replication tests which failed for the original patch as well. > > > Diffs > - > > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d3cb60b790 > > hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java > da677c7977 > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java > 056cd27496 > > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java > 31d15fdef9 > > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorTestUtil.java > c2aa73b5f1 > itests/src/test/resources/testconfiguration.properties 1b1bf1147a > ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java > 9a3258115b > ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 9ad4e71482 > ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 06e4ebee82 > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 6c67bc7dd8 > ql/src/java/org/apache/hadoop/hive/ql/io/AcidInputFormat.java bba3960102 > ql/src/java/org/apache/hadoop/hive/ql/io/AcidOutputFormat.java 1e8bb223f2 > ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 2f5ec5270c > ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java > 8980a6292a > ql/src/java/org/apache/hadoop/hive/ql/io/RecordUpdater.java 737e6774b7 > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 76984abd0a > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java > c4c56f8477 > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java > b8a0f0465c > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java > 398698ec06 > > ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java > 2543dc6fc4 > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1eb9c12cc8 > ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java > 73ca658d9c > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java > 33d3beba46 > ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java > c102a69f8f > ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java ecc7bdee4d > ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java bed05819b5 > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java > 739f2b654b > ql/src/java/org/apache/hadoop/hive/ql/util/UpgradeTool.java 58e6289583 > ql/src/test/org/apache/hadoop/hive/ql/TestTxnAddPartition.java c9cb6692df > ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands.java 842140815d > ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java e56d83158f > ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands3.java 908ceb43fc > ql/src/test/org/apache/hadoop/hive/ql/TestTxnConcatenate.java 8676e0db11 > ql/src/test/org/apache/hadoop/hive/ql/TestTxnExIm.java 66b2b2768b > ql/src/test/org/apache/hadoop/hive/ql/TestTxnLoadData.java bb55d9fd79 > ql/src/test/org/apache/hadoop/hive/ql/TestTxnNoBuckets.java ea6b1d9bec > ql/src/test/org/apache/hadoop/hive/ql/TxnCommandsBaseForTests.java > af14
Re: Review Request 71904: HIVE-21164: ACID: explore how we can avoid a move step during inserts/compaction
> On Feb. 4, 2020, 10:16 p.m., Rajesh Balamohan wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java > > Line 4382 (original), 4397 (patched) > > <https://reviews.apache.org/r/71904/diff/3/?file=2210218#file2210218line4397> > > > > Is this needed for direct insert?. In objectstores, we could have calls > > getting throttled. That's a really good question, I was thinking about it a lot. I think it is not needed. This method does two things: removes the temporarily and duplicated files and returns the emptyBuckets list. This list contains elements if the number of buckets are bigger than the number of files. In this case, for MM tables, empty files will be created. But this is not the case for ACID tables, there won't be any empty files created for ACID tables. I want to revisit this topic whether or not we need these empty files, but for now, I would go with the same behaviour as for ACID tables. About the temp file removal, when the direct insert is finished all files which are not committed (meaning not in the manifest files) will be deleted prior to this call. So there shouldn't be any unnecessary files left at this point. I remove this call, and upload a patch to see the result of the pre-commit tests. If everything passes, I think it is safe to remove this call in case of direct insert. - Marta --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71904/#review219494 ------- On Feb. 18, 2020, 12:21 p.m., Marta Kuczora wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/71904/ > --- > > (Updated Feb. 18, 2020, 12:21 p.m.) > > > Review request for hive, Gopal V and Peter Vary. > > > Bugs: HIVE-21164 > https://issues.apache.org/jira/browse/HIVE-21164 > > > Repository: hive-git > > > Description > --- > > Extended the original patch with saving the task attempt ids in the file > names and also fixed some bugs in the original patch. > With this fix, inserting into an ACID table would not use move task to place > the generated files into the final directory. It will inserts every files to > the final directory and then clean up the files which are not needed (like > written by failed task attempts). > Also fixed the replication tests which failed for the original patch as well. > > > Diffs > - > > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d3cb60b790 > > hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java > da677c7977 > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java > 056cd27496 > > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java > 31d15fdef9 > > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorTestUtil.java > c2aa73b5f1 > itests/src/test/resources/testconfiguration.properties 1b1bf1147a > ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java > 9a3258115b > ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 9ad4e71482 > ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 06e4ebee82 > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 6c67bc7dd8 > ql/src/java/org/apache/hadoop/hive/ql/io/AcidInputFormat.java bba3960102 > ql/src/java/org/apache/hadoop/hive/ql/io/AcidOutputFormat.java 1e8bb223f2 > ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 2f5ec5270c > ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java > 8980a6292a > ql/src/java/org/apache/hadoop/hive/ql/io/RecordUpdater.java 737e6774b7 > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 76984abd0a > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java > c4c56f8477 > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java > b8a0f0465c > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java > 398698ec06 > > ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java > 2543dc6fc4 > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1eb9c12cc8 > ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java > 73ca658d9c > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java > 33d3beba46 > ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java > c102a69f8f > ql/src/java/org/apache/hadoop/hive/ql/pl
Re: Review Request 71904: HIVE-21164: ACID: explore how we can avoid a move step during inserts/compaction
/mm_all.q.out 226f2a9374 ql/src/test/results/clientpositive/llap/tez_acid_union_dynamic_partition.q.out PRE-CREATION ql/src/test/results/clientpositive/llap/tez_acid_union_dynamic_partition_2.q.out PRE-CREATION ql/src/test/results/clientpositive/llap/tez_acid_union_multiinsert.q.out PRE-CREATION ql/src/test/results/clientpositive/mm_all.q.out 143ebd69f9 streaming/src/test/org/apache/hive/streaming/TestStreaming.java 35a220facd Diff: https://reviews.apache.org/r/71904/diff/5/ Changes: https://reviews.apache.org/r/71904/diff/4-5/ Testing --- Had to modify some tests because of the file name changes. Also added some specific tests. In the pre-commit run all tests passed successfully. Thanks, Marta Kuczora
Re: Review Request 71904: HIVE-21164: ACID: explore how we can avoid a move step during inserts/compaction
/clientpositive/llap/insert_overwrite.q.out fbc3326b39 ql/src/test/results/clientpositive/llap/mm_all.q.out 226f2a9374 ql/src/test/results/clientpositive/llap/tez_acid_union_dynamic_partition.q.out PRE-CREATION ql/src/test/results/clientpositive/llap/tez_acid_union_dynamic_partition_2.q.out PRE-CREATION ql/src/test/results/clientpositive/llap/tez_acid_union_multiinsert.q.out PRE-CREATION ql/src/test/results/clientpositive/mm_all.q.out 143ebd69f9 streaming/src/test/org/apache/hive/streaming/TestStreaming.java 35a220facd Diff: https://reviews.apache.org/r/71904/diff/4/ Changes: https://reviews.apache.org/r/71904/diff/3-4/ Testing --- Had to modify some tests because of the file name changes. Also added some specific tests. In the pre-commit run all tests passed successfully. Thanks, Marta Kuczora
Re: Review Request 71904: HIVE-21164: ACID: explore how we can avoid a move step during inserts/compaction
> On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java > > Lines 1444 (patched) > > <https://reviews.apache.org/r/71904/diff/3/?file=2210216#file2210216line1446> > > > > Why is this null? It is null, because if the union all optimization is on, the different union statements will be translated into different FileSinkOperators and they will write to their own separate directories. They are normally writing to the staging directory and under folders with specific 'HIVE_UNION_SUBDIR_' prefix. Then the move tasks will move these files to the final table directory. In ACID tables these FileSinkOperators would write to different delta directories anyway, so the tasks could write directly to the final table location instead of the 'HIVE_UNION_SUBDIR_' folders. That's why the unionSuffix is null here. In other cases, they have the 'HIVE_UNION_SUBDIR_' value. Btw, I locally modified many union q tests to run with ACID tables and ran them with MR and Tez. I found one bug, which I fixed and I also added some union q tests to run with ACID table and direct insert. > On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote: > > ql/src/test/org/apache/hadoop/hive/ql/TestTxnNoBuckets.java > > Lines 77 (patched) > > <https://reviews.apache.org/r/71904/diff/3/?file=2210244#file2210244line77> > > > > We created this variable - we should use it? Maybe set it even as a > > constant? You're right. I move this as a constant and changed the tests. - Marta --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71904/#review219487 ----------- On Jan. 31, 2020, 4:12 p.m., Marta Kuczora wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/71904/ > --- > > (Updated Jan. 31, 2020, 4:12 p.m.) > > > Review request for hive, Gopal V and Peter Vary. > > > Bugs: HIVE-21164 > https://issues.apache.org/jira/browse/HIVE-21164 > > > Repository: hive-git > > > Description > --- > > Extended the original patch with saving the task attempt ids in the file > names and also fixed some bugs in the original patch. > With this fix, inserting into an ACID table would not use move task to place > the generated files into the final directory. It will inserts every files to > the final directory and then clean up the files which are not needed (like > written by failed task attempts). > Also fixed the replication tests which failed for the original patch as well. > > > Diffs > - > > > hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java > da677c7977 > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java > 056cd27496 > > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java > 31d15fdef9 > > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorTestUtil.java > c2aa73b5f1 > > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java > 4c0137 > ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java > 9a3258115b > ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 9ad4e71482 > ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 06e4ebee82 > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 6c67bc7dd8 > ql/src/java/org/apache/hadoop/hive/ql/io/AcidInputFormat.java bba3960102 > ql/src/java/org/apache/hadoop/hive/ql/io/AcidOutputFormat.java 1e8bb223f2 > ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 2f5ec5270c > ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java > 8980a6292a > ql/src/java/org/apache/hadoop/hive/ql/io/RecordUpdater.java 737e6774b7 > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 76984abd0a > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java > c4c56f8477 > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java > b8a0f0465c > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java > 398698ec06 > > ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java > 2543dc6fc4 > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 7f061d4a6b > ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java > 73ca658d9c > ql/src/java/org/apache/hadoop/hive/q
Re: Review Request 71904: HIVE-21164: ACID: explore how we can avoid a move step during inserts/compaction
> On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java > > Lines 1732-1737 (patched) > > <https://reviews.apache.org/r/71904/diff/3/?file=2210218#file2210218line1732> > > > > What about using lambda here? Fixed it. > On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote: > > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java > > Lines 7442-7443 (original), 7456-7460 (patched) > > <https://reviews.apache.org/r/71904/diff/3/?file=2210231#file2210231line7459> > > > > nit: Maybe if/else Fixed it. > On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote: > > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java > > Lines 7562-7563 (original), 7600-7604 (patched) > > <https://reviews.apache.org/r/71904/diff/3/?file=2210231#file2210231line7605> > > > > nit: Maybe if/else? Fixed it. - Marta --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71904/#review219487 --- On Jan. 31, 2020, 4:12 p.m., Marta Kuczora wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/71904/ > --- > > (Updated Jan. 31, 2020, 4:12 p.m.) > > > Review request for hive, Gopal V and Peter Vary. > > > Bugs: HIVE-21164 > https://issues.apache.org/jira/browse/HIVE-21164 > > > Repository: hive-git > > > Description > --- > > Extended the original patch with saving the task attempt ids in the file > names and also fixed some bugs in the original patch. > With this fix, inserting into an ACID table would not use move task to place > the generated files into the final directory. It will inserts every files to > the final directory and then clean up the files which are not needed (like > written by failed task attempts). > Also fixed the replication tests which failed for the original patch as well. > > > Diffs > - > > > hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java > da677c7977 > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java > 056cd27496 > > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java > 31d15fdef9 > > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorTestUtil.java > c2aa73b5f1 > > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java > 4c0137 > ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java > 9a3258115b > ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 9ad4e71482 > ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 06e4ebee82 > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 6c67bc7dd8 > ql/src/java/org/apache/hadoop/hive/ql/io/AcidInputFormat.java bba3960102 > ql/src/java/org/apache/hadoop/hive/ql/io/AcidOutputFormat.java 1e8bb223f2 > ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 2f5ec5270c > ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java > 8980a6292a > ql/src/java/org/apache/hadoop/hive/ql/io/RecordUpdater.java 737e6774b7 > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 76984abd0a > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java > c4c56f8477 > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java > b8a0f0465c > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java > 398698ec06 > > ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java > 2543dc6fc4 > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 7f061d4a6b > ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java > 73ca658d9c > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java > 5fcc367cc9 > ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java > c102a69f8f > ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java ecc7bdee4d > ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java bed05819b5 > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java > bb70db4524 > ql/src/java/org/apache/hadoop/hive/ql/util/UpgradeTool.java 58e6289583 > ql/src/test/org/apache/hadoop/hive/ql/TestTxnAddPartition.java c9cb6692df > ql/src/test/org/apache/hadoop/hive/ql/TestTxnCo
Re: Review Request 71904: HIVE-21164: ACID: explore how we can avoid a move step during inserts/compaction
> On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote: > > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java > > Lines 7526-7543 (patched) > > <https://reviews.apache.org/r/71904/diff/3/?file=2210231#file2210231line7529> > > > > Is this duplicated code? Yeah, however I cannot move this whole part to a separate method, because the acidOp and the isDirectInsert variables both have to be set. I can create a separate method for getting the value of isDirectInsert and a separate method for getting the tmp dir. - Marta --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71904/#review219487 --- On Jan. 31, 2020, 4:12 p.m., Marta Kuczora wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/71904/ > --- > > (Updated Jan. 31, 2020, 4:12 p.m.) > > > Review request for hive, Gopal V and Peter Vary. > > > Bugs: HIVE-21164 > https://issues.apache.org/jira/browse/HIVE-21164 > > > Repository: hive-git > > > Description > --- > > Extended the original patch with saving the task attempt ids in the file > names and also fixed some bugs in the original patch. > With this fix, inserting into an ACID table would not use move task to place > the generated files into the final directory. It will inserts every files to > the final directory and then clean up the files which are not needed (like > written by failed task attempts). > Also fixed the replication tests which failed for the original patch as well. > > > Diffs > - > > > hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java > da677c7977 > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java > 056cd27496 > > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java > 31d15fdef9 > > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorTestUtil.java > c2aa73b5f1 > > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java > 4c0137 > ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java > 9a3258115b > ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 9ad4e71482 > ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 06e4ebee82 > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 6c67bc7dd8 > ql/src/java/org/apache/hadoop/hive/ql/io/AcidInputFormat.java bba3960102 > ql/src/java/org/apache/hadoop/hive/ql/io/AcidOutputFormat.java 1e8bb223f2 > ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 2f5ec5270c > ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java > 8980a6292a > ql/src/java/org/apache/hadoop/hive/ql/io/RecordUpdater.java 737e6774b7 > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 76984abd0a > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java > c4c56f8477 > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java > b8a0f0465c > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java > 398698ec06 > > ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java > 2543dc6fc4 > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 7f061d4a6b > ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java > 73ca658d9c > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java > 5fcc367cc9 > ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java > c102a69f8f > ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java ecc7bdee4d > ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java bed05819b5 > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java > bb70db4524 > ql/src/java/org/apache/hadoop/hive/ql/util/UpgradeTool.java 58e6289583 > ql/src/test/org/apache/hadoop/hive/ql/TestTxnAddPartition.java c9cb6692df > ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands.java 842140815d > ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java 88ca683173 > ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands3.java 908ceb43fc > ql/src/test/org/apache/hadoop/hive/ql/TestTxnConcatenate.java 8676e0db11 > ql/src/test/org/apache/hadoop/hive/ql/TestTxnExIm.java 66b2b2768b > ql/src/test/org/apache/hadoop/hive/ql/TestTxnLoadData.java bb55d9fd79 > ql/
Re: Review Request 71904: HIVE-21164: ACID: explore how we can avoid a move step during inserts/compaction
> On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote: > > Thanks for the patch! This will be very-very usefull. > > Some minor comments, questions... Thanks a lot for the review!! > On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote: > > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorTestUtil.java > > Lines 55 (patched) > > <https://reviews.apache.org/r/71904/diff/3/?file=2210213#file2210213line55> > > > > Is this import used? You're right, it is not used. Removed it. > On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java > > Lines 843 (patched) > > <https://reviews.apache.org/r/71904/diff/3/?file=2210216#file2210216line845> > > > > Is inheritPerms still a working stuff? I kinda remember that it was > > removed from Hive some time ago... No, I think this log message was just a copy-paste error. Fixed it. > On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java > > Lines 1799 (patched) > > <https://reviews.apache.org/r/71904/diff/3/?file=2210218#file2210218line1799> > > > > Maybe slightly different log message, so we can easily ditinguish > > between this and the line below Fixed it. > On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote: > > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java > > Lines 7379 (patched) > > <https://reviews.apache.org/r/71904/diff/3/?file=2210231#file2210231line7379> > > > > We might want to make this feature configurable, to turn it on/off in > > case we missed some edge cases You are absolutely right. I introduced a config parameter so we can turn on/off this feature. > On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote: > > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java > > Lines 493-494 (patched) > > <https://reviews.apache.org/r/71904/diff/3/?file=2210235#file2210235line493> > > > > nit: Formatting? Really not important, just for the completensess shake > > :D Fixed it. > On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote: > > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java > > Lines 690-691 (patched) > > <https://reviews.apache.org/r/71904/diff/3/?file=2210235#file2210235line690> > > > > nit: Formatting? Fixed it. > On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote: > > ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java > > Lines 1246 (patched) > > <https://reviews.apache.org/r/71904/diff/3/?file=2210248#file2210248line1246> > > > > Is this table always exists? Shall we use "drop table if exists" > > instead? Fixed it. - Marta --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71904/#review219487 --- On Jan. 31, 2020, 4:12 p.m., Marta Kuczora wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/71904/ > --- > > (Updated Jan. 31, 2020, 4:12 p.m.) > > > Review request for hive, Gopal V and Peter Vary. > > > Bugs: HIVE-21164 > https://issues.apache.org/jira/browse/HIVE-21164 > > > Repository: hive-git > > > Description > --- > > Extended the original patch with saving the task attempt ids in the file > names and also fixed some bugs in the original patch. > With this fix, inserting into an ACID table would not use move task to place > the generated files into the final directory. It will inserts every files to > the final directory and then clean up the files which are not needed (like > written by failed task attempts). > Also fixed the replication tests which failed for the original patch as well. > > > Diffs > - > > > hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java > da677c7977 > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java > 056cd27496 > > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java > 31d15fdef9 > > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorTestUtil.java > c2aa73b5f1 > > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java > 4c0137 > ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMer
Re: Review Request 72074: HIVE-21215: Read Parquet INT64 timestamp
> On Feb. 3, 2020, 9:12 a.m., Karen Coppage wrote: > > Thanks for the patch, looks good! Two ideas: > > 1. It would be nice to have a unit test that reads a date before October > > 1582, so it's clear that we're using the Proleptic Gregorian calendar. > > 2. ParquetTimestampUtils would be more readable if the big multipliers were > > declared as constants and/or in this format: e.g. 1_000_000. > > > > Thanks! Thanks a lot Karen for the review. These are really good point. I fixed the numbers in the ParquetTimestampUtils and also added some test cases for dates before 1582. - Marta --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/72074/#review219466 ------- On Feb. 3, 2020, 12:31 p.m., Marta Kuczora wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/72074/ > --- > > (Updated Feb. 3, 2020, 12:31 p.m.) > > > Review request for hive, Karen Coppage and Peter Vary. > > > Bugs: HIVE-21215 > https://issues.apache.org/jira/browse/HIVE-21215 > > > Repository: hive-git > > > Description > --- > > Implemented the read path for Parquet INT64 timestamp. > > > Diffs > - > > common/src/java/org/apache/hadoop/hive/common/type/Timestamp.java > f2c1493f56 > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java > d67b030648 > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/ParquetTimestampUtils.java > PRE-CREATION > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/ParquetDataColumnReaderFactory.java > 519bd813e9 > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedPrimitiveColumnReader.java > 2803baf90c > > ql/src/test/org/apache/hadoop/hive/ql/io/parquet/convert/TestETypeConverter.java > f6ee57140c > > > Diff: https://reviews.apache.org/r/72074/diff/2/ > > > Testing > --- > > > Thanks, > > Marta Kuczora > >
Re: Review Request 72074: HIVE-21215: Read Parquet INT64 timestamp
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/72074/ --- (Updated Feb. 3, 2020, 12:31 p.m.) Review request for hive, Karen Coppage and Peter Vary. Bugs: HIVE-21215 https://issues.apache.org/jira/browse/HIVE-21215 Repository: hive-git Description --- Implemented the read path for Parquet INT64 timestamp. Diffs (updated) - common/src/java/org/apache/hadoop/hive/common/type/Timestamp.java f2c1493f56 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java d67b030648 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/ParquetTimestampUtils.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/ParquetDataColumnReaderFactory.java 519bd813e9 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedPrimitiveColumnReader.java 2803baf90c ql/src/test/org/apache/hadoop/hive/ql/io/parquet/convert/TestETypeConverter.java f6ee57140c Diff: https://reviews.apache.org/r/72074/diff/2/ Changes: https://reviews.apache.org/r/72074/diff/1-2/ Testing --- Thanks, Marta Kuczora
Review Request 72074: HIVE-21215: Read Parquet INT64 timestamp
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/72074/ --- Review request for hive, Karen Coppage and Peter Vary. Bugs: HIVE-21215 https://issues.apache.org/jira/browse/HIVE-21215 Repository: hive-git Description --- Implemented the read path for Parquet INT64 timestamp. Diffs - common/src/java/org/apache/hadoop/hive/common/type/Timestamp.java f2c1493f56 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java d67b030648 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/ParquetTimestampUtils.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/ParquetDataColumnReaderFactory.java 519bd813e9 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedPrimitiveColumnReader.java 2803baf90c ql/src/test/org/apache/hadoop/hive/ql/io/parquet/convert/TestETypeConverter.java f6ee57140c Diff: https://reviews.apache.org/r/72074/diff/1/ Testing --- Thanks, Marta Kuczora
Re: Review Request 71904: HIVE-21164: ACID: explore how we can avoid a move step during inserts/compaction
--- Had to modify some tests because of the file name changes. Also added some specific tests. In the pre-commit run all tests passed successfully. Thanks, Marta Kuczora
Re: Review Request 71904: HIVE-21164: ACID: explore how we can avoid a move step during inserts/compaction
some specific tests. In the pre-commit run all tests passed successfully. Thanks, Marta Kuczora
[jira] [Created] (HIVE-22716) Reading to ByteBuffer is broken in ParquetFooterInputFromCache
Marta Kuczora created HIVE-22716: Summary: Reading to ByteBuffer is broken in ParquetFooterInputFromCache Key: HIVE-22716 URL: https://issues.apache.org/jira/browse/HIVE-22716 Project: Hive Issue Type: Bug Components: llap Reporter: Marta Kuczora Assignee: Marta Kuczora Fix For: 4.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22648) Upgrade Parquet to 1.11.0
Marta Kuczora created HIVE-22648: Summary: Upgrade Parquet to 1.11.0 Key: HIVE-22648 URL: https://issues.apache.org/jira/browse/HIVE-22648 Project: Hive Issue Type: Improvement Components: File Formats Reporter: Marta Kuczora Assignee: Karen Coppage [WIP until Parquet community releases version 1.11.0] The new Parquet version (1.11.0) uses [LogicalTypes|https://github.com/apache/parquet-format/blob/master/LogicalTypes.md] instead of OriginalTypes. These are backwards-compatible with OriginalTypes. Thanks to [~kuczoram] for her work on this patch. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Review Request 71904: HIVE-21164: ACID: explore how we can avoid a move step during inserts/compaction
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71904/ --- Review request for hive, Gopal V and Peter Vary. Bugs: HIVE-21164 https://issues.apache.org/jira/browse/HIVE-21164 Repository: hive-git Description --- Extended the original patch with saving the task attempt ids in the file names and also fixed some bugs in the original patch. With this fix, inserting into an ACID table would not use move task to place the generated files into the final directory. It will inserts every files to the final directory and then clean up the files which are not needed (like written by failed task attempts). Also fixed the replication tests which failed for the original patch as well. Diffs - hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java da677c7 itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 2868427 itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java 31d15fd itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java 445e39c itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java b7245e2 ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 9a32581 ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 9ad4e71 ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 06e4ebe ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 3d30d09 ql/src/java/org/apache/hadoop/hive/ql/io/AcidInputFormat.java bba3960 ql/src/java/org/apache/hadoop/hive/ql/io/AcidOutputFormat.java 1e8bb22 ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 3c508ec ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 8980a62 ql/src/java/org/apache/hadoop/hive/ql/io/RecordUpdater.java 737e677 ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 76984ab ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java c4c56f8 ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 2ac6232 ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 3fa61d3 ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java 2543dc6 ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java f4bd0f9 ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 73ca658 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 90549f9 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java c102a69 ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java ecc7bde ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java bed0581 ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 2b2cc1a ql/src/java/org/apache/hadoop/hive/ql/util/UpgradeTool.java 58e6289 ql/src/test/org/apache/hadoop/hive/ql/TestTxnAddPartition.java c9cb669 ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands.java 8421408 ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java 88ca683 ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands3.java 908ceb4 ql/src/test/org/apache/hadoop/hive/ql/TestTxnConcatenate.java 8676e0d ql/src/test/org/apache/hadoop/hive/ql/TestTxnExIm.java 66b2b27 ql/src/test/org/apache/hadoop/hive/ql/TestTxnLoadData.java bb55d9f ql/src/test/org/apache/hadoop/hive/ql/TestTxnNoBuckets.java ea6b1d9 ql/src/test/org/apache/hadoop/hive/ql/TxnCommandsBaseForTests.java af14e62 ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java dd70524 ql/src/test/org/apache/hadoop/hive/ql/exec/TestFileSinkOperator.java 2c4b69b ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java c033a94 ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/CompactorTest.java cfd7290 ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestWorker.java 70ae85c ql/src/test/results/clientpositive/acid_subquery.q.out 1dc1775 ql/src/test/results/clientpositive/create_transactional_full_acid.q.out e324d5e ql/src/test/results/clientpositive/encrypted/encryption_insert_partition_dynamic.q.out 61b0057 ql/src/test/results/clientpositive/llap/acid_no_buckets.q.out 5571c53 ql/src/test/results/clientpositive/llap/insert_overwrite.q.out fbc3326 ql/src/test/results/clientpositive/llap/mm_all.q.out 7542a6a ql/src/test/results/clientpositive/mm_all.q.out 1377856 streaming/src/test/org/apache/hive/streaming/TestStreaming.java 58b3ae2 Diff: https://reviews.apache.org/r/71904/diff/1/ Testing --- Had to modify some tests because of the file name changes. Also added some specific tests. In the pre-commit run all tests passed successfully. Thanks, Marta Kuczora
[jira] [Created] (HIVE-22375) ObjectStore.lockNotificationSequenceForUpdate is leaking query in case of error
Marta Kuczora created HIVE-22375: Summary: ObjectStore.lockNotificationSequenceForUpdate is leaking query in case of error Key: HIVE-22375 URL: https://issues.apache.org/jira/browse/HIVE-22375 Project: Hive Issue Type: Bug Components: Standalone Metastore Affects Versions: 4.0.0 Reporter: Marta Kuczora Assignee: Marta Kuczora In the ObjectStore.lockNotificationSequenceForUpdate method, the query doesn't get closed if an error occur: {noformat} private void lockNotificationSequenceForUpdate() throws MetaException { if (sqlGenerator.getDbProduct() == DatabaseProduct.DERBY && directSql != null) { new RetryingExecutor(conf, () -> { directSql.lockDbTable("NOTIFICATION_SEQUENCE"); }).run(); } else { String selectQuery = "select \"NEXT_EVENT_ID\" from \"NOTIFICATION_SEQUENCE\""; String lockingQuery = sqlGenerator.addForUpdateClause(selectQuery); new RetryingExecutor(conf, () -> { prepareQuotes(); Query query = pm.newQuery("javax.jdo.query.SQL", lockingQuery); query.setUnique(true); // only need to execute it to get db Lock query.execute(); query.closeAll(); }).run(); } } {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22345) Accidentally committed HIVE-21327 with wrong commit message
Marta Kuczora created HIVE-22345: Summary: Accidentally committed HIVE-21327 with wrong commit message Key: HIVE-22345 URL: https://issues.apache.org/jira/browse/HIVE-22345 Project: Hive Issue Type: Bug Reporter: Marta Kuczora Assignee: Marta Kuczora Fix For: 4.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22336) The updates should be pushed to the Metastore backend DB before creating the notification event
Marta Kuczora created HIVE-22336: Summary: The updates should be pushed to the Metastore backend DB before creating the notification event Key: HIVE-22336 URL: https://issues.apache.org/jira/browse/HIVE-22336 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 4.0.0 Reporter: Marta Kuczora There was an issue on HDP-3.1 where a table couldn't be deleted, because some related objects (like storage descriptor) were missing from the metastore. There was a previous delete attempt on that table which went wrong, but no rollback happened, that's why the SD were missing. In that previous delete, the notification creation swallowed the error which came from the backend DB, that's why no rollback happened. Here are the steps which happened in the first delete attempt: # Open a transaction (transaction_1) - this step was successful # Delete all the objects which are related to the table - this step was successful too, so the SD and other objects were deleted # Delete the table - this step failed in the backend DB, but according to the log the delete happens in a batch statement, so it won't necessarily be executed right at this moment, so we won't see an error here # Create a notification about the table delete: ## Open an other transaction for the notification creation (transaction_2) - call the ObjectStore.openTransaction method which increases a counter for open transactions and then checks if there is already an active transaction. If there is, then just returns true and doesn't really create a new transaction. ## Lock the notification id in the metastore backend db for update - here is where the exception from the backend DB (let's call it "MySQL Exception") manifests ## If an exception occurs during acquiring the log, retry - The "MySQL Exception" was caught and since there is no check on the exception, the retry mechanism thinks that it happened because couldn't acquire the log for the notification id, so retries and "forgot" about the "MySQL Exception". ## If the lock was acquired successfully, create the notification - Second time, the lock was acquired successfully, so the notification creation was successful. ## Commit transaction_2 - Just decrease the transaction counter, but doesn't actually commits anything. # Commit transaction_1 - This commits the transaction, but since the error already got manifested and kind of "handled", here we won't see any error, just that the commit was successful, so no rollback happens and leaves the table object in an invalid state. # If the commit was not successful then rollback In the customer setup, this issue could be fixed by adding a flush call before creating the notification event, so all the updates would be pushed to the backend db and the error would manifest at this point. With this, the error would go back to the HiveMetastore class which would do the rollback and the delete table operation would fail as it should be, since the table couldn't be deleted. But then the Hivemetastore retry mechanism could try the table deletion again. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: Review Request 71606: HIVE-21407: Parquet predicate pushdown is not working correctly for char column types
> On Oct. 10, 2019, 9 a.m., Peter Vary wrote: > > Thanks for chasing this down! > > Really appreciate it! Thanks a lot for the review! > On Oct. 10, 2019, 9 a.m., Peter Vary wrote: > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java > > Lines 157 (patched) > > <https://reviews.apache.org/r/71606/diff/1/?file=216#file216line158> > > > > This is the best way to check this? > > Is this always starts with char? CHAR? or anything else is not possible? It always start with "char", but you are right that it is not the best way to check it. I changed it to use at least the name of the CHAR serde constant. > On Oct. 10, 2019, 9 a.m., Peter Vary wrote: > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java > > Lines 181 (patched) > > <https://reviews.apache.org/r/71606/diff/1/?file=216#file216line182> > > > > I do not like this. > > Either we only aim for space, or we aim for whitespace characters, but > > the check and the replace is different. You are right, thanks for pointing this out. Since the regex will always replace the whitespaces at the end of the string, the check if the string ends with space is not event necessary. If it doesn't end with space, the regex replace will do nothing. - Marta --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71606/#review218175 --- On Oct. 10, 2019, 11:39 a.m., Marta Kuczora wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/71606/ > --- > > (Updated Oct. 10, 2019, 11:39 a.m.) > > > Review request for hive and Peter Vary. > > > Bugs: HIVE-21407 > https://issues.apache.org/jira/browse/HIVE-21407 > > > Repository: hive-git > > > Description > --- > > The previous approach didn't solve all use cases. In this new approach the > hive type is sent to the Parquet PPD part and trim the value which is pushed > to the predicate in case of CHAR hive type. > > > Diffs > - > > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java > 5b051dd > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java > fc9188f > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java > 033e26a > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetFilterPredicateConverter.java > ca5e085 > > ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetRecordReaderWrapper.java > 0210a0a > > ql/src/test/org/apache/hadoop/hive/ql/io/parquet/read/TestParquetFilterPredicate.java > 7c7c657 > > ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestConvertAstToSearchArg.java > 4c40908 > ql/src/test/queries/clientpositive/parquet_ppd_char.q 386fb25 > ql/src/test/queries/clientpositive/parquet_ppd_char2.q PRE-CREATION > ql/src/test/results/clientpositive/parquet_ppd_char2.q.out PRE-CREATION > > > Diff: https://reviews.apache.org/r/71606/diff/2/ > > > Testing > --- > > Added new q test for testing the PPD for char and varchar types. Also > extended the unit tests for the > ParquetFilterPredicateConverter.toFilterPredicate method. > > > The TestParquetRecordReaderWrapper and the TestParquetFilterPredicate are > both testing the same thing, the behavior of the > ParquetFilterPredicateConverter.toFilterPredicate method. It doesn't make > sense to have tests for the same use case in different test classes, so moved > the test cases from the TestParquetRecordReaderWrapper to > TestParquetFilterPredicate. > > > Thanks, > > Marta Kuczora > >
Re: Review Request 71606: HIVE-21407: Parquet predicate pushdown is not working correctly for char column types
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71606/ --- (Updated Oct. 10, 2019, 11:39 a.m.) Review request for hive and Peter Vary. Changes --- Fix the issues from the review. Bugs: HIVE-21407 https://issues.apache.org/jira/browse/HIVE-21407 Repository: hive-git Description --- The previous approach didn't solve all use cases. In this new approach the hive type is sent to the Parquet PPD part and trim the value which is pushed to the predicate in case of CHAR hive type. Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java 5b051dd ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java fc9188f ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java 033e26a ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetFilterPredicateConverter.java ca5e085 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetRecordReaderWrapper.java 0210a0a ql/src/test/org/apache/hadoop/hive/ql/io/parquet/read/TestParquetFilterPredicate.java 7c7c657 ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestConvertAstToSearchArg.java 4c40908 ql/src/test/queries/clientpositive/parquet_ppd_char.q 386fb25 ql/src/test/queries/clientpositive/parquet_ppd_char2.q PRE-CREATION ql/src/test/results/clientpositive/parquet_ppd_char2.q.out PRE-CREATION Diff: https://reviews.apache.org/r/71606/diff/2/ Changes: https://reviews.apache.org/r/71606/diff/1-2/ Testing --- Added new q test for testing the PPD for char and varchar types. Also extended the unit tests for the ParquetFilterPredicateConverter.toFilterPredicate method. The TestParquetRecordReaderWrapper and the TestParquetFilterPredicate are both testing the same thing, the behavior of the ParquetFilterPredicateConverter.toFilterPredicate method. It doesn't make sense to have tests for the same use case in different test classes, so moved the test cases from the TestParquetRecordReaderWrapper to TestParquetFilterPredicate. Thanks, Marta Kuczora
Review Request 71606: HIVE-21407: Parquet predicate pushdown is not working correctly for char column types
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71606/ --- Review request for hive and Peter Vary. Bugs: HIVE-21407 https://issues.apache.org/jira/browse/HIVE-21407 Repository: hive-git Description --- The previous approach didn't solve all use cases. In this new approach the hive type is sent to the Parquet PPD part and trim the value which is pushed to the predicate in case of CHAR hive type. Diffs - ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java 5b051dd ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java fc9188f ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java 033e26a ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetFilterPredicateConverter.java ca5e085 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetRecordReaderWrapper.java 0210a0a ql/src/test/org/apache/hadoop/hive/ql/io/parquet/read/TestParquetFilterPredicate.java 7c7c657 ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestConvertAstToSearchArg.java 4c40908 ql/src/test/queries/clientpositive/parquet_ppd_char.q 386fb25 ql/src/test/queries/clientpositive/parquet_ppd_char2.q PRE-CREATION ql/src/test/results/clientpositive/parquet_ppd_char2.q.out PRE-CREATION Diff: https://reviews.apache.org/r/71606/diff/1/ Testing --- Added new q test for testing the PPD for char and varchar types. Also extended the unit tests for the ParquetFilterPredicateConverter.toFilterPredicate method. The TestParquetRecordReaderWrapper and the TestParquetFilterPredicate are both testing the same thing, the behavior of the ParquetFilterPredicateConverter.toFilterPredicate method. It doesn't make sense to have tests for the same use case in different test classes, so moved the test cases from the TestParquetRecordReaderWrapper to TestParquetFilterPredicate. Thanks, Marta Kuczora
Review Request 71558: HIVE-21987: Hive is unable to read Parquet int32 annotated with decimal
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71558/ --- Review request for hive and Peter Vary. Bugs: HIVE-21987 https://issues.apache.org/jira/browse/HIVE-21987 Repository: hive-git Description --- Added support to read INT32 Parquet decimals. Diffs - data/files/parquet_int_decimal_1.parquet PRE-CREATION data/files/parquet_int_decimal_2.parquet PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 350ae2d ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/ParquetDataColumnReaderFactory.java 320ce52 ql/src/test/queries/clientpositive/parquet_int_decimal.q PRE-CREATION ql/src/test/results/clientpositive/parquet_int_decimal.q.out PRE-CREATION ql/src/test/results/clientpositive/type_change_test_fraction.q.out 07cf8fa Diff: https://reviews.apache.org/r/71558/diff/1/ Testing --- Added new q tests for the use-case. Thanks, Marta Kuczora
[jira] [Created] (HIVE-22271) Create index on the TBL_COL_PRIVS table for the columns COLUMN_NAME, PRINCIPAL_NAME, PRINCIPAL_TYPE and TBL_ID
Marta Kuczora created HIVE-22271: Summary: Create index on the TBL_COL_PRIVS table for the columns COLUMN_NAME, PRINCIPAL_NAME, PRINCIPAL_TYPE and TBL_ID Key: HIVE-22271 URL: https://issues.apache.org/jira/browse/HIVE-22271 Project: Hive Issue Type: Bug Components: Metastore Reporter: Marta Kuczora In one of the escalations for HDP-3.1.0 we found that the table privilege checks could be very slow and these checks could be speed up by defining an INDEX on the TBL_COL_PRIVS table for the following columns: COLUMN_NAME,PRINCIPAL_NAME,PRINCIPAL_TYPE,TBL_ID In the MYSQL slow query log, we found that the following query is executed slowly: {noformat} SELECT DISTINCT 'org.apache.hadoop.hive.metastore.model.MTableColumnPrivilege' AS `NUCLEUS_TYPE`,`A0`.`AUTHORIZER`,`A0`.`COLUMN_NAME`,`A0`.`CREATE_TIME`,`A0`.`GRANT_OPTION`,`A0`.`GRANTOR`,`A0`.`GRANTOR_TYPE`,`A0`.`PRINCIPAL_NAME`,`A0`.`PRINCIPAL_TYPE`,`A0`.`TBL_COL_PRIV`,`A0`.`TBL_COLUMN_GRANT_ID` FROM `TBL_COL_PRIVS` `A0` LEFT OUTER JOIN `TBLS` `B0` ON `A0`.`TBL_ID` = `B0`.`TBL_ID` LEFT OUTER JOIN `DBS` `C0` ON `B0`.`DB_ID` = `C0`.`DB_ID` WHERE `A0`.`PRINCIPAL_NAME` = 'xxx' AND `A0`.`PRINCIPAL_TYPE` = 'GROUP' AND `B0`.`TBL_NAME` = '' AND `C0`.`NAME` = 'xxx' AND `C0`.`CTLG_NAME` = 'xxx' AND `A0`.`COLUMN_NAME` = 'xxx' {noformat} When checked the explain plan of the this query, it could be seen that the index defined on the TBL_COL_PRIVS table is not used. In the slow query, the COLUMN_NAME, PRINCIPAL_NAME, PRINCIPAL_TYPE and TBL_ID columns were used, and after creating an index on these columns only, we saw significant performance improvement. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: Review Request 71271: HIVE-21580: Introduce ISO 8601 week numbering SQL:2016 formats
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71271/#review217349 --- Fix it, then Ship it! Thanks a lot for the patch. I have only one comment, otherwise it look good. common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java Lines 1025 (patched) <https://reviews.apache.org/r/71271/#comment304639> As we discussed, it would be enough to update the variable in an if statement when the temporalField is "IsoFields.WEEK_BASED_YEAR". In that case, there is no need for the updateVar method which is a bit confusing. - Marta Kuczora On Aug. 12, 2019, 10:59 a.m., Karen Coppage wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/71271/ > --- > > (Updated Aug. 12, 2019, 10:59 a.m.) > > > Review request for hive and Marta Kuczora. > > > Bugs: HIVE-21580 > https://issues.apache.org/jira/browse/HIVE-21580 > > > Repository: hive-git > > > Description > --- > > Enable Hive to parse the following datetime formats when any > combination/subset of these or previously implemented patterns is provided in > one string. Also catch combinations that conflict. > > IYYY > IYY > IY > I > IW > > > Diffs > - > > > common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java > 9443e8ec78 > > common/src/test/org/apache/hadoop/hive/common/format/datetime/TestHiveSqlDateTimeFormatter.java > ff41534fce > > > Diff: https://reviews.apache.org/r/71271/diff/1/ > > > Testing > --- > > > Thanks, > > Karen Coppage > >
Re: Review Request 71016: HIVE-21578: Introduce SQL:2016 formats FM, FX, and nested strings
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71016/#review216889 --- Ship it! Ship It! - Marta Kuczora On July 26, 2019, 10:01 a.m., Karen Coppage wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/71016/ > --- > > (Updated July 26, 2019, 10:01 a.m.) > > > Review request for hive and Marta Kuczora. > > > Bugs: HIVE-21578 > https://issues.apache.org/jira/browse/HIVE-21578 > > > Repository: hive-git > > > Description > --- > > Enable Hive to parse the following datetime formats when any combination or > subset of these or previously implemented formats is provided in one string. > > "text" (nested strings) > FM > FX > > > Diffs > - > > > common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java > 998e5a2f6a > > common/src/test/org/apache/hadoop/hive/common/format/datetime/TestHiveSqlDateTimeFormatter.java > ac57842148 > ql/src/test/queries/clientpositive/cast_datetime_with_sql_2016_format.q > 5a2a6d7894 > ql/src/test/results/clientpositive/cast_datetime_with_sql_2016_format.q.out > e1fd341050 > > > Diff: https://reviews.apache.org/r/71016/diff/3/ > > > Testing > --- > > > Thanks, > > Karen Coppage > >
Re: Review Request 71016: HIVE-21578: Introduce SQL:2016 formats FM, FX, and nested strings
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71016/#review216885 --- Thanks a lot for the patch! I had two comments, but otherwise it looks good. Nice testing btw!! :) - Marta Kuczora On July 5, 2019, 7:51 a.m., Karen Coppage wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/71016/ > --- > > (Updated July 5, 2019, 7:51 a.m.) > > > Review request for hive and Marta Kuczora. > > > Bugs: HIVE-21578 > https://issues.apache.org/jira/browse/HIVE-21578 > > > Repository: hive-git > > > Description > --- > > Enable Hive to parse the following datetime formats when any combination or > subset of these or previously implemented formats is provided in one string. > > "text" (nested strings) > FM > FX > > > Diffs > - > > > common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java > 998e5a2f6a > > common/src/test/org/apache/hadoop/hive/common/format/datetime/TestHiveSqlDateTimeFormatter.java > 4e822d53f9 > ql/src/test/queries/clientpositive/cast_datetime_with_sql_2016_format.q > 5a2a6d7894 > ql/src/test/results/clientpositive/cast_datetime_with_sql_2016_format.q.out > e1fd341050 > > > Diff: https://reviews.apache.org/r/71016/diff/2/ > > > Testing > --- > > > Thanks, > > Karen Coppage > >
Re: Review Request 71016: HIVE-21578: Introduce SQL:2016 formats FM, FX, and nested strings
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71016/#review216884 --- common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java Lines 546 (patched) <https://reviews.apache.org/r/71016/#comment304127> I think it would be better to split this method to two: one for checking only fm and one for checking only fx. Returning a boolean and setting an other one in the background can be a bit confusing for the caller. common/src/test/org/apache/hadoop/hive/common/format/datetime/TestHiveSqlDateTimeFormatter.java Lines 252 (patched) <https://reviews.apache.org/r/71016/#comment304128> I think this test could be split up to have separate tests for fm, fx and fm-fx cases. It is just a nit, but I think it is a good idea to focus on one use-case per test. - Marta Kuczora On July 5, 2019, 7:51 a.m., Karen Coppage wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/71016/ > --- > > (Updated July 5, 2019, 7:51 a.m.) > > > Review request for hive and Marta Kuczora. > > > Bugs: HIVE-21578 > https://issues.apache.org/jira/browse/HIVE-21578 > > > Repository: hive-git > > > Description > --- > > Enable Hive to parse the following datetime formats when any combination or > subset of these or previously implemented formats is provided in one string. > > "text" (nested strings) > FM > FX > > > Diffs > - > > > common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java > 998e5a2f6a > > common/src/test/org/apache/hadoop/hive/common/format/datetime/TestHiveSqlDateTimeFormatter.java > 4e822d53f9 > ql/src/test/queries/clientpositive/cast_datetime_with_sql_2016_format.q > 5a2a6d7894 > ql/src/test/results/clientpositive/cast_datetime_with_sql_2016_format.q.out > e1fd341050 > > > Diff: https://reviews.apache.org/r/71016/diff/2/ > > > Testing > --- > > > Thanks, > > Karen Coppage > >
Re: Review Request 71011: HIVE:21957: Create temporary table like should omit transactional properties.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71011/#review216850 --- Ship it! Ship It! - Marta Kuczora On July 4, 2019, 1:52 p.m., Laszlo Pinter wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/71011/ > --- > > (Updated July 4, 2019, 1:52 p.m.) > > > Review request for hive, Marta Kuczora and Thejas Nair. > > > Repository: hive-git > > > Description > --- > > HIVE:21957: Create temporary table like should omit transactional properties. > > > Diffs > - > > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java > e09fc379f5e0127367e73ed4c4556522de9838a8 > > > Diff: https://reviews.apache.org/r/71011/diff/1/ > > > Testing > --- > > > Thanks, > > Laszlo Pinter > >
Re: Review Request 71011: HIVE:21957: Create temporary table like should omit transactional properties.
> On July 18, 2019, noon, Marta Kuczora wrote: > > Thanks a lot for the patch! > > Just one question: could you add a test about the fixed use-case? > > Laszlo Pinter wrote: > I will add test in a separate patch. Ok, thanks! - Marta --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71011/#review216720 --- On July 4, 2019, 1:52 p.m., Laszlo Pinter wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/71011/ > --- > > (Updated July 4, 2019, 1:52 p.m.) > > > Review request for hive, Marta Kuczora and Thejas Nair. > > > Repository: hive-git > > > Description > --- > > HIVE:21957: Create temporary table like should omit transactional properties. > > > Diffs > - > > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java > e09fc379f5e0127367e73ed4c4556522de9838a8 > > > Diff: https://reviews.apache.org/r/71011/diff/1/ > > > Testing > --- > > > Thanks, > > Laszlo Pinter > >
Re: Review Request 71011: HIVE:21957: Create temporary table like should omit transactional properties.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71011/#review216720 --- Thanks a lot for the patch! Just one question: could you add a test about the fixed use-case? - Marta Kuczora On July 4, 2019, 1:52 p.m., Laszlo Pinter wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/71011/ > --- > > (Updated July 4, 2019, 1:52 p.m.) > > > Review request for hive, Marta Kuczora and Thejas Nair. > > > Repository: hive-git > > > Description > --- > > HIVE:21957: Create temporary table like should omit transactional properties. > > > Diffs > - > > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java > e09fc379f5e0127367e73ed4c4556522de9838a8 > > > Diff: https://reviews.apache.org/r/71011/diff/1/ > > > Testing > --- > > > Thanks, > > Laszlo Pinter > >
Re: Review Request 70920: HIVE-21868: Vectorize CAST...FORMAT
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/70920/#review216444 --- Ship it! Ship It! - Marta Kuczora On July 4, 2019, 3:04 p.m., Karen Coppage wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/70920/ > --- > > (Updated July 4, 2019, 3:04 p.m.) > > > Review request for hive and Marta Kuczora. > > > Bugs: HIVE-21868 > https://issues.apache.org/jira/browse/HIVE-21868 > > > Repository: hive-git > > > Description > --- > > Vectorize UDFs for CAST ( AS STRING/CHAR/VARCHAR FORMAT > ) and CAST ( AS TIMESTAMP/DATE FORMAT ). > > > Diffs > - > > > common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java > 4e024a357b > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java > fa9d1e9783 > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToCharWithFormat.java > PRE-CREATION > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java > dfa9f8a00d > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToStringWithFormat.java > PRE-CREATION > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToVarCharWithFormat.java > PRE-CREATION > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToDate.java > a6dff12e1a > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToDateWithFormat.java > PRE-CREATION > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToTimestamp.java > b48b0136eb > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToTimestampWithFormat.java > PRE-CREATION > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToCharWithFormat.java > PRE-CREATION > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToString.java > adc3a9d7b9 > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToStringWithFormat.java > PRE-CREATION > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToVarCharWithFormat.java > PRE-CREATION > ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCastFormat.java > 16742eee9b > > ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorMathFunctions.java > 663237739e > > ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorTypeCasts.java > 58fd7b030e > > ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorTypeCastsWithFormat.java > PRE-CREATION > ql/src/test/queries/clientnegative/udf_cast_format_bad_pattern.q > PRE-CREATION > ql/src/test/queries/clientpositive/cast_datetime_with_sql_2016_format.q > 269edf6da6 > ql/src/test/results/clientnegative/udf_cast_format_bad_pattern.q.out > PRE-CREATION > ql/src/test/results/clientpositive/cast_datetime_with_sql_2016_format.q.out > 4a502b9700 > > > Diff: https://reviews.apache.org/r/70920/diff/5/ > > > Testing > --- > > > Thanks, > > Karen Coppage > >
Re: Review Request 70963: HIVE-21874: Implement add partitions related methods on temporary table
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/70963/#review216337 --- Ship it! Ship It! - Marta Kuczora On July 1, 2019, 9:20 a.m., Laszlo Pinter wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/70963/ > --- > > (Updated July 1, 2019, 9:20 a.m.) > > > Review request for hive, Marta Kuczora, Peter Vary, and Adam Szita. > > > Repository: hive-git > > > Description > --- > > HIVE-21874: Implement add partitions related methods on temporary table > > > Diffs > - > > > ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java > 957ebb12725e9deac7e7644709521a998df4dbb4 > > ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientAddPartitionsFromSpecTempTable.java > PRE-CREATION > > ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientAddPartitionsTempTable.java > PRE-CREATION > > standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java > a15f5ea0453c7459217d229fa373cc1fec2f4d7a > > standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java > 25643495b53e1ede473c48a90b208b43070ee6aa > > > Diff: https://reviews.apache.org/r/70963/diff/2/ > > > Testing > --- > > Unit testing is done via > TestSessionHiveMetastoreClientAddPartitionsTempTable, > TestSessionHiveMetastoreClientAddPartitionsFromSpecTempTable. > > > Thanks, > > Laszlo Pinter > >
Re: Review Request 70963: HIVE-21874: Implement add partitions related methods on temporary table
> On June 28, 2019, 2:42 p.m., Marta Kuczora wrote: > > ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java > > Line 1046 (original), 1049-1050 (patched) > > <https://reviews.apache.org/r/70963/diff/1/?file=2152472#file2152472line1049> > > > > Why do you need to make the DB and Table name lower case? > > Laszlo Pinter wrote: > Partition properties like table and db name must be stored in lower case. > This is the same in HiveMestarore as well. > Other properties are case sensitive. Ah, I see, thanks for the explanation. - Marta --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/70963/#review216227 --- On July 1, 2019, 9:20 a.m., Laszlo Pinter wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/70963/ > --- > > (Updated July 1, 2019, 9:20 a.m.) > > > Review request for hive, Marta Kuczora, Peter Vary, and Adam Szita. > > > Repository: hive-git > > > Description > --- > > HIVE-21874: Implement add partitions related methods on temporary table > > > Diffs > - > > > ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java > 957ebb12725e9deac7e7644709521a998df4dbb4 > > ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientAddPartitionsFromSpecTempTable.java > PRE-CREATION > > ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientAddPartitionsTempTable.java > PRE-CREATION > > standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java > a15f5ea0453c7459217d229fa373cc1fec2f4d7a > > standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java > 25643495b53e1ede473c48a90b208b43070ee6aa > > > Diff: https://reviews.apache.org/r/70963/diff/2/ > > > Testing > --- > > Unit testing is done via > TestSessionHiveMetastoreClientAddPartitionsTempTable, > TestSessionHiveMetastoreClientAddPartitionsFromSpecTempTable. > > > Thanks, > > Laszlo Pinter > >
Re: Review Request 70920: HIVE-21868: Vectorize CAST...FORMAT
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/70920/#review216334 --- Thanks a lot Karen for the patch! I have some questions, but otherwise the change looks good to me. common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java Line 223 (original), 224 (patched) <https://reviews.apache.org/r/70920/#comment303500> Why did you change the type of this variable to ArrayList from List? ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java Lines 59 (patched) <https://reviews.apache.org/r/70920/#comment303501> Do the CastDateToString, CastDateToChar and CastDateToVarchar udfs use this method, or is this just a typo and the CastDateToStringWithFormat, ... udfs use this? ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCastFormat.java Line 200 (original), 202 (patched) <https://reviews.apache.org/r/70920/#comment303502> Is the formattedOutput variable never going to be null after this change? If there is a scenario where it can be null, it will cause problems when trying to cast it. ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCastFormat.java Line 217 (original), 220 (patched) <https://reviews.apache.org/r/70920/#comment303503> The same question about being null (previous comment) applies to the t and d variable as well. - Marta Kuczora On June 26, 2019, 8:44 a.m., Karen Coppage wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/70920/ > --- > > (Updated June 26, 2019, 8:44 a.m.) > > > Review request for hive and Marta Kuczora. > > > Bugs: HIVE-21868 > https://issues.apache.org/jira/browse/HIVE-21868 > > > Repository: hive-git > > > Description > --- > > Vectorize UDFs for CAST ( AS STRING/CHAR/VARCHAR FORMAT > ) and CAST ( AS TIMESTAMP/DATE FORMAT ). > > > Diffs > - > > > common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java > 4e024a357b > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java > fa9d1e9783 > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToCharWithFormat.java > PRE-CREATION > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java > dfa9f8a00d > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToStringWithFormat.java > PRE-CREATION > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToVarCharWithFormat.java > PRE-CREATION > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToDate.java > a6dff12e1a > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToDateWithFormat.java > PRE-CREATION > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToTimestamp.java > b48b0136eb > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToTimestampWithFormat.java > PRE-CREATION > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToCharWithFormat.java > PRE-CREATION > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToString.java > adc3a9d7b9 > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToStringWithFormat.java > PRE-CREATION > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToVarCharWithFormat.java > PRE-CREATION > ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCastFormat.java > 16742eee9b > > ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorMathFunctions.java > 663237739e > > ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorTypeCasts.java > 58fd7b030e > > ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorTypeCastsWithFormat.java > PRE-CREATION > ql/src/test/queries/clientnegative/udf_cast_format_bad_pattern.q > PRE-CREATION > ql/src/test/queries/clientpositive/cast_datetime_with_sql_2016_format.q > 269edf6da6 > ql/src/test/results/clientnegative/udf_cast_format_bad_pattern.q.out > PRE-CREATION > ql/src/test/results/clientpositive/cast_datetime_with_sql_2016_format.q.out > 4a502b9700 > > > Diff: https://reviews.apache.org/r/70920/diff/3/ > > > Testing > --- > > > Thanks, > > Karen Coppage > >
Re: Review Request 70963: HIVE-21874: Implement add partitions related methods on temporary table
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/70963/#review216227 --- ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java Line 1046 (original), 1049-1050 (patched) <https://reviews.apache.org/r/70963/#comment303353> Why do you need to make the DB and Table name lower case? ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java Lines 1100 (patched) <https://reviews.apache.org/r/70963/#comment303354> Why is it needed to get the newly added partition from the "parts" list as the addPartition method returns the newly added Partition? - Marta Kuczora On June 27, 2019, 9:07 a.m., Laszlo Pinter wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/70963/ > --- > > (Updated June 27, 2019, 9:07 a.m.) > > > Review request for hive, Marta Kuczora, Peter Vary, and Adam Szita. > > > Repository: hive-git > > > Description > --- > > HIVE-21874: Implement add partitions related methods on temporary table > > > Diffs > - > > > ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java > 957ebb12725e9deac7e7644709521a998df4dbb4 > > ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientAddPartitionsFromSpecTempTable.java > PRE-CREATION > > ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientAddPartitionsTempTable.java > PRE-CREATION > > standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java > a15f5ea0453c7459217d229fa373cc1fec2f4d7a > > standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java > 25643495b53e1ede473c48a90b208b43070ee6aa > > > Diff: https://reviews.apache.org/r/70963/diff/1/ > > > Testing > --- > > Unit testing is done via > TestSessionHiveMetastoreClientAddPartitionsTempTable, > TestSessionHiveMetastoreClientAddPartitionsFromSpecTempTable. > > > Thanks, > > Laszlo Pinter > >
Re: Review Request 70963: HIVE-21874: Implement add partitions related methods on temporary table
> On June 28, 2019, 2:42 p.m., Marta Kuczora wrote: > > Thanks a lot for the patch! - Marta --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/70963/#review216227 --- On June 27, 2019, 9:07 a.m., Laszlo Pinter wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/70963/ > --- > > (Updated June 27, 2019, 9:07 a.m.) > > > Review request for hive, Marta Kuczora, Peter Vary, and Adam Szita. > > > Repository: hive-git > > > Description > --- > > HIVE-21874: Implement add partitions related methods on temporary table > > > Diffs > - > > > ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java > 957ebb12725e9deac7e7644709521a998df4dbb4 > > ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientAddPartitionsFromSpecTempTable.java > PRE-CREATION > > ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientAddPartitionsTempTable.java > PRE-CREATION > > standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java > a15f5ea0453c7459217d229fa373cc1fec2f4d7a > > standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java > 25643495b53e1ede473c48a90b208b43070ee6aa > > > Diff: https://reviews.apache.org/r/70963/diff/1/ > > > Testing > --- > > Unit testing is done via > TestSessionHiveMetastoreClientAddPartitionsTempTable, > TestSessionHiveMetastoreClientAddPartitionsFromSpecTempTable. > > > Thanks, > > Laszlo Pinter > >
Re: Review Request 70934: HIVE-18735: Create table like loses transactional attribute.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/70934/#review216150 --- Ship it! Thanks a lot for the patch. It looks good to me. - Marta Kuczora On June 25, 2019, 12:47 p.m., Laszlo Pinter wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/70934/ > --- > > (Updated June 25, 2019, 12:47 p.m.) > > > Review request for hive, Eugene Koifman, Marta Kuczora, Peter Vary, and Adam > Szita. > > > Repository: hive-git > > > Description > --- > > HIVE-18735: Create table like loses transactional attribute. > > > Diffs > - > > hbase-handler/src/test/results/positive/hbase_queries.q.out > 0c21d6d74882788d5748639ea2675579893791af > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java > d395db1b59d021789b1bb47c7f09ff337cba2dd0 > ql/src/test/results/clientpositive/alter_rename_table.q.out > dd656954a1877f7f808de81f6952d7cf8ebfda2f > ql/src/test/results/clientpositive/alter_table_stats_status.q.out > efa2834e0d6dbd77181473c214b77d09fcc1fe69 > ql/src/test/results/clientpositive/autoColumnStats_1.q.out > 1f594ddb6816805d22a1152c261dda75490cd5d0 > ql/src/test/results/clientpositive/autoColumnStats_2.q.out > 121a10384bca03942c297dd0488aceaf0d3bed68 > ql/src/test/results/clientpositive/autoColumnStats_3.q.out > 777d165dc26fb11a6fd863fe1f375c6ae3d55b2a > ql/src/test/results/clientpositive/autoColumnStats_8.q.out > 0e1868bd52d717a6103f1456a1d4e525e85d8622 > ql/src/test/results/clientpositive/create_alter_list_bucketing_table1.q.out > 593ae8389971449ad0f8704d911f6f7c6bcc > ql/src/test/results/clientpositive/create_like.q.out > f4a5ed55a568b0160a6c87cb2fe8c7cd9b20c7c8 > ql/src/test/results/clientpositive/create_like2.q.out > 7152f52fcf82d5052a67be6e27bda532f2b521bd > ql/src/test/results/clientpositive/create_like_tbl_props.q.out > 4d11fc3c9e39c18dd18fdb585ad1831a0a068768 > ql/src/test/results/clientpositive/create_table_like_stats.q.out > 4aa1b4f167a99ffc97d97bb62e0f5313fd83314e > ql/src/test/results/clientpositive/describe_table.q.out > 8c7a16c4b65d3f3951e6c230c42325056a7eab0b > ql/src/test/results/clientpositive/erasurecoding/erasure_simple.q.out > 3ceb3d03c2614f3256a822c9f105ed6e9f2bada8 > ql/src/test/results/clientpositive/explain_ddl.q.out > c53ffae8003bdcc320d4910f021c821c0777bdeb > ql/src/test/results/clientpositive/llap/autoColumnStats_1.q.out > 7272a9c925a4115ee3f1d3a4e6576057d75ac994 > ql/src/test/results/clientpositive/llap/autoColumnStats_2.q.out > 1a4b164b0925860543dd74215e0820fe84c5f3f1 > > ql/src/test/results/clientpositive/llap/insert_values_orig_table_use_metadata.q.out > 6c892cc5b87960b086d90c43516526056bdf221f > ql/src/test/results/clientpositive/llap/stats_noscan_1.q.out > af55d23484ddb74a2c5b7f06c4e91a6063ae11dc > ql/src/test/results/clientpositive/llap/whroot_external1.q.out > cac158c92669f1ad532ada3d6620adebeb909eae > ql/src/test/results/clientpositive/load_dyn_part8.q.out > 7b1b5c1f862a581af3b2c4cabe21b6d186601652 > ql/src/test/results/clientpositive/merge3.q.out > 4e670558808894b0dd5f7b8815987e03de1dc6d3 > ql/src/test/results/clientpositive/mm_default.q.out > 70519b7da8346ddc2de74e46010183d2c9ab11ee > ql/src/test/results/clientpositive/partition_discovery.q.out > cddb6e56ba8db9162c491125e3efd3acd2ed29b2 > ql/src/test/results/clientpositive/spark/load_dyn_part8.q.out > aebf4382cd78b02d9b7bab7285254431f04e29c0 > ql/src/test/results/clientpositive/spark/stats12.q.out > 9db43ef112d0898c08429c839e196b3e48067383 > ql/src/test/results/clientpositive/spark/stats13.q.out > 4922d717a0074146d6da91aae859f09aa5a2b623 > ql/src/test/results/clientpositive/spark/stats14.q.out > eb8a995e298d77098c5d7a01086943dc08307c19 > ql/src/test/results/clientpositive/spark/stats15.q.out > 3874e6de249428404946f461eec3575d6dcb50a5 > ql/src/test/results/clientpositive/spark/stats2.q.out > 30339caeb2cff5cc96101d8cbf5f3ed8b5b01667 > ql/src/test/results/clientpositive/spark/stats6.q.out > 77be16cb13558e6b2af2e772ff0505ea4dba8125 > ql/src/test/results/clientpositive/spark/stats7.q.out > fe942ad94b35288b0cc74d1434429378835ce9c2 > ql/src/test/results/clientpositive/spark/stats8.q.out > edfbd57f72b55d040233328330562703286627d3 > ql/src/test/results/clientpositive/spark/stats9.q.out > ed226b68d21733c7d371472d99f714b759e380e2 > ql/src/test/results/cl
Re: Review Request 70867: HIVE-21814: Implement list partitions related methods on temporary tables
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/70867/#review215995 --- Ship it! Ship It! - Marta Kuczora On June 20, 2019, 9:53 a.m., Laszlo Pinter wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/70867/ > --- > > (Updated June 20, 2019, 9:53 a.m.) > > > Review request for hive, Marta Kuczora, Peter Vary, and Adam Szita. > > > Repository: hive-git > > > Description > --- > > HIVE-21814: Implement list partitions related methods on temporary tables > > > Diffs > - > > > ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java > b71ef5a725d610cda402717f501f6c6a0f653216 > > ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientListPartitionsTempTable.java > PRE-CREATION > > standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/ConditionalIgnoreOnSessionHiveMetastoreClient.java > 99039b08d014cddc9de12e70801267eba7331266 > > standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestListPartitions.java > 34ceb34de646cc2e501564e9b3a0cb8cc8a034e1 > > > Diff: https://reviews.apache.org/r/70867/diff/2/ > > > Testing > --- > > Unit testing is done via > TestSessionHiveMetastoreClientListPartitionsTempTable. > > > Thanks, > > Laszlo Pinter > >
Re: Review Request 70841: HIVE-21576: Introduce CAST...FORMAT and limited list of SQL:2016 datetime formats
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/70841/#review215966 --- Ship it! Ship It! - Marta Kuczora On June 14, 2019, 8:30 a.m., Karen Coppage wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/70841/ > --- > > (Updated June 14, 2019, 8:30 a.m.) > > > Review request for hive, Gabor Kaszab and Marta Kuczora. > > > Bugs: HIVE-21576 > https://issues.apache.org/jira/browse/HIVE-21576 > > > Repository: hive-git > > > Description > --- > > Timestamp and date handling and formatting are currently implemented in Hive > using (sometimes very specific) Java SimpleDateFormat patterns with both > SimpleDateFormat and java.time.DateTimeFormatter, however, these patterns are > not what most standard SQL systems use. For example see Vertica, Netezza, > Oracle, and PostgreSQL. > > **Cast...Format** > > SQL:2016 introduced the FORMAT clause for CAST which is the standard way to > do string <-> datetime conversions > > For example: > > CAST( AS [FORMAT ]) > CAST( AS [FORMAT ]) > cast(dt as string format 'DD-MM-') > cast('01-05-2017' as date format 'DD-MM-') > Stuff like this wouldn't need to happen. > > **New SQL:2016 Patterns** > > Some conflicting examples: > > SimpleDateTime: 'MMM dd, HH:mm:ss' > SQL:2016: 'mon dd, hh24:mi:ss' > > SimpleDateTime: '-MM-dd HH:mm:ss' > SQL:2016: '-mm-dd hh24:mi:ss' > > For the full list of patterns, see subsection "Proposal for Impala’s datetime > patterns" in this doc: > https://docs.google.com/document/d/1V7k6-lrPGW7_uhqM-FhKl3QsxwCRy69v2KIxPsGjc1k/edit > > **Continued usage of SimpleDateFormat patterns** > > [Update] This feature will NOT be behind a flag in order to keep things > simple for users. Existing Hive functions that accept SimpleDateFormat > patterns as input will continue to do so. Please let me know if you disagree > with this decision. These are the functions (afaik) affected: > > from_unixtime(bigint unixtime[, string format]) > unix_timestamp(string date, string pattern) > to_unix_timestamp(date[, pattern]) > add_months(string start_date, int num_months, output_date_format) > date_format(date/timestamp/string ts, string fmt) > This description is a heavily edited description of IMPALA-4018. > > > Diffs > - > > > common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java > PRE-CREATION > > common/src/java/org/apache/hadoop/hive/common/format/datetime/package-info.java > PRE-CREATION > > common/src/test/org/apache/hadoop/hive/common/format/datetime/TestHiveSqlDateTimeFormatter.java > PRE-CREATION > > common/src/test/org/apache/hadoop/hive/common/format/datetime/package-info.java > PRE-CREATION > ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java d08b05fb68 > ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 58fe0cd32e > ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCastFormat.java > PRE-CREATION > > ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFCastFormat.java > PRE-CREATION > ql/src/test/queries/clientpositive/cast_datetime_with_sql_2016_format.q > PRE-CREATION > ql/src/test/results/clientpositive/cast_datetime_with_sql_2016_format.q.out > PRE-CREATION > ql/src/test/results/clientpositive/show_functions.q.out 374e9c4fce > > > Diff: https://reviews.apache.org/r/70841/diff/8/ > > > Testing > --- > > > Thanks, > > Karen Coppage > >
Re: Review Request 70841: HIVE-21576: Introduce CAST...FORMAT and limited list of SQL:2016 datetime formats
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/70841/#review215964 --- Thanks a lot for the patch. I have some minor hints/questions, but otherwise the change looks good to me. common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java Lines 631-651 (patched) <https://reviews.apache.org/r/70841/#comment302889> Does this method still needed? common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java Lines 756 (patched) <https://reviews.apache.org/r/70841/#comment302888> Does this issue still exist? common/src/test/org/apache/hadoop/hive/common/format/datetime/TestHiveSqlDateTimeFormatter.java Lines 235 (patched) <https://reviews.apache.org/r/70841/#comment302890> You could add a message to the assertEquals to make it easier to identify which test case is failing. - Marta Kuczora On June 14, 2019, 8:30 a.m., Karen Coppage wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/70841/ > --- > > (Updated June 14, 2019, 8:30 a.m.) > > > Review request for hive, Gabor Kaszab and Marta Kuczora. > > > Bugs: HIVE-21576 > https://issues.apache.org/jira/browse/HIVE-21576 > > > Repository: hive-git > > > Description > --- > > Timestamp and date handling and formatting are currently implemented in Hive > using (sometimes very specific) Java SimpleDateFormat patterns with both > SimpleDateFormat and java.time.DateTimeFormatter, however, these patterns are > not what most standard SQL systems use. For example see Vertica, Netezza, > Oracle, and PostgreSQL. > > **Cast...Format** > > SQL:2016 introduced the FORMAT clause for CAST which is the standard way to > do string <-> datetime conversions > > For example: > > CAST( AS [FORMAT ]) > CAST( AS [FORMAT ]) > cast(dt as string format 'DD-MM-') > cast('01-05-2017' as date format 'DD-MM-') > Stuff like this wouldn't need to happen. > > **New SQL:2016 Patterns** > > Some conflicting examples: > > SimpleDateTime: 'MMM dd, HH:mm:ss' > SQL:2016: 'mon dd, hh24:mi:ss' > > SimpleDateTime: '-MM-dd HH:mm:ss' > SQL:2016: '-mm-dd hh24:mi:ss' > > For the full list of patterns, see subsection "Proposal for Impala’s datetime > patterns" in this doc: > https://docs.google.com/document/d/1V7k6-lrPGW7_uhqM-FhKl3QsxwCRy69v2KIxPsGjc1k/edit > > **Continued usage of SimpleDateFormat patterns** > > [Update] This feature will NOT be behind a flag in order to keep things > simple for users. Existing Hive functions that accept SimpleDateFormat > patterns as input will continue to do so. Please let me know if you disagree > with this decision. These are the functions (afaik) affected: > > from_unixtime(bigint unixtime[, string format]) > unix_timestamp(string date, string pattern) > to_unix_timestamp(date[, pattern]) > add_months(string start_date, int num_months, output_date_format) > date_format(date/timestamp/string ts, string fmt) > This description is a heavily edited description of IMPALA-4018. > > > Diffs > - > > > common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java > PRE-CREATION > > common/src/java/org/apache/hadoop/hive/common/format/datetime/package-info.java > PRE-CREATION > > common/src/test/org/apache/hadoop/hive/common/format/datetime/TestHiveSqlDateTimeFormatter.java > PRE-CREATION > > common/src/test/org/apache/hadoop/hive/common/format/datetime/package-info.java > PRE-CREATION > ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java d08b05fb68 > ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 58fe0cd32e > ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCastFormat.java > PRE-CREATION > > ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFCastFormat.java > PRE-CREATION > ql/src/test/queries/clientpositive/cast_datetime_with_sql_2016_format.q > PRE-CREATION > ql/src/test/results/clientpositive/cast_datetime_with_sql_2016_format.q.out > PRE-CREATION > ql/src/test/results/clientpositive/show_functions.q.out 374e9c4fce > > > Diff: https://reviews.apache.org/r/70841/diff/8/ > > > Testing > --- > > > Thanks, > > Karen Coppage > >
Re: Review Request 70867: HIVE-21814: Implement list partitions related methods on temporary tables
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/70867/#review215963 --- Ship it! Thanks for the patch. I have two minor hints, otherwise the change looks good to me. Just please consider them before the new batch of partitioned temp table changes. - Marta Kuczora On June 17, 2019, 3:36 p.m., Laszlo Pinter wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/70867/ > --- > > (Updated June 17, 2019, 3:36 p.m.) > > > Review request for hive, Marta Kuczora, Peter Vary, and Adam Szita. > > > Repository: hive-git > > > Description > --- > > HIVE-21814: Implement list partitions related methods on temporary tables > > This change is the next step to support partitions on temporary tables. > HIVE-18739 and HIVE-20661 added partial support for partition columns on > temporary tables, but it was not complete and it was available only for > internal usage. This change addresses the missing functionality related to > listing partitions from temporary tables, although is still remains unexposed > until all the partition related functionalities (get, list, add, alter etc.) > are implemented. > > > Diffs > - > > > ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java > b71ef5a725d610cda402717f501f6c6a0f653216 > > ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientListPartitionsTempTable.java > PRE-CREATION > > standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/ConditionalIgnoreOnSessionHiveMetastoreClient.java > 99039b08d014cddc9de12e70801267eba7331266 > > standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestListPartitions.java > 34ceb34de646cc2e501564e9b3a0cb8cc8a034e1 > > > Diff: https://reviews.apache.org/r/70867/diff/1/ > > > Testing > --- > > Unit testing is done via > TestSessionHiveMetastoreClientListPartitionsTempTable. > > > Thanks, > > Laszlo Pinter > >
Re: Review Request 70867: HIVE-21814: Implement list partitions related methods on temporary tables
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/70867/#review215960 --- ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java Lines 1250-1255 (patched) <https://reviews.apache.org/r/70867/#comment302886> This code piece is used in multiple methods. Maybe it would make sense to extract it to a separate method. But since you have some more patches to go around the temp table partition handling, it is ok if you consider fixing this in a next patch. ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientListPartitionsTempTable.java Lines 133 (patched) <https://reviews.apache.org/r/70867/#comment302887> Would it make sense to add test with low max parts number to see if the method returns the correct number of partitions? - Marta Kuczora On June 17, 2019, 3:36 p.m., Laszlo Pinter wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/70867/ > --- > > (Updated June 17, 2019, 3:36 p.m.) > > > Review request for hive, Marta Kuczora, Peter Vary, and Adam Szita. > > > Repository: hive-git > > > Description > --- > > HIVE-21814: Implement list partitions related methods on temporary tables > > This change is the next step to support partitions on temporary tables. > HIVE-18739 and HIVE-20661 added partial support for partition columns on > temporary tables, but it was not complete and it was available only for > internal usage. This change addresses the missing functionality related to > listing partitions from temporary tables, although is still remains unexposed > until all the partition related functionalities (get, list, add, alter etc.) > are implemented. > > > Diffs > - > > > ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java > b71ef5a725d610cda402717f501f6c6a0f653216 > > ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientListPartitionsTempTable.java > PRE-CREATION > > standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/ConditionalIgnoreOnSessionHiveMetastoreClient.java > 99039b08d014cddc9de12e70801267eba7331266 > > standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestListPartitions.java > 34ceb34de646cc2e501564e9b3a0cb8cc8a034e1 > > > Diff: https://reviews.apache.org/r/70867/diff/1/ > > > Testing > --- > > Unit testing is done via > TestSessionHiveMetastoreClientListPartitionsTempTable. > > > Thanks, > > Laszlo Pinter > >
Re: Review Request 70850: HIVE-21812: Implement get partition related methods on temporary tables
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/70850/#review215869 --- Ship it! Ship It! - Marta Kuczora On June 13, 2019, 8:01 a.m., Laszlo Pinter wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/70850/ > --- > > (Updated June 13, 2019, 8:01 a.m.) > > > Review request for hive, Marta Kuczora, Peter Vary, and Adam Szita. > > > Repository: hive-git > > > Description > --- > > HIVE-21812: Implement get partition related methods on temporary tables > > HIVE-18739 and HIVE-20661 added partial support for partition columns on > temporary tables, but it was not complete and it was available only for > internal usage. This change addresses the missing functionality related to > getting partitions from temporary tables, although is still remains unexposed > until all the partition related functionalities (get, list, add, alter etc.) > are implemented. > > > Diffs > - > > > ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java > 410868cacfe53e8898d4e08572d7a01e05b7eb49 > > ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientGetPartitionsTempTable.java > PRE-CREATION > > standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/ConditionalIgnoreOnSessionHiveMetastoreClient.java > PRE-CREATION > > standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/CustomIgnoreRule.java > PRE-CREATION > > standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/MetaStoreClientTest.java > dc48fa8308a07f68c5e21a2d95f40127d3ff41df > > standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestGetPartitions.java > 4d7f7c12203a9a90568f4aae644ff5cabaafa18c > > > Diff: https://reviews.apache.org/r/70850/diff/1/ > > > Testing > --- > > Unit testing is done via TestSessionHiveMetastoreClientGetPartitionsTempTable. > > > Thanks, > > Laszlo Pinter > >
Re: Review Request 70474: HIVE-21407: Parquet predicate pushdown is not working correctly for char column types
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/70474/ --- (Updated May 9, 2019, 7:51 a.m.) Review request for hive and Peter Vary. Changes --- Fixed the whitespace issue. Bugs: HIVE-21407 https://issues.apache.org/jira/browse/HIVE-21407 Repository: hive-git Description --- The idea behind the patch is that for CHAR columns extend the predicate which is pushed to Parquet with an “or” clause which contains the same expression with a padded and a stripped value. Example: column c is a CHAR(10) type and the search expression is c='apple' The predicate which is pushed to Parquet looked like c='apple ' before the patch and it would look like (c='apple ' or c='apple') after the patch. Since the value 'apple' is stored in Parquet without padding, the predicate before the patch didn’t return any rows. With the patch it will return the correct row. Since on predicate level, there is no distinction between CHAR or VARCHAR, the predicates for VARCHARs will be changed as well, so the result set returned from Parquet will be wider than before. Example: A table contains a c VARCHAR(10) column and there is a row where c='apple' and there is an other row where c='apple '. If the search expression is c='apple ', both rows will be returned from Parquet after the patch. But since Hive is doing an additional filtering on the rows returned from Parquet, it won’t be a problem, the result set returned by Hive will contain only the row with the value 'apple '. Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java be4c0d5 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetRecordReaderWrapper.java 0210a0a ql/src/test/org/apache/hadoop/hive/ql/io/parquet/read/TestParquetFilterPredicate.java d464046 ql/src/test/queries/clientpositive/parquet_ppd_char.q 4230d8c ql/src/test/queries/clientpositive/parquet_ppd_char2.q PRE-CREATION ql/src/test/results/clientpositive/parquet_ppd_char2.q.out PRE-CREATION Diff: https://reviews.apache.org/r/70474/diff/2/ Changes: https://reviews.apache.org/r/70474/diff/1-2/ Testing --- Added new q test for testing the PPD for char and varchar types. Also extended the unit tests for the ParquetFilterPredicateConverter.toFilterPredicate method. The TestParquetRecordReaderWrapper and the TestParquetFilterPredicate are both testing the same thing, the behavior of the ParquetFilterPredicateConverter.toFilterPredicate method. It doesn't make sense to have tests for the same use case in different test classes, so moved the test cases from the TestParquetRecordReaderWrapper to TestParquetFilterPredicate. Thanks, Marta Kuczora
[jira] [Created] (HIVE-21632) Hive should not push partition columns to the Parquet predicate, even if the data file contains a column with the same name as the partition column
Marta Kuczora created HIVE-21632: Summary: Hive should not push partition columns to the Parquet predicate, even if the data file contains a column with the same name as the partition column Key: HIVE-21632 URL: https://issues.apache.org/jira/browse/HIVE-21632 Project: Hive Issue Type: Bug Affects Versions: 4.0.0 Reporter: Marta Kuczora -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Review Request 70474: HIVE-21407: Parquet predicate pushdown is not working correctly for char column types
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/70474/ --- Review request for hive and Peter Vary. Bugs: HIVE-21407 https://issues.apache.org/jira/browse/HIVE-21407 Repository: hive-git Description --- The idea behind the patch is that for CHAR columns extend the predicate which is pushed to Parquet with an “or” clause which contains the same expression with a padded and a stripped value. Example: column c is a CHAR(10) type and the search expression is c='apple' The predicate which is pushed to Parquet looked like c='apple ' before the patch and it would look like (c='apple ' or c='apple') after the patch. Since the value 'apple' is stored in Parquet without padding, the predicate before the patch didn’t return any rows. With the patch it will return the correct row. Since on predicate level, there is no distinction between CHAR or VARCHAR, the predicates for VARCHARs will be changed as well, so the result set returned from Parquet will be wider than before. Example: A table contains a c VARCHAR(10) column and there is a row where c='apple' and there is an other row where c='apple '. If the search expression is c='apple ', both rows will be returned from Parquet after the patch. But since Hive is doing an additional filtering on the rows returned from Parquet, it won’t be a problem, the result set returned by Hive will contain only the row with the value 'apple '. Diffs - ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java be4c0d5 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetRecordReaderWrapper.java 0210a0a ql/src/test/org/apache/hadoop/hive/ql/io/parquet/read/TestParquetFilterPredicate.java d464046 ql/src/test/queries/clientpositive/parquet_ppd_char.q 4230d8c ql/src/test/queries/clientpositive/parquet_ppd_char2.q PRE-CREATION ql/src/test/results/clientpositive/parquet_ppd_char2.q.out PRE-CREATION Diff: https://reviews.apache.org/r/70474/diff/1/ Testing --- Added new q test for testing the PPD for char and varchar types. Also extended the unit tests for the ParquetFilterPredicateConverter.toFilterPredicate method. The TestParquetRecordReaderWrapper and the TestParquetFilterPredicate are both testing the same thing, the behavior of the ParquetFilterPredicateConverter.toFilterPredicate method. It doesn't make sense to have tests for the same use case in different test classes, so moved the test cases from the TestParquetRecordReaderWrapper to TestParquetFilterPredicate. Thanks, Marta Kuczora
[jira] [Created] (HIVE-21407) Parquet predicate pushdown is not working correctly for char column types
Marta Kuczora created HIVE-21407: Summary: Parquet predicate pushdown is not working correctly for char column types Key: HIVE-21407 URL: https://issues.apache.org/jira/browse/HIVE-21407 Project: Hive Issue Type: Bug Affects Versions: 4.0.0 Reporter: Marta Kuczora -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21327) Predicate is not pushed to Parquet if hive.parquet.timestamp.skip.conversion=true
Marta Kuczora created HIVE-21327: Summary: Predicate is not pushed to Parquet if hive.parquet.timestamp.skip.conversion=true Key: HIVE-21327 URL: https://issues.apache.org/jira/browse/HIVE-21327 Project: Hive Issue Type: Bug Affects Versions: 4.0.0 Reporter: Marta Kuczora Assignee: Marta Kuczora -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[DISCUSS] Consistent Timestamps across Hadoop
Hi Hive Community, I would like to share the following document on our "Consistent Timestamp types in Hadoop" plans for review. https://docs.google.com/document/d/1gNRww9mZJcHvUDCXklzjFEQGpefsuR_akCDfWsdE35Q/edit With this plan we would like to get an agreement on consistent timestamp behavior on Hive, Spark and Impala and in order to achieve this, we are sharing this document with all three communities. Please review and comment, any feedback is much appreciated! Regards, Marta
Re: [ANNOUNCE] New committer: Bharathkrishna Guruvayoor Murali
Congratulations Bharath! On Mon, Dec 3, 2018 at 8:45 AM Peter Vary wrote: > Congratulations! > > > On Dec 3, 2018, at 05:32, Sankar Hariappan > wrote: > > > > Congrats Bharath! > > > > Best regards > > Sankar > > > > > > > > > > > > > > > > > > > > On 03/12/18, 7:38 AM, "Vihang Karajgaonkar" > wrote: > > > >> Congratulations Bharath! > >> > >> On Sun, Dec 2, 2018 at 9:33 AM Sahil Takiar > wrote: > >> > >>> Congrats Bharath! > >>> > >>> On Sun, Dec 2, 2018 at 11:14 AM Andrew Sherman > >>> wrote: > >>> > Congratulations Bharath! > > On Sat, Dec 1, 2018 at 10:26 AM Ashutosh Chauhan < > hashut...@apache.org> > wrote: > > > Apache Hive's Project Management Committee (PMC) has invited > > Bharathkrishna > > Guruvayoor Murali to become a committer, and we are pleased to > announce > > that > > he has accepted. > > > > Bharath, welcome, thank you for your contributions, and we look > forward > > your > > further interactions with the community! > > > > Ashutosh Chauhan (on behalf of the Apache Hive PMC) > > > > >>> > >>> > >>> -- > >>> Sahil Takiar > >>> Software Engineer > >>> takiar.sa...@gmail.com | (510) 673-0309 > >>> > >
Re: Review Request 69432: HIVE-20964 Create a test that checks the level of the parallel compilation
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/69432/#review210821 --- Ship it! Ship It! - Marta Kuczora On Nov. 22, 2018, 3:19 p.m., Peter Vary wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/69432/ > --- > > (Updated Nov. 22, 2018, 3:19 p.m.) > > > Review request for hive, Denys Kuzmenko, Marta Kuczora, and Adam Szita. > > > Bugs: HIVE-20964 > https://issues.apache.org/jira/browse/HIVE-20964 > > > Repository: hive-git > > > Description > --- > > * Created 2 query types in the TestCompileLock mock driver. The original > SHORT_QUERY is finishing in 0.5s as before, but the new LONG_QUERY will > finish only after 5s. > * With using the new 5s query I have created a new test where the compile > quota is 4 and the parallel request number is 10. So the test expects that 6 > query will fail with timeout. > * Added a new verifyThatTimedOutCompileOpsCount method to validate the number > of the timed out queries. > * The other changes are just pushing down the query string so the > compileAndRespond method can decide which query to run. > > > Diffs > - > > ql/src/test/org/apache/hadoop/hive/ql/TestCompileLock.java 8dc05ff480 > > > Diff: https://reviews.apache.org/r/69432/diff/1/ > > > Testing > --- > > Run the new test, and all the old tests in TestCompileLock > > > Thanks, > > Peter Vary > >
Re: [ANNOUNCE] New PMC Member : Peter Vary
Congratulations Peter! On Mon, Jul 30, 2018 at 7:53 PM Andrew Sherman wrote: > Congratulations Peter! > > On Sun, Jul 29, 2018 at 1:32 PM Vineet Garg wrote: > > > Congratulations Peter! > > > > > On Jul 26, 2018, at 11:25 AM, Ashutosh Chauhan > > wrote: > > > > > > On behalf of the Hive PMC I am delighted to announce Peter Vary is > > joining > > > Hive PMC. > > > Thanks Peter for all your contributions till now. Looking forward to > many > > > more. > > > > > > Welcome, Peter! > > > > > > Thanks, > > > Ashutosh > > > > >
Re: [ANNOUNCE] New PMC Member : Sahil Takiar
Congratulations Sahil! On Mon, Jul 30, 2018 at 9:44 AM Peter Vary wrote: > Congratulations Sahil! > > > On Jul 29, 2018, at 22:32, Vineet Garg wrote: > > > > Congratulations Sahil! > > > >> On Jul 26, 2018, at 11:28 AM, Ashutosh Chauhan > wrote: > >> > >> On behalf of the Hive PMC I am delighted to announce Sahil Takiar is > >> joining Hive PMC. > >> Thanks Sahil for all your contributions till now. Looking forward to > many > >> more. > >> > >> Welcome, Sahil! > >> > >> Thanks, > >> Ashutosh > > > >
Re: [ANNOUNCE] New PMC Member : Vihang Karajgaonkar
Congratulations Vihang! On Mon, Jul 30, 2018 at 9:44 AM Peter Vary wrote: > Congratulations Vihang! > > > On Jul 29, 2018, at 22:32, Vineet Garg wrote: > > > > Congratulations Vihang! > > > >> On Jul 26, 2018, at 11:27 AM, Ashutosh Chauhan > wrote: > >> > >> On behalf of the Hive PMC I am delighted to announce Vihang > Karajgaonkar > >> is joining Hive PMC. > >> Thanks Vihang for all your contributions till now. Looking forward to > many > >> more. > >> > >> Welcome, Vihang! > >> > >> Thanks, > >> Ashutosh > > > >
Re: [ANNOUNCE] New PMC Member : Vineet Garg
Congratulations Vineet! On Mon, Jul 30, 2018 at 9:45 AM Peter Vary wrote: > Congratulations Vineet! > > > On Jul 30, 2018, at 01:59, Ashutosh Chauhan > wrote: > > > > On behalf of the Hive PMC I am delighted to announce Vineet Garg is > joining > > Hive PMC. > > Thanks Vineet for all your contributions till now. Looking forward to > many > > more. > > > > Welcome, Vineet! > > > > Thanks, > > Ashutosh > >
Re: [ANNOUNCE] New committer: Slim Bouguerra
Congratulations Slim! On Mon, Jul 30, 2018 at 2:01 AM Ashutosh Chauhan wrote: > Apache Hive's Project Management Committee (PMC) has invited Slim Bouguerra > to become a committer, and we are pleased to announce that he has accepted. > > Slim, welcome, thank you for your contributions, and we look forward your > further interactions with the community! > > Ashutosh Chauhan (on behalf of the Apache Hive PMC) >
Re: New committer announcement : Marta Kuczora
Thank you all!! On Thu, Jun 21, 2018 at 9:03 AM Lefty Leverenz wrote: > Congratulations Marta! > > -- Lefty > > > On Thu, Jun 21, 2018 at 1:46 AM Prasanth Jayachandran < > pjayachand...@hortonworks.com> wrote: > > > Congratulations! > > > > Thanks > > Prasanth > > > > > > > > On Wed, Jun 20, 2018 at 10:44 PM -0700, "Vihang Karajgaonkar" > > mailto:vih...@cloudera.com.INVALID>> wrote: > > > > > > Congrats Marta! > > > > On Wed, Jun 20, 2018 at 8:46 PM, Zoltan Haindrich wrote: > > > > > Congratulations Márta! > > > > > > On 20 June 2018 22:20:30 CEST, Deepak Jaiswal > > > wrote: > > > >Congratulations Marta. > > > > > > > >On 6/20/18, 12:06 PM, "Ashutosh Chauhan" wrote: > > > > > > > >Apache Hive's Project Management Committee (PMC) has invited Marta > > > >Kuczora > > > >to become a committer, and we are pleased to announce that he has > > > >accepted. > > > > > > > >Marta, welcome, thank you for your contributions, and we look forward > > > >your > > > >further interactions with the community! > > > > > > > >Ashutosh Chauhan (on behalf of the Apache Hive PMC) > > > > > > > > > > > >
Re: Review Request 66667: HIVE-19046: Refactor the common parts of the HiveMetastore add_partition_core and add_partitions_pspec_core methods
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7/ --- (Updated June 25, 2018, 12:01 p.m.) Review request for hive, Peter Vary, Sahil Takiar, and Adam Szita. Changes --- Rebased the patch Bugs: HIVE-19046 https://issues.apache.org/jira/browse/HIVE-19046 Repository: hive-git Description --- The biggest part of these methods use the same code. Refactored these code parts to common methods. Diffs (updated) - standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java e9d7e7c standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java bf559b4 standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java 4f11a55 Diff: https://reviews.apache.org/r/7/diff/7/ Changes: https://reviews.apache.org/r/7/diff/6-7/ Testing --- Thanks, Marta Kuczora
Re: Review Request 66667: HIVE-19046: Refactor the common parts of the HiveMetastore add_partition_core and add_partitions_pspec_core methods
> On June 6, 2018, 10:34 p.m., Alexander Kolbasov wrote: > > Looks good, a few nits below. Thanks for looking into this review. I fixed/answered the issues. Please let me know if the patch looks ok, then I will upload it to the Jira to run the pre-commit tests. > On June 6, 2018, 10:34 p.m., Alexander Kolbasov wrote: > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java > > Line 3227 (original), 3322 (patched) > > <https://reviews.apache.org/r/7/diff/2/?file=2012314#file2012314line3325> > > > > Is it possible to do it once in constructor instead? I suspect that > > this is a no-trivial operation. To be honest, I don't see clearly if it would be worth to move this part to the constructor. I am not sure what side effect it would have. In HIVE-15137, where this part was added to the code, the problem was that if two HiveCli were started with different users and both users added a partition, the owner of the partition directories was always the first user. Would moving this code to the constructor not affect this use-case? Would it work correctly? I think, this should be investigated. I am just not sure of the benefit of moving this code. The current user is fetched only once when creating a batch of partitions, and I don't see this as a very expensive call. If we want to move this, I would suggest to investigate and do it in a seperate Jira. What do you think? > On June 6, 2018, 10:34 p.m., Alexander Kolbasov wrote: > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java > > Lines 3253 (patched) > > <https://reviews.apache.org/r/7/diff/5/?file=2034474#file2034474line3253> > > > > Can you clarify that "clean up" means removing associated directory. I fixed it accordingly. > On June 6, 2018, 10:34 p.m., Alexander Kolbasov wrote: > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java > > Lines 3268 (patched) > > <https://reviews.apache.org/r/7/diff/5/?file=2034474#file2034474line3268> > > > > Please add a Javadoc here explaining what is checked by validation. > > Also it isn't obvious that validation has side effects (updating partsToAdd) Added Javadoc > On June 6, 2018, 10:34 p.m., Alexander Kolbasov wrote: > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java > > Line 3247 (original), 3343 (patched) > > <https://reviews.apache.org/r/7/diff/5/?file=2034474#file2034474line3346> > > > > addedPartitions is not defined here so it isn't obvious that it should > > be thread-safe. Is it possible to allocate and return addedPartitions here > > so that you guarantee using of thread-safe map? > > > > Another way you can do it is to collect added partitions in thread-safe > > local map and then copy it to the resulting map once you are done with > > concurrent part. The createPartitionFolders method is called with a ConcurrentHashMap, I thought it would do the trick. Returning with the addedPartitions map would be complicated as we have to return the newParts list as well. So I fixed this issue by introducing a local map and then copy the result to the addedPartitions map. - Marta ------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7/#review203099 --- On June 11, 2018, 11:27 a.m., Marta Kuczora wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/7/ > --- > > (Updated June 11, 2018, 11:27 a.m.) > > > Review request for hive, Peter Vary, Sahil Takiar, and Adam Szita. > > > Bugs: HIVE-19046 > https://issues.apache.org/jira/browse/HIVE-19046 > > > Repository: hive-git > > > Description > --- > > The biggest part of these methods use the same code. Refactored these code > parts to common methods. > > > Diffs > - > > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java > b9f5fb8 > > standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java > bf559b4 > > standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java > 4f11a55 > > > Diff: https://reviews.apache.org/r/7/diff/6/ > > > Testing > --- > > > Thanks, > > Marta Kuczora > >
Re: Review Request 66667: HIVE-19046: Refactor the common parts of the HiveMetastore add_partition_core and add_partitions_pspec_core methods
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7/ --- (Updated June 11, 2018, 11:27 a.m.) Review request for hive, Peter Vary, Sahil Takiar, and Adam Szita. Changes --- Address review findings. Bugs: HIVE-19046 https://issues.apache.org/jira/browse/HIVE-19046 Repository: hive-git Description --- The biggest part of these methods use the same code. Refactored these code parts to common methods. Diffs (updated) - standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java b9f5fb8 standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java bf559b4 standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java 4f11a55 Diff: https://reviews.apache.org/r/7/diff/6/ Changes: https://reviews.apache.org/r/7/diff/5-6/ Testing --- Thanks, Marta Kuczora
Re: Review Request 66667: HIVE-19046: Refactor the common parts of the HiveMetastore add_partition_core and add_partitions_pspec_core methods
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7/ --- (Updated June 1, 2018, 12:31 p.m.) Review request for hive, Peter Vary, Sahil Takiar, and Adam Szita. Bugs: HIVE-19046 https://issues.apache.org/jira/browse/HIVE-19046 Repository: hive-git Description --- The biggest part of these methods use the same code. Refactored these code parts to common methods. Diffs (updated) - standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java d8b8414 standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java 88064d9 standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java debcd0e Diff: https://reviews.apache.org/r/7/diff/5/ Changes: https://reviews.apache.org/r/7/diff/4-5/ Testing --- Thanks, Marta Kuczora
Re: Review Request 66667: HIVE-19046: Refactor the common parts of the HiveMetastore add_partition_core and add_partitions_pspec_core methods
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7/ --- (Updated May 29, 2018, 4:24 p.m.) Review request for hive, Peter Vary, Sahil Takiar, and Adam Szita. Changes --- Rebased the patch. Bugs: HIVE-19046 https://issues.apache.org/jira/browse/HIVE-19046 Repository: hive-git Description --- The biggest part of these methods use the same code. Refactored these code parts to common methods. Diffs (updated) - standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java c1d25db standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java 88064d9 standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java debcd0e Diff: https://reviews.apache.org/r/7/diff/4/ Changes: https://reviews.apache.org/r/7/diff/3-4/ Testing --- Thanks, Marta Kuczora
Re: Review Request 66667: HIVE-19046: Refactor the common parts of the HiveMetastore add_partition_core and add_partitions_pspec_core methods
> On April 18, 2018, 9:52 p.m., Alexander Kolbasov wrote: > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java > > Line 3323 (original), 3396 (patched) > > <https://reviews.apache.org/r/7/diff/1/?file=2004741#file2004741line3399> > > > > Should we set interrupted flag on the thread if we get > > InterruptedException? > > Marta Kuczora wrote: > Could you please give me some details about why you think it is needed? I > don't know actually if it is needed or not. My idea here was to go through on > all FutureTasks and if one of them didn't finish successfully (there was > either an error or the task was interrupted), throw an exception, cause it > would mean that not all partition folders were created successfully. For this > I don't think that I should set anything on the thread, but I might miss > something. So could you please explain me your thoughts on this? I just uploaded a new patch with this change. - Marta --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7/#review201465 ------- On May 23, 2018, 4:24 p.m., Marta Kuczora wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/7/ > --- > > (Updated May 23, 2018, 4:24 p.m.) > > > Review request for hive, Peter Vary, Sahil Takiar, and Adam Szita. > > > Bugs: HIVE-19046 > https://issues.apache.org/jira/browse/HIVE-19046 > > > Repository: hive-git > > > Description > --- > > The biggest part of these methods use the same code. Refactored these code > parts to common methods. > > > Diffs > - > > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java > 92d2e3f > > standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java > 88064d9 > > standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java > debcd0e > > > Diff: https://reviews.apache.org/r/7/diff/3/ > > > Testing > --- > > > Thanks, > > Marta Kuczora > >
Re: Review Request 66667: HIVE-19046: Refactor the common parts of the HiveMetastore add_partition_core and add_partitions_pspec_core methods
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7/ --- (Updated May 23, 2018, 4:24 p.m.) Review request for hive, Peter Vary, Sahil Takiar, and Adam Szita. Changes --- Address review finding. Bugs: HIVE-19046 https://issues.apache.org/jira/browse/HIVE-19046 Repository: hive-git Description --- The biggest part of these methods use the same code. Refactored these code parts to common methods. Diffs (updated) - standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 92d2e3f standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java 88064d9 standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java debcd0e Diff: https://reviews.apache.org/r/7/diff/3/ Changes: https://reviews.apache.org/r/7/diff/2-3/ Testing --- Thanks, Marta Kuczora
[jira] [Created] (HIVE-19656) Upgrade Hive to PARQUET 1.10.0
Marta Kuczora created HIVE-19656: Summary: Upgrade Hive to PARQUET 1.10.0 Key: HIVE-19656 URL: https://issues.apache.org/jira/browse/HIVE-19656 Project: Hive Issue Type: Improvement Affects Versions: 3.1.0 Reporter: Marta Kuczora Assignee: Marta Kuczora In the future, the new Parquet logical types for the timestamp type should be introduced to Hive. The implementation of these logical types is planned to be released in the next Parquet version. Before this we should upgrade to the Parquet version 1.10.0 which is already released. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Review Request 66935: HIVE-18977: Listing partitions returns different results with JDO and direct SQL
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66935/ --- Review request for hive, Alan Gates and Peter Vary. Bugs: HIVE-18977 https://issues.apache.org/jira/browse/HIVE-18977 Repository: hive-git Description --- Some of the test cases in TestListPartitions fail when directSQL is disabled. Diffs - standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 4601e09 standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java 6645e55 standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java d608e50 standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestListPartitions.java a8b6e31 Diff: https://reviews.apache.org/r/66935/diff/1/ Testing --- Thanks, Marta Kuczora
Re: Review Request 66667: HIVE-19046: Refactor the common parts of the HiveMetastore add_partition_core and add_partitions_pspec_core methods
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7/ --- (Updated April 26, 2018, 11:18 a.m.) Review request for hive, Peter Vary, Sahil Takiar, and Adam Szita. Changes --- Fixed review findings Bugs: HIVE-19046 https://issues.apache.org/jira/browse/HIVE-19046 Repository: hive-git Description --- The biggest part of these methods use the same code. Refactored these code parts to common methods. Diffs (updated) - standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 397a081 standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java 88064d9 standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java debcd0e Diff: https://reviews.apache.org/r/7/diff/2/ Changes: https://reviews.apache.org/r/7/diff/1-2/ Testing --- Thanks, Marta Kuczora
Re: Review Request 66667: HIVE-19046: Refactor the common parts of the HiveMetastore add_partition_core and add_partitions_pspec_core methods
rrupted flag on the thread if we get > > InterruptedException? Could you please give me some details about why you think it is needed? I don't know actually if it is needed or not. My idea here was to go through on all FutureTasks and if one of them didn't finish successfully (there was either an error or the task was interrupted), throw an exception, cause it would mean that not all partition folders were created successfully. For this I don't think that I should set anything on the thread, but I might miss something. So could you please explain me your thoughts on this? > On April 18, 2018, 9:52 p.m., Alexander Kolbasov wrote: > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java > > Line 3500 (original), 3525 (patched) > > <https://reviews.apache.org/r/7/diff/1/?file=2004741#file2004741line3576> > > > > Style nit: validPartition doesn't add any value here, why not just > > > > if (validatePartition(part, catName, tblName, dbName, > > partsToAdd, ms, ifNotExists)) {... } Fixed it. > On April 18, 2018, 9:52 p.m., Alexander Kolbasov wrote: > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java > > Line 3595 (original), 3548 (patched) > > <https://reviews.apache.org/r/7/diff/1/?file=2004741#file2004741line3671> > > > > Note that cleanupPartitionFolders() here may throw an exception, thus > > preventing other cleanup. I guess you mean the same issue here than in your previous comment: "Here we are trying to nuke a bunch of values. If a single one fails, we do not attempt to delete others. Since you are just doing refactoring it is out of scope but I think the proper behavior is to continue nuking for others as well." I would close this issue and continue the discussion under the other comment, just to have it in one place. If you meant something else, please feel free to reopen this issue. - Marta --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7/#review201465 --- On April 17, 2018, 1:37 p.m., Marta Kuczora wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/7/ > --- > > (Updated April 17, 2018, 1:37 p.m.) > > > Review request for hive, Peter Vary, Sahil Takiar, and Adam Szita. > > > Bugs: HIVE-19046 > https://issues.apache.org/jira/browse/HIVE-19046 > > > Repository: hive-git > > > Description > --- > > The biggest part of these methods use the same code. Refactored these code > parts to common methods. > > > Diffs > - > > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java > ae9ec5c > > standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java > f8497c7 > > standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java > fc0c60f > > > Diff: https://reviews.apache.org/r/7/diff/1/ > > > Testing > --- > > > Thanks, > > Marta Kuczora > >
Re: Review Request 66774: HIVE-19285: Add logs to the subclasses of MetaDataOperation
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66774/ --- (Updated April 26, 2018, 8:16 a.m.) Review request for hive and Peter Vary. Changes --- Fixed stylecheck issues. Bugs: HIVE-19285 https://issues.apache.org/jira/browse/HIVE-19285 Repository: hive-git Description --- Subclasses of MetaDataOperation are not writing anything to the logs. It would be useful to have some INFO and DEBUG level logging in these classes. Diffs (updated) - service/src/java/org/apache/hive/service/cli/operation/GetCatalogsOperation.java 7944467 service/src/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java d67ea90 service/src/java/org/apache/hive/service/cli/operation/GetCrossReferenceOperation.java 99ccd4e service/src/java/org/apache/hive/service/cli/operation/GetFunctionsOperation.java 091bf50 service/src/java/org/apache/hive/service/cli/operation/GetPrimaryKeysOperation.java e603fdd service/src/java/org/apache/hive/service/cli/operation/GetSchemasOperation.java de09ec9 service/src/java/org/apache/hive/service/cli/operation/GetTableTypesOperation.java 59cfbb2 service/src/java/org/apache/hive/service/cli/operation/GetTablesOperation.java c9233d0 service/src/java/org/apache/hive/service/cli/operation/GetTypeInfoOperation.java ac078b4 service/src/java/org/apache/hive/service/cli/operation/MetadataOperation.java bf7c021 Diff: https://reviews.apache.org/r/66774/diff/3/ Changes: https://reviews.apache.org/r/66774/diff/2-3/ Testing --- Just adding some additional log messages. Tested locally by checking the log messages for different use cases Thanks, Marta Kuczora
Re: Review Request 66774: HIVE-19285: Add logs to the subclasses of MetaDataOperation
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66774/ --- (Updated April 24, 2018, 6:35 p.m.) Review request for hive and Peter Vary. Changes --- Fixed review findings. Bugs: HIVE-19285 https://issues.apache.org/jira/browse/HIVE-19285 Repository: hive-git Description --- Subclasses of MetaDataOperation are not writing anything to the logs. It would be useful to have some INFO and DEBUG level logging in these classes. Diffs (updated) - service/src/java/org/apache/hive/service/cli/operation/GetCatalogsOperation.java 7944467 service/src/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java d67ea90 service/src/java/org/apache/hive/service/cli/operation/GetCrossReferenceOperation.java 99ccd4e service/src/java/org/apache/hive/service/cli/operation/GetFunctionsOperation.java 091bf50 service/src/java/org/apache/hive/service/cli/operation/GetPrimaryKeysOperation.java e603fdd service/src/java/org/apache/hive/service/cli/operation/GetSchemasOperation.java de09ec9 service/src/java/org/apache/hive/service/cli/operation/GetTableTypesOperation.java 59cfbb2 service/src/java/org/apache/hive/service/cli/operation/GetTablesOperation.java c9233d0 service/src/java/org/apache/hive/service/cli/operation/GetTypeInfoOperation.java ac078b4 service/src/java/org/apache/hive/service/cli/operation/MetadataOperation.java bf7c021 Diff: https://reviews.apache.org/r/66774/diff/2/ Changes: https://reviews.apache.org/r/66774/diff/1-2/ Testing --- Just adding some additional log messages. Tested locally by checking the log messages for different use cases Thanks, Marta Kuczora