Hive CI / github user change
Hey, For a long time hive-ci was adding comments/etc under my username on every PR; I've worked with infra(INFRA-24854) to get a PAT for one of their own users. I don't see any issues arising from this change - but wanted to let you know :) cheers, Zoltan OpenPGP_signature Description: OpenPGP digital signature
Re: How to use `engine` introduced by HIVE-22046
Hi, Okumin I have encountered this issue before, and the 'validWriteIdList' is also a incompatibility parameter. I have submit a PR in trino-hive-apache repo, and you can refer to https://github.com/trinodb/trino-hive-apache/pull/43 . IIUC, the 'engine' parameter is used to differentiate between stats produced by different engines(Hive), but it seems that the downstream engines do not want to adopt the new 'engine' parameter. At present, if some engines(e.g. Trino) use the customized thrift api to interact wiht hms, it must change its thrift file to match the thrift definition of hms. BTW, maybe we can change hms thrift file to make the 'engine' parameter optional and then other customized thrift client will not have compatibility issues. Thanks, Butao Zhang Replied Message | From | Okumin | | Date | 8/10/2023 23:41 | | To | | | Subject | How to use `engine` introduced by HIVE-22046 | Hi Hive developers, I noticed HIVE-22046 introduced incompatibility to Metastore APIs while I'm testing integration between Hive 4 and other software. If I understand correctly, clients are currently required to additionally specify the engine name when they get or update column statistics. - https://issues.apache.org/jira/browse/HIVE-22046 - https://github.com/apache/hive/pull/741 For example, Trino has a feature to use column stats and it fails. Note that I am not 100% sure about Trino's implementation or behavior. ``` trino> create table hive.default.test_trino (id int); Query 20230810_152236_4_t9n6h failed: Required field 'engine' is unset! Struct:TableStatsRequest(dbName:default, tblName:test_trino, colNames:[id], engine:null) ``` I have two questions about this feature. (1) Should any engine use a unique engine name? I guess some software can store or use stats compatible with Hive. I wonder if it can reuse engine=hive in that case, or should use a different name like engine=trino. I see Impala gives a unique engine name to metastore. Taking a glance, Spark is unlikely to be using col stats of Hive directly. - https://issues.apache.org/jira/browse/IMPALA-8842 (2) Should Hive Metastore use engine=hive as a default value? If other compatible software can reuse engine=hive, it could be an option to accept requests with the old format assuming its engine is "hive" for compatibility. Or should they explicitly specify engine=hive when using Hive 4? Regards, Okumin
[RESULT][VOTE] Release Apache Hive 4.0.0-beta-1 (Release Candidate 0)
Thanks to everyone who has tested the release candidate and given their comments and votes. The tally is as follows. 3 binding +1s: Stamatis Zampetakis Denys Kuzmenko Ayush Saxena 3 non-binding +1s: Zhihua Deng Sourabh Badhya Simhadri Govindappa No 0s or -1s. Therefore, I am delighted to announce that the proposal to release Apache Hive 4.0.0-beta-1 has passed. I will proceed with the next steps of the release and I will send an announcement once the release becomes publicly available (sometime next week). Best, Stamatis
How to use `engine` introduced by HIVE-22046
Hi Hive developers, I noticed HIVE-22046 introduced incompatibility to Metastore APIs while I'm testing integration between Hive 4 and other software. If I understand correctly, clients are currently required to additionally specify the engine name when they get or update column statistics. - https://issues.apache.org/jira/browse/HIVE-22046 - https://github.com/apache/hive/pull/741 For example, Trino has a feature to use column stats and it fails. Note that I am not 100% sure about Trino's implementation or behavior. ``` trino> create table hive.default.test_trino (id int); Query 20230810_152236_4_t9n6h failed: Required field 'engine' is unset! Struct:TableStatsRequest(dbName:default, tblName:test_trino, colNames:[id], engine:null) ``` I have two questions about this feature. (1) Should any engine use a unique engine name? I guess some software can store or use stats compatible with Hive. I wonder if it can reuse engine=hive in that case, or should use a different name like engine=trino. I see Impala gives a unique engine name to metastore. Taking a glance, Spark is unlikely to be using col stats of Hive directly. - https://issues.apache.org/jira/browse/IMPALA-8842 (2) Should Hive Metastore use engine=hive as a default value? If other compatible software can reuse engine=hive, it could be an option to accept requests with the old format assuming its engine is "hive" for compatibility. Or should they explicitly specify engine=hive when using Hive 4? Regards, Okumin
Re: [VOTE] Release Apache Hive 4.0.0-beta-1 (Release Candidate 0)
+1(Binding) * Built from source * Verified checksums * Verified signatures * Verified no code diff between the git tag & source tar * Checked the NOTICE & LICENSE files. * Skimmed over HS2 UI * Deployed with Derby and tried some operations on ACID & Iceberg tables. Thanx Stamatis for driving the release, Good Luck!!! -Ayush On Thu, 10 Aug 2023 at 18:27, Denys Kuzmenko wrote: > > +1 > > * Verified signatures and checksum; > * Checked binary content and successfully built from the source; > * Skimmed through the release notes; > * Initialized backend DB schema and launched HMS & HS2 locally; > * Conducted basic checks via beeline: > - Created a few ACID & Iceberg tables and loaded data into them; > - Executed Select/Insert/Update/Delete/Merge/IOW queries. > > Thanks, Stamatis for driving the release. > > Regards, > Denys
Re: [VOTE] Release Apache Hive 4.0.0-beta-1 (Release Candidate 0)
+1 * Verified signatures and checksum; * Checked binary content and successfully built from the source; * Skimmed through the release notes; * Initialized backend DB schema and launched HMS & HS2 locally; * Conducted basic checks via beeline: - Created a few ACID & Iceberg tables and loaded data into them; - Executed Select/Insert/Update/Delete/Merge/IOW queries. Thanks, Stamatis for driving the release. Regards, Denys
Re: [VOTE] Release Apache Hive 4.0.0-beta-1 (Release Candidate 0)
Hi Everyone, Thanks, Stamatis for driving the release. +1 (non-binding) Verified the following: * Download the source tarball, signature (.asc), and checksum (.sha512): OK * Import gpg keys: download KEYS and run gpg --import /path/to/downloaded/KEYS.txt -> Verify the signature by running:gpg --verify ./apache-hive-4.0.0-beta-1-bin.tar.gz.asc /apache-hive-4.0.0-beta-1-bin.tar.gz : OK * Validated checksum and signature for the artifacts : OK * Build from source successfully : OK * Init meta scripts against MYSQL : OK * Successful standalone metastore setup with MYSQL : OK * Bring up HiveServer2 and Metastore, run some simple hive queries and iceberg queries using Tez : OK Thanks! Simhadri G On Thu, Aug 10, 2023 at 12:30 PM Sourabh Badhya wrote: > I was able to do the following with RC0 artifacts - > * Verified checksum of sources and binaries. > * Successfully built from source. > * Successful metastore DB setup with Postgres. > * Brought up HS2 and HMS successfully and ran CREATE, INSERT, SELECT, DROP > queries for external tables and CREATE, INSERT, SELECT, DELETE, UPDATE, > DROP queries for transactional and Iceberg tables using Tez. > > +1 (non-binding) > > Thanks Stamatis for driving the release. > > Regards, > Sourabh Badhya > > On Wed, Aug 9, 2023 at 10:53 AM dengzhhu653 wrote: > > > +1 (non-binding), and thanks for driving the release.* Verified > > signature/Checksum of sources and binaries;* Good rat check and > source > > files;* Build from source successfully;* Inited meta scripts > > against Postgres;* Bring up HiveServer2 and Metastore, run some > simple > > queries using Tez: okThanks,Zhihua > > At 2023-08-07 21:55:30, "Stamatis Zampetakis" > wrote: > > >Hi all, > > > > > >I have created a build for Apache Hive 4.0.0-beta-1 Release Candidate 0. > > > > > >Thanks to everyone who has contributed to this release. > > > > > >You can read the release notes here: > > > > https://github.com/apache/hive/blob/branch-4.0.0-beta-1/RELEASE_NOTES.txt > > > > > >The commit to be voted upon: > > > > > > https://github.com/apache/hive/commit/d2310944e412b577a39687c7968b2e93eede8433 > > > > > >Its hash is > > >d2310944e412b577a39687c7968b2e93eede8433 > > > > > >Tag: > > >https://github.com/apache/hive/tree/release-4.0.0-beta-1-rc0 > > > > > >The artifacts to be voted on are located here: > > >https://people.apache.org/~zabetak/apache-hive-4.0.0-beta-1-rc0/ > > > > > >The hashes of the artifacts are as follows: > > >- 4114d8e9a523562c77237a8751dec9ed1bcbf6ccbe2e178d72f356ca4e65d466 > > >apache-hive-4.0.0-beta-1-bin.tar.gz > > >- 8d157f4dcb9af5e48e51206a4046d1c11414fbc39583c84be31d609606136209 > > >apache-hive-4.0.0-beta-1-src.tar.gz > > > > > >A staged Maven repository is available for review at: > > >https://repository.apache.org/content/repositories/orgapachehive-1119 > > > > > >Release artifacts are signed with the following key: > > >https://people.apache.org/keys/committer/zabetak.asc > > >https://downloads.apache.org/hive/KEYS > > > > > >Please vote on releasing this package as Apache Hive 4.0.0-beta-1. > > > > > >The vote is open for the next 72 hours and passes if a majority of at > > >least three +1 PMC votes are cast. > > > > > >[ ] +1 Release this package as Apache Hive 4.0.0-beta-1 > > >[ ] 0 I don't feel strongly about it, but I'm okay with the release > > >[ ] -1 Do not release this package because... > > > > > >Here is my vote: > > >+1 (binding) > > > > > >Best, > > >Stamatis > > >
Re: [VOTE] Release Apache Hive 4.0.0-beta-1 (Release Candidate 0)
I was able to do the following with RC0 artifacts - * Verified checksum of sources and binaries. * Successfully built from source. * Successful metastore DB setup with Postgres. * Brought up HS2 and HMS successfully and ran CREATE, INSERT, SELECT, DROP queries for external tables and CREATE, INSERT, SELECT, DELETE, UPDATE, DROP queries for transactional and Iceberg tables using Tez. +1 (non-binding) Thanks Stamatis for driving the release. Regards, Sourabh Badhya On Wed, Aug 9, 2023 at 10:53 AM dengzhhu653 wrote: > +1 (non-binding), and thanks for driving the release.* Verified > signature/Checksum of sources and binaries;* Good rat check and source > files;* Build from source successfully;* Inited meta scripts > against Postgres;* Bring up HiveServer2 and Metastore, run some simple > queries using Tez: okThanks,Zhihua > At 2023-08-07 21:55:30, "Stamatis Zampetakis" wrote: > >Hi all, > > > >I have created a build for Apache Hive 4.0.0-beta-1 Release Candidate 0. > > > >Thanks to everyone who has contributed to this release. > > > >You can read the release notes here: > >https://github.com/apache/hive/blob/branch-4.0.0-beta-1/RELEASE_NOTES.txt > > > >The commit to be voted upon: > > > https://github.com/apache/hive/commit/d2310944e412b577a39687c7968b2e93eede8433 > > > >Its hash is > >d2310944e412b577a39687c7968b2e93eede8433 > > > >Tag: > >https://github.com/apache/hive/tree/release-4.0.0-beta-1-rc0 > > > >The artifacts to be voted on are located here: > >https://people.apache.org/~zabetak/apache-hive-4.0.0-beta-1-rc0/ > > > >The hashes of the artifacts are as follows: > >- 4114d8e9a523562c77237a8751dec9ed1bcbf6ccbe2e178d72f356ca4e65d466 > >apache-hive-4.0.0-beta-1-bin.tar.gz > >- 8d157f4dcb9af5e48e51206a4046d1c11414fbc39583c84be31d609606136209 > >apache-hive-4.0.0-beta-1-src.tar.gz > > > >A staged Maven repository is available for review at: > >https://repository.apache.org/content/repositories/orgapachehive-1119 > > > >Release artifacts are signed with the following key: > >https://people.apache.org/keys/committer/zabetak.asc > >https://downloads.apache.org/hive/KEYS > > > >Please vote on releasing this package as Apache Hive 4.0.0-beta-1. > > > >The vote is open for the next 72 hours and passes if a majority of at > >least three +1 PMC votes are cast. > > > >[ ] +1 Release this package as Apache Hive 4.0.0-beta-1 > >[ ] 0 I don't feel strongly about it, but I'm okay with the release > >[ ] -1 Do not release this package because... > > > >Here is my vote: > >+1 (binding) > > > >Best, > >Stamatis >