Hive CI / github user change

2023-08-10 Thread Zoltan Haindrich

Hey,

For a long time hive-ci was adding comments/etc under my username on every PR; 
I've worked with infra(INFRA-24854) to get a PAT for one of their own users.
I don't see any issues arising from this change - but wanted to let you know :)

cheers,
Zoltan


OpenPGP_signature
Description: OpenPGP digital signature


Re: How to use `engine` introduced by HIVE-22046

2023-08-10 Thread Butao Zhang
Hi, Okumin


I have encountered this issue before, and the 'validWriteIdList' is also a 
incompatibility parameter. I have submit a PR in trino-hive-apache repo, and 
you can refer to https://github.com/trinodb/trino-hive-apache/pull/43 .
IIUC, the 'engine' parameter is used to differentiate between stats produced by 
different engines(Hive), but it seems that the downstream 
engines do not want to adopt the new 'engine' 
parameter.
At present, if some engines(e.g. Trino) use the customized thrift api to 
interact wiht hms, it must change its thrift file to match the thrift 
definition of hms.
BTW, maybe we can change hms thrift file to make the 'engine' parameter 
optional and then other customized thrift client will not have compatibility 
issues.

Thanks,

Butao Zhang

 Replied Message 
| From | Okumin |
| Date | 8/10/2023 23:41 |
| To |  |
| Subject | How to use `engine` introduced by HIVE-22046 |
Hi Hive developers,

I noticed HIVE-22046 introduced incompatibility to Metastore APIs while I'm
testing integration between Hive 4 and other software. If I understand
correctly, clients are currently required to additionally specify the
engine name when they get or update column statistics.

- https://issues.apache.org/jira/browse/HIVE-22046
- https://github.com/apache/hive/pull/741

For example, Trino has a feature to use column stats and it fails. Note
that I am not 100% sure about Trino's implementation or behavior.

```
trino> create table hive.default.test_trino (id int);
Query 20230810_152236_4_t9n6h failed: Required field 'engine' is unset!
Struct:TableStatsRequest(dbName:default, tblName:test_trino, colNames:[id],
engine:null)
```

I have two questions about this feature.

(1) Should any engine use a unique engine name?

I guess some software can store or use stats compatible with Hive. I wonder
if it can reuse engine=hive in that case, or should use a different name
like engine=trino.

I see Impala gives a unique engine name to metastore. Taking a glance,
Spark is unlikely to be using col stats of Hive directly.

- https://issues.apache.org/jira/browse/IMPALA-8842

(2) Should Hive Metastore use engine=hive as a default value?

If other compatible software can reuse engine=hive, it could be an option
to accept requests with the old format assuming its engine is "hive" for
compatibility. Or should they explicitly specify engine=hive when using
Hive 4?

Regards,
Okumin


[RESULT][VOTE] Release Apache Hive 4.0.0-beta-1 (Release Candidate 0)

2023-08-10 Thread Stamatis Zampetakis
Thanks to everyone who has tested the release candidate and given
their comments and votes.

The tally is as follows.

3 binding +1s:
Stamatis Zampetakis
Denys Kuzmenko
Ayush Saxena

3 non-binding +1s:
Zhihua Deng
Sourabh Badhya
Simhadri Govindappa

No 0s or -1s.

Therefore, I am delighted to announce that the proposal to release
Apache Hive 4.0.0-beta-1 has passed.

I will proceed with the next steps of the release and I will send an
announcement once the release becomes publicly available (sometime
next week).

Best,
Stamatis


How to use `engine` introduced by HIVE-22046

2023-08-10 Thread Okumin
Hi Hive developers,

I noticed HIVE-22046 introduced incompatibility to Metastore APIs while I'm
testing integration between Hive 4 and other software. If I understand
correctly, clients are currently required to additionally specify the
engine name when they get or update column statistics.

- https://issues.apache.org/jira/browse/HIVE-22046
- https://github.com/apache/hive/pull/741

For example, Trino has a feature to use column stats and it fails. Note
that I am not 100% sure about Trino's implementation or behavior.

```
trino> create table hive.default.test_trino (id int);
Query 20230810_152236_4_t9n6h failed: Required field 'engine' is unset!
Struct:TableStatsRequest(dbName:default, tblName:test_trino, colNames:[id],
engine:null)
```

I have two questions about this feature.

(1) Should any engine use a unique engine name?

I guess some software can store or use stats compatible with Hive. I wonder
if it can reuse engine=hive in that case, or should use a different name
like engine=trino.

I see Impala gives a unique engine name to metastore. Taking a glance,
Spark is unlikely to be using col stats of Hive directly.

- https://issues.apache.org/jira/browse/IMPALA-8842

(2) Should Hive Metastore use engine=hive as a default value?

If other compatible software can reuse engine=hive, it could be an option
to accept requests with the old format assuming its engine is "hive" for
compatibility. Or should they explicitly specify engine=hive when using
Hive 4?

Regards,
Okumin


Re: [VOTE] Release Apache Hive 4.0.0-beta-1 (Release Candidate 0)

2023-08-10 Thread Ayush Saxena
+1(Binding)

* Built from source
* Verified checksums
* Verified signatures
* Verified no code diff between the git tag & source tar
* Checked the NOTICE & LICENSE files.
* Skimmed over HS2 UI
* Deployed with Derby and tried some operations on ACID & Iceberg tables.

Thanx Stamatis for driving the release, Good Luck!!!

-Ayush

On Thu, 10 Aug 2023 at 18:27, Denys Kuzmenko  wrote:
>
> +1
>
> * Verified signatures and checksum;
> * Checked binary content and successfully built from the source;
> * Skimmed through the release notes;
> * Initialized backend DB schema and launched HMS & HS2 locally;
> * Conducted basic checks via beeline:
> - Created a few ACID & Iceberg tables and loaded data into them;
> - Executed Select/Insert/Update/Delete/Merge/IOW queries.
>
> Thanks, Stamatis for driving the release.
>
> Regards,
> Denys


Re: [VOTE] Release Apache Hive 4.0.0-beta-1 (Release Candidate 0)

2023-08-10 Thread Denys Kuzmenko
+1

* Verified signatures and checksum;
* Checked binary content and successfully built from the source;
* Skimmed through the release notes;
* Initialized backend DB schema and launched HMS & HS2 locally;
* Conducted basic checks via beeline:
- Created a few ACID & Iceberg tables and loaded data into them;
- Executed Select/Insert/Update/Delete/Merge/IOW queries.

Thanks, Stamatis for driving the release.

Regards,
Denys


Re: [VOTE] Release Apache Hive 4.0.0-beta-1 (Release Candidate 0)

2023-08-10 Thread Simhadri G
Hi Everyone,

Thanks, Stamatis for driving the release.

+1 (non-binding)

Verified the following:

* Download the source tarball, signature (.asc), and checksum (.sha512):
OK
* Import gpg keys: download KEYS and run gpg --import
/path/to/downloaded/KEYS.txt -> Verify the signature by running:gpg
--verify ./apache-hive-4.0.0-beta-1-bin.tar.gz.asc
/apache-hive-4.0.0-beta-1-bin.tar.gz  : OK
* Validated checksum and signature for the artifacts : OK
* Build from source successfully  : OK
* Init  meta scripts against MYSQL : OK
* Successful standalone metastore setup with MYSQL  : OK
* Bring up HiveServer2 and Metastore, run some simple hive queries and
iceberg queries using Tez : OK

Thanks!
Simhadri G



On Thu, Aug 10, 2023 at 12:30 PM Sourabh Badhya
 wrote:

> I was able to do the following with RC0 artifacts -
> * Verified checksum of sources and binaries.
> * Successfully built from source.
> * Successful metastore DB setup with Postgres.
> * Brought up HS2 and HMS successfully and ran CREATE, INSERT, SELECT, DROP
> queries for external tables and CREATE, INSERT, SELECT, DELETE, UPDATE,
> DROP queries for transactional and Iceberg tables using Tez.
>
> +1 (non-binding)
>
> Thanks Stamatis for driving the release.
>
> Regards,
> Sourabh Badhya
>
> On Wed, Aug 9, 2023 at 10:53 AM dengzhhu653  wrote:
>
> > +1 (non-binding), and thanks for driving the release.* Verified
> > signature/Checksum of sources and binaries;* Good rat check and
> source
> > files;* Build from source successfully;* Inited meta scripts
> > against Postgres;* Bring up HiveServer2 and Metastore, run some
> simple
> > queries using Tez: okThanks,Zhihua
> > At 2023-08-07 21:55:30, "Stamatis Zampetakis" 
> wrote:
> > >Hi all,
> > >
> > >I have created a build for Apache Hive 4.0.0-beta-1 Release Candidate 0.
> > >
> > >Thanks to everyone who has contributed to this release.
> > >
> > >You can read the release notes here:
> > >
> https://github.com/apache/hive/blob/branch-4.0.0-beta-1/RELEASE_NOTES.txt
> > >
> > >The commit to be voted upon:
> > >
> >
> https://github.com/apache/hive/commit/d2310944e412b577a39687c7968b2e93eede8433
> > >
> > >Its hash is
> > >d2310944e412b577a39687c7968b2e93eede8433
> > >
> > >Tag:
> > >https://github.com/apache/hive/tree/release-4.0.0-beta-1-rc0
> > >
> > >The artifacts to be voted on are located here:
> > >https://people.apache.org/~zabetak/apache-hive-4.0.0-beta-1-rc0/
> > >
> > >The hashes of the artifacts are as follows:
> > >- 4114d8e9a523562c77237a8751dec9ed1bcbf6ccbe2e178d72f356ca4e65d466
> > >apache-hive-4.0.0-beta-1-bin.tar.gz
> > >- 8d157f4dcb9af5e48e51206a4046d1c11414fbc39583c84be31d609606136209
> > >apache-hive-4.0.0-beta-1-src.tar.gz
> > >
> > >A staged Maven repository is available for review at:
> > >https://repository.apache.org/content/repositories/orgapachehive-1119
> > >
> > >Release artifacts are signed with the following key:
> > >https://people.apache.org/keys/committer/zabetak.asc
> > >https://downloads.apache.org/hive/KEYS
> > >
> > >Please vote on releasing this package as Apache Hive 4.0.0-beta-1.
> > >
> > >The vote is open for the next 72 hours and passes if a majority of at
> > >least three +1 PMC votes are cast.
> > >
> > >[ ] +1 Release this package as Apache Hive 4.0.0-beta-1
> > >[ ]  0 I don't feel strongly about it, but I'm okay with the release
> > >[ ] -1 Do not release this package because...
> > >
> > >Here is my vote:
> > >+1 (binding)
> > >
> > >Best,
> > >Stamatis
> >
>


Re: [VOTE] Release Apache Hive 4.0.0-beta-1 (Release Candidate 0)

2023-08-10 Thread Sourabh Badhya
I was able to do the following with RC0 artifacts -
* Verified checksum of sources and binaries.
* Successfully built from source.
* Successful metastore DB setup with Postgres.
* Brought up HS2 and HMS successfully and ran CREATE, INSERT, SELECT, DROP
queries for external tables and CREATE, INSERT, SELECT, DELETE, UPDATE,
DROP queries for transactional and Iceberg tables using Tez.

+1 (non-binding)

Thanks Stamatis for driving the release.

Regards,
Sourabh Badhya

On Wed, Aug 9, 2023 at 10:53 AM dengzhhu653  wrote:

> +1 (non-binding), and thanks for driving the release.* Verified
> signature/Checksum of sources and binaries;* Good rat check and source
> files;* Build from source successfully;* Inited meta scripts
> against Postgres;* Bring up HiveServer2 and Metastore, run some simple
> queries using Tez: okThanks,Zhihua
> At 2023-08-07 21:55:30, "Stamatis Zampetakis"  wrote:
> >Hi all,
> >
> >I have created a build for Apache Hive 4.0.0-beta-1 Release Candidate 0.
> >
> >Thanks to everyone who has contributed to this release.
> >
> >You can read the release notes here:
> >https://github.com/apache/hive/blob/branch-4.0.0-beta-1/RELEASE_NOTES.txt
> >
> >The commit to be voted upon:
> >
> https://github.com/apache/hive/commit/d2310944e412b577a39687c7968b2e93eede8433
> >
> >Its hash is
> >d2310944e412b577a39687c7968b2e93eede8433
> >
> >Tag:
> >https://github.com/apache/hive/tree/release-4.0.0-beta-1-rc0
> >
> >The artifacts to be voted on are located here:
> >https://people.apache.org/~zabetak/apache-hive-4.0.0-beta-1-rc0/
> >
> >The hashes of the artifacts are as follows:
> >- 4114d8e9a523562c77237a8751dec9ed1bcbf6ccbe2e178d72f356ca4e65d466
> >apache-hive-4.0.0-beta-1-bin.tar.gz
> >- 8d157f4dcb9af5e48e51206a4046d1c11414fbc39583c84be31d609606136209
> >apache-hive-4.0.0-beta-1-src.tar.gz
> >
> >A staged Maven repository is available for review at:
> >https://repository.apache.org/content/repositories/orgapachehive-1119
> >
> >Release artifacts are signed with the following key:
> >https://people.apache.org/keys/committer/zabetak.asc
> >https://downloads.apache.org/hive/KEYS
> >
> >Please vote on releasing this package as Apache Hive 4.0.0-beta-1.
> >
> >The vote is open for the next 72 hours and passes if a majority of at
> >least three +1 PMC votes are cast.
> >
> >[ ] +1 Release this package as Apache Hive 4.0.0-beta-1
> >[ ]  0 I don't feel strongly about it, but I'm okay with the release
> >[ ] -1 Do not release this package because...
> >
> >Here is my vote:
> >+1 (binding)
> >
> >Best,
> >Stamatis
>