[jira] [Created] (HIVE-24819) CombineHiveInputFormat format seems to be returning row count in the multiple of Maps
Jitender Kumar created HIVE-24819: - Summary: CombineHiveInputFormat format seems to be returning row count in the multiple of Maps Key: HIVE-24819 URL: https://issues.apache.org/jira/browse/HIVE-24819 Project: Hive Issue Type: Bug Environment: Apache Hive (version 3.1.0.3.1.0.0-78) Driver: Hive JDBC (version 3.1.0.3.1.0.0-78) Transaction isolation: TRANSACTION_REPEATABLE_READ Beeline version 3.1.0.3.1.0.0-78 by Apache Hive Reporter: Jitender Kumar Hi Team, This is the first time I am writing a bug using apache Jira, so pardon me if I am unintentionally breaking any protocols. I am facing the following issue (on a multi-node cluster) when I set hive.tez.input.format to org.apache.hadoop.hive.ql.io.CombineHiveInputFormat. Just for demonstration purposes, I will be executing the following query for multiple cases. _select count(1) from dbname.personal_data_rc tablesample(1000 rows);_ *Case1* mapred.map.tasks=2 hive.tez.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat *Output* 1000 *Case 2* mapred.map.tasks=2 hive.tez.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat *Output* 2000 *Case 3* mapred.map.tasks=3 hive.tez.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat *Output* 3000 After 3 maps set as default, out remains same, i.e multiple of 3. Can you help me understand why if I have TABLESAMPLE set to 1000 rows, it is giving me more number of rows? Is there any other property that must be used with CombineHiveInputFormat or is it an issue with CombineHiveInputFormat only? I have tried to look for a solution but in the end i had to come here. Please share your inputs ASAP as one of our client is looking for a solution or explaination regarding this? For now as a workaround we have changed it to following. *hive.tez.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat* -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: Need help to create 2.3.9 release in Hive JIRA
Bump this again. Can someone create the 2.3.9 release in JIRA, please? On Thu, Jan 28, 2021 at 10:00 AM Chao Sun wrote: > Bump this, also cc Owen who helped me last time (sorry for directly > emailing you). > > On Tue, Jan 19, 2021 at 4:07 PM Chao Sun wrote: > >> Hi, >> >> Can someone help me to create 2.3.9 release in Hive JIRA so that we can >> use that as fixed or targeted version? Thanks. >> >> Best, >> Chao >> >
[jira] [Created] (HIVE-24818) REPL LOAD (Bootstrap ) of views with partitions fails
Anurag Shekhar created HIVE-24818: - Summary: REPL LOAD (Bootstrap ) of views with partitions fails Key: HIVE-24818 URL: https://issues.apache.org/jira/browse/HIVE-24818 Project: Hive Issue Type: Bug Components: repl Reporter: Anurag Shekhar Assignee: Anurag Shekhar -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24817) "not in" clause returns incorrect data when there is coercion
Steve Carlin created HIVE-24817: --- Summary: "not in" clause returns incorrect data when there is coercion Key: HIVE-24817 URL: https://issues.apache.org/jira/browse/HIVE-24817 Project: Hive Issue Type: Bug Components: CBO Reporter: Steve Carlin When the query has a where clause that has an integer column checking against being "not in" a decimal column, the decimal column is being changed to null, causing incorrect results. This is a sample query of a failure: select count(*) from my_tbl where int_col not in (355.8); Since the int_col can never be 355.8, one would expect all the rows to be returned, but it is changing the 355.8 into a null value causing no rows to be returned. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24816) Upgrade jackson to 2.10.5.1 or 2.11.0+ due to CVE-2020-25649
Sai Hemanth Gantasala created HIVE-24816: Summary: Upgrade jackson to 2.10.5.1 or 2.11.0+ due to CVE-2020-25649 Key: HIVE-24816 URL: https://issues.apache.org/jira/browse/HIVE-24816 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Sai Hemanth Gantasala Assignee: Sai Hemanth Gantasala Currently, hive is pulling Jackson 2.10.5 version jar. Please upgrade to 2.10.5.1 or 2.11.0+ due to CVE-2020-25649. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24815) Remove "IDXS" Table from Metastore Schema
Hunter Logan created HIVE-24815: --- Summary: Remove "IDXS" Table from Metastore Schema Key: HIVE-24815 URL: https://issues.apache.org/jira/browse/HIVE-24815 Project: Hive Issue Type: Improvement Components: Metastore, Standalone Metastore Affects Versions: 3.1.2, 3.1.1, 3.0.0, 3.1.0, 3.2.0, 4.0.0 Reporter: Hunter Logan In Hive 3 the rarely used "INDEXES" was removed from the DDL https://issues.apache.org/jira/browse/HIVE-18448 There are a few issues here: # The Standalone-Metastore schema for Hive 3+ all include the "IDXS" table, which has no function. ** [https://github.com/apache/hive/tree/master/standalone-metastore/metastore-server/src/main/sql/mysql] # The upgrade schemas from 2.x -> 3.x do not do any cleanup of the IDXS table ** If a user used the "INDEXES" feature in 2.x and then upgrades their metastore to 3.x+ they cannot drop any table that has an index on it due to "IDXS_FK1" constraint since the TBLS entry is referenced in the IDXS table ** Since INDEX is no longer in the DDL they cannot run any command from Hive to drop the index. ** Users can manually connect to the metastore and either drop the IDXS table or the foreign key constraint Since indexes provide no benefits in Hive 3+ it should be fine to drop them completely in the schema upgrade scripts. At the very least the 2.x -> 3.x+ scripts should drop the fk constraint. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24814) Harmonize Hive Date-Time Formats
David Mollitor created HIVE-24814: - Summary: Harmonize Hive Date-Time Formats Key: HIVE-24814 URL: https://issues.apache.org/jira/browse/HIVE-24814 Project: Hive Issue Type: Improvement Reporter: David Mollitor Assignee: David Mollitor Harmonize Hive on JDK date-time formats. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24813) thrift regeneration is failing with cannot find symbol TABLE_IS_CTAS
Attila Magyar created HIVE-24813: Summary: thrift regeneration is failing with cannot find symbol TABLE_IS_CTAS Key: HIVE-24813 URL: https://issues.apache.org/jira/browse/HIVE-24813 Project: Hive Issue Type: Bug Components: Standalone Metastore Reporter: Attila Magyar Assignee: Attila Magyar Fix For: 4.0.0 {code:java} [ERROR] /Users/amagyar/development/hive/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java:[2145,34] cannot find symbol [ERROR] symbol: variable TABLE_IS_CTAS [ERROR] location: class org.apache.hadoop.hive.metastore.HMSHandler [ERROR] /Users/amagyar/development/hive/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetastoreDefaultTransformer.java:[591,58] cannot find symbol [ERROR] symbol: variable TABLE_IS_CTAS [ERROR] location: class org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer [ERROR] -> [Help 1] {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [EXTERNAL] Hive meetup
I'm interested, I'd like to propose talking about future releases and making these more regular as well as the absolute pain that the Hive build is with all its flaky unit tests. I know some work has been done on this in the past but I think it's a huge barrier to new developers, especially casual ones who want to fix a small bug but can never get all the tests to pass. Hive-Iceberg is another good topic. On Tue, 23 Feb 2021 at 11:20, Peter Vary wrote: > +1 for the meetup > > If the team is interested, we can talk about Hive-Iceberg integration > > Thanks, > Peter > > > On Feb 23, 2021, at 04:34, Aasha wrote: > > > > +1 > > > >> On 22-Feb-2021, at 11:54 PM, Matt McCline > >> > wrote: > >> > >> Definitely interested. > >> > >> -Original Message- > >> From: Zoltan Haindrich > >> Sent: Monday, February 22, 2021 10:17 AM > >> To: dev@hive.apache.org > >> Subject: [EXTERNAL] Hive meetup > >> > >> Hey All! > >> > >> It was quite some time ago when we had a meetup - and in these covid > times it would be online-only anyway :) We were mentioning this lately here > and there at Cloudera. > >> I think we could have a few talks spanning 2-3 hours or so. > >> > >> Are there any interest in it? > >> > >> I would be happy to talk about how hive-test-kube works and how > hive-dev-box is employed during testing. > >> > >> cheers, > >> Zoltan > >
Re: Any plan for new hive 3 or 4 release?
I would love to see a HIve 3.1 release which is capable of being used on Java 11 like Hive 2 is. What is the main difference going to be between Hive 3 and 4? The removal of MR? On Mon, 22 Feb 2021 at 16:46, Zoltan Haindrich wrote: > Hey Michel! > > Yes it was a long time ago we had a release; we have quite a few new > features in master. > I think we are scaring people for some time now that we will be dropping > MR support...I think we should do that. > > I would really like to see a new Hive release in the near future as well - > there is no way for users to even try out new features. > I was planning to add nightly builds to package the latest master's state > into a deployable artifact - I think a service like may help pretest our > next release; I think it > won't take much to do it so I'll probably throw it together in the next > couple days! > > cheers, > Zoltan > > On 2/21/21 2:27 PM, Michel Sumbul wrote: > > Hi Guys, > > > > If I'm not wrong, the last release of Hive 3.x is 18 months old. > > I wanted to ask if you had any roadmap / plan to release a new version of > > Hive 3.x or Hive 4? > > > > Thanks, > > Michel > > >
[jira] [Created] (HIVE-24812) Disable sharedworkoptimizer remove semijoin by default
Zoltan Haindrich created HIVE-24812: --- Summary: Disable sharedworkoptimizer remove semijoin by default Key: HIVE-24812 URL: https://issues.apache.org/jira/browse/HIVE-24812 Project: Hive Issue Type: Sub-task Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich SJ removal backfired a bit when I was testing stuff - because of the additional opportunities paralleledges may enable ; because it will increased the shuffled memory amount and/or even make MJ broadcast inputs larger set hive.optimize.shared.work.semijoin=false by default for now right now it's better to leave dppunion to pick up these cases instead of removing the SJ fully - after HIVE-24376 we might enable it back -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [EXTERNAL] Hive meetup
+1 for the meetup If the team is interested, we can talk about Hive-Iceberg integration Thanks, Peter > On Feb 23, 2021, at 04:34, Aasha wrote: > > +1 > >> On 22-Feb-2021, at 11:54 PM, Matt McCline >> wrote: >> >> Definitely interested. >> >> -Original Message- >> From: Zoltan Haindrich >> Sent: Monday, February 22, 2021 10:17 AM >> To: dev@hive.apache.org >> Subject: [EXTERNAL] Hive meetup >> >> Hey All! >> >> It was quite some time ago when we had a meetup - and in these covid times >> it would be online-only anyway :) We were mentioning this lately here and >> there at Cloudera. >> I think we could have a few talks spanning 2-3 hours or so. >> >> Are there any interest in it? >> >> I would be happy to talk about how hive-test-kube works and how hive-dev-box >> is employed during testing. >> >> cheers, >> Zoltan