from:"Zoltan Haindrich"

Zoltan Haindrich created HIVE-25720:
---

 Summary: Fix flaky test TestScheduledReplicationScenarios
 Key: HIVE-25720
 URL: https://issues.apache.org/jira/browse/HIVE-25720
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


failed at the first attempt; the issue happened during
{code}
drop scheduled query repl_load_p2
{code}
which is in a finally block ; so this exception may be shadowing another 
exception

http://ci.hive.apache.org/job/hive-flaky-check/463/





--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (HIVE-25719) Fix flaky test TestMiniLlapLocalCliDriver#testCliDriver[replication_metrics_ingest]

Zoltan Haindrich created HIVE-25719:
---

 Summary: Fix flaky test 
TestMiniLlapLocalCliDriver#testCliDriver[replication_metrics_ingest]
 Key: HIVE-25719
 URL: https://issues.apache.org/jira/browse/HIVE-25719
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


flaky checker failed after 3 attempts with a q.out difference

there seems to be some ID difference - maybe 2 events happened in a different 
order?

http://ci.hive.apache.org/job/hive-flaky-check/465/testReport/junit/org.apache.hadoop.hive.cli/TestMiniLlapLocalCliDriver/testCliDriver_replication_metrics_ingest_/



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Re: hive-exec vs. hive-exec:core

2021-11-17 Thread Zoltan Haindrich


Hey all,

I wanted to get back to this - but had other things going on.

Chao> it is still being used today by some other popular projects
the idea is to fix the issues they bump into - because people who load the jdbc 
driver may also see those issues.

Edward> [...] You all must like enjoy shading jars.
I totally agree that they may use a shell action as well.
I wonder how do you propose to solve issues related to clients using a 
different version of the guava library?

The changes which will remove the core artifact stuff is ready: 
https://github.com/apache/hive/pull/2648

cheers,
Zoltan

On 9/21/21 8:23 PM, Edward Capriolo wrote:

recommendation from the Hive team is to use the hive-exec.jar artifact.

You know about 10 years ago. I mentioned that oozie should just use
hive-service or hive jdbc. After a big fight where folks kept bringing up
concurrency bugs in hive-server-1 my prs were rejected (even though hive
server2 would not have these bugs). I still cannot fathom why someone using
oozie would want a fat jar of hive (as opposed to hive server or hivejdbc)
. If I had to do that, i would just use shell action. You all must like
enjoy shading jars.

Edward

On Thu, Sep 16, 2021 at 2:30 PM Chao Sun  wrote:


I'm not sure whether it is a good idea to remove `hive-exec-core`
completely - it is still being used today by some other popular projects
including Spark and Trino/Presto. By sticking to `hive-exec-core` it gives
more flexibility to the other projects to shade & relocate those classes
according to their need, without waiting for new Hive releases. Hive also
needs to make sure it relocate everything properly. Otherwise, if some
classes are shaded & included in `hive-exec` but not relocated, there is no
way for the other projects to exclude them and avoid potential conflicts.

Chao

On Thu, Sep 16, 2021 at 8:03 AM Zoltan Haindrich  wrote:


Hey

On 9/6/21 12:48 PM, Stamatis Zampetakis wrote:

Indeed this may lead to binary incompatibility problems as the one you
mentioned. If I understood correctly the problem you cite comes up if
library B in this case is not relocated. If Hive systematically

relocates

shaded deps do you think there will still be binary incompatibility

issues?


If the relocating solution works, I would personally prefer going down

this

path instead of introducing an entirely new module just for the sake of
dependency management. Most of the time when there are problems with
shading the answer comes from relocating the problematic dependencies

and

people are more or less accustomed with this route.


I totally agree with you Stamatis - with the addition that we should work
together with the owners of other projects to help them use the correct
artifact to gain access to
Hive's internal parts.
I've opened HIVE-25531 to remove the core classified artifact - and

ensure

that we will be uncovering and fixing future issues with the hive-exec
artifact.

cheers,
Zoltan




Best,
Stamatis

On Mon, Aug 30, 2021 at 9:49 PM Daniel Fritsi



wrote:


Dear Hive developers,

I am Dan from the Oozie team and I would like to bring up the
hive-exec.jar vs. hive-exec-core.jar topic.
The reason for that is because as far as we understand the official
recommendation from the Hive team is to use the hive-exec.jar

artifact.


However in Oozie that can end-up in a binary incompatibility.

The reason for that is:

* Let's say library A is included in the fat Jar.

* And library B which is using library A is also included in the

fat

Jar.


* Let's also say that library A's com.library.alib package is
  relocated to org.apache.hive.com.library.alib,
  meaning the com.library.alib.SomeClass becomes
  org.apache.hive.com.library.alib.SomeClass

* So if B has a method like public void
  someMethod(com.library.alib.SomeClass) then the signature of this
  method will be changed to:
  public void

someMethod(org.apache.hive.com.library.alib.SomeClass)


* If Oozie is also using B directly meaning we'll have b.jar on our
  classpath, but with the unchanged signature,
  so when hive-exec tries to invoke someMethod then depending on
  whether b.jar coming from us will be loaded first or hive-exec

will,

  we can end-up with a NoSuchMethodError is hive-exec tries to pass

an

  org.apache.hive.com.library.alib.SomeClass instance to the
  someMethod which was loaded from the original b.jar.

Hence in Oozie a long time ago (OOZIE-2621
<https://issues.apache.org/jira/browse/OOZIE-2621>) the decision was
made to use the hive-exec-core Jar.

Now since the shading process actually removes those dependencies from
the hive-exec pom which are included in the fat Jar, we manually had

to

add some dependencies to Oozie to compensate this.
However these dependencies are not used by Oozie directly and with the
growing features of hive-exec we had to repeat the same process
over-and-ov

[jira] [Created] (HIVE-25715) Provide nightly builds

Zoltan Haindrich created HIVE-25715:
---

 Summary: Provide nightly builds
 Key: HIVE-25715
 URL: https://issues.apache.org/jira/browse/HIVE-25715
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


provide nightly builds for the master branch



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (HIVE-25714) Some tests are flaky because docker is not able to start in 5 seconds

Zoltan Haindrich created HIVE-25714:
---

 Summary: Some tests are flaky because docker is not able to start 
in 5 seconds
 Key: HIVE-25714
 URL: https://issues.apache.org/jira/browse/HIVE-25714
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


there are some testruns failing with; and on the test site multiple pods are 
running in parallel - its not an ideal environment for tight deadlines
{code}
Unexpected exception java.lang.RuntimeException: Process docker failed to run 
in 5 seconds
 at 
org.apache.hadoop.hive.ql.externalDB.AbstractExternalDB.runCmd(AbstractExternalDB.java:92)
 at 
org.apache.hadoop.hive.ql.externalDB.AbstractExternalDB.launchDockerContainer(AbstractExternalDB.java:123)
 at 
org.apache.hadoop.hive.ql.qoption.QTestDatabaseHandler.beforeTest(QTestDatabaseHandler.java:111)
 at 
org.apache.hadoop.hive.ql.qoption.QTestOptionDispatcher.beforeTest(QTestOptionDispatcher.java:79)
{code}

http://ci.hive.apache.org/job/hive-precommit/job/PR-1674/4/testReport/junit/org.apache.hadoop.hive.cli.split19/TestMiniLlapLocalCliDriver/Testing___split_14___PostProcess___testCliDriver_qt_database_all_/



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (HIVE-25713) Fix test TestLlapTaskSchedulerService#testPreemption

Zoltan Haindrich created HIVE-25713:
---

 Summary: Fix test TestLlapTaskSchedulerService#testPreemption
 Key: HIVE-25713
 URL: https://issues.apache.org/jira/browse/HIVE-25713
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


when this test passes it passes under 100ms - but when it fails it keeps 
waiting or more than 10 seconds - the test seem to be using singal/await 

http://ci.hive.apache.org/job/hive-flaky-check/462/



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (HIVE-25712) Fix test TestContribCliDriver#testCliDriver[url_hook]

Zoltan Haindrich created HIVE-25712:
---

 Summary: Fix test TestContribCliDriver#testCliDriver[url_hook]
 Key: HIVE-25712
 URL: https://issues.apache.org/jira/browse/HIVE-25712
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


The test makes use of SampleURLHook - which could change the JDO url
http://ci.hive.apache.org/job/hive-flaky-check/460/



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (HIVE-25711) Make Table#isEmpty more efficient

Zoltan Haindrich created HIVE-25711:
---

 Summary: Make Table#isEmpty more efficient
 Key: HIVE-25711
 URL: https://issues.apache.org/jira/browse/HIVE-25711
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


[~stevel] suggested in another ticket that we could make our isEmpty method 
faster:

https://issues.apache.org/jira/browse/HIVE-24849?focusedCommentId=17372145&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17372145




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (HIVE-25707) SchemaTool may leave the metastore in-between upgrade steps

Zoltan Haindrich created HIVE-25707:
---

 Summary: SchemaTool may leave the metastore in-between upgrade 
steps
 Key: HIVE-25707
 URL: https://issues.apache.org/jira/browse/HIVE-25707
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


it seems like:
* schematool runs the sql files via beeline
* autocommit is turned on
* pressing ctrl+c or killing the process will result in an invalid schema

https://github.com/apache/hive/blob/6e02f6164385a370ee8014c795bee1fa423d7937/beeline/src/java/org/apache/hive/beeline/schematool/HiveSchemaTool.java#L79



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (HIVE-25703) Postgres metastore test failures

Zoltan Haindrich created HIVE-25703:
---

 Summary: Postgres metastore test failures
 Key: HIVE-25703
 URL: https://issues.apache.org/jira/browse/HIVE-25703
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


all recent builds are failing because postgres metastore don't start

underlying issue is that the docker container can't start because of:
```
ls: cannot access '/docker-entrypoint-initdb.d/': Operation not permitted
```



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (HIVE-25692) ExceptionHandler may mask checked exceptions

2021-11-12 Thread Zoltan Haindrich (Jira)

Zoltan Haindrich created HIVE-25692:
---

 Summary: ExceptionHandler may mask checked exceptions
 Key: HIVE-25692
 URL: https://issues.apache.org/jira/browse/HIVE-25692
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


HIVE-25055 have changed the way exceptions as rethrowed - but one of the 
methods may let checked exception out without them being declared on the method 
(and avoid the compile time error for it)

testcase for:
org.apache.hadoop.hive.metastore.TestExceptionHandler

{code}
  @Test
  public void testInvalid() throws MetaException {
try {
  throw new IOException("IOException test");
} catch (Exception e) {
  throw handleException(e).throwIfInstance(AccessControlException.class, 
IOException.class).defaultMetaException();
}
  }
{code}

this testcase should not compile - as it may throw IOException or 
AccessControlException as well



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Re: Category-X JDBC drivers in Hive modules

2021-11-12 Thread Zoltan Haindrich

Hey Stamatis!

Makes sense to me; I think we already have all of the jdbc drivers in the test
scope - but adding runtime is a great idea!

I had some memories about some letter that we are using Cat-X stuff in Hive and
we should remove it - I think HIVE-23284 was opened in response to that.
However...if that comes back after these changes we may ask to update the
scanner because we only use it in test runtime.

cheers,
Zoltan

On 11/10/21 11:59 AM, Stamatis Zampetakis wrote:

Hi all,

Currently, we have some (MariadDB, MySQL, Oracle) Category-X [1] JDBC
drivers in some parts of the project. Sometimes they are included using the
dependency section with test and some others by relying on
download-maven-plugin [2].

Using test scope is kind of OK but it comes with the risk that we may write
code which needs JDBC driver classes in order to compile and this could be
seen as a violation of the AL2 when the Hive source code is released. From
my understanding, the use of download-maven-plugin, first introduced in
HIVE-23284 [3], was an attempt to remedy this problem. Now it comes back
since we started using the test scope again.

We have a few other drivers, namely Postgres, MSSQL, in test scope but are
less important since they have BSD-2 and MIT licenses which are not
problematic.

I would expect that in the context of Hive *all* the JDBC drivers should be
declared using the runtime. This would remove the need to
use the download-maven-plugin and would simplify the inclusion of drivers
in the build. We are not risking to create derivatives of GPL work since
the dependency is not present at compilation so we cannot really use the
respective classes in our code.

Moreover, driver dependencies could be marked optional, which is actually
true, and that would solve any potential licensing issues [4].

I would like to propose to use the following declaration for all JDBC
drivers no matter the license.

org.mariadb.jdbc
mariadb-java-client
${mariadb.version}
runtime
true

This will make things more uniform, solve any potential licensing issues,
and when in the future someone copy-pastes dependencies to include new
drivers there will be no violation of AL2.

What do you think?

Best,
Stamatis

[1] https://www.apache.org/legal/resolved.html#category-x
[2]
https://search.maven.org/artifact/com.googlecode.maven-download-plugin/download-maven-plugin/1.6.1/jar
[3] https://issues.apache.org/jira/browse/HIVE-23284
[4] https://www.apache.org/legal/resolved.html#optional

[jira] [Created] (HIVE-25634) Eclipse compiler bumps into AIOBE during ObjectStore compilation

2021-10-21 Thread Zoltan Haindrich (Jira)

Zoltan Haindrich created HIVE-25634:
---

 Summary: Eclipse compiler bumps into AIOBE during ObjectStore 
compilation
 Key: HIVE-25634
 URL: https://issues.apache.org/jira/browse/HIVE-25634
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


this issue seem to have started appearing after HIVE-23633



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-25633) Prevent shutdown of MetaStore scheduled worker ThreadPool

2021-10-21 Thread Zoltan Haindrich (Jira)

Zoltan Haindrich created HIVE-25633:
---

 Summary: Prevent shutdown of MetaStore scheduled worker ThreadPool
 Key: HIVE-25633
 URL: https://issues.apache.org/jira/browse/HIVE-25633
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


[~lpinter] have noticed that this patch has some sideffect:

in HIVE-23164 the patch have added a {{ThreadPool#shutdown}} to 
{{HMSHandler#shutdown}} - which could cause trouble in case a {{HMSHandler}} is 
shutdown and a new one is created

I was looking for cases in which a HMSHandler is created inside the metastore 
(beyond the one HiveMetaStore is using) - and I think tasks like Msck use it to 
access the metastore - and they close the client - which closes the hmshandler 
client ; which will shut down the threadpool




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-25630) Translator fixes

2021-10-21 Thread Zoltan Haindrich (Jira)

Zoltan Haindrich created HIVE-25630:
---

 Summary: Translator fixes
 Key: HIVE-25630
 URL: https://issues.apache.org/jira/browse/HIVE-25630
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


there are some issues:
* AlreadyExistsException might be suppressed by the translator
* uppercase letter usage may cause problems for some clients
* add a way to suppress location checks for legacy clients




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-25569) Enable table definition over a single file

2021-09-28 Thread Zoltan Haindrich (Jira)

Zoltan Haindrich created HIVE-25569:
---

 Summary: Enable table definition over a single file
 Key: HIVE-25569
 URL: https://issues.apache.org/jira/browse/HIVE-25569
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


Suppose there is a directory where multiple files are present - and by a 3rd 
party database system this is perfectly normal - because its treating a single 
file as the contents of the table.

Tables defined in the metastore follow a different principle - tables are 
considered to be under a directory - and all files under that directory are the 
contents of that directory.

To enable seamless migration/evaluation of Hive and other databases using HMS 
as a metadatabackend the ability to define a table over a single file would be 
usefull.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Delays in precommit runs

2021-09-21 Thread Zoltan Haindrich

Hey All,

I've merged a change to enable branch indexing on the ci job - this will enable 
it to autocleanup old builds and it will also make sure it will start runs even 
in case the github event is lost.
As a side effect of this it rediscovered almost all PRs - i've aborted all of 
the runs which already had a green run...but left the others running.
So, right now it is busy running those tests...if I count it correctly; it 
still has around 25 to go...
it would have been better to wait until the weekend with this...sorry for the 
holdup; I think it will get better by tomorrow.

cheers,
Zoltan

Re: hive-exec vs. hive-exec:core

2021-09-16 Thread Zoltan Haindrich


Hey

On 9/6/21 12:48 PM, Stamatis Zampetakis wrote:

Indeed this may lead to binary incompatibility problems as the one you
mentioned. If I understood correctly the problem you cite comes up if
library B in this case is not relocated. If Hive systematically relocates
shaded deps do you think there will still be binary incompatibility issues?

If the relocating solution works, I would personally prefer going down this
path instead of introducing an entirely new module just for the sake of
dependency management. Most of the time when there are problems with
shading the answer comes from relocating the problematic dependencies and
people are more or less accustomed with this route.


I totally agree with you Stamatis - with the addition that we should work together with the owners of other projects to help them use the correct artifact to gain access to 
Hive's internal parts.

I've opened HIVE-25531 to remove the core classified artifact - and ensure that 
we will be uncovering and fixing future issues with the hive-exec artifact.

cheers,
Zoltan




Best,
Stamatis

On Mon, Aug 30, 2021 at 9:49 PM Daniel Fritsi 
wrote:


Dear Hive developers,

I am Dan from the Oozie team and I would like to bring up the
hive-exec.jar vs. hive-exec-core.jar topic.
The reason for that is because as far as we understand the official
recommendation from the Hive team is to use the hive-exec.jar artifact.

However in Oozie that can end-up in a binary incompatibility.

The reason for that is:

   * Let's say library A is included in the fat Jar.

   * And library B which is using library A is also included in the fat Jar.

   * Let's also say that library A's com.library.alib package is
 relocated to org.apache.hive.com.library.alib,
 meaning the com.library.alib.SomeClass becomes
 org.apache.hive.com.library.alib.SomeClass

   * So if B has a method like public void
 someMethod(com.library.alib.SomeClass) then the signature of this
 method will be changed to:
 public void someMethod(org.apache.hive.com.library.alib.SomeClass)

   * If Oozie is also using B directly meaning we'll have b.jar on our
 classpath, but with the unchanged signature,
 so when hive-exec tries to invoke someMethod then depending on
 whether b.jar coming from us will be loaded first or hive-exec will,
 we can end-up with a NoSuchMethodError is hive-exec tries to pass an
 org.apache.hive.com.library.alib.SomeClass instance to the
 someMethod which was loaded from the original b.jar.

Hence in Oozie a long time ago (OOZIE-2621
) the decision was
made to use the hive-exec-core Jar.

Now since the shading process actually removes those dependencies from
the hive-exec pom which are included in the fat Jar, we manually had to
add some dependencies to Oozie to compensate this.
However these dependencies are not used by Oozie directly and with the
growing features of hive-exec we had to repeat the same process
over-and-over which is a bit unmaintainable.

Today I'm writing to you to propose a long-term solution where basically
nothing would change in the generated hive artifacts, poms and the same
time we wouldn't have to manually declare dependencies in Oozie which
are not explicitly used by us.

The solution:

  1. We would create a new module named hive-exec-dependencies which
 would be a pom-packaging module without any Java source files.
  2. All the dependencies declared in hive-exec would be moved to
 hive-exec-dependencies.
  3. We would make the hive-exec-dependencies module the parent of
 hive-exec and with this hive-exec would still have access to the
 same dependencies as before.
  4. The maven shade plugin would still strip the dependencies from the
 generated hive-exec pom which are included in the fat Jar.
  5. And with a small maven plugin we'd change hive-exec's parent back
 from hive-exec-dependencies to the root hive project in the
 generated hive-exec pom file.

I have a change ready locally and it works as described above.

With this on the Oozie side we could add a dependency on
hive-exec-dependencies and hence all the required libraries which are
included in the fat Jar would be pulled into Oozie.
The next time a new dependency would be added to hive-exec-dependencies,
the Oozie build would pull it in automatically without us having to
explicitly declare it.

Please let me know what you think.

Best,
Dan

[jira] [Created] (HIVE-25531) Remove the core classified hive-exec artifact

2021-09-16 Thread Zoltan Haindrich (Jira)

Zoltan Haindrich created HIVE-25531:
---

 Summary: Remove the core classified hive-exec artifact
 Key: HIVE-25531
 URL: https://issues.apache.org/jira/browse/HIVE-25531
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


* this artifact was introduced in HIVE-7423 
* loading this artifact and the shaded hive-exec (along with the jdbc driver) 
could create interesting classpath problems
* if other projects have issues with the shaded hive-exec artifact we must 
start fix those problems



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-25508) Partitioned tables created with CTAS queries doesnt have lineage informations

2021-09-09 Thread Zoltan Haindrich (Jira)

Zoltan Haindrich created HIVE-25508:
---

 Summary: Partitioned tables created with CTAS queries doesnt have 
lineage informations
 Key: HIVE-25508
 URL: https://issues.apache.org/jira/browse/HIVE-25508
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-25485) Transform selects of literals under a UNION ALL to inline table scan

2021-08-26 Thread Zoltan Haindrich (Jira)

Zoltan Haindrich created HIVE-25485:
---

 Summary: Transform selects of literals under a UNION ALL to inline 
table scan
 Key: HIVE-25485
 URL: https://issues.apache.org/jira/browse/HIVE-25485
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich



{code}
select 1
union all
select 1
union all
[...]
union all
select 1
{code}

results in a very big plan; which will have vertexes proportional to the number 
of union all branch - hence it could be slow to execute it



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-25404) Inserts inside merge statements are rewritten incorrectly for partitioned tables

2021-07-29 Thread Zoltan Haindrich (Jira)

Zoltan Haindrich created HIVE-25404:
---

 Summary: Inserts inside merge statements are rewritten incorrectly 
for partitioned tables
 Key: HIVE-25404
 URL: https://issues.apache.org/jira/browse/HIVE-25404
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


{code}
drop table u;drop table t;

create table t(value string default 'def') partitioned by (id integer);
create table u(id integer);
{code}

#1 id&value specified
rewritten
{code}
FROM
  `default`.`t`
  RIGHT OUTER JOIN
  `default`.`u`
  ON `t`.`id`=`u`.`id`
INSERT INTO `default`.`t` (`id`,`value`) partition (`id`)-- insert clause
  SELECT `u`.`id`,'x'
   WHERE `t`.`id` IS NULL
{code}
it should be
{code}
[...]
INSERT INTO `default`.`t` partition (`id`) (`value`)-- insert clause
[...]
{code}

#2 when values is not specified

{code}
merge into t using u on t.id=u.id when not matched then insert (id) values 
(u.id);
{code}

rewritten query:
{code}
FROM
  `default`.`t`
  RIGHT OUTER JOIN
  `default`.`u`
  ON `t`.`id`=`u`.`id`
INSERT INTO `default`.`t` (`id`) partition (`id`)-- insert clause
  SELECT `u`.`id`
   WHERE `t`.`id` IS NULL
{code}

it should be
{code}
[...]
INSERT INTO `default`.`t` partition (`id`) ()-- insert clause
[...]
{code}

however we don't accept empty column lists



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-25395) Update hadoop to a more recent version

2021-07-27 Thread Zoltan Haindrich (Jira)

Zoltan Haindrich created HIVE-25395:
---

 Summary: Update hadoop to a more recent version
 Key: HIVE-25395
 URL: https://issues.apache.org/jira/browse/HIVE-25395
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


we are still depending on hadoop 3.1.0

which doesn't have source attachments - and makes development harder



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-25378) Enable removal of old builds on hive ci

2021-07-23 Thread Zoltan Haindrich (Jira)

Zoltan Haindrich created HIVE-25378:
---

 Summary: Enable removal of old builds on hive ci
 Key: HIVE-25378
 URL: https://issues.apache.org/jira/browse/HIVE-25378
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


We are using the github plugin to run builds on PRs

However to remove old builds that plugin needs to have periodic branch scanning 
enabled - however since we also use the plugins merge mechanism; this will 
cause to rediscover all open PRs after there is a new commit on the target 
branch. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-25370) Improve SharedWorkOptimizer performance

2021-07-22 Thread Zoltan Haindrich (Jira)

Zoltan Haindrich created HIVE-25370:
---

 Summary: Improve SharedWorkOptimizer performance
 Key: HIVE-25370
 URL: https://issues.apache.org/jira/browse/HIVE-25370
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


for queries which are unioning ~800 constant rows the SWO is doing around n*n/2 
operations trying to find 2 TS-es which could be merged

{code}
select constants
UNION ALL
...
UNION ALL
select constants
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-25313) Upgrade commons-codec to 1.15

2021-07-07 Thread Zoltan Haindrich (Jira)

Zoltan Haindrich created HIVE-25313:
---

 Summary: Upgrade commons-codec to 1.15
 Key: HIVE-25313
 URL: https://issues.apache.org/jira/browse/HIVE-25313
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-25312) Upgrade netty to 4.1.65.Final

2021-07-07 Thread Zoltan Haindrich (Jira)

Zoltan Haindrich created HIVE-25312:
---

 Summary: Upgrade netty to 4.1.65.Final
 Key: HIVE-25312
 URL: https://issues.apache.org/jira/browse/HIVE-25312
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-25311) Slow compilation of union operators with >100 branches

2021-07-07 Thread Zoltan Haindrich (Jira)

Zoltan Haindrich created HIVE-25311:
---

 Summary: Slow compilation of union operators with >100 branches
 Key: HIVE-25311
 URL: https://issues.apache.org/jira/browse/HIVE-25311
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


during the processing of an N way union operator the full plan is cloned N 
times; which might hurt compilation time performance



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-25290) Stabilize TestTxnHandler

2021-06-25 Thread Zoltan Haindrich (Jira)

Zoltan Haindrich created HIVE-25290:
---

 Summary: Stabilize TestTxnHandler
 Key: HIVE-25290
 URL: https://issues.apache.org/jira/browse/HIVE-25290
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


http://ci.hive.apache.org/job/hive-flaky-check/271/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-25289) Fix external_jdbc_table3 and external_jdbc_table4

2021-06-25 Thread Zoltan Haindrich (Jira)

Zoltan Haindrich created HIVE-25289:
---

 Summary: Fix external_jdbc_table3 and external_jdbc_table4
 Key: HIVE-25289
 URL: https://issues.apache.org/jira/browse/HIVE-25289
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


http://ci.hive.apache.org/job/hive-flaky-check/265/
http://ci.hive.apache.org/job/hive-flaky-check/266/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-25288) Fix TestMmCompactorOnTez

2021-06-25 Thread Zoltan Haindrich (Jira)

Zoltan Haindrich created HIVE-25288:
---

 Summary: Fix TestMmCompactorOnTez
 Key: HIVE-25288
 URL: https://issues.apache.org/jira/browse/HIVE-25288
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


http://ci.hive.apache.org/job/hive-flaky-check/240/





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-25285) Retire HiveProjectJoinTransposeRule

2021-06-24 Thread Zoltan Haindrich (Jira)

Zoltan Haindrich created HIVE-25285:
---

 Summary: Retire HiveProjectJoinTransposeRule
 Key: HIVE-25285
 URL: https://issues.apache.org/jira/browse/HIVE-25285
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


we don't neccessary need our own rule anymore - a plain 
ProjectJoinTransposeRule  could probably work





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-25278) HiveProjectJoinTransposeRule may do invalid transformations with windowing expressions

2021-06-23 Thread Zoltan Haindrich (Jira)

Zoltan Haindrich created HIVE-25278:
---

 Summary: HiveProjectJoinTransposeRule may do invalid 
transformations with windowing expressions 
 Key: HIVE-25278
 URL: https://issues.apache.org/jira/browse/HIVE-25278
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


running
{code}
create table table1 (acct_num string, interest_rate decimal(10,7)) stored as 
orc;
create table table2 (act_id string) stored as orc;
CREATE TABLE temp_output AS
SELECT act_nbr, row_num
FROM (SELECT t2.act_id as act_nbr,
row_number() over (PARTITION BY trim(acct_num) ORDER BY interest_rate DESC) AS 
row_num
FROM table1 t1
INNER JOIN table2 t2
ON trim(acct_num) = t2.act_id) t
WHERE t.row_num = 1;
{code}

may result in error like:

{code}
Error: Error while compiling statement: FAILED: SemanticException Line 0:-1 
Invalid column reference 'interest_rate': (possible column names are: 
interest_rate, trim) (state=42000,code=4)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-25267) Fix TestReplicationScenariosAcidTables

2021-06-18 Thread Zoltan Haindrich (Jira)

Zoltan Haindrich created HIVE-25267:
---

 Summary: Fix TestReplicationScenariosAcidTables
 Key: HIVE-25267
 URL: https://issues.apache.org/jira/browse/HIVE-25267
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


test is unstable
http://ci.hive.apache.org/job/hive-flaky-check/242/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-25266) Fix TestWarehouseExternalDir

2021-06-18 Thread Zoltan Haindrich (Jira)

Zoltan Haindrich created HIVE-25266:
---

 Summary: Fix TestWarehouseExternalDir
 Key: HIVE-25266
 URL: https://issues.apache.org/jira/browse/HIVE-25266
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


test is unstable 
http://ci.hive.apache.org/job/hive-flaky-check/244/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-25265) Fix TestHiveIcebergStorageHandlerWithEngine

2021-06-18 Thread Zoltan Haindrich (Jira)

Zoltan Haindrich created HIVE-25265:
---

 Summary: Fix TestHiveIcebergStorageHandlerWithEngine
 Key: HIVE-25265
 URL: https://issues.apache.org/jira/browse/HIVE-25265
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


test is unstable:
http://ci.hive.apache.org/job/hive-flaky-check/251/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-25250) Fix TestHS2ImpersonationWithRemoteMS.testImpersonation

Zoltan Haindrich created HIVE-25250:
---

 Summary: Fix TestHS2ImpersonationWithRemoteMS.testImpersonation
 Key: HIVE-25250
 URL: https://issues.apache.org/jira/browse/HIVE-25250
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


http://ci.hive.apache.org/job/hive-flaky-check/235/testReport/org.apache.hive.service/TestHS2ImpersonationWithRemoteMS/testImpersonation/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-25249) Fix TestWorker

Zoltan Haindrich created HIVE-25249:
---

 Summary: Fix TestWorker
 Key: HIVE-25249
 URL: https://issues.apache.org/jira/browse/HIVE-25249
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich



http://ci.hive.apache.org/job/hive-precommit/job/PR-2381/1/

http://ci.hive.apache.org/job/hive-flaky-check/236/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-25248) Fix .TestLlapTaskSchedulerService#testForcedLocalityMultiplePreemptionsSameHost1

Zoltan Haindrich created HIVE-25248:
---

 Summary: Fix 
.TestLlapTaskSchedulerService#testForcedLocalityMultiplePreemptionsSameHost1
 Key: HIVE-25248
 URL: https://issues.apache.org/jira/browse/HIVE-25248
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


This test is failing randomly recently

http://ci.hive.apache.org/job/hive-flaky-check/233/testReport/org.apache.hadoop.hive.llap.tezplugins/TestLlapTaskSchedulerService/testForcedLocalityMultiplePreemptionsSameHost1/





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-25247) Fix TestWMMetricsWithTrigger