Re: Review Request 65756: HIVE-18060 UpdateInputAccessTimeHook fails for non-current database

2018-02-22 Thread Zoltan Haindrich

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65756/#review198127
---



Could you please add a test for this use case?
The one in the jira description is great

- Zoltan Haindrich


On Feb. 22, 2018, 3:06 p.m., Oleksiy Sayankin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65756/
> ---
> 
> (Updated Feb. 22, 2018, 3:06 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Zoltan Haindrich.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> initial commit
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/hooks/UpdateInputAccessTimeHook.java 
> c4856b1 
> 
> 
> Diff: https://reviews.apache.org/r/65756/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Oleksiy Sayankin
> 
>



Re: Review Request 65755: HIVE-18279 Incorrect condition in StatsOpimizer

2018-02-22 Thread Zoltan Haindrich

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65755/#review198126
---




ql/src/java/org/apache/hadoop/hive/ql/optimizer/StatsOptimizer.java
Line 824 (original), 824 (patched)
<https://reviews.apache.org/r/65755/#comment278235>

as I've noted earlier in the jira: a statistic with 0 rows is fine; but 
only if its up-to-date (areBasicStatsUpToDate)

I don't think we should be excluding 0 in any way; that would just 
"cover-up" some simple cases; and make it harder to track down if basic state 
is not maintained correctly.


- Zoltan Haindrich


On Feb. 22, 2018, 2:50 p.m., Oleksiy Sayankin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65755/
> ---
> 
> (Updated Feb. 22, 2018, 2:50 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Zoltan Haindrich, and Zoltan 
> Haindrich.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Initial commit
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/StatsOptimizer.java 85f198b 
> 
> 
> Diff: https://reviews.apache.org/r/65755/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Oleksiy Sayankin
> 
>



reverting test-breaking changes

2018-02-22 Thread Zoltan Haindrich

*

Hello,

*
*

**

In the last couple weeks the number of broken tests have started to go 
up...and even tho I run bisect/etc from time to time ; sometimes people 
don’t react to my comments/tickets/etc.


Because keeping this many failing tests makes it easier for a new one to 
slip in...I think reverting the patch introducing the test failures 
would also help in some case.


I think it would help a lot to prevent further test breaks to revert the 
patch if any of the following conditions is met:


*
*

C1) if the notification/comment about the fact that the patch indeed 
broken a test somehow have been unanswered for at least 24 hours.


C2) if the patch is in for 7 days; but the test failure is still not 
addressed (note that in this case there might be a conversation about 
fixing it...but in this case ; to enable other people to work in a 
cleaner environment is more important than a single patch - and if it 
can't be fixed in 7 days...well it might not get fixed in a month).


*
*

I would like to also note that I've seen a few tickets which have been 
picked up by people who were not involved in creating the original 
change - and although the intention was good, they might miss the 
context of the original patch and may "fix" the tests in the wrong way: 
accept a q.out which is inappropriate or ignore the test...


*
*

would it be ok to implement this from now on? because it makes my 
efforts practically useless if people are not reacting…


*
*

note: just to be on the same page - this is only about running a single 
test which falls on its own - I feel that flaky tests are an entirely 
different topic.


*
*

cheers,

Zoltan

**
*


[jira] [Created] (HIVE-18759) Remove unconnected q.out-s

2018-02-20 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18759:
---

 Summary: Remove unconnected q.out-s
 Key: HIVE-18759
 URL: https://issues.apache.org/jira/browse/HIVE-18759
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Zoltan Haindrich


org.apache.hadoop.hive.cli.control.TestDanglingQOuts



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 65422: HIVE-17626

2018-02-20 Thread Zoltan Haindrich

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65422/
---

(Updated Feb. 20, 2018, 6:41 p.m.)


Review request for hive and Ashutosh Chauhan.


Repository: hive-git


Description
---

preview


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 3d777f992b 
  itests/src/test/resources/testconfiguration.properties d4f2e539fb 
  ql/src/java/org/apache/hadoop/hive/ql/AbstractReExecDriver.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/Context.java dba2dbb15b 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java d00e639643 
  ql/src/java/org/apache/hadoop/hive/ql/DriverFactory.java 60e8de8fd4 
  ql/src/java/org/apache/hadoop/hive/ql/HookRunner.java 2a32a51588 
  ql/src/java/org/apache/hadoop/hive/ql/IDriver.java 9f13fa8e88 
  ql/src/java/org/apache/hadoop/hive/ql/ReExecOverlayDriver.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/ReOptimizeDriver.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 32fc257b03 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java 199b181290 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 395a5f450f 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkCommonOperator.java
 8dd7cfe58c 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkEmptyKeyOperator.java
 134fc0ff0b 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkObjectHashOperator.java
 1eb72ce4d9 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkUniformHashOperator.java
 384bd74686 
  ql/src/java/org/apache/hadoop/hive/ql/hooks/PrivateHookContext.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/SharedWorkOptimizer.java 
b0cf3bd94e 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
27b53b8b33 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
 9a3f81c98f 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g 78cbf25c43 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g e431271d3a 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java f9a6386ecf 
  ql/src/java/org/apache/hadoop/hive/ql/plan/Statistics.java 0057f0c2c6 
  ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/EmptyStatsSource.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/GroupTransformer.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/PlanMapper.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/PlanMapperProcess.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/RuntimeStatsSource.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/SimpleRuntimeStatsSource.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/StatsSource.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/refs/OperatorRef.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/stats/OperatorStats.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/stats/OperatorStatsReaderHook.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAssertTrueOOM.java 
PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/plan/mapping/TestCounterMapping.java 
PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/plan/mapping/TestOperatorCmp.java 
PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/plan/mapping/TestReOptimization.java 
PRE-CREATION 
  ql/src/test/queries/clientpositive/retry_failure.q PRE-CREATION 
  ql/src/test/queries/clientpositive/retry_failure_oom.q PRE-CREATION 
  ql/src/test/queries/clientpositive/retry_failure_stat_changes.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/retry_failure.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/llap/retry_failure_oom.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/llap/retry_failure_stat_changes.q.out 
PRE-CREATION 


Diff: https://reviews.apache.org/r/65422/diff/2/

Changes: https://reviews.apache.org/r/65422/diff/1-2/


Testing
---


Thanks,

Zoltan Haindrich



Re: Review Request 65422: HIVE-17626

2018-02-19 Thread Zoltan Haindrich


> On Feb. 7, 2018, 1:58 a.m., Ashutosh Chauhan wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
> > Lines 3691 (patched)
> > <https://reviews.apache.org/r/65422/diff/1/?file=1950474#file1950474line3696>
> >
> > Instead of config this should be explain modifier. WE already have 
> > explain rewrite select .. We similarly can add explain reoptimize select ...

yes...I agree; it turned out that its very inconvinient to use it this way...

I've employed a semanticAnlayzer hook to handle the reoptimize keyword


> On Feb. 7, 2018, 1:58 a.m., Ashutosh Chauhan wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
> > Lines 5066 (patched)
> > <https://reviews.apache.org/r/65422/diff/1/?file=1950474#file1950474line5071>
> >
> > Instead of iterating over _this_ which can be very large, more 
> > efficient is to iterate on other list.

I wasn't aware that the iterator() creates a new map on the flyI'm now 
using getProps() to get access to the actual values


> On Feb. 7, 2018, 1:58 a.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/AbstractReExecDriver.java
> > Lines 127 (patched)
> > <https://reviews.apache.org/r/65422/diff/1/?file=1950478#file1950478line127>
> >
> > Currently its only reexcuted once. Alternatively, we can keep 
> > re-running it if it fails again. e.g. in case of OOM, its possible that 
> > there are many joins which are mis-planed, but we get stats only for first 
> > join.
> > To avoid, very large number of retrials we can limit to some max 
> > attempts.

I aggree


> On Feb. 7, 2018, 1:58 a.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/DriverFactory.java
> > Lines 21 (patched)
> > <https://reviews.apache.org/r/65422/diff/1/?file=1950481#file1950481line21>
> >
> > Incorrect import ?

I've just taken a look at null analysis; but it detects too many issues to just 
turn on...so I'll remove it for now :)


> On Feb. 7, 2018, 1:58 a.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/refs/OperatorRef.java
> > Lines 50 (patched)
> > <https://reviews.apache.org/r/65422/diff/1/?file=1950503#file1950503line50>
> >
> > Instead of relying on ids, better is to use (and extend) logic in 
> > SharedWorkOptimizer::compareOperator() ?

t


- Zoltan


-------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65422/#review196950
---


On Jan. 30, 2018, 6:13 p.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65422/
> ---
> 
> (Updated Jan. 30, 2018, 6:13 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> preview
> 
> 
> Diffs
> -
> 
>   cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java a78e0c63d7 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b7d3e99e1a 
>   hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/HCatCli.java 
> ad31287879 
>   hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/HCatDriver.java 
> 533f0bcd6f 
>   itests/src/test/resources/testconfiguration.properties d86ff58840 
>   ql/src/java/org/apache/hadoop/hive/ql/AbstractReExecDriver.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/Context.java 820fbf0f58 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 74595b00f9 
>   ql/src/java/org/apache/hadoop/hive/ql/DriverFactory.java 49d2bf5f33 
>   ql/src/java/org/apache/hadoop/hive/ql/IDriver.java 6280be0b08 
>   ql/src/java/org/apache/hadoop/hive/ql/ReExecOverlayDriver.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/ReOptimizeDriver.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 76e85636d1 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java 199b181290 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 
> 395a5f450f 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkCommonOperator.java
>  8dd7cfe58c 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkEmptyKeyOperator.java
>  134fc0ff0b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkObjectHashOperator.java
>  1eb72ce4d9 
>   
> ql/src/java/org/apache/hadoop/hive/ql/ex

Re: Review Request 65422: HIVE-17626

2018-02-19 Thread Zoltan Haindrich


> On Feb. 16, 2018, 4:50 a.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/AbstractReExecDriver.java
> > Lines 131 (patched)
> > <https://reviews.apache.org/r/65422/diff/1/?file=1950478#file1950478line131>
> >
> > This is hackish.. as pointed above it needs to happen via explain 
> > modifier.

I agree


> On Feb. 16, 2018, 4:50 a.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/DriverFactory.java
> > Lines 21 (patched)
> > <https://reviews.apache.org/r/65422/diff/1/?file=1950481#file1950481line21>
> >
> > Use java's nonnull annotation.

I've not found any "standard" annotation...I may just as well remove these 
markers...
https://stackoverflow.com/questions/4963300/which-notnull-java-annotation-should-i-use/42695253#42695253


> On Feb. 16, 2018, 4:50 a.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ReOptimizeDriver.java
> > Lines 54 (patched)
> > <https://reviews.apache.org/r/65422/diff/1/?file=1950484#file1950484line54>
> >
> > Why is this needed?

this is not needed...but enables the user to set a different set of 
configuration during re-executions


> On Feb. 16, 2018, 4:50 a.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/PlanMapper.java
> > Lines 36 (patched)
> > <https://reviews.apache.org/r/65422/diff/1/?file=1950498#file1950498line36>
> >
> > A flat map of operators looses hierarichal info in which operators are 
> > organized which is tree. So, this match needs to happen  via sub-graph 
> > matching pattern. See SharedWorkOptimizer::areMergeable() .

I will try to retain this concept for now at least; the idea is that imagine 
that we have N operator stats gathered; and the current plan consist of M 
operators; if we have only a cmp(A,B) oracle; that means we will have to do N*M 
comparisions; which could become really bad if N starts to become large...

I'm thinking of serving the existing operator infos in a map alike fashion - at 
least it should be visible as one for the outside.

If an operator could self-describe its whole context; then it could be match... 
for example a matching the textual representation of a RelNode contains all the 
upstream operations as well; and enables matching.

It looked promising to do it; I wanted to do it with HIVE-18703 - but 
unfortunately there were some complications...


- Zoltan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65422/#review197649
---


On Jan. 30, 2018, 6:13 p.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65422/
> ---
> 
> (Updated Jan. 30, 2018, 6:13 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> preview
> 
> 
> Diffs
> -
> 
>   cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java a78e0c63d7 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b7d3e99e1a 
>   hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/HCatCli.java 
> ad31287879 
>   hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/HCatDriver.java 
> 533f0bcd6f 
>   itests/src/test/resources/testconfiguration.properties d86ff58840 
>   ql/src/java/org/apache/hadoop/hive/ql/AbstractReExecDriver.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/Context.java 820fbf0f58 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 74595b00f9 
>   ql/src/java/org/apache/hadoop/hive/ql/DriverFactory.java 49d2bf5f33 
>   ql/src/java/org/apache/hadoop/hive/ql/IDriver.java 6280be0b08 
>   ql/src/java/org/apache/hadoop/hive/ql/ReExecOverlayDriver.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/ReOptimizeDriver.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 76e85636d1 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java 199b181290 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 
> 395a5f450f 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkCommonOperator.java
>  8dd7cfe58c 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkEmptyKeyOperator.java
>  134fc0ff0b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkObjectHashOperator.java
>  1eb72ce4d9 
>   
> ql/src/java/org/apache/

[jira] [Created] (HIVE-18715) Remove index support from metastore

2018-02-14 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18715:
---

 Summary: Remove index support from metastore
 Key: HIVE-18715
 URL: https://issues.apache.org/jira/browse/HIVE-18715
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore, Standalone Metastore
Reporter: Zoltan Haindrich


Hive will not use this feature anymore; so if there are no other uses of it we 
might remove it from the metastore as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18714) Investigate usage of IndexPredicateAnalyzer in StorageHandlers

2018-02-14 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18714:
---

 Summary: Investigate usage of IndexPredicateAnalyzer in 
StorageHandlers
 Key: HIVE-18714
 URL: https://issues.apache.org/jira/browse/HIVE-18714
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] deprecation policy

2018-02-14 Thread Zoltan Haindrich
Hello,

Regarding dropping index support: I've thinked about it a few times earlier 
because it complicates things; on the other hand I think there are no real user 
base behind it because parquet/orc does a similar(better) optimization - 
without the extra need of sql ddls...the ticket have reinforced my point of 
view.
Regarding lock/hpsql I'm not sure..but I think people are start watching 
tickets; and by doing so keep an eye on these things...

cheers,
Zoltan


On 13 Feb 2018 3:45 p.m., Peter Vary  wrote:
Hi Team,

I am seeing several jiras proposing removing functions from Hive:
HIVE-18691 Drop Support for Explicit Table Lock From Apache Hive
HIVE-18692 Remove HPL/SQL From Apache Hive
HIVE-18448 Drop Support For Indexes From Apache Hive
I generally agree with removing unused functions. Especially ones which add 
extra effort to upkeep without real user base.
My question is wether Hive has a deprecation policy for function deprecation 
like Hadoop:
https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html#Deprecation
 


Thanks,
Peter



[jira] [Created] (HIVE-18703) Make Operator comparision to be based on some primitive

2018-02-13 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18703:
---

 Summary: Make Operator comparision to be based on some primitive
 Key: HIVE-18703
 URL: https://issues.apache.org/jira/browse/HIVE-18703
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


currently we have {{Operator.isSame(op)}} which can respond to wheter 2 
operators are equal; it would be great to introduce a simple object on which 
the comparision is happening; and that could also enable to lookup operators in 
a set.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18701) Fix TestMiniSparkOnYarnCliDriver#testCliDriver[spark_opt_shuffle_serde]

2018-02-13 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18701:
---

 Summary: Fix 
TestMiniSparkOnYarnCliDriver#testCliDriver[spark_opt_shuffle_serde]
 Key: HIVE-18701
 URL: https://issues.apache.org/jira/browse/HIVE-18701
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


seems to be broken by HIVE-18389



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18700) qtests: authorization_update_own_table.q breaks row__id.q

2018-02-13 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18700:
---

 Summary: qtests: authorization_update_own_table.q breaks row__id.q
 Key: HIVE-18700
 URL: https://issues.apache.org/jira/browse/HIVE-18700
 Project: Hive
  Issue Type: Bug
  Components: Test
Reporter: Zoltan Haindrich


{code}
mvn install -Pitests install -pl itests/qtest -am -Dtest=TestCliDriver 
-Dqfile=authorization_update_own_table.q,row__id.q
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18698) Fix TestMiniLlapLocalCliDriver#testCliDriver[bucket_map_join_tez1]

2018-02-13 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18698:
---

 Summary: Fix 
TestMiniLlapLocalCliDriver#testCliDriver[bucket_map_join_tez1]
 Key: HIVE-18698
 URL: https://issues.apache.org/jira/browse/HIVE-18698
 Project: Hive
  Issue Type: Bug
  Components: Test
Reporter: Zoltan Haindrich


HIVE-18416 have made some extra stat updates on the q.out which are unrelated



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Review Request 65631: HIVE-18635 hooks

2018-02-13 Thread Zoltan Haindrich

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65631/
---

Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-18635
https://issues.apache.org/jira/browse/HIVE-18635


Repository: hive-git


Description
---

make it possible to add hooks using the java api ; it was possible earlier as 
well - but it was unneccessarily overcomplicated...


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 
2d7e4597de623d892702cae6e732ec5eb09d87da 
  ql/src/java/org/apache/hadoop/hive/ql/HookRunner.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/QueryLifeTimeHookRunner.java 
53d716bceb98c2ced3a3ba3f0cd607766447dfd9 
  ql/src/java/org/apache/hadoop/hive/ql/hooks/HookUtils.java 
dbd258a2cc0f7736077d8aab85c937e6a4c604bd 
  ql/src/java/org/apache/hadoop/hive/ql/hooks/HooksLoader.java 
5a370e89a9e93da2191431acc1a66c69c94d1372 
  ql/src/test/org/apache/hadoop/hive/ql/hooks/TestQueryHooks.java 
06a96d5a39edbf442a8ed462d6eeb91522a8338d 
  ql/src/test/results/clientnegative/bad_exec_hooks.q.out 
6f2ee5c41042eef9347471711215fb753887137d 
  service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
f3e08a9e64bd759d0e89dd19653905ff1b53c0a7 


Diff: https://reviews.apache.org/r/65631/diff/1/


Testing
---


Thanks,

Zoltan Haindrich



[jira] [Created] (HIVE-18695) fix TestAccumuloCliDriver.testCliDriver[accumulo_queries]

2018-02-13 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18695:
---

 Summary: fix TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 Key: HIVE-18695
 URL: https://issues.apache.org/jira/browse/HIVE-18695
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


seems to be broken by HIVE-15680



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18694) Fix TestHiveCli test

2018-02-13 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18694:
---

 Summary: Fix TestHiveCli test
 Key: HIVE-18694
 URL: https://issues.apache.org/jira/browse/HIVE-18694
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


seems to be broken by HIVE-18493



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 65543: HIVE-18238

2018-02-12 Thread Zoltan Haindrich

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65543/
---

(Updated Feb. 12, 2018, 9:20 p.m.)


Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-18238
https://issues.apache.org/jira/browse/HIVE-18238


Repository: hive-git


Description
---

hs2 already uses separate driver instance to execute commands ; these changes 
make the cli also do the same - since we are using cli to run tests...


Diffs (updated)
-

  cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java e57412aca0 
  hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/HCatCli.java 
a36b0db3a5 
  hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/HCatDriver.java 
e112412f75 
  hcatalog/core/src/test/java/org/apache/hive/hcatalog/cli/TestPermsGrp.java 
4dbf7acb9d 
  
hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestHCatLoaderEncryption.java
 1560571931 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/authorization/plugin/TestHiveAuthorizerShowFilters.java
 89812232d4 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java c6f7d6459e 
  ql/src/java/org/apache/hadoop/hive/ql/DriverFactory.java 49d2bf5f33 
  ql/src/java/org/apache/hadoop/hive/ql/IDriver.java d4494cc72e 
  ql/src/java/org/apache/hadoop/hive/ql/QueryState.java d8d19e8606 
  ql/src/java/org/apache/hadoop/hive/ql/hooks/HooksLoader.java 5a370e89a9 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java cf8bc7f256 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManagerImpl.java 
d750e77215 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 7ed9fe4fbb 
  ql/src/java/org/apache/hadoop/hive/ql/processors/AddResourceProcessor.java 
5fcbd69644 
  ql/src/java/org/apache/hadoop/hive/ql/processors/CommandProcessor.java 
c7532648f4 
  ql/src/java/org/apache/hadoop/hive/ql/processors/CommandProcessorFactory.java 
dcf8d31eaf 
  ql/src/java/org/apache/hadoop/hive/ql/processors/CompileProcessor.java 
fad4f52c03 
  ql/src/java/org/apache/hadoop/hive/ql/processors/CryptoProcessor.java 
d1202f92bb 
  ql/src/java/org/apache/hadoop/hive/ql/processors/DeleteResourceProcessor.java 
54a7d4b0de 
  ql/src/java/org/apache/hadoop/hive/ql/processors/DfsProcessor.java 62a1725114 
  ql/src/java/org/apache/hadoop/hive/ql/processors/ListResourceProcessor.java 
91a6aba62f 
  ql/src/java/org/apache/hadoop/hive/ql/processors/ReloadProcessor.java 
4caab916ff 
  ql/src/java/org/apache/hadoop/hive/ql/processors/ResetProcessor.java 
ca39ff98ce 
  ql/src/java/org/apache/hadoop/hive/ql/processors/SetProcessor.java 1ff4b3c947 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java 26c6700d99 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java 
1e83799e03 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java 048215aa37 
  ql/src/test/org/apache/hadoop/hive/ql/TxnCommandsBaseForTests.java 93074e998d 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestOperators.java 635a357181 
  ql/src/test/org/apache/hadoop/hive/ql/hooks/TestQueryHooks.java 06a96d5a39 
  ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java 
71d960f4c9 
  ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDummyTxnManager.java 
2f5fc2fff1 
  ql/src/test/queries/clientpositive/driver_conf_isolation.q PRE-CREATION 
  ql/src/test/queries/clientpositive/special_character_in_tabnames_1.q 
c017172fd1 
  ql/src/test/results/clientpositive/driver_conf_isolation.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/input39.q.out 3000404448 
  service/src/java/org/apache/hive/service/cli/operation/Operation.java 
2ef1479540 


Diff: https://reviews.apache.org/r/65543/diff/2/

Changes: https://reviews.apache.org/r/65543/diff/1-2/


Testing
---


Thanks,

Zoltan Haindrich



Re: Review Request 65543: HIVE-18238

2018-02-12 Thread Zoltan Haindrich


> On Feb. 12, 2018, 7:38 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/IDriver.java
> > Line 66 (original), 68 (patched)
> > <https://reviews.apache.org/r/65543/diff/1/?file=1953888#file1953888line68>
> >
> > HCat cli will be removed soon, so we may ignore that.

I've removed the fixme...I've been using this conf getter elsewhere also


> On Feb. 12, 2018, 7:38 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/QueryState.java
> > Lines 181 (patched)
> > <https://reviews.apache.org/r/65543/diff/1/?file=1953889#file1953889line192>
> >
> > I don't see this being used everywhere. Whats the purpose of this?

I was sure I've removed this...this was only used during narrowing down the 
problems this patch caused at first.


- Zoltan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65543/#review197292
-------


On Feb. 7, 2018, 9:36 a.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65543/
> ---
> 
> (Updated Feb. 7, 2018, 9:36 a.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-18238
> https://issues.apache.org/jira/browse/HIVE-18238
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> hs2 already uses separate driver instance to execute commands ; these changes 
> make the cli also do the same - since we are using cli to run tests...
> 
> 
> Diffs
> -
> 
>   cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java 
> a78e0c63d792230c19493514be1ed7bd992eeebe 
>   hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/HCatCli.java 
> ad31287879930386838ca19533ce67df08349dc1 
>   hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/HCatDriver.java 
> 533f0bcd6f5a12606de46a1b986270789dd52233 
>   hcatalog/core/src/test/java/org/apache/hive/hcatalog/cli/TestPermsGrp.java 
> 4dbf7acb9d569dd608b83ca7db0f6e18d0c3d02b 
>   
> hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestHCatLoaderEncryption.java
>  b70a9529d895fddc3cb7debc37dba36a1afb4c7e 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/authorization/plugin/TestHiveAuthorizerShowFilters.java
>  2be86f8b4259de3e5c1d78d9bdd2c3a53505e3c4 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 
> 74595b00f9d35e3a850b2ef3550d2909a83880cf 
>   ql/src/java/org/apache/hadoop/hive/ql/DriverFactory.java 
> 49d2bf5f335c6806460fb6b83ee4da8bf842bd5a 
>   ql/src/java/org/apache/hadoop/hive/ql/IDriver.java 
> 6280be0b08a657a452cc39dacb42c9cf396bd880 
>   ql/src/java/org/apache/hadoop/hive/ql/QueryState.java 
> d8d19e86061846c446fbb60e7ed827c6ba6fb7fc 
>   ql/src/java/org/apache/hadoop/hive/ql/hooks/HooksLoader.java 
> 5a370e89a9e93da2191431acc1a66c69c94d1372 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java 
> cf8bc7f256db4a4f559aa1c79c76201de620f141 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManagerImpl.java 
> d750e772150088295643b99d4496b3ded2e104c0 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> b67a03f2138cc5f47135f4a6ecf55dd3bd1c20fc 
>   ql/src/java/org/apache/hadoop/hive/ql/processors/AddResourceProcessor.java 
> 5fcbd6964429dd6aa7c70ded47fa862b356cfd88 
>   ql/src/java/org/apache/hadoop/hive/ql/processors/CommandProcessor.java 
> 3624d08ee82ecb4822767312e9cfdaf3fea5c98b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/processors/CommandProcessorFactory.java 
> dcf8d31eaf769ca198e9877c73bf4bce81d3f3c4 
>   ql/src/java/org/apache/hadoop/hive/ql/processors/CompileProcessor.java 
> 07d70ab5634044f48dc3401185ab7c132dbb7476 
>   ql/src/java/org/apache/hadoop/hive/ql/processors/CryptoProcessor.java 
> 6825dd83a4953eb97279f9e39deca072d057dfe7 
>   
> ql/src/java/org/apache/hadoop/hive/ql/processors/DeleteResourceProcessor.java 
> 54a7d4b0de6fd8d79517b616ed2fff24e1e36370 
>   ql/src/java/org/apache/hadoop/hive/ql/processors/DfsProcessor.java 
> 2f288ce8b8234b2f3da58729f998124f6858608b 
>   ql/src/java/org/apache/hadoop/hive/ql/processors/ListResourceProcessor.java 
> 7ec36be61db680e2ce2471d13c954ac87935f473 
>   ql/src/java/org/apache/hadoop/hive/ql/processors/ReloadProcessor.java 
> b82bd5ce99b47e81b88984f47cce625c97170c34 
>   ql/src/java/org/apache/hadoop/hive/ql/processors/ResetProcessor.java 
> 144f5223d3e8664cc424c40abe046b23c97382c8 
>   ql/src/java/org/apache/hadoop/hive/ql/processors/SetProcess

[jira] [Created] (HIVE-18681) use the same jackson library consistently

2018-02-12 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18681:
---

 Summary: use the same jackson library consistently
 Key: HIVE-18681
 URL: https://issues.apache.org/jira/browse/HIVE-18681
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


currently there are uses of both:  org.codehaus.jackson and 
com.fasterxml.jackson.core inside hive; it would be great to migrate to use the 
latter.

more info:
https://stackoverflow.com/questions/30782706/org-codehaus-jackson-versus-com-fasterxml-jackson-core



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18668) Really shade guava in ql

2018-02-09 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18668:
---

 Summary: Really shade guava in ql
 Key: HIVE-18668
 URL: https://issues.apache.org/jira/browse/HIVE-18668
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


After HIVE-15393 a test started to fail in druid; after some investigation it 
turned out that ql doesn't shade it's guava artifact at all...because it shades 
'com.google.guava' instead 'com.google.common'






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Review Request 65543: HIVE-18238

2018-02-07 Thread Zoltan Haindrich

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65543/
---

Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-18238
https://issues.apache.org/jira/browse/HIVE-18238


Repository: hive-git


Description
---

hs2 already uses separate driver instance to execute commands ; these changes 
make the cli also do the same - since we are using cli to run tests...


Diffs
-

  cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java 
a78e0c63d792230c19493514be1ed7bd992eeebe 
  hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/HCatCli.java 
ad31287879930386838ca19533ce67df08349dc1 
  hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/HCatDriver.java 
533f0bcd6f5a12606de46a1b986270789dd52233 
  hcatalog/core/src/test/java/org/apache/hive/hcatalog/cli/TestPermsGrp.java 
4dbf7acb9d569dd608b83ca7db0f6e18d0c3d02b 
  
hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestHCatLoaderEncryption.java
 b70a9529d895fddc3cb7debc37dba36a1afb4c7e 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/authorization/plugin/TestHiveAuthorizerShowFilters.java
 2be86f8b4259de3e5c1d78d9bdd2c3a53505e3c4 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 
74595b00f9d35e3a850b2ef3550d2909a83880cf 
  ql/src/java/org/apache/hadoop/hive/ql/DriverFactory.java 
49d2bf5f335c6806460fb6b83ee4da8bf842bd5a 
  ql/src/java/org/apache/hadoop/hive/ql/IDriver.java 
6280be0b08a657a452cc39dacb42c9cf396bd880 
  ql/src/java/org/apache/hadoop/hive/ql/QueryState.java 
d8d19e86061846c446fbb60e7ed827c6ba6fb7fc 
  ql/src/java/org/apache/hadoop/hive/ql/hooks/HooksLoader.java 
5a370e89a9e93da2191431acc1a66c69c94d1372 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java 
cf8bc7f256db4a4f559aa1c79c76201de620f141 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManagerImpl.java 
d750e772150088295643b99d4496b3ded2e104c0 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
b67a03f2138cc5f47135f4a6ecf55dd3bd1c20fc 
  ql/src/java/org/apache/hadoop/hive/ql/processors/AddResourceProcessor.java 
5fcbd6964429dd6aa7c70ded47fa862b356cfd88 
  ql/src/java/org/apache/hadoop/hive/ql/processors/CommandProcessor.java 
3624d08ee82ecb4822767312e9cfdaf3fea5c98b 
  ql/src/java/org/apache/hadoop/hive/ql/processors/CommandProcessorFactory.java 
dcf8d31eaf769ca198e9877c73bf4bce81d3f3c4 
  ql/src/java/org/apache/hadoop/hive/ql/processors/CompileProcessor.java 
07d70ab5634044f48dc3401185ab7c132dbb7476 
  ql/src/java/org/apache/hadoop/hive/ql/processors/CryptoProcessor.java 
6825dd83a4953eb97279f9e39deca072d057dfe7 
  ql/src/java/org/apache/hadoop/hive/ql/processors/DeleteResourceProcessor.java 
54a7d4b0de6fd8d79517b616ed2fff24e1e36370 
  ql/src/java/org/apache/hadoop/hive/ql/processors/DfsProcessor.java 
2f288ce8b8234b2f3da58729f998124f6858608b 
  ql/src/java/org/apache/hadoop/hive/ql/processors/ListResourceProcessor.java 
7ec36be61db680e2ce2471d13c954ac87935f473 
  ql/src/java/org/apache/hadoop/hive/ql/processors/ReloadProcessor.java 
b82bd5ce99b47e81b88984f47cce625c97170c34 
  ql/src/java/org/apache/hadoop/hive/ql/processors/ResetProcessor.java 
144f5223d3e8664cc424c40abe046b23c97382c8 
  ql/src/java/org/apache/hadoop/hive/ql/processors/SetProcessor.java 
1ff4b3c9479fccc0bc6c137eb0d1be7953594c4f 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java 
4508e59a8bdde571740a1e5c08652c564958d350 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java 
d56002d192c9faf5a656386fc933f7d84e1b1204 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java 
048215aa376e4b561cf5fe07bec88397441dc7fb 
  ql/src/test/org/apache/hadoop/hive/ql/TxnCommandsBaseForTests.java 
93074e998d42147eae54f0993d9ee53072e2e36a 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestOperators.java 
df19d72411cb901a996f2d3ff0ee6d0b6ae200c7 
  ql/src/test/org/apache/hadoop/hive/ql/hooks/TestQueryHooks.java 
06628750e250c449d5695640cffda4f8ac924a17 
  ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java 
71d960f4c99b6874d35e7547c73d3a787c04b864 
  ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDummyTxnManager.java 
2f5fc2fff1931a7c57f6f4805d6096aa1a09ae59 
  ql/src/test/queries/clientpositive/driver_conf_isolation.q PRE-CREATION 
  ql/src/test/queries/clientpositive/special_character_in_tabnames_1.q 
c017172fd18b94fd870a2dcd79e0d205fc4076cc 
  ql/src/test/results/clientpositive/driver_conf_isolation.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/input39.q.out 
3000404448c05b57c4bced986e8e2d3b305848b2 
  service/src/java/org/apache/hive/service/cli/operation/Operation.java 
2ef14795401cd9be786494f4652b844e3fb4b283 


Diff: https://reviews.apache.org/r/65543/diff/1/


Testing
---


Thanks,

Zoltan Haindrich



Re: Review Request 65479: HIVE-18523 Fix summary row in case there are no inputs

2018-02-06 Thread Zoltan Haindrich

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65479/
---

(Updated Feb. 6, 2018, 5:15 p.m.)


Review request for hive, Ashutosh Chauhan and Prasanth_J.


Changes
---

update


Bugs: HIVE-18523
https://issues.apache.org/jira/browse/HIVE-18523


Repository: hive-git


Description
---

* ensure that mapper operators are started up - but only if empty grouping is 
present


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java 6a0f0de2a5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/IConfigureJobConf.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 941dd58f27 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 976b537033 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapRunner.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapper.java 150382a8d5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java 
98f4bc01c8 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorGroupByOperator.java 
45d809a182 
  ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java 
e4dfc009d9 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java c3b846c4d2 
  ql/src/java/org/apache/hadoop/hive/ql/io/NullRowsInputFormat.java 6a372a3f47 
  ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java f2b2fc57a0 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceWork.java ecfb118b41 
  ql/src/test/queries/clientpositive/groupby_rollup_empty.q 432d8c448a 
  ql/src/test/results/clientpositive/groupby_rollup_empty.q.out 7359140e29 
  ql/src/test/results/clientpositive/llap/groupby_rollup_empty.q.out d2b57455a3 


Diff: https://reviews.apache.org/r/65479/diff/3/

Changes: https://reviews.apache.org/r/65479/diff/2-3/


Testing
---

added new testcase for union


Thanks,

Zoltan Haindrich



Re: Review Request 65479: HIVE-18523 Fix summary row in case there are no inputs

2018-02-06 Thread Zoltan Haindrich


> On Feb. 5, 2018, 11:05 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceWork.java
> > Lines 213 (patched)
> > <https://reviews.apache.org/r/65479/diff/1/?file=1951925#file1951925line219>
> >
> > Do we need this? Reducers are always launched when when there is no 
> > mapper. So, this seems unnecessary.
> 
> Zoltan Haindrich wrote:
> I can remove it...but I think IConfigureJobConf can be used as a general 
> way to make modifications like this - in the current case this setting has 
> effectively no effect on the reducers...they are being run anyway
> 
> I think that after this patch the code from PlanUtils.configureJobConf 
> could be moved to FileSinkOperator by using the new interface
> I feel that core constructs like ReduceWork should depend less on 
> explicit operator implementations and more on interfaces if possible.
> 
> Ashutosh Chauhan wrote:
> I agree that a general mechanism has some value. However, my concern is 
> extra CPU we will be wasting doing nothing. In a large plan with 100s of 
> vertices going through all RSs will be non-zero cost we will be paying. So, 
> unless we need it we should avoid it.

I see your point, I've removed it.


- Zoltan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65479/#review196840
---


On Feb. 6, 2018, 10:24 a.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65479/
> ---
> 
> (Updated Feb. 6, 2018, 10:24 a.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Prasanth_J.
> 
> 
> Bugs: HIVE-18523
> https://issues.apache.org/jira/browse/HIVE-18523
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> * ensure that mapper operators are started up - but only if empty grouping is 
> present
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java 6a0f0de2a5 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/IConfigureJobConf.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 941dd58f27 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 976b537033 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapRunner.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapper.java 150382a8d5 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java 
> 98f4bc01c8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorGroupByOperator.java 
> 45d809a182 
>   ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java 
> e4dfc009d9 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java c3b846c4d2 
>   ql/src/java/org/apache/hadoop/hive/ql/io/NullRowsInputFormat.java 
> 6a372a3f47 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java f2b2fc57a0 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceWork.java ecfb118b41 
>   ql/src/test/queries/clientpositive/groupby_rollup_empty.q 432d8c448a 
>   ql/src/test/results/clientpositive/groupby_rollup_empty.q.out 7359140e29 
>   ql/src/test/results/clientpositive/llap/groupby_rollup_empty.q.out 
> d2b57455a3 
> 
> 
> Diff: https://reviews.apache.org/r/65479/diff/2/
> 
> 
> Testing
> ---
> 
> added new testcase for union
> 
> 
> Thanks,
> 
> Zoltan Haindrich
> 
>



[jira] [Created] (HIVE-18635) Generalize hook dispatch logics in Driver

2018-02-06 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18635:
---

 Summary: Generalize hook dispatch logics in Driver
 Key: HIVE-18635
 URL: https://issues.apache.org/jira/browse/HIVE-18635
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich


Currently it is only possible to "add" new hooks by either hard coding them; or 
"pasting" the classname into the hiveconf value...it would be good to make it 
possible to add hooks by some api as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18634) Spark artifact alters classpath for tests

2018-02-06 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18634:
---

 Summary: Spark artifact alters classpath for tests
 Key: HIVE-18634
 URL: https://issues.apache.org/jira/browse/HIVE-18634
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


It seems like TestTriggersMoveWorkloadManager is sensitive to something...and I 
was looking into a bit...and while I've checked the log differences between the 
passed/failed I've noticed that some log messages are similar;but different:

It seems to me that by using {{-DskipSparkTests}} - the test is run with a 
different version of jetty

{code}
kirk@savara:~/projects/toolbox$ grep request.log /tmp/hl.bad 
2018-02-06T02:04:38,421  WARN [main] http.HttpRequestLog: Jetty request log can 
only be enabled using Log4j
2018-02-06T02:04:39,012  WARN [main] http.HttpRequestLog: Jetty request log can 
only be enabled using Log4j
2018-02-06T02:04:39,329  WARN [main] http.HttpRequestLog: Jetty request log can 
only be enabled using Log4j
2018-02-06T02:04:39,497  WARN [main] http.HttpRequestLog: Jetty request log can 
only be enabled using Log4j
2018-02-06T02:04:39,744  WARN [main] http.HttpRequestLog: Jetty request log can 
only be enabled using Log4j
2018-02-06T02:04:41,991  WARN [main] http.HttpRequestLog: Jetty request log can 
only be enabled using Log4j
2018-02-06T02:04:43,032  WARN [main] http.HttpRequestLog: Jetty request log can 
only be enabled using Log4j
2018-02-06T02:04:43,404  WARN [main] http.HttpRequestLog: Jetty request log can 
only be enabled using Log4j
kirk@savara:~/projects/toolbox$ grep request.log /tmp/hl.good 
2018-02-06T01:34:39,990  INFO [main] http.HttpRequestLog: Http request log for 
http.requests.namenode is not defined
2018-02-06T01:34:40,513  INFO [main] http.HttpRequestLog: Http request log for 
http.requests.datanode is not defined
2018-02-06T01:34:40,736  INFO [main] http.HttpRequestLog: Http request log for 
http.requests.datanode is not defined
2018-02-06T01:34:40,898  INFO [main] http.HttpRequestLog: Http request log for 
http.requests.datanode is not defined
2018-02-06T01:34:41,033  INFO [main] http.HttpRequestLog: Http request log for 
http.requests.datanode is not defined
2018-02-06T01:34:44,070  INFO [main] http.HttpRequestLog: Http request log for 
http.requests.resourcemanager is not defined
2018-02-06T01:34:45,099  INFO [main] http.HttpRequestLog: Http request log for 
http.requests.nodemanager is not defined
2018-02-06T01:34:45,423  INFO [main] http.HttpRequestLog: Http request log for 
http.requests.nodemanager is not defined
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 65479: HIVE-18523 Fix summary row in case there are no inputs

2018-02-06 Thread Zoltan Haindrich

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65479/
---

(Updated Feb. 6, 2018, 10:24 a.m.)


Review request for hive, Ashutosh Chauhan and Prasanth_J.


Changes
---

address review comments


Bugs: HIVE-18523
https://issues.apache.org/jira/browse/HIVE-18523


Repository: hive-git


Description
---

* ensure that mapper operators are started up - but only if empty grouping is 
present


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java 6a0f0de2a5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/IConfigureJobConf.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 941dd58f27 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 976b537033 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapRunner.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapper.java 150382a8d5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java 
98f4bc01c8 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorGroupByOperator.java 
45d809a182 
  ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java 
e4dfc009d9 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java c3b846c4d2 
  ql/src/java/org/apache/hadoop/hive/ql/io/NullRowsInputFormat.java 6a372a3f47 
  ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java f2b2fc57a0 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceWork.java ecfb118b41 
  ql/src/test/queries/clientpositive/groupby_rollup_empty.q 432d8c448a 
  ql/src/test/results/clientpositive/groupby_rollup_empty.q.out 7359140e29 
  ql/src/test/results/clientpositive/llap/groupby_rollup_empty.q.out d2b57455a3 


Diff: https://reviews.apache.org/r/65479/diff/2/

Changes: https://reviews.apache.org/r/65479/diff/1-2/


Testing
---

added new testcase for union


Thanks,

Zoltan Haindrich



Re: Review Request 65479: HIVE-18523 Fix summary row in case there are no inputs

2018-02-06 Thread Zoltan Haindrich


> On Feb. 5, 2018, 11:05 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/IConfigureJobConf.java
> > Lines 24 (patched)
> > <https://reviews.apache.org/r/65479/diff/1/?file=1951914#file1951914line24>
> >
> > Add: Intended only for compilation phase.

added


> On Feb. 5, 2018, 11:05 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java
> > Lines 259 (patched)
> > <https://reviews.apache.org/r/65479/diff/1/?file=1951916#file1951916line259>
> >
> > Is this needed?

unfortunately yes; the following makes it needed:

* we are currently using mapreduce "old" api
* 
[MapRunner#map](https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapRunner.java#L54)
 only passes the OutputCollector if there are *at least* one inputs 
* 
[ExecMapper](https://github.com/apache/hive/blob/f33db1f68c68b552b9888988f818c03879749461/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapper.java#L144)
 at the minute the first record is start getting in; it sets the OC correctly
* note: It's interesting that the ReduceSink needs the OutputCollector to pass 
the output...but it can "silently" ignore record if the OC is unset 
[here](https://github.com/apache/hive/blob/f33db1f68c68b552b9888988f818c03879749461/ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java#L500)
 - not sure if this have already hidden bugs or not...
* ExecMapRunner only adds the ability to set the OC in case there are 0 rows - 
and it enables closeOp() -s to emit records if they have to


> On Feb. 5, 2018, 11:05 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapRunner.java
> > Lines 29 (patched)
> > <https://reviews.apache.org/r/65479/diff/1/?file=1951917#file1951917line29>
> >
> > Why do we need this class?

see previous comment


> On Feb. 5, 2018, 11:05 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java
> > Line 247 (original), 246-248 (patched)
> > <https://reviews.apache.org/r/65479/diff/1/?file=1951919#file1951919line247>
> >
> > This may result in extra memory allocation. If this change is not 
> > necessary, can we leave it as is?

this was some leftover from an earlier version of the patch...removed


> On Feb. 5, 2018, 11:05 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorGroupByOperator.java
> > Lines 453 (patched)
> > <https://reviews.apache.org/r/65479/diff/1/?file=1951920#file1951920line453>
> >
> > Please add comment.

added:

   in case the empty grouping set is preset; but no output has done
   the "summary row" still needs to be emitted


> On Feb. 5, 2018, 11:05 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java
> > Lines 580 (patched)
> > <https://reviews.apache.org/r/65479/diff/1/?file=1951921#file1951921line581>
> >
> > Add comment on need for this.

I've added:

If there are no inputs; the Execution engine skips the operator tree.
To prevent it from happening; an opaque  ZeroRows input is added here - when 
needed.


> On Feb. 5, 2018, 11:05 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceWork.java
> > Lines 213 (patched)
> > <https://reviews.apache.org/r/65479/diff/1/?file=1951925#file1951925line219>
> >
> > Do we need this? Reducers are always launched when when there is no 
> > mapper. So, this seems unnecessary.

I can remove it...but I think IConfigureJobConf can be used as a general way to 
make modifications like this - in the current case this setting has effectively 
no effect on the reducers...they are being run anyway

I think that after this patch the code from PlanUtils.configureJobConf could be 
moved to FileSinkOperator by using the new interface
I feel that core constructs like ReduceWork should depend less on explicit 
operator implementations and more on interfaces if possible.


- Zoltan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65479/#review196840
---


On Feb. 2, 2018, 12:23 p.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65479/
> ---
> 
> (Updated Feb. 2, 2018, 12:23 p.m.)
> 
> 
> Revi

Review Request 65480: HIVE-18545 Add UDF to parse complex types from json

2018-02-02 Thread Zoltan Haindrich

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65480/
---

Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-18545
https://issues.apache.org/jira/browse/HIVE-18545


Repository: hive-git


Description
---

add json_read udf


Diffs
-

  
itests/hive-jmh/src/main/java/org/apache/hive/benchmark/udf/json_read/JsonReadBench.java
 PRE-CREATION 
  
itests/hive-jmh/src/main/resources/org/apache/hive/benchmark/udf/json_read/val1.json
 PRE-CREATION 
  
itests/hive-jmh/src/main/resources/org/apache/hive/benchmark/udf/json_read/val1.type
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 
76e85636d1ecddc720d6b6e3680194354a6e157c 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFJsonRead.java 
PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFJsonRead.java 
PRE-CREATION 
  ql/src/test/queries/clientpositive/udf_json_read.q PRE-CREATION 
  ql/src/test/results/clientpositive/show_functions.q.out 
43e4a5de393d4b23c4c0257f08c32dd650eaaadc 
  ql/src/test/results/clientpositive/udf_json_read.q.out PRE-CREATION 


Diff: https://reviews.apache.org/r/65480/diff/1/


Testing
---


Thanks,

Zoltan Haindrich



Review Request 65479: HIVE-18523 Fix summary row in case there are no inputs

2018-02-02 Thread Zoltan Haindrich

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65479/
---

Review request for hive, Ashutosh Chauhan and Prasanth_J.


Bugs: HIVE-18523
https://issues.apache.org/jira/browse/HIVE-18523


Repository: hive-git


Description
---

* ensure that mapper operators are started up - but only if empty grouping is 
present


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java 
6a0f0de2a5e84770c6446af41710d972d813c7bc 
  ql/src/java/org/apache/hadoop/hive/ql/exec/IConfigureJobConf.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 
d7b3e4b2fd3ee1a8e2795095a6c55442de2b38e0 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 
976b537033abda5d5ab8b77a7e7d6fb9c84e5a19 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapRunner.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapper.java 
150382a8d58fd4ba44e4d9b78a80173ab984e776 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java 
98f4bc01c8526422348a38f8d8632e0899d695ee 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorGroupByOperator.java 
45d809a1820fcb6ea5e1e5c15aee7de91a4c36c8 
  ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java 
e4dfc009d95f4302bd1fcdff2276e11bed68d2e0 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 
c3b846c4d2fee8691b4952b9f6cf4dd1d8bd632f 
  ql/src/java/org/apache/hadoop/hive/ql/io/NullRowsInputFormat.java 
6a372a3f47e3ac2ae2b2e583541b3a19e5d525f3 
  ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java 
f2b2fc57a03b368707968eb503139e51218008ca 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceWork.java 
ecfb118b41bfa5b7d593b7e801a37f0a7b5b0b5e 
  ql/src/test/queries/clientpositive/groupby_rollup_empty.q 
432d8c448a05f51db9ecf9940bce599dfd598a70 
  ql/src/test/results/clientpositive/groupby_rollup_empty.q.out 
7359140e29fc63eebbab42ab385187be6bfc66e1 
  ql/src/test/results/clientpositive/llap/groupby_rollup_empty.q.out 
d2b57455a3640387d8bc5f2d415a7af25eb55341 


Diff: https://reviews.apache.org/r/65479/diff/1/


Testing
---

added new testcase for union


Thanks,

Zoltan Haindrich



Re: Review Request 65415: HIVE-18571 stats issues for MM tables

2018-02-02 Thread Zoltan Haindrich

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65415/#review196696
---




ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 1947 (patched)
<https://reviews.apache.org/r/65415/#comment276493>

I think this should be in somewhere in the BasicStat related class; or this 
can't be moved there?



ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java
Line 127 (original), 127 (patched)
<https://reviews.apache.org/r/65415/#comment276489>

It seems to me that the old conditionals have done almost the same...by 
changing p.isAcid to p.isTransactional ; I don't see any difference; since if 
its being rewritten the flag will be turned on



ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java
Line 157 (original), 162 (patched)
<https://reviews.apache.org/r/65415/#comment276488>

I feel that currently the stats system is half-blind when it comes to acid 
tables...because the autogather operations are somewhat useless on them...
I was thinking about the following: removing this condition to collect 
stats even in case basic stats are off; would enable the stats to gather a 
total "rowtraffic" - which might be good enough for an estimation ; and it may 
give the join order optimization a chance to do its job better for 
acid/insert_only tables which have not been updated explicitly updated for a 
long time...

This could be probably done as a separate change (because it will probably 
rewrite every second q.out) - what do you think about it?



standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
Lines 650 (patched)
<https://reviews.apache.org/r/65415/#comment276490>

I might be missing something but I don't see why should quickstats be 
calculated differently for transactional tables...quickstats is num_files and 
total bytes on disk - these things apply to acid tables as well



standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
Lines 674 (patched)
<https://reviews.apache.org/r/65415/#comment276491>

I don't understand why



standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
Lines 676 (patched)
<https://reviews.apache.org/r/65415/#comment276492>

I totally agree...it's very inconvinient to have this in the metastore


- Zoltan Haindrich


On Jan. 31, 2018, 2:15 a.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65415/
> ---
> 
> (Updated Jan. 31, 2018, 2:15 a.m.)
> 
> 
> Review request for hive and Eugene Koifman.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> f.,v fbghdscd
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 114d455ff8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java
>  bad7962373 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 63bcedc000 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
> 6c73dc54a7 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> 5868d4dd56 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> dbf9363d11 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/BasicStatsWork.java a4e770ce95 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsNoJobTask.java 
> 946c300750 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java b48379013d 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/Partish.java 78f48b169a 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsAggregator.java 
> d84cf136d5 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
>  89354a2d34 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  ecc464418d 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
>  50f873a013 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
>  2599ab103e 
> 
> 
> Diff: https://reviews.apache.org/r/65415/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



[jira] [Created] (HIVE-18592) DP insert on insert only table causes StatTask to fail

2018-01-31 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18592:
---

 Summary: DP insert on insert only table causes StatTask to fail
 Key: HIVE-18592
 URL: https://issues.apache.org/jira/browse/HIVE-18592
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


can be reproduced with:

{code}
set hive.mapred.mode=nonstrict;
set 
hive.security.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest;
set hive.support.concurrency=true;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;


set hive.create.as.insert.only=true;
set metastore.create.as.acid=true;

drop table if exists student;

create table student(
name string,
age int,
gpa double);

insert into student values
('asd',1,2),
('asdx',2,3),
('asdx',2,3),
('asdx',3,3),
('asdx',3,3),
('asdx',3,3);

create table p1 (name STRING, GPA DOUBLE) PARTITIONED BY (age INT);

SET hive.exec.dynamic.partition.mode=nonstrict;

INSERT OVERWRITE TABLE p1 PARTITION (age) SELECT name, gpa, age FROM student;
{code}

causes exception

{code}
2018-01-31T02:16:24,135 ERROR [22bd4065-6e2f-4f4c-8f29-8d6aad8edda8 main] 
exec.StatsTask: Failed to run stats task
org.apache.hadoop.hive.ql.metadata.HiveException: 
NoSuchObjectException(message:Partition for which stats is gathered doesn't 
exist.)
at 
org.apache.hadoop.hive.ql.metadata.Hive.setPartitionColumnStatistics(Hive.java:4295)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.stats.ColStatsProcessor.persistColumnStats(ColStatsProcessor.java:180)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.stats.ColStatsProcessor.process(ColStatsProcessor.java:84)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.StatsTask.execute(StatsTask.java:108) 
[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205) 
[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) 
[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
...
Caused by: org.apache.hadoop.hive.metastore.api.NoSuchObjectException: 
Partition for which stats is gathered doesn't exist.
at 
org.apache.hadoop.hive.metastore.ObjectStore.updatePartitionColumnStatistics(ObjectStore.java:7757)
 ~[hive-standalone-metastore-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_151]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_151]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_151]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151]
at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) 
~[hive-standalone-metastore-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at com.sun.proxy.$Proxy38.updatePartitionColumnStatistics(Unknown 
Source) ~[?:?]
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.updatePartitonColStats(HiveMetaStore.java:5394)
 ~[hive-standalone-metastore-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.set_aggr_stats_for(HiveMetaStore.java:6907)
 ~[hive-standalone-metastore-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_151]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_151]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_151]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151]
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
 ~[hive-standalone-metastore-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
 ~[hive-standalone-metastore-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at com.sun.proxy.$Proxy40.set_aggr_stats_for(Unknown Source) ~[?:?]
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.setPartitionColumnStatistics(HiveMetaStoreClient.java:1736)
 ~[hive-standalone-metastore-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.setPartitionColumnStatistics(SessionHiveMetaStoreClient.java:375)
 ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_151]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_151

Review Request 65422: HIVE-17626

2018-01-30 Thread Zoltan Haindrich

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65422/
---

Review request for hive and Ashutosh Chauhan.


Repository: hive-git


Description
---

preview


Diffs
-

  cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java a78e0c63d7 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b7d3e99e1a 
  hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/HCatCli.java 
ad31287879 
  hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/HCatDriver.java 
533f0bcd6f 
  itests/src/test/resources/testconfiguration.properties d86ff58840 
  ql/src/java/org/apache/hadoop/hive/ql/AbstractReExecDriver.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/Context.java 820fbf0f58 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 74595b00f9 
  ql/src/java/org/apache/hadoop/hive/ql/DriverFactory.java 49d2bf5f33 
  ql/src/java/org/apache/hadoop/hive/ql/IDriver.java 6280be0b08 
  ql/src/java/org/apache/hadoop/hive/ql/ReExecOverlayDriver.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/ReOptimizeDriver.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 76e85636d1 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java 199b181290 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 395a5f450f 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkCommonOperator.java
 8dd7cfe58c 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkEmptyKeyOperator.java
 134fc0ff0b 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkObjectHashOperator.java
 1eb72ce4d9 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkUniformHashOperator.java
 384bd74686 
  ql/src/java/org/apache/hadoop/hive/ql/hooks/PrivateHookContext.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
190771ea6b 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
 cbadfa4f07 
  ql/src/java/org/apache/hadoop/hive/ql/plan/Statistics.java 0057f0c2c6 
  ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/EmptyStatsSource.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/GroupTransformer.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/PlanMapper.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/PlanMapperProcess.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/RuntimeStatsSource.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/SimpleRuntimeStatsSource.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/StatsSource.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/refs/OperatorRef.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/processors/CommandProcessorFactory.java 
dcf8d31eaf 
  ql/src/java/org/apache/hadoop/hive/ql/stats/OperatorStats.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/stats/OperatorStatsReaderHook.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAssertTrueOOM.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java 
d56002d192 
  ql/src/test/org/apache/hadoop/hive/ql/plan/mapping/TestCounterMapping.java 
PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/plan/mapping/TestReOptimization.java 
PRE-CREATION 
  ql/src/test/queries/clientpositive/retry_failure.q PRE-CREATION 
  ql/src/test/queries/clientpositive/retry_failure_oom.q PRE-CREATION 
  ql/src/test/queries/clientpositive/retry_failure_stat_changes.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/retry_failure.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/llap/retry_failure_oom.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/llap/retry_failure_stat_changes.q.out 
PRE-CREATION 


Diff: https://reviews.apache.org/r/65422/diff/1/


Testing
---


Thanks,

Zoltan Haindrich



Re: Review Request 64900: HIVE-18359: Extend grouping set limits from int to long

2018-01-30 Thread Zoltan Haindrich

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64900/#review196518
---




ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
Lines 672 (patched)
<https://reviews.apache.org/r/64900/#comment276191>

the NoMapper flag file is shared between mappers; so the following query 
results in a single row:

```
create table tx3 (a integer,b integer,c integer);

select '2 rows expected',sum(c) from tx3 group by rollup (a)
union all
select '2 rows expected',sum(c) from tx3 group by rollup (a);
```

could you please add this to groupby_rollup_empty.q?


- Zoltan Haindrich


On Jan. 30, 2018, 8:31 a.m., Prasanth_J wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64900/
> ---
> 
> (Updated Jan. 30, 2018, 8:31 a.m.)
> 
> 
> Review request for hive and Jesús Camacho Rodríguez.
> 
> 
> Bugs: HIVE-18359
> https://issues.apache.org/jira/browse/HIVE-18359
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-18359: Extend grouping set limits from int to long
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java 6a0f0de 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java d7b3e4b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java 
> 98f4bc0 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorGroupByOperator.java 
> 45d809a 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/VirtualColumn.java 2411d3a 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveGroupingID.java
>  4ba27a2 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveExpandDistinctAggregatesRule.java
>  864efa4 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/HiveGBOpConvUtil.java
>  f22cd94 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 85a1f34 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 83dfb47 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/GroupByDesc.java e90a398 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFGrouping.java 
> c0c3015 
>   ql/src/test/queries/clientpositive/cte_1.q 15d3f06 
>   ql/src/test/queries/clientpositive/groupingset_high_columns.q PRE-CREATION 
>   ql/src/test/results/clientpositive/annotate_stats_groupby.q.out ed3d594 
>   ql/src/test/results/clientpositive/annotate_stats_groupby2.q.out ffcb20f 
>   ql/src/test/results/clientpositive/cbo_rp_annotate_stats_groupby.q.out 
> 3d92a0d 
>   ql/src/test/results/clientpositive/groupby_cube1.q.out e5ece81 
>   ql/src/test/results/clientpositive/groupby_cube_multi_gby.q.out 9a6457c 
>   ql/src/test/results/clientpositive/groupby_grouping_id3.q.out f13b6e5 
>   ql/src/test/results/clientpositive/groupby_grouping_sets1.q.out d70f065 
>   ql/src/test/results/clientpositive/groupby_grouping_sets2.q.out 453b9f7 
>   ql/src/test/results/clientpositive/groupby_grouping_sets3.q.out be8d20e 
>   ql/src/test/results/clientpositive/groupby_grouping_sets4.q.out 0c6ead9 
>   ql/src/test/results/clientpositive/groupby_grouping_sets5.q.out 0bb12e1 
>   ql/src/test/results/clientpositive/groupby_grouping_sets6.q.out 5b990a1 
>   ql/src/test/results/clientpositive/groupby_grouping_sets_grouping.q.out 
> 1f2f86b 
>   ql/src/test/results/clientpositive/groupby_grouping_sets_limit.q.out 
> b25b0e5 
>   ql/src/test/results/clientpositive/groupby_grouping_window.q.out 32135e4 
>   ql/src/test/results/clientpositive/groupby_rollup1.q.out bc1d8a9 
>   ql/src/test/results/clientpositive/groupby_rollup_empty.q.out 7359140 
>   ql/src/test/results/clientpositive/groupingset_high_columns.q.out 
> PRE-CREATION 
>   
> ql/src/test/results/clientpositive/infer_bucket_sort_grouping_operators.q.out 
> 5f1d264 
>   ql/src/test/results/clientpositive/limit_pushdown2.q.out cdd221b 
>   ql/src/test/results/clientpositive/llap/cte_1.q.out ddef9db 
>   ql/src/test/results/clientpositive/llap/groupby_rollup_empty.q.out d2b5745 
>   ql/src/test/results/clientpositive/llap/llap_acid.q.out 38889b9 
>   ql/src/test/results/clientpositive/llap/llap_acid_fast.q.out 4a7297d 
>   ql/src/test/results/clientpositive/llap/multi_count_distinct_null.q.out 
> 1f17f96 
>   ql/src/test/results/clientpositive/llap/sysdb.q.out 5ed427f 
>   ql/src/test/results/clientpositive/llap/vector_groupby_cube1.q.out 39de888 
>   ql/src/test/results/clientpositive/llap/ve

[jira] [Created] (HIVE-18565) Set some timeout to all CliDriver tests

2018-01-29 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18565:
---

 Summary: Set some timeout to all CliDriver tests
 Key: HIVE-18565
 URL: https://issues.apache.org/jira/browse/HIVE-18565
 Project: Hive
  Issue Type: Sub-task
  Components: Testing Infrastructure
Reporter: Zoltan Haindrich


if a testcase runs into an infinite loop or something ; it should still fail 
instead of causing timeout at the ptest executor



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18564) Add a mapper to make plan transformations more easily understandable

2018-01-29 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18564:
---

 Summary: Add a mapper to make plan transformations more easily 
understandable
 Key: HIVE-18564
 URL: https://issues.apache.org/jira/browse/HIVE-18564
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


This part is started as a small helper class to enable plan independent mapping 
of runtime operator informations. But in reality its a bit different; and might 
have its own different kind of usages.

Goals were:
 * connect plan pieces which are responsible for the same part together; 
currently I'm using it to connect RelNode, AST, Operator, RuntimeStats
 * make it easy to attach new data
 * make it easy to lookup some related information

This concept seems to be also usefull during writing tests; because it enables 
the lookup of specific pieces like HiveFilter



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18545) Add UDF to parse complex types from json

2018-01-25 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18545:
---

 Summary: Add UDF to parse complex types from json
 Key: HIVE-18545
 URL: https://issues.apache.org/jira/browse/HIVE-18545
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich
 Attachments: HIVE-18545.01.patch





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18523) Fix summary row in case there are no inputs

2018-01-24 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18523:
---

 Summary: Fix summary row in case there are no inputs
 Key: HIVE-18523
 URL: https://issues.apache.org/jira/browse/HIVE-18523
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


HIVE-17617 added a fix to handle empty groups correctly; however it might be 
possible that the summary row is returned multiple times - in case the mappers 
are "seem" to be optimized away...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18522) Running all negative qtests seems to leave some threads behind

2018-01-24 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18522:
---

 Summary: Running all negative qtests seems to leave some threads 
behind
 Key: HIVE-18522
 URL: https://issues.apache.org/jira/browse/HIVE-18522
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Zoltan Haindrich



running all the negative queries is seemingly produce some extra threads...it 
have gone from 40 to 80 the suspicious live thread names: "HMSHandler #x", 
"Hearbeater-#x", "communicationthread" ...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18495) JUnit rule to enable Driver level testing

2018-01-19 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18495:
---

 Summary: JUnit rule to enable Driver level testing
 Key: HIVE-18495
 URL: https://issues.apache.org/jira/browse/HIVE-18495
 Project: Hive
  Issue Type: Sub-task
  Components: Testing Infrastructure
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


I've tried to write a case for a sophisticated check...it worked so well that 
I've started using it and eventually created a junit rule to make it easier to 
reuse

Currently it takes ~15-25sec to run a test case with this framework (from which 
most of the time is the launch time of all kind of stuff which are needed to 
run a driver command).

* enable to write JUnit tests which has access to the {{IDriver}} level
* leave out the cli-driver; it sometimes causes problems
* write tests at the {{ql}} module
* it should also work from the IDE without changing anything

Note: JUnit 5 would be great for this task; but unfortunately junit5 needs 
maven-surefire 2.19.1 ; which causes all kinds of problems for hive devs using 
idea...so that's not an option.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18469) HS2UI: Introduce separate option to show query on web ui

2018-01-17 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18469:
---

 Summary: HS2UI: Introduce separate option to show query on web ui
 Key: HIVE-18469
 URL: https://issues.apache.org/jira/browse/HIVE-18469
 Project: Hive
  Issue Type: Bug
 Environment: currently {{ConfVars.HIVE_LOG_EXPLAIN_OUTPUT}} enables 2 
features:

* log the query to the console (even thru beeline)
* shows the query on the web ui

I've enabled it...and ever since then my beeline is always flooded with an 
{{explain extended}} output...which is very verbose; even for simple queries.
Reporter: Zoltan Haindrich






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: checkstyle changes

2018-01-16 Thread Zoltan Haindrich
Well okay..but I still feel that checking for 100 kinda takes the value of 
these checks away; because people will just ignore the checks because they are 
red all the timeso I still think it should be raised to some higher value.
Would 120 be ok?
Choosing a higher linelength does not meab that everyone should use that; I 
think its free to use smaller settings.

cheers,
Zoltan


On 8 Dec 2017 4:16 a.m., Rui Li <lirui.fu...@gmail.com> wrote:
I also believe 140 is a little too long.

BTW do we use 2 or 4 chars for continuation indent? I personally prefer 4,
but I do find both cases in out code.

On Fri, Dec 8, 2017 at 6:20 AM, Alexander Kolbasov <ak...@cloudera.com>
wrote:

> Problem with 140-wide code isn't just laptops - in many cases we need to do
> side-by-side diffs (e.g. for code reviews) and this doubles the required
> size.
>
> - Alex.
>
> On Thu, Dec 7, 2017 at 1:38 PM, Sergey Shelukhin <ser...@hortonworks.com>
> wrote:
>
> > I think the 140-character change will make the code hard to use on a
> > laptop without a monitor.
> >
> >
> > On 17/12/7, 02:43, "Peter Vary" <pv...@cloudera.com> wrote:
> >
> > >Disclaimer: I did not have time to test it out, but according to
> > >http://checkstyle.sourceforge.net/config_misc.html#Indentation
> > ><http://checkstyle.sourceforge.net/config_misc.html#Indentation>
> > >Maybe the indentation could be solved by:
> > >lineWrappingIndentation=2 (default 4)
> > >forceStrictCondition=false (default false)
> > >
> > >http://checkstyle.sourceforge.net/config_misc.html#TrailingComment
> > ><http://checkstyle.sourceforge.net/config_misc.html#TrailingComment>
> > >might help with the comments
> > >
> > >Sorry for not being more helpful. Maybe sometime later I will have time
> > >to check these out.
> > >
> > >Thanks,
> > >Peter
> > >
> > >> On Dec 7, 2017, at 10:26 AM, Zoltan Haindri > > 
> > >> >><zhaindr...@hortonworks.com> wrote:
> > >>
> > >> Hello Eugene!
> > >>
> > >> I've looked into doing something with these; but I was not able to
> > >>relieve the warnings you've mentioned:
> > >>
> > >> * the ;// is seems to be not configurable
> > >>   It seems like its handled by the whitespaceafter module; I'm not
> sure
> > >>how to allow / after ;
> > >> * I think that indentation of 4 for many method arguments makes it
> more
> > >>readable; so I think it would be the best to just drop this check...but
> > >>I've not seen any way to do this(w/o disabling the whole indentation
> > >>module...)
> > >>
> > >> maybe someone else should take a look at itI find it pretty hard
> to
> > >>get docs about specific chechkstyle configurations; since the sear > > 
> > >>>>keywords mostly contain keywords like: semicolon, whitespace,
> > >>comment...which tends to pull in all kind of garbage results :)
> > >>
> > >> cheers,
> > >> Zoltan
> > >>
> > >> On 6 Dec 2017 8:53 p.m., Eugene Koifman <ekoif...@hortonworks.com>
> > >>wrote:
> > >> It currently complains about no space between ; and // as in
> “…);//foo”
> > >>
> > >> And also about indentation when a single method call is split into
> > >>multiple lines.
> > >> It insists on 4 chars in this case, though we use 2 in (all?) other
> > >>cases.
> > >>
> > >> Could this be dialed down as well?
> > >>
> > >>
> > >> On 12/5/17, 7:26 AM, "Peter Vary" <pv...@cloudera.com> wrote:
> > >>
> > >>+1 for the changes
> > >>
> > >>> On Dec 5, 2017, at 1:02 PM, Zoltan Haindrich <k...@rxd.hu> wrote:
> > >>>
> > >>> Hello,
> > >>>
> > >>> I've filed a ticket to make the checkstyle warnings less noisy
> > >>>(https://issues.apache.org/jira/browse/HIVE-18222)
> > >>>
> > >>> * set maxlinelength to 140
> > >>>   I think everyone is working with big-enough displays to handle this
> > >>>:)
> > >>>   There are many methods which have complicated names / arguments /
> > >>>etc ; breaking the lines more frequently hurts readability...
> > >>> * disabled some restrictions like: declaration via get/set
> > >>>methods for protected/package fields are not mandatory
> > >>>
> > >>> If you don't feel comfortable with these changes, please share your
> > >>>point of view.
> > >>>
> > >>> cheers,
> > >>> Zoltan
> > >>>
> > >>>
> > >>
> > >>
> > >>
> > >>
> > >
> >
> >
>



--
Best regards!
Rui Li



Re: ptest broken?

2018-01-16 Thread Zoltan Haindrich
Hello,

I think the triggering mechanism is broken; queuing manually works

cheers,
Zoltan

On 16 Jan 2018 7:38 p.m., Deepak Jaiswal  wrote:
Hi,

I attached a patch once last night and then again in the morning but no run 
triggered after that.
There is no queue of pending runs as well.

Regards,
Deepak



[jira] [Created] (HIVE-18454) Incorrect rownum estimation in joins

2018-01-16 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18454:
---

 Summary: Incorrect rownum estimation in joins
 Key: HIVE-18454
 URL: https://issues.apache.org/jira/browse/HIVE-18454
 Project: Hive
  Issue Type: Sub-task
  Components: Statistics
Reporter: Zoltan Haindrich


I've seen this probably earlier...row counts seems to be off the 
charts...12 rows estimated when the table has only 10 rows

{code}
create table s (x int);

insert into s values
(1),(2),(3),(4),(5),
(6),(7),(8),(9),(10);

create table tu(id_uv int,id_uw int,u int);
create table tv(id_uv int,v int);
create table tw(id_uw int,w int);

from s
insert overwrite table tu
select x,x,x 
where x<=6 or x=10
insert overwrite table tv
select x,x  
where x<=3 or x=10
insert overwrite table tw
select x,x  
;

set hive.explain.user=true;

explain analyze
select sum(u*v*w) from tu
join tv on (tu.id_uv=tv.id_uv)
join tw on (tu.id_uw=tw.id_uw)
where w>9 and u>1 and v>3;

desc formatted tv;
{code}





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [Announce] New committer: Deepak Jaiswal

2018-01-09 Thread Zoltan Haindrich
Congratulations Deepak!

On 9 Jan 2018 10:32 p.m., Matthew McCline  wrote:

Congratulations!


From: Lefty Leverenz 
Sent: Tuesday, January 9, 2018 3:08 AM
To: dev@hive.apache.org
Subject: Re: [Announce] New committer: Deepak Jaiswal

Congratulations!

-- Lefty


On Mon, Jan 8, 2018 at 10:41 AM, Vihang Karajgaonkar 
wrote:

> Congrats Deepak!
>
> On Mon, Jan 8, 2018 at 10:25 AM, Mithun RK  wrote:
>
> > Congratulations, Deepak!
> >
> > On Mon, Jan 8, 2018 at 10:14 AM D K  wrote:
> >
> > > Congratulations Deepak!
> > >
> > > On Fri, Jan 5, 2018 at 2:18 PM, Ashutosh Chauhan  >
> > > wrote:
> > >
> > > > The Project Management Committee (PMC) for Apache Hive has invited
> > Deepak
> > > > Jaiswal to become a committer and we are pleased to announce that he
> > has
> > > >  accepted.
> > > >
> > > > Welcome, Deepak!
> > > >
> > > > Thanks,
> > > >  Ashutosh
> > > >
> > >
> >
>





[jira] [Created] (HIVE-18414) upgrade to tez-0.9.1

2018-01-09 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18414:
---

 Summary: upgrade to tez-0.9.1
 Key: HIVE-18414
 URL: https://issues.apache.org/jira/browse/HIVE-18414
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18413) Grouping of an empty results set must contain null values

2018-01-09 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18413:
---

 Summary: Grouping of an empty results set must contain null values
 Key: HIVE-18413
 URL: https://issues.apache.org/jira/browse/HIVE-18413
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


exposed by: HIVE-18359

in case of vectorization the summary row object was left as is; which may cause 
it to be inconsistent with the column isNull conditions



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Review Request 64900: HIVE-18359: Extend grouping set limits from int to long

2018-01-09 Thread Zoltan Haindrich

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64900/#review195020
---




ql/src/test/results/clientpositive/llap/groupby_rollup_empty.q.out
Lines 94 (patched)
<https://reviews.apache.org/r/64900/#comment274121>

there should be no result set changes in this q.out
I'm not sure if this difference is caused by the rollup empty problem your 
patch exposed or not



ql/src/test/results/clientpositive/llap/groupby_rollup_empty.q.out
Lines 111 (patched)
<https://reviews.apache.org/r/64900/#comment274122>

there should be no result set changes in this q.out
NULL,1 may not appear in this case


- Zoltan Haindrich


On Jan. 5, 2018, 6:55 a.m., Prasanth_J wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64900/
> ---
> 
> (Updated Jan. 5, 2018, 6:55 a.m.)
> 
> 
> Review request for hive and Jesús Camacho Rodríguez.
> 
> 
> Bugs: HIVE-18359
> https://issues.apache.org/jira/browse/HIVE-18359
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-18359: Extend grouping set limits from int to long
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java 8b94d1d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorGroupByOperator.java 
> 90145e5 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/VirtualColumn.java 0032305 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveGroupingID.java
>  adcda26 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveExpandDistinctAggregatesRule.java
>  89c5c23 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/HiveGBOpConvUtil.java
>  6f4188c 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 28b4cfe 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 5a88a96 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/GroupByDesc.java 9d4ad22 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFGrouping.java 
> cee0e14 
>   ql/src/test/queries/clientpositive/cte_1.q 15d3f06 
>   ql/src/test/queries/clientpositive/groupingset_high_columns.q PRE-CREATION 
>   ql/src/test/results/clientpositive/annotate_stats_groupby.q.out ed3d594 
>   ql/src/test/results/clientpositive/annotate_stats_groupby2.q.out ffcb20f 
>   ql/src/test/results/clientpositive/auto_join25.q.out 063d3ca 
>   ql/src/test/results/clientpositive/cte_1.q.out 9374a32 
>   ql/src/test/results/clientpositive/groupby_cube1.q.out e5ece81 
>   ql/src/test/results/clientpositive/groupby_cube_multi_gby.q.out 9a6457c 
>   ql/src/test/results/clientpositive/groupby_grouping_id3.q.out f13b6e5 
>   ql/src/test/results/clientpositive/groupby_grouping_sets1.q.out d70f065 
>   ql/src/test/results/clientpositive/groupby_grouping_sets2.q.out 453b9f7 
>   ql/src/test/results/clientpositive/groupby_grouping_sets3.q.out be8d20e 
>   ql/src/test/results/clientpositive/groupby_grouping_sets4.q.out 0c6ead9 
>   ql/src/test/results/clientpositive/groupby_grouping_sets5.q.out 0bb12e1 
>   ql/src/test/results/clientpositive/groupby_grouping_sets6.q.out 5b990a1 
>   ql/src/test/results/clientpositive/groupby_grouping_sets_grouping.q.out 
> 1f2f86b 
>   ql/src/test/results/clientpositive/groupby_grouping_sets_limit.q.out 
> b25b0e5 
>   ql/src/test/results/clientpositive/groupby_grouping_window.q.out 32135e4 
>   ql/src/test/results/clientpositive/groupby_rollup1.q.out bc1d8a9 
>   ql/src/test/results/clientpositive/groupby_rollup_empty.q.out 5db3184 
>   ql/src/test/results/clientpositive/groupingset_high_columns.q.out 
> PRE-CREATION 
>   
> ql/src/test/results/clientpositive/infer_bucket_sort_grouping_operators.q.out 
> 5f1d264 
>   ql/src/test/results/clientpositive/limit_pushdown2.q.out cdd221b 
>   ql/src/test/results/clientpositive/llap/cte_1.q.out ddef9db 
>   ql/src/test/results/clientpositive/llap/groupby_rollup_empty.q.out 061b0d7 
>   
> ql/src/test/results/clientpositive/llap/insert_values_orig_table_use_metadata.q.out
>  d135f08 
>   ql/src/test/results/clientpositive/llap/llap_acid.q.out 38889b9 
>   ql/src/test/results/clientpositive/llap/llap_acid_fast.q.out 4a7297d 
>   ql/src/test/results/clientpositive/llap/multi_count_distinct_null.q.out 
> 39feaec 
>   ql/src/test/results/clientpositive/llap/sysdb.q.out 3bd407b 
>   ql/src/test/results/clientpositive/llap/vector_groupby_cube1.q.out 39de888 
>   ql/src/test/results/clientpositive/l

[jira] [Created] (HIVE-18383) Qtests: running all cases from TestNegativeCliDriver results in OOMs

2018-01-05 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18383:
---

 Summary: Qtests: running all cases from TestNegativeCliDriver 
results in OOMs
 Key: HIVE-18383
 URL: https://issues.apache.org/jira/browse/HIVE-18383
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


I think that it is caused by unclosed SessionState objects which are piling up 
and cause OOM..

There is special have been made to start a new sessionstate for every qtest; 
but the old one is not closed up to this 
[point|https://github.com/apache/hive/blob/20c9a3905f4b1b627c935ad54a53a7a59015587c/itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java#L1202]

this prevents running all {{TestNegativeCliDriver}} tests in one maven 
callI keep getting OOMs

This issues sometimes appears on the ptest executor as well and its reported as 
a failed batch.


I've gone back in time a bitseems like at 
c925cf8d2bdf646f5c3c57ed7252c01b2ab33eec it was ok to execute the whole batch; 
but at 1b4baf474c15377cc9f0bacdda317feabeefacaf and probably also at 
a42314deb07a1c8d9d4daeaa799ad1c1ebb0c6c9 its not possible anymore. I suspect 
that there is possibly another issueor these are just the consequences that 
the sessionstate got heavier by a few hundred bytes; and made it easier to fill 
up the heap



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Review Request 64947: HIVE-18108 in case basic stats are missing; rowcount estimation depends on the selected columns size

2018-01-04 Thread Zoltan Haindrich
 
  ql/src/test/results/clientpositive/union_remove_8.q.out 
a387f7e874f36b8f6f289006ababf18b91a8e7be 
  ql/src/test/results/clientpositive/union_remove_9.q.out 
78ad309669eb423582e97d575c183a6fb32044d9 
  ql/src/test/results/clientpositive/union_view.q.out 
16d6e8aad69286a59cdc42fc5caa9bb5686c6591 
  ql/src/test/results/clientpositive/vector_gather_stats.q.out 
9675a8b12228dbe22e35f3e9b9da65f33ff4a6e6 
  ql/src/test/results/clientpositive/vectorization_parquet_projection.q.out 
8ed69a45b78be466f61ebb6d32fdb25091ef758c 


Diff: https://reviews.apache.org/r/64947/diff/1/


Testing
---


Thanks,

Zoltan Haindrich



[jira] [Created] (HIVE-18354) Fix test TestAcidOnTez

2018-01-02 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18354:
---

 Summary: Fix test TestAcidOnTez 
 Key: HIVE-18354
 URL: https://issues.apache.org/jira/browse/HIVE-18354
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


it seems like this test have been broken by HIVE-18149



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18330) Fix TestMsgBusConnection - doesn't test tests the original intention

2017-12-22 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18330:
---

 Summary: Fix TestMsgBusConnection - doesn't test tests the 
original intention
 Key: HIVE-18330
 URL: https://issues.apache.org/jira/browse/HIVE-18330
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


this test should have never been passed...there is a point where it drops a 
database; and that command returns with an error - there are other things which 
are intrestinglike create database on an existing db is sucess somewhere 
 - so it get posted to the msgbus.





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18329) There are 2 configs to detect/warn for cross products

2017-12-22 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18329:
---

 Summary: There are 2 configs to detect/warn for cross products
 Key: HIVE-18329
 URL: https://issues.apache.org/jira/browse/HIVE-18329
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich


The following 2 options seem to be after a very similar thing:
{code}
hive.exec.check.crossproducts
hive.strict.checks.cartesian.product
{code}
not sure about the differences...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18314) qtests: semijoin_hint.q breaks hybridgrace_hashjoin_2.q

2017-12-19 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18314:
---

 Summary: qtests: semijoin_hint.q breaks hybridgrace_hashjoin_2.q   
 Key: HIVE-18314
 URL: https://issues.apache.org/jira/browse/HIVE-18314
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Zoltan Haindrich


{code}
mvn install -q -am -pl itests/qtest -DskipSparkTests 
-Dtest=TestMiniLlapLocalCliDriver 
-Dqfile=semijoin_hint.q,hybridgrace_hashjoin_2.q
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18309) qtests: smb_mapjoin_19.q breaks bucketsortoptimize_insert_2.q

2017-12-19 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18309:
---

 Summary: qtests: smb_mapjoin_19.q breaks 
bucketsortoptimize_insert_2.q
 Key: HIVE-18309
 URL: https://issues.apache.org/jira/browse/HIVE-18309
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Zoltan Haindrich



{code}
mvn install -q -am -pl itests/qtest -DskipSparkTests -pl itests/qtest 
-Dtest=TestMiniLlapLocalCliDriver 
-Dqfile=smb_mapjoin_19.q,bucketsortoptimize_insert_2.q
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Review Request 64712: HIVE-18224 Introduce interface above driver

2017-12-19 Thread Zoltan Haindrich
rsion.java
 0a034d3593468b1704769fa2b0fde2330d61b546 
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 
1fd84ac438dd36d1d867afc00a037b8749779667 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestCreateUdfEntities.java
 34f4ed0490f23378a6dc832cf0cbca2f81c94a81 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestDDLWithRemoteMetastoreSecondNamenode.java
 179eed95d0be28056d0bb7ed6a8b63910a7594ac 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java
 d73cd6426cefceac35922314b208e92d23e52b1c 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/metadata/TestSemanticAnalyzerHookLoading.java
 2170ca3706d0c21df3a630b8b0112eb188a96b16 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenarios.java
 55acd1df3697f1742c826f1cd9648634811b915f 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/WarehouseInstance.java
 cde7a3e33cdadafe58e441fe65126636ef3c7bae 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/StorageBasedMetastoreTestBase.java
 dc3af3c18696bf7ce14b6139e4686c340b3fff32 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/TestAuthorizationPreEventListener.java
 6a668aa40c856f90d09da4eb8941a0e7e0a70f19 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/TestClientSideAuthorizationProvider.java
 57ff8c9ae7edc71b12a46ce63387e0ebdef58b22 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/TestMetastoreAuthorizationProvider.java
 edb46fd97903e409644c97b1aebefdb75a99de01 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/TestMultiAuthorizationPreEventListener.java
 2059370fd40c49ca0016ee8179512d325768331b 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/authorization/plugin/TestHiveAuthorizerCheckInvocation.java
 19694b093e5aa3af419409f725c228af912cf435 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/authorization/plugin/TestHiveAuthorizerShowFilters.java
 5922a8c603e1597d6091b6fb64527d2c75f15a9c 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
 75eeaf61d6519987f6ea6e3d0641f326b99dfb34 
  itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 
88034d764358ff91c8a6fdf824f6cda513cace3f 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 
b168906b440d14dfebab381d2109d11467437e2b 
  ql/src/java/org/apache/hadoop/hive/ql/DriverFactory.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/IDriver.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/QueryLifeTimeHookRunner.java 
6bdf7ebf47001fde89b5fdd20e63e4e69a55d6d5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DagUtils.java 
aed1b2cf53d97577d058550f51ec3adfef9cb0a6 
  ql/src/java/org/apache/hadoop/hive/ql/hooks/HookUtils.java 
4380fe33b6ca35a826052c7820a188a3c60c2ec2 
  ql/src/java/org/apache/hadoop/hive/ql/hooks/HooksLoader.java 
00087267c735bb9a915e67ee041fa3bb2f4b5a8c 
  ql/src/java/org/apache/hadoop/hive/ql/processors/CommandProcessorFactory.java 
8d0690d33ebf96e714a0d9cd8239aba0273733a5 
  ql/src/test/queries/clientpositive/retry_tez_failure.q PRE-CREATION 
  service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 
773dd51c9a6c5d4982db7ecbed8ea26ecdc8c919 
  service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
cc4dd52fcf7987e468e477d81a89b03b80158419 


Diff: https://reviews.apache.org/r/64712/diff/1/


Testing
---

doesn't break any tests


Thanks,

Zoltan Haindrich



[jira] [Created] (HIVE-18306) Fix spark smb tests

2017-12-19 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18306:
---

 Summary: Fix spark smb tests
 Key: HIVE-18306
 URL: https://issues.apache.org/jira/browse/HIVE-18306
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


seems to me that {{TestSparkCliDriver#testCliDriver\[auto_sortmerge_join_10\]}} 
and {{TestSparkCliDriver#testCliDriver\[bucketsortoptimize_insert_7\]}} is 
failing since HIVE-18208 is in.





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18305) travis-ci builds are timing out

2017-12-19 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18305:
---

 Summary: travis-ci builds are timing out
 Key: HIVE-18305
 URL: https://issues.apache.org/jira/browse/HIVE-18305
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


{code}

No output has been received in the last 10m0s, this potentially indicates a 
stalled build or something wrong with the build itself.
Check the details on how to adjust your build configuration on: 
https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18302) Make StatsTask use less metastore calls

2017-12-19 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18302:
---

 Summary: Make StatsTask use less metastore calls
 Key: HIVE-18302
 URL: https://issues.apache.org/jira/browse/HIVE-18302
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich


ideally it should be 1.

StatsTask only "writes" to some specific values...for HIVE-18285 it have 
probably undone some changes done by some other part




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18292) correct typo of vector_reduce_groupby_duplicate_cols in testconfiguration.properties

2017-12-18 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18292:
---

 Summary: correct typo of vector_reduce_groupby_duplicate_cols in 
testconfiguration.properties
 Key: HIVE-18292
 URL: https://issues.apache.org/jira/browse/HIVE-18292
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


HIVE-18258 added a test; but the matching properties 
[entry|https://github.com/apache/hive/blob/82590226a89eeac7aa0ace8c311a8d4f4794c5bc/itests/src/test/resources/testconfiguration.properties#L384]
 for it is typoed...

I never taught TestDanglingQOuts#checkDanglingQOut  will catch usefull things 
as well... :)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18243) Cartesian error for joins defined in where clause

2017-12-07 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18243:
---

 Summary: Cartesian error for joins defined in where clause
 Key: HIVE-18243
 URL: https://issues.apache.org/jira/browse/HIVE-18243
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


this issue was hidden; because of HIVE-18238

{code}
create table agg_01 (amount int, dim_shops_id int);
create table dim_shops (id int);

EXPLAIN SELECT agg.amount
FROM agg_01 agg,
dim_shops d1
WHERE agg.dim_shops_id = d1.id
and agg.dim_shops_id = 1;
{code}

emits a cartesian product error
{code}
2017-12-07T04:48:20,612 ERROR [c7a4797b-2635-4e28-9e0b-af2e4d26f2bc main] 
ql.Driver: FAILED: SemanticException Cartesian products are disabled for safety 
reasons. If you know what you are doing, please 
sethive.strict.checks.cartesian.product to false and that hive.mapred.mode is 
not set to 'strict' to proceed. Note that if you may get errors or incorrect 
results if you make a mistake while using some of the unsafe features.
{code}


from the plan: 
https://github.com/apache/hive/blob/7ddd915bf82a68c8ab73b0c4ca409f1a6d43d227/ql/src/test/results/clientpositive/llap/dynamic_partition_pruning_2.q.out#L591

It doesn't seem to be that a cartesian join being happening...possibly the 
check is overreacting...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: checkstyle changes

2017-12-07 Thread Zoltan Haindrich
Hello Eugene!

I've looked into doing something with these; but I was not able to relieve the 
warnings you've mentioned:

* the ;// is seems to be not configurable
   It seems like its handled by the whitespaceafter module; I'm not sure how to 
allow / after ;
* I think that indentation of 4 for many method arguments makes it more 
readable; so I think it would be the best to just drop this check...but I've 
not seen any way to do this(w/o disabling the whole indentation module...)

maybe someone else should take a look at itI find it pretty hard to get 
docs about specific chechkstyle configurations; since the search keywords 
mostly contain keywords like: semicolon, whitespace, comment...which tends to 
pull in all kind of garbage results :)

cheers,
Zoltan

On 6 Dec 2017 8:53 p.m., Eugene Koifman <ekoif...@hortonworks.com> wrote:
It currently complains about no space between ; and // as in “…);//foo”

And also about indentation when a single method call is split into multiple 
lines.
It insists on 4 chars in this case, though we use 2 in (all?) other cases.

Could this be dialed down as well?


On 12/5/17, 7:26 AM, "Peter Vary" <pv...@cloudera.com> wrote:

+1 for the changes

> On Dec 5, 2017, at 1:02 PM, Zoltan Haindrich <k...@rxd.hu> wrote:
>
> Hello,
>
> I've filed a ticket to make the checkstyle warnings less noisy 
(https://issues.apache.org/jira/browse/HIVE-18222)
>
> * set maxlinelength to 140
>I think everyone is working with big-enough displays to handle this :)
>There are many methods which have complicated names / arguments / etc 
; breaking the lines more frequently hurts readability...
> * disabled some restrictions like: declaration via get/set methods 
for protected/package fields are not mandatory
>
> If you don't feel comfortable with these changes, please share your point 
of view.
>
> cheers,
> Zoltan
>
>






ptests are stuck

2017-12-07 Thread Zoltan Haindrich

Hello,

It's stuck again; but the current state is very interesting:

* current build is testing HIVE-18237
  https://issues.apache.org/jira/browse/HIVE-18237
* it's currently running for >5 hours
* hiveqa have posted ptest results for HIVE-18237 around 4 hours ago!
* from the jenkins build console it's currently executing batch #134 
(134-TestSparkCliDriver)

https://builds.apache.org/job/PreCommit-HIVE-Build/8139/consoleFull
* but at the corresponding ptest site:
http://104.198.109.242/logs/PreCommit-HIVE-Build-8139/succeeded/
  not that batch 134 have been finished; but all of them! (299?)
* I've searched for batch "177-TestMiniSparkOnYarnCliD" in the console 
output

https://builds.apache.org/job/PreCommit-HIVE-Build/8139/consoleFull
  and it only appears in the "generation" phase.
* taking a closer look at:
http://104.198.109.242/logs/PreCommit-HIVE-Build-8139/
  * according to the directory creation dates:
    all these directories are seem to be newer than the creation date 
of http://104.198.109.242/logs/PreCommit-HIVE-Build-8139/ + 2 hours;
    the dir http://104.198.109.242/logs/PreCommit-HIVE-Build-8139/ is 
lesser than 2 hours later created than 
http://104.198.109.242/logs/PreCommit-HIVE-Build-8138/

  * the patch matches with the one submitted in HIVE-18237
* closer look
  * 122-TestSpark
    the "completed executing" appears in 
http://104.198.109.242/logs/PreCommit-HIVE-Build-8139/execution.txt; but 
not in the actual build output console

  * 134-TestSpark
    the completed executing appears in both logs; with exactly the same 
timestamp


From the above: I suspect that somehow there are 2 "builds" executing 
the same set of tests is happening somehow

but anyway...could someone please take a closer look?

cheers,
Zoltan



[jira] [Created] (HIVE-18238) Driver execution should not have configuration altering sideeffects

2017-12-06 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18238:
---

 Summary: Driver execution should not have configuration altering 
sideeffects 
 Key: HIVE-18238
 URL: https://issues.apache.org/jira/browse/HIVE-18238
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich


{{Driver}} executes sql statements which use "hiveconf" settings;
but the {{Driver}} itself may *not* change the configuration...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18237) missing results for insert_only table after DP insert

2017-12-06 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18237:
---

 Summary: missing results for insert_only table after DP insert
 Key: HIVE-18237
 URL: https://issues.apache.org/jira/browse/HIVE-18237
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


{code}
set hive.stats.column.autogather=false;

set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.max.dynamic.partitions.pernode=200;
set hive.exec.max.dynamic.partitions=200;
set hive.support.concurrency=true;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;

create table i0 (p int,v int);
insert into i0 values
(0,0),
(2,2),
(3,3);

create table p0 (v int) partitioned by (p int) stored as orc 
  tblproperties ("transactional"="true", 
"transactional_properties"="insert_only");

explain insert overwrite table p0 partition (p) select * from i0 where v < 3;
insert overwrite table p0 partition (p) select * from i0 where v < 3;
select count(*) from p0 where v!=1;
{code}

The table p0 should contain {{2}} rows at this point; but the result is {{0}}.

* seems to be specific to insert_only tables
* the existing data appears if an {{insert into}} is executed.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18235) Columnstats gather fails for insert_only table

2017-12-06 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18235:
---

 Summary: Columnstats gather fails for insert_only table
 Key: HIVE-18235
 URL: https://issues.apache.org/jira/browse/HIVE-18235
 Project: Hive
  Issue Type: Sub-task
  Components: Statistics
Reporter: Zoltan Haindrich



test: dp_counter_mm.q

at:
{code}
insert overwrite table src2 partition (value) select * from src where key < 100;
{code}

produces:
{code}
2017-12-06T02:39:54,447 DEBUG [d709e6e0-7573-4c79-bb38-b043a88a8dde main] 
metrics.PerfLogger: 
2017-12-06T02:39:54,447 DEBUG [d709e6e0-7573-4c79-bb38-b043a88a8dde main] 
metadata.Hive: NoSuchObjectException(message:Partition for which stats is 
gathered doesn't exist.)
at 
org.apache.hadoop.hive.metastore.ObjectStore.updatePartitionColumnStatistics(ObjectStore.java:7644)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97)
at com.sun.proxy.$Proxy52.updatePartitionColumnStatistics(Unknown 
Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.updatePartitonColStats(HiveMetaStore.java:5340)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.set_aggr_stats_for(HiveMetaStore.java:6853)
at sun.reflect.GeneratedMethodAccessor81.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
at com.sun.proxy.$Proxy54.set_aggr_stats_for(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.setPartitionColumnStatistics(HiveMetaStoreClient.java:1748)
at 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.setPartitionColumnStatistics(SessionHiveMetaStoreClient.java:374)
at sun.reflect.GeneratedMethodAccessor80.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:211)
at com.sun.proxy.$Proxy55.setPartitionColumnStatistics(Unknown Source)
at 
org.apache.hadoop.hive.ql.metadata.Hive.setPartitionColumnStatistics(Hive.java:4215)
at 
org.apache.hadoop.hive.ql.stats.ColStatsProcessor.persistColumnStats(ColStatsProcessor.java:180)
at 
org.apache.hadoop.hive.ql.stats.ColStatsProcessor.process(ColStatsProcessor.java:84)
at org.apache.hadoop.hive.ql.exec.StatsTask.execute(StatsTask.java:108)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2230)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1882)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1613)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1358)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1346)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18225) disaggreement in results; vector_decimal_mapjoin.q

2017-12-05 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18225:
---

 Summary: disaggreement in results; vector_decimal_mapjoin.q
 Key: HIVE-18225
 URL: https://issues.apache.org/jira/browse/HIVE-18225
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


one of these out files have changed for me...and it turned out that it's not 
clear which output should be the expected; since we have different version 
recorded :D

for the last query in {{vector_decimal_mapjoin.q}} ; spark and clidriver gives 
1 result; in the meanwhile llap returns a whole bunch of results.

{code}
$ tail `find ql -name vector_decimal_mapjoin.q.out`
==> ql/src/test/results/clientpositive/vector_decimal_mapjoin.q.out <==
PREHOOK: type: QUERY
PREHOOK: Input: default@t1_small
PREHOOK: Input: default@t2_small
 A masked pattern was here 
POSTHOOK: query: select t1_small.`dec`, t1_small.value_dec, t2_small.`dec`, 
t2_small.value_dec from t1_small join t2_small on 
(t1_small.`dec`=t2_small.`dec`)
POSTHOOK: type: QUERY
POSTHOOK: Input: default@t1_small
POSTHOOK: Input: default@t2_small
 A masked pattern was here 
89.00   15.09   89  15

==> ql/src/test/results/clientpositive/llap/vector_decimal_mapjoin.q.out <==
9.0048.96   9   34
9.0048.96   9   38
9.0048.96   9   41
9.0048.96   9   42
9.0048.96   9   45
9.0048.96   9   48
9.0048.96   9   49
9.0048.96   9   5
9.0048.96   9   7
9.0048.96   9   7

==> ql/src/test/results/clientpositive/spark/vector_decimal_mapjoin.q.out <==
PREHOOK: type: QUERY
PREHOOK: Input: default@t1_small
PREHOOK: Input: default@t2_small
 A masked pattern was here 
POSTHOOK: query: select t1_small.`dec`, t1_small.value_dec, t2_small.`dec`, 
t2_small.value_dec from t1_small join t2_small on 
(t1_small.`dec`=t2_small.`dec`)
POSTHOOK: type: QUERY
POSTHOOK: Input: default@t1_small
POSTHOOK: Input: default@t2_small
 A masked pattern was here 
89.00   15.09   89  15
{code}

I think the TestClidriver based one can be "changed" to the other result; by 
setting {{hive.stats.column.autogather=true}}.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18224) Introduce interface above driver

2017-12-05 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18224:
---

 Summary: Introduce interface above driver
 Key: HIVE-18224
 URL: https://issues.apache.org/jira/browse/HIVE-18224
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich


Add an interface above driver; and use it outside of ql.
The goal is to enable the overlaying of the Driver with some strategy.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


checkstyle changes

2017-12-05 Thread Zoltan Haindrich

Hello,

I've filed a ticket to make the checkstyle warnings less noisy 
(https://issues.apache.org/jira/browse/HIVE-18222)


* set maxlinelength to 140
   I think everyone is working with big-enough displays to handle this :)
   There are many methods which have complicated names / arguments / 
etc ; breaking the lines more frequently hurts readability...
* disabled some restrictions like: declaration via get/set 
methods for protected/package fields are not mandatory


If you don't feel comfortable with these changes, please share your 
point of view.


cheers,
Zoltan




test executions

2017-12-05 Thread Zoltan Haindrich

Hello,

There seem to be some problems with the test executions...there was a 
job with 7 hours runtime; and since that have finished; the job takes 
>2.5hours to finish.


There are some exceptions; which have start appearing just before the 7 
hours job was executed


org.apache.hive.ptest.execution.AbortDroneException: Drone Drone 
[user=hiveptest, host=104.198.217.87, instance=1] exited with 255: SSHResult 
[command=bash 
/home/hiveptest/104.198.217.87-hiveptest-1/scratch/hiveptest-2-TestCliDriver-ppd_constant_where.q-drop_index_removes_partition_dirs.q-cbo_input26.q-and-27-more.sh,
 getExitCode()=255, getException()=null, getUser()=hiveptest, 
getHost()=104.198.217.87, getInstance()=1]
at 
org.apache.hive.ptest.execution.HostExecutor.executeTestBatch(HostExecutor.java:266)
at 
org.apache.hive.ptest.execution.HostExecutor.access$700(HostExecutor.java:57)
at 
org.apache.hive.ptest.execution.HostExecutor$2.call(HostExecutor.java:175)
at 
org.apache.hive.ptest.execution.HostExecutor$2.call(HostExecutor.java:159)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
at java.lang.Thread.run(Thread.java:748)


Could someone take a look?

* the current job is executing for 3 hours now:
  https://builds.apache.org/job/PreCommit-HIVE-Build/8108/
* the previous job which took 2.5 hours:
  https://builds.apache.org/job/PreCommit-HIVE-Build/8107/
* the 7hour job:
  https://builds.apache.org/job/PreCommit-HIVE-Build/8106/
* the first build which had this exception:
https://builds.apache.org/job/PreCommit-HIVE-Build/8105/consoleFull
* earlier builds seem to be normal..

regards,
Zoltan


Review Request 64334: HIVE-15939 boolean cast expressions

2017-12-05 Thread Zoltan Haindrich

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64334/
---

Review request for hive, Ashutosh Chauhan and Teddy Choi.


Bugs: HIVE-15939
https://issues.apache.org/jira/browse/HIVE-15939


Repository: hive-git


Description
---

* changes to make udf/vectorized cast expressions align with the serde 
counterparts


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
5c7d7eec8a54bff0193d769a559d2f3cbf49eef1 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToBoolean.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FuncStringToLong.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpression.java
 b5399d6ccf65ab11180e17320758f9a02085eded 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToBoolean.java 
d291e3659887f045b8274acb9502cd7ca44f6a75 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorMathFunctions.java
 fb791160c6b1c376335124b824941423322e1b0c 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorTypeCasts.java
 fb8035b6875cd39c6dc011a70b15d53c246d91f7 
  ql/src/test/queries/clientpositive/udf_to_boolean.q 
8bea7abcbc4d3b9c30d7deb6245cc9f55072875b 
  ql/src/test/queries/clientpositive/vector_udf_string_to_boolean.q 
PRE-CREATION 
  ql/src/test/results/clientpositive/udf_to_boolean.q.out 
ebce364bf7c4f1277cb7843221502d024a79f0b0 
  ql/src/test/results/clientpositive/vector_empty_where.q.out 
609a36cb1a88ff9ff4ef9e1cfbee01c817fc12fb 
  ql/src/test/results/clientpositive/vector_udf_string_to_boolean.q.out 
PRE-CREATION 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java
 6a4733f989f0033c9f84248ff385698ca6d0eec8 
  
serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/primitive/TestPrimitiveObjectInspectorUtils.java
 9d86a5494a366568f151493794ab7404e3b205f6 


Diff: https://reviews.apache.org/r/64334/diff/1/


Testing
---


Thanks,

Zoltan Haindrich



[jira] [Created] (HIVE-18222) Update checkstyle rules to be less peeky

2017-12-05 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18222:
---

 Summary: Update checkstyle rules to be less peeky
 Key: HIVE-18222
 URL: https://issues.apache.org/jira/browse/HIVE-18222
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich


there are a few issues with the current checkstyle.xml

as long as the new checks are coming back red all the time; people will start 
to ignore these checks...so I think, it would be better to make the checks less 
strict...

* set max linelength to 140; it looks like a more natural limit - because there 
are classnames which are eating up line space pretty quickly... like: 
{{PrimitiveObjectInspector}} :) 
* make checkstyle.xml easily importable into ide (use {{config_loc}} instead 
{{basedir}})
* suppress generated vectorized class errors




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18185) update insert_values_orig_table_use_metadata.q.out

2017-11-30 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18185:
---

 Summary: update insert_values_orig_table_use_metadata.q.out
 Key: HIVE-18185
 URL: https://issues.apache.org/jira/browse/HIVE-18185
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


some time back this test was skipped; and put back in: HIVE-17076

however...possibly some other changes were also present around that time; which 
have caused a 1 byte difference in totalSize; and after that some stats changes 
came by (HIVE-16827 / others)?




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18184) qtests should set some hadoop_home if possible

2017-11-30 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18184:
---

 Summary: qtests should set some hadoop_home if possible
 Key: HIVE-18184
 URL: https://issues.apache.org/jira/browse/HIVE-18184
 Project: Hive
  Issue Type: Improvement
  Components: Tests
Reporter: Zoltan Haindrich


this exception is always in the hive.log which are produced during qtest 
executions...

{code}
java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset.
at org.apache.hadoop.util.Shell.checkHadoopHomeInner(Shell.java:454) 
~[hadoop-common-3.0.0-beta1.jar:?]
at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:425) 
~[hadoop-common-3.0.0-beta1.jar:?]
at org.apache.hadoop.util.Shell.(Shell.java:502) 
[hadoop-common-3.0.0-beta1.jar:?]
at org.apache.hadoop.util.StringUtils.(StringUtils.java:78) 
[hadoop-common-3.0.0-beta1.jar:?]
at 
org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1555) 
[hadoop-common-3.0.0-beta1.jar:?]
at org.apache.hadoop.hive.conf.HiveConf.getBoolVar(HiveConf.java:4150) 
[hive-common-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.conf.HiveConf.getBoolVar(HiveConf.java:4177) 
[hive-common-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:4384) 
[hive-common-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.conf.HiveConf.(HiveConf.java:4295) 
[hive-common-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.QTestUtil.(QTestUtil.java:541) 
[hive-it-util-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.cli.control.CoreCliDriver$1.invokeInternal(CoreCliDriver.java:65)
 [hive-it-util-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.cli.control.CoreCliDriver$1.invokeInternal(CoreCliDriver.java:61)
 [hive-it-util-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.util.ElapsedTimeLoggingWrapper.invoke(ElapsedTimeLoggingWrapper.java:33)
 [hive-it-util-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.cli.control.CoreCliDriver.beforeClass(CoreCliDriver.java:67)
 [hive-it-util-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.cli.control.CliAdapter$1$1.evaluate(CliAdapter.java:71) 
[hive-it-util-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at org.junit.rules.RunRules.evaluate(RunRules.java:20) 
[junit-4.11.jar:?]
at org.junit.runners.ParentRunner.run(ParentRunner.java:309) 
[junit-4.11.jar:?]
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:369)
 [surefire-junit4-2.20.1.jar:2.20.1]
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:275)
 [surefire-junit4-2.20.1.jar:2.20.1]
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18183) fix: beanutils.FluentPropertyBeanIntrospector

2017-11-30 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18183:
---

 Summary: fix: beanutils.FluentPropertyBeanIntrospector
 Key: HIVE-18183
 URL: https://issues.apache.org/jira/browse/HIVE-18183
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


I see this exception in the hive.log lately...I don't think it's sever; but it 
can be misleading...

{code}
2017-11-30T04:53:17,808  INFO [0a6aca14-c404-494e-b30b-a13f7ad19eb2 main] 
beanutils.FluentPropertyBeanIntrospector: Error when creating 
PropertyDescriptor for public final void org.apache.commons.configuration2.
AbstractConfiguration.setProperty(java.lang.String,java.lang.Object)! Ignoring 
this property.
2017-11-30T04:53:17,808 DEBUG [0a6aca14-c404-494e-b30b-a13f7ad19eb2 main] 
beanutils.FluentPropertyBeanIntrospector: Exception is:
java.beans.IntrospectionException: bad write method arg count: public final 
void 
org.apache.commons.configuration2.AbstractConfiguration.setProperty(java.lang.String,java.lang.Object)
at 
java.beans.PropertyDescriptor.findPropertyType(PropertyDescriptor.java:657) 
~[?:1.8.0_151]
at 
java.beans.PropertyDescriptor.setWriteMethod(PropertyDescriptor.java:327) 
~[?:1.8.0_151]
at java.beans.PropertyDescriptor.(PropertyDescriptor.java:139) 
~[?:1.8.0_151]
at 
org.apache.commons.beanutils.FluentPropertyBeanIntrospector.createFluentPropertyDescritor(FluentPropertyBeanIntrospector.java:178)
 ~[commons-beanutils-1.9.3.jar:1.9.3]
at 
org.apache.commons.beanutils.FluentPropertyBeanIntrospector.introspect(FluentPropertyBeanIntrospector.java:141)
 [commons-beanutils-1.9.3.jar:1.9.3]
at 
org.apache.commons.beanutils.PropertyUtilsBean.fetchIntrospectionData(PropertyUtilsBean.java:2245)
 [commons-beanutils-1.9.3.jar:1.9.3]
at 
org.apache.commons.beanutils.PropertyUtilsBean.getIntrospectionData(PropertyUtilsBean.java:2226)
 [commons-beanutils-1.9.3.jar:1.9.3]
at 
org.apache.commons.beanutils.PropertyUtilsBean.getPropertyDescriptor(PropertyUtilsBean.java:954)
 [commons-beanutils-1.9.3.jar:1.9.3]
at 
org.apache.commons.beanutils.PropertyUtilsBean.isWriteable(PropertyUtilsBean.java:1478)
 [commons-beanutils-1.9.3.jar:1.9.3]
at 
org.apache.commons.configuration2.beanutils.BeanHelper.isPropertyWriteable(BeanHelper.java:521)
 [commons-configuration2-2.1.1.jar:2.1.1]
at 
org.apache.commons.configuration2.beanutils.BeanHelper.initProperty(BeanHelper.java:357)
 [commons-configuration2-2.1.1.jar:2.1.1]
at 
org.apache.commons.configuration2.beanutils.BeanHelper.initBeanProperties(BeanHelper.java:273)
 [commons-configuration2-2.1.1.jar:2.1.1]
at 
org.apache.commons.configuration2.beanutils.BeanHelper.initBean(BeanHelper.java:192)
 [commons-configuration2-2.1.1.jar:2.1.1]
at 
org.apache.commons.configuration2.beanutils.BeanHelper$BeanCreationContextImpl.initBean(BeanHelper.java:669)
 [commons-configuration2-2.1.1.jar:2.1.1]
at 
org.apache.commons.configuration2.beanutils.DefaultBeanFactory.initBeanInstance(DefaultBeanFactory.java:162)
 [commons-configuration2-2.1.1.jar:2.1.1]
at 
org.apache.commons.configuration2.beanutils.DefaultBeanFactory.createBean(DefaultBeanFactory.java:116)
 [commons-configuration2-2.1.1.jar:2.1.1]
at 
org.apache.commons.configuration2.beanutils.BeanHelper.createBean(BeanHelper.java:459)
 [commons-configuration2-2.1.1.jar:2.1.1]
at 
org.apache.commons.configuration2.beanutils.BeanHelper.createBean(BeanHelper.java:479)
 [commons-configuration2-2.1.1.jar:2.1.1]
at 
org.apache.commons.configuration2.beanutils.BeanHelper.createBean(BeanHelper.java:492)
 [commons-configuration2-2.1.1.jar:2.1.1]
at 
org.apache.commons.configuration2.builder.BasicConfigurationBuilder.createResultInstance(BasicConfigurationBuilder.java:447)
 [commons-configuration2-2.1.1.jar:2.1.1]
at 
org.apache.commons.configuration2.builder.BasicConfigurationBuilder.createResult(BasicConfigurationBuilder.java:417)
 [commons-configuration2-2.1.1.jar:2.1.1]
at 
org.apache.commons.configuration2.builder.BasicConfigurationBuilder.getConfiguration(BasicConfigurationBuilder.java:285)
 [commons-configuration2-2.1.1.jar:2.1.1]
at 
org.apache.hadoop.metrics2.impl.MetricsConfig.loadFirst(MetricsConfig.java:119) 
[hadoop-common-3.0.0-beta1.jar:?]
at 
org.apache.hadoop.metrics2.impl.MetricsConfig.create(MetricsConfig.java:98) 
[hadoop-common-3.0.0-beta1.jar:?]
at 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl.configure(MetricsSystemImpl.java:478)
 [hadoop-common-3.0.0-beta1.jar:?]
at 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl.start(MetricsSystemImpl.java:188)
 [hadoop-common-3.0.0-beta1.jar:?]
at 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl.init(MetricsSystemImpl.java:163)
 [hadoop-common-3.0.0-beta1.jar:?]
at 
org.apache.hadoop.metrics2

[jira] [Created] (HIVE-18182) execution of semijoin_hint.q breaks hybridgrace_hashjoin_2.q

2017-11-30 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18182:
---

 Summary: execution of semijoin_hint.q breaks 
hybridgrace_hashjoin_2.q
 Key: HIVE-18182
 URL: https://issues.apache.org/jira/browse/HIVE-18182
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich



{code}

M_OPTS+=" -q "
M_OPTS+=" -Pitests -DskipSparkTests"
M_OPTS+=" -pl itests/qtest"
M_OPTS+=" -Dtest=TestMiniLlapLocalCliDriver"
M_OPTS+=" -Dqfile=semijoin_hint.q,hybridgrace_hashjoin_2.q"
M_OPTS+=" install"

mvn $M_OPTS -DfailIfNoTests 
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18178) Column stats are not autogathered for materialized views

2017-11-29 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18178:
---

 Summary: Column stats are not autogathered for materialized views
 Key: HIVE-18178
 URL: https://issues.apache.org/jira/browse/HIVE-18178
 Project: Hive
  Issue Type: Sub-task
  Components: Statistics
Reporter: Zoltan Haindrich


HIVE-18163 will fix basic stats collection; but even with that fix, the column 
stats are not collected



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Review Request 64122: HIVE-18163 Stats: create materialized view should also collect stats

2017-11-29 Thread Zoltan Haindrich

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64122/
---

(Updated Nov. 29, 2017, 1:13 p.m.)


Review request for hive, Ashutosh Chauhan and Jesús Camacho Rodríguez.


Changes
---

update to patch#2


Bugs: HIVE-18163
https://issues.apache.org/jira/browse/HIVE-18163


Repository: hive-git


Description
---

* collect stats for {{create materialized view}} as well; AFAIK its not 
possible to do an update on a materialized view ; so every materialized view 
operation can be considered as a rewrite w.r.t stats
* added a small collection to delay the construction of the view objects in 
`MaterializedViewRegistry`; the reason this was needed is:
 * `StatsTask` runs after `DDLTask` :
  * `DDLTask` invoked  `MaterializedViewRegistry` to put the view in 
cache
   * `StatsTask` filled out the basicStats info in the metastore...
  * next query used the *out-dated* cached table object (which was 
available at the time {{MaterializedViewRegistry}} built the scanner.
* I've rerun all the "materialized_view" tests. and the results look good to me
* in materialized_view_create_rewrite_2.q.out the usage of {{cmv_mat_view_5}} 
appeared; which looks good to me


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/QueryLifeTimeHookRunner.java 
85e038ce36420859c5e42e081b88065cdad811f5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 
17640f3396678cec2732bb29033eecd6e8d8db71 
  
ql/src/java/org/apache/hadoop/hive/ql/hooks/MaterializedViewRegistryUpdateHook.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java
 51b6ef58fc196be716c4b07287fbc83503d1df50 
  ql/src/java/org/apache/hadoop/hive/ql/plan/BasicStatsWork.java 
26bb3e17074bf03cb0fc67c3983252998ab23d4d 
  ql/src/test/queries/clientpositive/materialized_view_create.q 
bb50dbb6a1816005873ea51ae193a45d418c99e4 
  
ql/src/test/results/clientpositive/beeline/materialized_view_create_rewrite.q.out
 aa3240cad4da90b2146e330884e7223708ed20a3 
  ql/src/test/results/clientpositive/llap/materialized_view_create.q.out 
928618390d2ea0254839013f46b8be28a2ff5a54 
  
ql/src/test/results/clientpositive/llap/materialized_view_create_rewrite.q.out 
8bebab4ef036b4750da9e00cccf0d0f4ed9c53e8 
  
ql/src/test/results/clientpositive/llap/materialized_view_create_rewrite_2.q.out
 83ab7429e40ff9a3313430917e64a1991c47c4df 
  
ql/src/test/results/clientpositive/llap/materialized_view_create_rewrite_multi_db.q.out
 e1357853631ab369bb79a17911c2e5df0b1e9ac7 
  ql/src/test/results/clientpositive/llap/materialized_view_describe.q.out 
2be1536453843d10644948f6ac7100088ee52e5c 
  ql/src/test/results/clientpositive/materialized_view_create.q.out 
928618390d2ea0254839013f46b8be28a2ff5a54 
  ql/src/test/results/clientpositive/materialized_view_create_rewrite.q.out 
aa3240cad4da90b2146e330884e7223708ed20a3 
  ql/src/test/results/clientpositive/materialized_view_create_rewrite_2.q.out 
c4bee9c63d4e15181a00af5c7c9e6dbbec443fd9 
  ql/src/test/results/clientpositive/materialized_view_create_rewrite_3.q.out 
9fd70b69371158003284bc55d5965c129bf412a5 
  
ql/src/test/results/clientpositive/materialized_view_create_rewrite_multi_db.q.out
 a6d00db0f76e0b61ba03b7534cbd8d51dedf5381 
  ql/src/test/results/clientpositive/materialized_view_describe.q.out 
2be1536453843d10644948f6ac7100088ee52e5c 


Diff: https://reviews.apache.org/r/64122/diff/2/

Changes: https://reviews.apache.org/r/64122/diff/1-2/


Testing
---


Thanks,

Zoltan Haindrich



[jira] [Created] (HIVE-18165) Differentiate table level stat / operator level stats

2017-11-28 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18165:
---

 Summary: Differentiate table level stat / operator level stats
 Key: HIVE-18165
 URL: https://issues.apache.org/jira/browse/HIVE-18165
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich


there are some confusion here-and-thereand most probably because the 
following cases are both valid:

* the full row size of a table is x bytes
* the actually used field sum from a table is x bytes

there is another axis:

* online: including object headers/etc (string value "asd": >100 bytes)
* offline: just raw data (4 bytes)





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Review Request 64122: HIVE-18163 Stats: create materialized view should also collect stats

2017-11-28 Thread Zoltan Haindrich

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64122/
---

Review request for hive, Ashutosh Chauhan and Jesús Camacho Rodríguez.


Bugs: HIVE-18163
https://issues.apache.org/jira/browse/HIVE-18163


Repository: hive-git


Description
---

* collect stats for {{create materialized view}} as well; AFAIK its not 
possible to do an update on a materialized view ; so every materialized view 
operation can be considered as a rewrite w.r.t stats
* added a small collection to delay the construction of the view objects in 
`MaterializedViewRegistry`; the reason this was needed is:
 * `StatsTask` runs after `DDLTask` :
  * `DDLTask` invoked  `MaterializedViewRegistry` to put the view in 
cache
   * `StatsTask` filled out the basicStats info in the metastore...
  * next query used the *out-dated* cached table object (which was 
available at the time {{MaterializedViewRegistry}} built the scanner.
* I've rerun all the "materialized_view" tests. and the results look good to me
* in materialized_view_create_rewrite_2.q.out the usage of {{cmv_mat_view_5}} 
appeared; which looks good to me


Diffs
-

  
ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java
 51b6ef58fc196be716c4b07287fbc83503d1df50 
  ql/src/java/org/apache/hadoop/hive/ql/plan/BasicStatsWork.java 
26bb3e17074bf03cb0fc67c3983252998ab23d4d 
  ql/src/test/queries/clientpositive/materialized_view_create.q 
bb50dbb6a1816005873ea51ae193a45d418c99e4 
  ql/src/test/results/clientpositive/llap/materialized_view_create.q.out 
928618390d2ea0254839013f46b8be28a2ff5a54 
  
ql/src/test/results/clientpositive/llap/materialized_view_create_rewrite.q.out 
8bebab4ef036b4750da9e00cccf0d0f4ed9c53e8 
  
ql/src/test/results/clientpositive/llap/materialized_view_create_rewrite_2.q.out
 83ab7429e40ff9a3313430917e64a1991c47c4df 
  
ql/src/test/results/clientpositive/llap/materialized_view_create_rewrite_multi_db.q.out
 e1357853631ab369bb79a17911c2e5df0b1e9ac7 
  ql/src/test/results/clientpositive/llap/materialized_view_describe.q.out 
2be1536453843d10644948f6ac7100088ee52e5c 
  ql/src/test/results/clientpositive/materialized_view_create.q.out 
928618390d2ea0254839013f46b8be28a2ff5a54 
  ql/src/test/results/clientpositive/materialized_view_create_rewrite.q.out 
aa3240cad4da90b2146e330884e7223708ed20a3 
  ql/src/test/results/clientpositive/materialized_view_create_rewrite_2.q.out 
c4bee9c63d4e15181a00af5c7c9e6dbbec443fd9 
  ql/src/test/results/clientpositive/materialized_view_create_rewrite_3.q.out 
9fd70b69371158003284bc55d5965c129bf412a5 
  
ql/src/test/results/clientpositive/materialized_view_create_rewrite_multi_db.q.out
 a6d00db0f76e0b61ba03b7534cbd8d51dedf5381 
  ql/src/test/results/clientpositive/materialized_view_describe.q.out 
2be1536453843d10644948f6ac7100088ee52e5c 


Diff: https://reviews.apache.org/r/64122/diff/1/


Testing
---


Thanks,

Zoltan Haindrich



[jira] [Created] (HIVE-18163) stats: create materialized view should also collect stats

2017-11-28 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18163:
---

 Summary: stats: create materialized view should also collect stats
 Key: HIVE-18163
 URL: https://issues.apache.org/jira/browse/HIVE-18163
 Project: Hive
  Issue Type: Sub-task
  Components: Statistics
Reporter: Zoltan Haindrich


not having basic stats on the materialized view may cause it to be "ruled out" 
as a viable alternative by the cbo.

repro set {{set hive.stats.deserialization.factor=10.0}} in 
{{ql/src/test/queries/clientpositive/materialized_view_create_rewrite.q}}

blocks: HIVE-18149



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18162) Investigate bucketed table stats

2017-11-28 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18162:
---

 Summary: Investigate bucketed table stats
 Key: HIVE-18162
 URL: https://issues.apache.org/jira/browse/HIVE-18162
 Project: Hive
  Issue Type: Sub-task
  Components: Statistics
Reporter: Zoltan Haindrich


something is very off in: 
ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out

* 170 bytes of data; 1 rows
* tablescan(d) reacts to  deser factor changes
* tablescan(b) seem to ignore  deser factor changes ; however they scan the 
same table..
* these all might be the consequence of the fact that table with alias {{d}} 's 
on condition is probably typoed ; and causes: 

{code}
 Warning: Map Join MAPJOIN[31][bigTable=?] in task 'Stage-3:MAPRED' is a cross 
product
{code}





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18161) Remove hive.stats.atomic

2017-11-28 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18161:
---

 Summary: Remove hive.stats.atomic
 Key: HIVE-18161
 URL: https://issues.apache.org/jira/browse/HIVE-18161
 Project: Hive
  Issue Type: Sub-task
  Components: Statistics
Reporter: Zoltan Haindrich


It might have its purpose back then... when it was introduced; but currently 
enabling this property may at best prevent the stats collector from working 
properly :)

And moreover: {{hive.stats.reliable}} is a very similar prop; and it would be 
better to use only one property which enables correctness checks.





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


ptests are broken

2017-11-27 Thread Zoltan Haindrich

Hello,

There are some issues with the test execution at 
https://builds.apache.org/job/PreCommit-HIVE-Build/
Last Friday I've talked with Peter Vary about it; and he was able to 
restart it ... it looked better for a few hours :)


Currently, I also see some rsync errors as well; so the issue might be 
caused by some network glitch.


rsync: connection unexpectedly closed (3845539868 bytes received so far) 
[receiver]


rsync error: error in rsync protocol data stream (code 12) at io.c(226) 
[receiver=3.1.1]
rsync: connection unexpectedly closed (894 bytes received so far) [generator]

If someone could look into it...I would really appreciate it - I'm starting to 
loose the thread why I was pushing those tickets :)

cheers,
Zoltan



[jira] [Created] (HIVE-18149) Stats: rownum estimation from datasize underestimates in most cases

2017-11-27 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18149:
---

 Summary: Stats: rownum estimation from datasize underestimates in 
most cases
 Key: HIVE-18149
 URL: https://issues.apache.org/jira/browse/HIVE-18149
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich


rownum estimation is based on the following fact as of now:

* datasize being used from the following sources:
** basicstats aggregates the loaded "on-heap" row sizes ; other readers are 
able to give "raw size" estimation - I've checked orc; but I'm sure others will 
do the sameapi docs are a bit vague about the methods purpose...
** if the basicstats level info is not available; the filesystem level 
"file-size-sums" are used as the "raw data size" ; which is multiplied by the 
[deserialization 
ratio|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L261]
 ; which is currently 1.

the problem with all of this is that deser factor is 1; and that rowsize counts 
in the online object headers..

example; 20 rows are loaded into a partition 
[columnstats_partlvl_dp.q|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/test/queries/clientpositive/columnstats_partlvl_dp.q#L7]

after HIVE-18108 [this 
explain|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/test/queries/clientpositive/columnstats_partlvl_dp.q#L25]
 will estimate the rowsize of the table to be 404 bytes; however the 20 rows of 
text is only 169 bytes...so it ends up with 0 rows...




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18141) Fix StatsUtils.combineRange to combine intervals

2017-11-23 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18141:
---

 Summary: Fix StatsUtils.combineRange to combine intervals
 Key: HIVE-18141
 URL: https://issues.apache.org/jira/browse/HIVE-18141
 Project: Hive
  Issue Type: Sub-task
  Components: Statistics
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


the current [combinedRange 
implementation|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L1984]
 in its current form "combines" only ranges which contain eachother

but the comments suggests that the intention was to capture the case when the 2 
intervals are overlap; can be checked with the following testcase:

{code}
  @Test
  public void test11() {
Range r1 = new Range(0, 1);
Range r2 = new Range(1, 11);
Range r3 = StatsUtils.combineRange(r1, r2);
assertNotNull(r3);
  }
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18140) Partitioned tables statistics can go wrong in basic stats mixed case

2017-11-23 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18140:
---

 Summary: Partitioned tables statistics can go wrong in basic stats 
mixed case
 Key: HIVE-18140
 URL: https://issues.apache.org/jira/browse/HIVE-18140
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich


suppose the following scenario:

* part1 has basic stats {{RC=10,DS=1K}}
* all other partition has no basic stats (and a bunch of rows)

then 
[this|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L378]
 condition would be false; which in turn produces estimations for the whole 
partitioned table: {{RC=10,DS=1K}}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18139) spark may miss results in case column stats are gathered

2017-11-23 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18139:
---

 Summary: spark may miss results in case column stats are gathered
 Key: HIVE-18139
 URL: https://issues.apache.org/jira/browse/HIVE-18139
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


add {{set hive.stats.column.autogather=true;}} at the beginning of 
{{ql/src/test/queries/clientpositive/auto_sortmerge_join_13.q}}  to repro.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18138) Fix columnstats problem in case schema evolution

2017-11-23 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18138:
---

 Summary: Fix columnstats problem in case schema evolution
 Key: HIVE-18138
 URL: https://issues.apache.org/jira/browse/HIVE-18138
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


column stats are kept in case the main table schema is altered; and this causes 
all kind of problems.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18137) Schema evolution: newly inserted column value in pre-existing partition is masked to null

2017-11-23 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18137:
---

 Summary: Schema evolution: newly inserted column value in 
pre-existing partition is masked to null
 Key: HIVE-18137
 URL: https://issues.apache.org/jira/browse/HIVE-18137
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich



{code}
set hive.explain.user=false;
set hive.fetch.task.conversion=none;
set hive.mapred.mode=nonstrict;
set hive.cli.print.header=true;
SET hive.exec.schema.evolution=true;
SET hive.vectorized.use.vectorized.input.format=true;
SET hive.vectorized.use.vector.serde.deserialize=false;
SET hive.vectorized.use.row.serde.deserialize=false;
SET hive.vectorized.execution.enabled=false;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.metastore.disallow.incompatible.col.type.changes=true;
set hive.default.fileformat=textfile;
set hive.llap.io.enabled=false;

CREATE TABLE part_add_int_permute_select(insert_num int, a INT, b STRING) 
PARTITIONED BY(part INT);

insert into table part_add_int_permute_select partition(part=1) VALUES (1, 
, 'new');

alter table part_add_int_permute_select add columns(c int);

insert into table part_add_int_permute_select partition(part=1) VALUES (2, 
, 'new', );

select insert_num,part,a,b,c from part_add_int_permute_select;
{code}

results for the last select:
{code}
1  1   new NULL
2  1   new NULL
{code}

I think the following result should be expected:
{code}
1  1   new NULL
2  1   new 
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18114) Fix LazySimpleDeserializeRead failed precondition

2017-11-21 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18114:
---

 Summary: Fix LazySimpleDeserializeRead failed precondition
 Key: HIVE-18114
 URL: https://issues.apache.org/jira/browse/HIVE-18114
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


Exception:

{code}
[...]
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row 
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:919)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
... 18 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
DeserializeRead detail: Reading byte[] of length 142 at start offset 0 for 
length 42 to read 3 fields wi
th types [int, string, 
struct<c1:boolean,c2:tinyint,c3:smallint,c4:int,c5:bigint,c6:float,c7:double,c8:d
ecimal(38,18),c9:char(25),c10:varchar(25),c11:timestamp,c12:date,c13:binary>].  
Read field #2 at field s
tart position 11 for field length 31
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:881)
... 19 more
Caused by: java.lang.IllegalStateException
at 
com.google.common.base.Preconditions.checkState(Preconditions.java:133)
at 
org.apache.hadoop.hive.serde2.lazy.fast.LazySimpleDeserializeRead.readComplexField(LazySimple
DeserializeRead.java:1060)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storeComplexFieldRowColumn(VectorD
eserializeRow.java:748)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storeStructRowColumn(VectorDeseria
lizeRow.java:856)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storeRowColumn(VectorDeserializeRo
w.java:919)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserialize(VectorDeserializeRow.j
ava:1332)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:876)
... 19 more
{code}

reproduce: 

{code}
# apply attached patch (changes qtest)
mvn install  -Pitests -pl itests/qtest 
-Dtest=Test*CliDriver#*[schema_evol_text_vec_part_all_complex]   
-DskipSparkTests -q -DinitScript=asd.sql -Dtest.output.overwrite  -am
{code}





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18113) Remove mixed partitions/table schema support

2017-11-21 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18113:
---

 Summary: Remove mixed partitions/table schema support
 Key: HIVE-18113
 URL: https://issues.apache.org/jira/browse/HIVE-18113
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


Currently it is possible to have misaligned table/partition schemas;

[see here for example| 
https://github.com/apache/hive/blob/a5c2e15c7cc125d8cda2ee3a8ed64c116ff6b755/ql/src/test/queries/clientpositive/schema_evol_text_vec_part.q#L156]

result of [this insert 
statement|https://github.com/apache/hive/blob/a5c2e15c7cc125d8cda2ee3a8ed64c116ff6b755/ql/src/test/queries/clientpositive/schema_evol_text_vec_part.q#L162]
 are these [null 
values|https://github.com/apache/hive/blob/a5c2e15c7cc125d8cda2ee3a8ed64c116ff6b755/ql/src/test/results/clientpositive/llap/schema_evol_text_vec_part.q.out#L660]

This mixed partition setup can cause the stats aggregation to become quite 
confusing...I think it would be better to remove this thingthere is a 
{{CASCADE}} flag already; which changes the schema all over the 
table/partitions/etc.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18108) in case basic stats are missing; rowcount estimation depends on the select columns size

2017-11-20 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18108:
---

 Summary: in case basic stats are missing; rowcount estimation 
depends on the select columns size
 Key: HIVE-18108
 URL: https://issues.apache.org/jira/browse/HIVE-18108
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich


in case basicstats are not available (especially rowcount):

{code}
set hive.stats.autogather=false;
create table t (a integer, b string);

insert into t values (1,'asd1');
insert into t values (2,'asd2');
insert into t values (3,'asd3');
insert into t values (4,'asd4');
insert into t values (5,'asd5');

explain select a,count(1) from t group by a;
-- estimated to read 8 rows from table t
explain select b,count(1) from t group by b;
-- estimated: 1 rows
explain select a,b,count(1) from t group by a,b;
-- estimated: 1 rows
{code}

it may not depend on the actually selected column set.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18105) Aggregation of an empty set doesn't pass constants to the UDAF

2017-11-20 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18105:
---

 Summary: Aggregation of an empty set doesn't pass constants to the 
UDAF
 Key: HIVE-18105
 URL: https://issues.apache.org/jira/browse/HIVE-18105
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich


the groupbyoperator's logic for firstrow passes {{null}} for all parameters.
see here:
 
{here|https://github.com/apache/hive/blob/39d46e8af5a3794f7395060b890f94ddc84516e7/ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java#L1116}

this could obstruct {{compute_stats}} operations because it has a constant 
argument.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18092) ColumnStats autogather may only run if BasicStats are being collected

2017-11-17 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18092:
---

 Summary: ColumnStats autogather may only run if BasicStats are 
being collected
 Key: HIVE-18092
 URL: https://issues.apache.org/jira/browse/HIVE-18092
 Project: Hive
  Issue Type: Sub-task
  Components: Statistics
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


currently; if column.stats.autogather is enabled; the hbase tests are run into 
an exception which arises from the fact that basic stats are not collected on 
that table...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18063) Make CommandProcessorResponse an exception instead of a return class

2017-11-14 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18063:
---

 Summary: Make CommandProcessorResponse an exception instead of a 
return class
 Key: HIVE-18063
 URL: https://issues.apache.org/jira/browse/HIVE-18063
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


the usage pattern of the {{CommandProcessorResponse}} class suggests that its 
current role is closer to Exceptions than to return values.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Review Request 63442: HIVE-17934 Merging Statistics are promoted to COMPLETE (most of the time)

2017-11-14 Thread Zoltan Haindrich


> On Nov. 9, 2017, 7:51 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/llap/auto_sortmerge_join_12.q.out
> > Line 160 (original), 160 (patched)
> > <https://reviews.apache.org/r/63442/diff/2/?file=1886244#file1886244line160>
> >
> > bucket_small has no stats gathered. This should be NONE.
> 
> Zoltan Haindrich wrote:
> `hive.stats.autogather` is enabled by default from `HiveConf`
> 
> Ashutosh Chauhan wrote:
> Those are load statements, not inserts. We don't gather stats with load 
> statements only with insets.
> 
> Zoltan Haindrich wrote:
> sorry, you are right: basic stats are not gathered in this case in any 
> way.
> 
> But the stat state is complete; because: there is logic which scans the 
> file sizes - to calculate the datasizes; and from there HIVE-16811 can guess 
> some row counts
> 
> 
> https://github.com/kgyrtkirk/hive/blob/9f67a878512117eb5c251794adc1a91bae62fea7/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L386-L393
> 
> Firts I would like to make the standalone table/partitioned table's 
> calculation-s are a bit more similar to eachother
> 
> I've tried to come up with some definitions for NONE/PARTIAL/COMPLETE; 
> currently I would say the following:
> 
> * NONE: not known
> * on table: no information (afaik currently this can't happen)
> * estimation tree: all nodes in the estimation tree were NONE
> * PARTIAL:
> * on table: the current information is estimated from data size
> * estimation tree: contains at least one NONE/PARTIAL
> * COMPLETE:
> * current information is correct (calculated by statstask-s)
> * estimation tree: the whole subtree has COMPLETE status
> 
> If I use these definitions; then I would say that the filesystem size 
> based estimation should be considered PARTIAL.
> 
> Ashutosh Chauhan wrote:
> Definitions sounds good. Lets use them to make sure our state calculation 
> logic is built on it.
> Can you also add this in code comments.

I've opened HIVE-18062 to address these problems


- Zoltan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63442/#review190633
---


On Nov. 9, 2017, 5:39 p.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63442/
> ---
> 
> (Updated Nov. 9, 2017, 5:39 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-17934
> https://issues.apache.org/jira/browse/HIVE-17934
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> * remove the reactive stat state guessing method
> * make the guessing only work when a new object is created
> * change the way stat objects are merged
> 
> this patch will most probably break almost all qtest outputs
> 
> 
> Diffs
> -
> 
>   accumulo-handler/src/test/results/positive/accumulo_queries.q.out 
> b3adf4e504 
>   hbase-handler/src/test/results/positive/hbase_queries.q.out b2eda12e95 
>   hbase-handler/src/test/results/positive/hbasestats.q.out 29eefd43a9 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java
>  7a3fae65e8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
>  a4f60accce 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/Statistics.java 8ffb4ce44b 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java ce7c96c639 
>   ql/src/test/queries/clientpositive/lateral_view_onview2.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/stats_empty_partition2.q PRE-CREATION 
>   ql/src/test/results/clientpositive/acid_table_stats.q.out 351ff0da0a 
>   ql/src/test/results/clientpositive/alterColumnStatsPart.q.out 858e16fe22 
>   ql/src/test/results/clientpositive/annotate_stats_part.q.out 3a94a6a4e3 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out 7875e9693a 
>   ql/src/test/results/clientpositive/cbo_const.q.out e9f885b363 
>   ql/src/test/results/clientpositive/cbo_input26.q.out 77fc194829 
>   ql/src/test/results/clientpositive/columnstats_partlvl_dp.q.out 414b715b7a 
>   ql/src/test/results/clientpositive/columnstats_quoting.q.out 683c1e274f 
>   ql/src/test/results/clientpositive/columnstats_tbllvl.q.out a2c6ead293 
>   ql/src/test/results/clientpositive/constGby.q.out c633624935 
>   ql/src/test/results/cl

[jira] [Created] (HIVE-18062) Revise basic stat states for estimations

2017-11-14 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18062:
---

 Summary: Revise basic stat states for estimations
 Key: HIVE-18062
 URL: https://issues.apache.org/jira/browse/HIVE-18062
 Project: Hive
  Issue Type: Sub-task
  Components: Statistics
Reporter: Zoltan Haindrich


basic stat states might be misleading; because currently estimations also get 
the *COMPLETE* qualifier in most cases..

proposed definitions for the states:

* {{NONE}}
   ** on table: no information (afaik currently this can't happen)
   ** estimation tree: all nodes in the estimation tree were NONE
* {{PARTIAL}}:
   ** on table: the current information is estimated from data size
   ** estimation tree: contains at least one NONE/PARTIAL; and at least 1 
PARTIAL
* {{COMPLETE}}:
  ** current information is correct (calculated by statstask-s)
   ** estimation tree: the whole subtree has COMPLETE status

document/change states to comply with the above definitions.

followup of HIVE-17934; 




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18061) q.outs: be more selective with masikng hdfs paths

2017-11-14 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18061:
---

 Summary: q.outs: be more selective with masikng hdfs paths
 Key: HIVE-18061
 URL: https://issues.apache.org/jira/browse/HIVE-18061
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich



currently any line which contains a path which looks like an hdfs location is 
replaced with a "masked pattern was here"...

it might be releavant to record these messages; since even an exception message 
might contain an hdfs location

noticed in
HIVE-18012




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18059) remove unused hiveconf variables

2017-11-14 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-18059:
---

 Summary: remove unused hiveconf variables
 Key: HIVE-18059
 URL: https://issues.apache.org/jira/browse/HIVE-18059
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


* for example hive.debug.localtask is there...but it seems like its not used 
anywhere
* there might be more conf variables which are just hanging there without 
purpose



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


<    3   4   5   6   7   8   9   10   >