[jira] [Created] (HIVE-18746) add_months should validate the date first

2018-02-19 Thread Subhasis Gorai (JIRA)
Subhasis Gorai created HIVE-18746:
-

 Summary: add_months should validate the date first
 Key: HIVE-18746
 URL: https://issues.apache.org/jira/browse/HIVE-18746
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Subhasis Gorai


hive (sbg_hvc_ods)> select add_months('2017-02-28', 1);
OK
_c0
2017-03-31
Time taken: 0.107 seconds, Fetched: 1 row(s)
hive (sbg_hvc_ods)> select add_months('2017-02-29', 1);
OK
_c0
2017-04-01
Time taken: 0.084 seconds, Fetched: 1 row(s)
hive (sbg_hvc_ods)>

 

'2017-02-29' is an invalid date.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18745) Fix MetaStore creation in tests, so multiple MetaStores can be started on the same machine

2018-02-19 Thread Peter Vary (JIRA)
Peter Vary created HIVE-18745:
-

 Summary: Fix MetaStore creation in tests, so multiple MetaStores 
can be started on the same machine
 Key: HIVE-18745
 URL: https://issues.apache.org/jira/browse/HIVE-18745
 Project: Hive
  Issue Type: Sub-task
Reporter: Peter Vary
Assignee: Peter Vary


[~janulatha] fixed the problem, when multiple MetaStore tests are started on 
the same machine, then they tried to reserve the same port. This caused 
flakiness in the MetaStore tests run with the ptest framework. See: HIVE-18147

I reviewed the HIVE-17980, and tried to make sure, that the fix remains in 
every codepath. I was unsuccessful in it. :(

This Jira aims to go through the MetaStore tests, and make sure all of them is 
using the  {{startMetaStoreWithRetry}} method so the different tests will not 
cause each other to fail. Also there were clashes not only in port numbers, but 
warehouse directories as well, so this Jira should fix that also.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18744) Vectorization: VectorHashKeyWrapperBatch doesn't check repeated NULLs correctly

2018-02-19 Thread Matt McCline (JIRA)
Matt McCline created HIVE-18744:
---

 Summary: Vectorization: VectorHashKeyWrapperBatch doesn't check 
repeated NULLs correctly
 Key: HIVE-18744
 URL: https://issues.apache.org/jira/browse/HIVE-18744
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Matt McCline
Assignee: Matt McCline


Logic for checking selectedInUse isRepeating case for NULL is broken.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 65422: HIVE-17626

2018-02-19 Thread Zoltan Haindrich


> On Feb. 7, 2018, 1:58 a.m., Ashutosh Chauhan wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
> > Lines 3691 (patched)
> > 
> >
> > Instead of config this should be explain modifier. WE already have 
> > explain rewrite select .. We similarly can add explain reoptimize select ...

yes...I agree; it turned out that its very inconvinient to use it this way...

I've employed a semanticAnlayzer hook to handle the reoptimize keyword


> On Feb. 7, 2018, 1:58 a.m., Ashutosh Chauhan wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
> > Lines 5066 (patched)
> > 
> >
> > Instead of iterating over _this_ which can be very large, more 
> > efficient is to iterate on other list.

I wasn't aware that the iterator() creates a new map on the flyI'm now 
using getProps() to get access to the actual values


> On Feb. 7, 2018, 1:58 a.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/AbstractReExecDriver.java
> > Lines 127 (patched)
> > 
> >
> > Currently its only reexcuted once. Alternatively, we can keep 
> > re-running it if it fails again. e.g. in case of OOM, its possible that 
> > there are many joins which are mis-planed, but we get stats only for first 
> > join.
> > To avoid, very large number of retrials we can limit to some max 
> > attempts.

I aggree


> On Feb. 7, 2018, 1:58 a.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/DriverFactory.java
> > Lines 21 (patched)
> > 
> >
> > Incorrect import ?

I've just taken a look at null analysis; but it detects too many issues to just 
turn on...so I'll remove it for now :)


> On Feb. 7, 2018, 1:58 a.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/refs/OperatorRef.java
> > Lines 50 (patched)
> > 
> >
> > Instead of relying on ids, better is to use (and extend) logic in 
> > SharedWorkOptimizer::compareOperator() ?

t


- Zoltan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65422/#review196950
---


On Jan. 30, 2018, 6:13 p.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65422/
> ---
> 
> (Updated Jan. 30, 2018, 6:13 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> preview
> 
> 
> Diffs
> -
> 
>   cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java a78e0c63d7 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b7d3e99e1a 
>   hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/HCatCli.java 
> ad31287879 
>   hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/HCatDriver.java 
> 533f0bcd6f 
>   itests/src/test/resources/testconfiguration.properties d86ff58840 
>   ql/src/java/org/apache/hadoop/hive/ql/AbstractReExecDriver.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/Context.java 820fbf0f58 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 74595b00f9 
>   ql/src/java/org/apache/hadoop/hive/ql/DriverFactory.java 49d2bf5f33 
>   ql/src/java/org/apache/hadoop/hive/ql/IDriver.java 6280be0b08 
>   ql/src/java/org/apache/hadoop/hive/ql/ReExecOverlayDriver.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/ReOptimizeDriver.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 76e85636d1 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java 199b181290 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 
> 395a5f450f 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkCommonOperator.java
>  8dd7cfe58c 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkEmptyKeyOperator.java
>  134fc0ff0b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkObjectHashOperator.java
>  1eb72ce4d9 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkUniformHashOperator.java
>  384bd74686 
>   ql/src/java/org/apache/hadoop/hive/ql/hooks/PrivateHookContext.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
> 190771ea6b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
>  cbadfa4f07 
>   

Re: Review Request 65422: HIVE-17626

2018-02-19 Thread Zoltan Haindrich


> On Feb. 16, 2018, 4:50 a.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/AbstractReExecDriver.java
> > Lines 131 (patched)
> > 
> >
> > This is hackish.. as pointed above it needs to happen via explain 
> > modifier.

I agree


> On Feb. 16, 2018, 4:50 a.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/DriverFactory.java
> > Lines 21 (patched)
> > 
> >
> > Use java's nonnull annotation.

I've not found any "standard" annotation...I may just as well remove these 
markers...
https://stackoverflow.com/questions/4963300/which-notnull-java-annotation-should-i-use/42695253#42695253


> On Feb. 16, 2018, 4:50 a.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ReOptimizeDriver.java
> > Lines 54 (patched)
> > 
> >
> > Why is this needed?

this is not needed...but enables the user to set a different set of 
configuration during re-executions


> On Feb. 16, 2018, 4:50 a.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/PlanMapper.java
> > Lines 36 (patched)
> > 
> >
> > A flat map of operators looses hierarichal info in which operators are 
> > organized which is tree. So, this match needs to happen  via sub-graph 
> > matching pattern. See SharedWorkOptimizer::areMergeable() .

I will try to retain this concept for now at least; the idea is that imagine 
that we have N operator stats gathered; and the current plan consist of M 
operators; if we have only a cmp(A,B) oracle; that means we will have to do N*M 
comparisions; which could become really bad if N starts to become large...

I'm thinking of serving the existing operator infos in a map alike fashion - at 
least it should be visible as one for the outside.

If an operator could self-describe its whole context; then it could be match... 
for example a matching the textual representation of a RelNode contains all the 
upstream operations as well; and enables matching.

It looked promising to do it; I wanted to do it with HIVE-18703 - but 
unfortunately there were some complications...


- Zoltan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65422/#review197649
---


On Jan. 30, 2018, 6:13 p.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65422/
> ---
> 
> (Updated Jan. 30, 2018, 6:13 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> preview
> 
> 
> Diffs
> -
> 
>   cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java a78e0c63d7 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b7d3e99e1a 
>   hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/HCatCli.java 
> ad31287879 
>   hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/HCatDriver.java 
> 533f0bcd6f 
>   itests/src/test/resources/testconfiguration.properties d86ff58840 
>   ql/src/java/org/apache/hadoop/hive/ql/AbstractReExecDriver.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/Context.java 820fbf0f58 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 74595b00f9 
>   ql/src/java/org/apache/hadoop/hive/ql/DriverFactory.java 49d2bf5f33 
>   ql/src/java/org/apache/hadoop/hive/ql/IDriver.java 6280be0b08 
>   ql/src/java/org/apache/hadoop/hive/ql/ReExecOverlayDriver.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/ReOptimizeDriver.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 76e85636d1 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java 199b181290 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 
> 395a5f450f 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkCommonOperator.java
>  8dd7cfe58c 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkEmptyKeyOperator.java
>  134fc0ff0b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkObjectHashOperator.java
>  1eb72ce4d9 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkUniformHashOperator.java
>  384bd74686 
>   ql/src/java/org/apache/hadoop/hive/ql/hooks/PrivateHookContext.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
> 190771ea6b 
>   
>