Re: Review Request 69017: HIVE-20718

2018-10-21 Thread Ashutosh Chauhan


> On Oct. 19, 2018, 6:34 a.m., Ashutosh Chauhan wrote:
> > data/scripts/q_perf_test_init_constraints.sql
> > Lines 5-8 (patched)
> > 
> >
> > Shall we use varchar() to match original tpcds spec?
> 
> Jesús Camacho Rodríguez wrote:
> This is apparently not changed in TPCDS benchmark we run on Hive either:
> 
> https://github.com/hortonworks/hive-testbench/blob/hdp3/ddl-tpcds/text/alltables.sql
> 
> I believe we should change it indeed, but in both places. Gopal has just 
> told me he plans to change it shortly, together with decimal and date types. 
> Should we tackle in follow-up? (We would change it for perf driver without 
> constraints too).

Yeah.. its better to test whats in standard since we aspire to test standard as 
it is in future. TODO is fine for now.


> On Oct. 19, 2018, 6:34 a.m., Ashutosh Chauhan wrote:
> > data/scripts/q_perf_test_init_constraints.sql
> > Lines 664 (patched)
> > 
> >
> > Any reason for not adding this constraint?
> 
> Jesús Camacho Rodríguez wrote:
> Yes, I had seen that. This is commented out in 
> https://github.com/hortonworks/hive-testbench/blob/hdp3/ddl-tpcds/bin_partitioned/add_constraints.sql
>  . I checked it and cr_ship_date_sk is not present in catalog_returns table. 
> I checked the tpcds spec and it is not there either. However, it is mentioned 
> somewhere in the spec in the context of materializations. I just left it 
> there commented out for reference, but I can remove it too...

commented out is fine.


> On Oct. 19, 2018, 6:34 a.m., Ashutosh Chauhan wrote:
> > itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CliConfigs.java
> > Lines 294-297 (patched)
> > 
> >
> > Whats the reason for this?
> 
> Jesús Camacho Rodríguez wrote:
> These are flaky, i.e., plan string changes slightly among different 
> executions. For those that contain window functions, the problem was 
> https://issues.apache.org/jira/browse/CALCITE-2622 , which I already fixed. 
> Then there are a couple of queries for which Calcite generates a synthetic 
> field with an identifier during optimization, and the identifier is changing 
> among executions. I have not explored that one yet, but I plan to do it. In 
> any case, the plan generation does not fail, it is just flakiness.

sounds good.


> On Oct. 19, 2018, 6:34 a.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelOptUtil.java
> > Lines 539-550 (patched)
> > 
> >
> > Is this pure refactoring? Or is there a logic change too?
> 
> Jesús Camacho Rodríguez wrote:
> This is almost pure refactoring. The only change is in new L654 : the 
> _if_ clause was outside the previous _if_ clause, which was wrong and was 
> causing hitting the assertion in L658 for a couple of tpcds queries. The 
> reason is that we should only do the inner _if_ verification if we have 
> detected that the FK-PK is present in the join, not in all cases.

ok


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69017/#review209776
---


On Oct. 20, 2018, 1:10 a.m., Jesús Camacho Rodríguez wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69017/
> ---
> 
> (Updated Oct. 20, 2018, 1:10 a.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-20718
> https://issues.apache.org/jira/browse/HIVE-20718
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20718
> 
> 
> Diffs
> -
> 
>   data/conf/perf-reg/tez/hive-site.xml 
> 78a5481e0333a3ce9bc516e03273abe6a51c9a49 
>   data/scripts/q_perf_test_init.sql d27215b4cb570c0680212157ebb348e819ad802f 
>   data/scripts/q_perf_test_init_constraints.sql PRE-CREATION 
>   
> itests/qtest/src/test/java/org/apache/hadoop/hive/cli/TestTezPerfCliDriver.java
>  98ceb214047ba56fc2e1ebabc7b0860f22524203 
>   
> itests/qtest/src/test/java/org/apache/hadoop/hive/cli/TestTezPerfConstraintsCliDriver.java
>  PRE-CREATION 
>   itests/src/test/resources/testconfiguration.properties 
> 8349e3d84eeae4695cf91cfd5cbf7f854de8d507 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CliConfigs.java 
> 5e1e88e89d29f94006764e09cf1c60e58cffdc54 
>   ql/src/java/org/apache/hadoop/hive/ql/Context.java 
> b4d5806d4ed7f23f2dcf5299fe3c0b2fbe22ff80 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java 
> 46bf088f2c5ca97e11cc7ab939ef8ddaefd453c6 
>   

Re: Review Request 69019: HIVE-20617 Fix type of constants in IN expressions to have correct type

2018-10-21 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69019/#review209831
---




ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java
Line 1171 (original), 1176 (patched)


Can remove this TODO now.



ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java
Lines 1386-1397 (patched)


Why we need to add this logic now? Prior to this patch, we would have 
coerced RHS constant to decimal if column is of type decimal.

AFAICS, this if() can only be executed if we had in query either float_col 
= 4.2BD or the expression is dobule_col = 4.2BD but that will be handled by  if 
(PrimitiveObjectInspectorUtils.floatTypeEntry.equals(primitiveTypeEntry)) {
else if 
(PrimitiveObjectInspectorUtils.doubleTypeEntry.equals(primitiveTypeEntry))



ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java
Line 1298 (original), 1404 (patched)


My suggestion is to do final HiveChar newValue = new HiveChar(constValue, 
constValue.getlength());


- Ashutosh Chauhan


On Oct. 20, 2018, 5:51 a.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69019/
> ---
> 
> (Updated Oct. 20, 2018, 5:51 a.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Vineet Garg.
> 
> 
> Bugs: HIVE-20617
> https://issues.apache.org/jira/browse/HIVE-20617
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> For IN expressions the types were never corrected; and pointlookupoptimizer 
> was probably leaving behind fields already which were uncomparable; 
> HIVE-20296 exposed it further by changing the minimal number from  32 to 2.
> 
> This change generalizes the retyping of constants to also run it for the IN 
> operator ; and also for struct-s.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/type/HiveChar.java 29dc06dca1 
>   ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java e7d71595c7 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java 
> 4968d16876 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java 
> c274fd7cc9 
>   ql/src/test/queries/clientpositive/in_typecheck_char.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/in_typecheck_pointlook.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/in_typecheck_varchar.q PRE-CREATION 
>   ql/src/test/results/clientpositive/alter_partition_coltype.q.out f6c3c5642e 
>   ql/src/test/results/clientpositive/in_typecheck2.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/in_typecheck_char.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/in_typecheck_pointlook.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/in_typecheck_varchar.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/infer_const_type.q.out e1d7de5422 
>   ql/src/test/results/clientpositive/join45.q.out 47aaf7d0ab 
>   ql/src/test/results/clientpositive/join47.q.out 4d9e937815 
>   ql/src/test/results/clientpositive/llap/dec_str.q.out 554031e952 
>   ql/src/test/results/clientpositive/llap/explainuser_1.q.out f240468558 
>   ql/src/test/results/clientpositive/llap/lineage3.q.out cf38816127 
>   ql/src/test/results/clientpositive/llap/vectorization_13.q.out 4ce654f960 
>   ql/src/test/results/clientpositive/llap/vectorization_6.q.out a2f730beca 
>   ql/src/test/results/clientpositive/llap/vectorization_8.q.out 21ce7b8ebd 
>   ql/src/test/results/clientpositive/llap/vectorization_short_regress.q.out 
> 7f1c6a295e 
>   ql/src/test/results/clientpositive/mapjoin47.q.out 294dd69de5 
>   ql/src/test/results/clientpositive/parquet_vectorization_13.q.out 
> 0efce98b55 
>   ql/src/test/results/clientpositive/parquet_vectorization_6.q.out 0bb6888364 
>   ql/src/test/results/clientpositive/parquet_vectorization_8.q.out 957bd7b264 
>   ql/src/test/results/clientpositive/ppd_udf_col.q.out 814fb5afcf 
>   ql/src/test/results/clientpositive/spark/cbo_simple_select.q.out acf91bf178 
>   ql/src/test/results/clientpositive/spark/parquet_vectorization_13.q.out 
> 3812239343 
>   ql/src/test/results/clientpositive/spark/parquet_vectorization_6.q.out 
> 6108457aad 
>   ql/src/test/results/clientpositive/spark/parquet_vectorization_8.q.out 
> 3352dedc58 
>   ql/src/test/results/clientpositive/spark/spark_explainuser_1.q.out 
> f5a4c9ad86 
>   ql/src/test/results/clientpositive/spark/subquery_scalar.q.out b3252f5415 
>   ql/src/test/results/clientpositive/spark/vectorization_13.q.out 34ec9c42dd 
>   ql/src/test/results/clientpositive/spark/vectorization_6.q.out 5679bb8cfa 
>