> On Oct. 20, 2018, 6:57 p.m., Ashutosh Chauhan wrote:
> >

Overall latest version of the patch LGTM. It is a bit messy that all this logic 
lives here and probably we could rely on inspectors and existing logic to make 
everything cleaner. Let's get this one in, and we can tackle in follow-up.


> On Oct. 20, 2018, 6:57 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/alter_partition_coltype.q.out
> > Line 163 (original), 163 (patched)
> > <https://reviews.apache.org/r/69019/diff/2/?file=2101643#file2101643line163>
> >
> >     String and int comparison happens in double. So, should this be 3.0D ?

This has to do with the way that Calcite generates the SQL. I will create a 
follow-up.


> On Oct. 20, 2018, 6:57 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/in_typecheck_pointlook.q.out
> > Lines 56 (patched)
> > <https://reviews.apache.org/r/69019/diff/2/?file=2101646#file2101646line56>
> >
> >     I expected 'Unknown' should have been char of length 6. Is there a 
> > reason to expand the length to 10?
> >     As I mentioned previously if constant is of smaller length, then it 
> > doesn't make a difference, but is unnecessary, but if constant is of bigger 
> > length then LHS, then char::compare() actually truncates constant, so it 
> > better to create char with original length of constant.
> 
> Zoltan Haindrich wrote:
>     It worked before the other way around; constants are expanded to the 
> target type - the addition I've made is that if the constant is longer; then 
> its made invalid
> 
> Ashutosh Chauhan wrote:
>     I think its better to let runtime dictacte the semantics in such cases. 
> So, we just create constant char of its original length and then whatever 
> runtime does with it, we will get that, instead of us "pre-processing" value 
> at compile time. Other way to think about this is if there are 2 cols of 
> char(5) and char(10) what will runtime do? Runtime already has logic to 
> handle such cases, we let it handle that.

bq. I've made is that if the constant is longer; then its made invalid
I have moved this logic away in latest patch, we can add it in follow-up, as it 
seems to me that should not be enforced in creation of HiveChar object.


> On Oct. 20, 2018, 6:57 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/join45.q.out
> > Line 717 (original), 717 (patched)
> > <https://reviews.apache.org/r/69019/diff/2/?file=2101649#file2101649line717>
> >
> >     As discussed this should have been
> >     (struct(cast (_col0 as double), cast(_col2 as double))) IN (const 
> > struct(100.0D,100.0D), const struct(101.0D,101.0D), const 
> > struct(102.0D,102.0D))
> 
> Zoltan Haindrich wrote:
>     * column has type string.
>     * the IN statement may have the same type for all element in this case 
> it's int...
>     * in this case to change the left hand side to double: it may work
>     
>     
>     but what should happen in case the IN operands contain 1 int, 1 decimal 
> and 1 string ?
>     
>     note: during UDF evaluations there is a Set with all the values; and 
> "contains" is used - so if the inferred type is not a numeric: 
>     
>     I've just checked that even the standard IN doesn't work like this:
>     
>     ```
>     create table t (a string);
>     
>     insert into t values ('1'),('x'),('2.0');
>     
>     select * from t where a in (1.0,'x',2);
>     1
>     2.0
>     -- it doesn't return 'x' because it's casted to double
>     
>     ```
>     
>     I'm starting to think that it would be better to do this with the IN 
> unwinded into ORs: so that we could do the one-on-one constant checks and 
> then pointlookupoptimizer might collapse them if the types are the same - in 
> this case I think we would not loose 'x' in the above case - and it would 
> also make this whole recursive typecheck unneccessary.
> 
> Ashutosh Chauhan wrote:
>     We shall strive to match semantics which is already there. In Hive, for 
> expr str_col = 12, we get cast of double on both sides. So, by extensions IN 
> clause should do the same IFF all types are same. If types aren't same I 
> agree what you are proposing ie. keep it as unwinded ORs.
> 
> Zoltan Haindrich wrote:
>     Added logic to unwind ors earlier; it fixes the double casts by relys on 
> the existing logic for equals; and makes the wierd type `(1.0,'x',2)` issue 
> go away as well as `struct(null,1)` cases

Logic to unwind ORs LGTM


> On Oct. 20, 2018, 6:57 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/parquet_vectorization_13.q.out
> > Line 86 (original), 86 (patched)
> > <https://reviews.apache.org/r/69019/diff/2/?file=2101659#file2101659line86>
> >
> >     Dont we print f for float constant suffix? ie 3569.0f ?

Same as above, this has to do with the way that Calcite generates the SQL. I 
will create a follow-up.


- Jesús


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69019/#review209827
-----------------------------------------------------------


On Oct. 23, 2018, 9:49 a.m., Zoltan Haindrich wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69019/
> -----------------------------------------------------------
> 
> (Updated Oct. 23, 2018, 9:49 a.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Vineet Garg.
> 
> 
> Bugs: HIVE-20617
>     https://issues.apache.org/jira/browse/HIVE-20617
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> For IN expressions the types were never corrected; and pointlookupoptimizer 
> was probably leaving behind fields already which were uncomparable; 
> HIVE-20296 exposed it further by changing the minimal number from  32 to 2.
> 
> This change generalizes the retyping of constants to also run it for the IN 
> operator ; and also for struct-s.
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/common/type/HiveChar.java 29dc06dca1 
>   ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java e7d71595c7 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HivePointLookupOptimizerRule.java
>  04800cca91 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckCtx.java 1b56ecd044 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java 
> 4968d16876 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java 
> c274fd7cc9 
>   ql/src/test/queries/clientpositive/in_typecheck_char.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/in_typecheck_pointlook.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/in_typecheck_varchar.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/in_typecheck_x1.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/join45X.q PRE-CREATION 
>   ql/src/test/results/clientpositive/alter_partition_coltype.q.out f6c3c5642e 
>   ql/src/test/results/clientpositive/in_typecheck2.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/in_typecheck_char.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/in_typecheck_pointlook.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/in_typecheck_varchar.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/in_typecheck_x1.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/infer_const_type.q.out e1d7de5422 
>   ql/src/test/results/clientpositive/join45.q.out 47aaf7d0ab 
>   ql/src/test/results/clientpositive/join45X.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/join47.q.out 4d9e937815 
>   ql/src/test/results/clientpositive/llap/dec_str.q.out 554031e952 
>   ql/src/test/results/clientpositive/llap/explainuser_1.q.out f240468558 
>   ql/src/test/results/clientpositive/llap/lineage3.q.out 9bec309c9c 
>   ql/src/test/results/clientpositive/llap/vectorization_13.q.out 4ce654f960 
>   ql/src/test/results/clientpositive/llap/vectorization_6.q.out a2f730beca 
>   ql/src/test/results/clientpositive/llap/vectorization_8.q.out 21ce7b8ebd 
>   ql/src/test/results/clientpositive/llap/vectorization_short_regress.q.out 
> 7f1c6a295e 
>   ql/src/test/results/clientpositive/mapjoin47.q.out 294dd69de5 
>   ql/src/test/results/clientpositive/parquet_vectorization_13.q.out 
> 0efce98b55 
>   ql/src/test/results/clientpositive/parquet_vectorization_6.q.out 0bb6888364 
>   ql/src/test/results/clientpositive/parquet_vectorization_8.q.out 957bd7b264 
>   ql/src/test/results/clientpositive/ppd_udf_col.q.out 814fb5afcf 
>   ql/src/test/results/clientpositive/spark/cbo_simple_select.q.out acf91bf178 
>   ql/src/test/results/clientpositive/spark/parquet_vectorization_13.q.out 
> 3812239343 
>   ql/src/test/results/clientpositive/spark/parquet_vectorization_6.q.out 
> 6108457aad 
>   ql/src/test/results/clientpositive/spark/parquet_vectorization_8.q.out 
> 3352dedc58 
>   ql/src/test/results/clientpositive/spark/spark_explainuser_1.q.out 
> f5a4c9ad86 
>   ql/src/test/results/clientpositive/spark/subquery_scalar.q.out b3252f5415 
>   ql/src/test/results/clientpositive/spark/vectorization_13.q.out 34ec9c42dd 
>   ql/src/test/results/clientpositive/spark/vectorization_6.q.out 5679bb8cfa 
>   ql/src/test/results/clientpositive/spark/vectorization_short_regress.q.out 
> 231dea6de3 
>   ql/src/test/results/clientpositive/vectorization_13.q.out 8897f8427f 
>   ql/src/test/results/clientpositive/vectorization_6.q.out 8dedb63e7d 
>   ql/src/test/results/clientpositive/vectorization_8.q.out d81df76a2f 
> 
> 
> Diff: https://reviews.apache.org/r/69019/diff/3/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Zoltan Haindrich
> 
>

Reply via email to