> On April 20, 2015, 1:01 a.m., Aman Sinha wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/ChainedHashTable.java,
> >  line 189
> > <https://reviews.apache.org/r/33343/diff/1/?file=935063#file935063line189>
> >
> >     If we have  int a = bigint b , both sides will be hash distributed as 
> > doubles, then when we are doing the join (either hash or merge join), we 
> > will ignore that they were distributed as double and just implict cast the 
> > int to bigint ...is that correct ? 
> >     
> >     Any thoughts on the performance impact ? Previously the comparison 
> > function would have taken care of comparing different types, now it must go 
> > through implicit casting.

we will ignore that they were distributed as double and just implict cast the 
int to bigint ...is that correct ? 
Yes that is the behavior currently. 


Performance impact should be minimal or none, I would think since even with the 
explicit function implementations we were doing the casting. However we 
probably might be creating an extra object (for the holder) during implicit 
casting but I think that should be ok.


> On April 20, 2015, 1:01 a.m., Aman Sinha wrote:
> > exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/join/TestJoinAdvanced.java,
> >  line 80
> > <https://reviews.apache.org/r/33343/diff/1/?file=935067#file935067line80>
> >
> >     It would be good to have a test with join conditions on 2 or more 
> > columns that require implicit casting - 
> >     WHERE t1.a1 = t2.a2 
> >           AND t1.b1 = t2.b2 
> >           AND t1.c1 = t2.c2 
> >      where a1:int, b1:bigint, b1:float, b2:double, c1:double, c2: varchar 
> > with numeric double values.  
> >     This does not necesarily have to be unit test but see if you can get it 
> > included either as unit or functional test.

I have added a unit test with miniscule data as part of the unit test. I will 
add a test with more substantial data to the functional suite as well. 

I have added the case with multiple join conditions with the following data 
types on the left and right sides
1. bigint and int
2. double and float 
3. bigint and double

The case that you mention about varchar and double isn't supported, since our 
distribution only takes care of distributing the numeric values as double. This 
behavior is consistent with Postgres where in such cases users are required to 
put an explicit cast. To support such a scenario we would have to hash 
everything as varchar which will not be very efficient.


- Mehant


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33343/#review80643
-----------------------------------------------------------


On April 20, 2015, 4:55 a.m., Mehant Baid wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33343/
> -----------------------------------------------------------
> 
> (Updated April 20, 2015, 4:55 a.m.)
> 
> 
> Review request for drill and Aman Sinha.
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> DRILL-2753 aims to remove the comparision function implementations that have 
> different data type inputs (Look at the JIRA for more details). As a result 
> we need to modify hash join and merge join so that when we have different 
> types in the join condition they can apply implicit casts to do the 
> comparison. We have resolved the issue of distribution (as part of 
> DRILL-2244) and so different data types with the same numeric value will be 
> correctly distributed to the same node. 
> 
> As part of this change we materialize the expression in the join condition, 
> check if the types are different and apply casts if necessary.
> 
> 
> Diffs
> -----
> 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/ChainedHashTable.java
>  9df67d8 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/JoinUtils.java
>  7fa79a1 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/MergeJoinBatch.java
>  8fce52e 
>   
> exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/join/TestHashJoinAdvanced.java
>  796f6fe 
>   
> exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/join/TestJoinAdvanced.java
>  PRE-CREATION 
>   exec/java-exec/src/test/resources/jsoninput/implicit_cast_join_1.json 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/33343/diff/
> 
> 
> Testing
> -------
> 
> Added unit tests plus ran existing tests with different data types in join 
> conditions.
> 
> 
> Thanks,
> 
> Mehant Baid
> 
>

Reply via email to