Hi camelia,

Yes, your understanding is correct. Tajo uses such an approach for
building hash tables.

Best regards,
Hyunsik

On Fri, Sep 6, 2013 at 7:31 PM, camelia c <[email protected]> wrote:
> Hello,
>
> I have a question regarding the hash function used in HashJoinExec, please.
> From the source code, I reached the conclusion that in TAJO the hash function 
> used in the build phase of the algorithm is the identity function:
>      h(x) = x
>
> Am I correct?
>
> I shall give some examples and please correct me if I misunderstood something 
> about TAJO's approach.
> I shall use the notation ( , , .. ,) for a Tuple  and [ , , , ]  for a list 
> of elements
>
>
>
> For example
>
> Example 1)
>
> Given input set of tuples
> {(1,aaa), (1,bbb), (1,ccc), (2,ddd), (5,eee)}
>
>
> and if the join key consists of the first numeric column, then we have in the 
> build table (tupleSlots):
>
> keyTuple |  Value  which is ArrayList of Tuple-s
> ----------------------------------------------------
>
> (1)          |  [ (1,aaa), (1,bbb) , (1,ccc) ]
>
> (2)          |  [ (2,ddd) ]
>
> (5)          |  [ (5,eee) ]
>
> Example 2)
>
> Given input set of tuples{(10,A,aaa), (10,A,bbb), (10,A,ccc), (20,B,ddd), 
> (50,C,eee)}
>
> and  if the join key consists of the first two columns (a numeric and a 
> string), then we have in the build table (tupleSlots):
>
> keyTuple  |  Value which is ArrayList of Tuple-s
> --------------------------------------------------------
>
> (10, A)    |  [ (10, A, aaa), (10, A, bbb), (10, A, ccc) ]
> (20, B)    |  [ (20, B, ddd) ]
> (50, C)    |  [ (50, C, eee) ]
>
>
>
> Thank You all in advance.
>
> Yours sincerely,
> Camelia

Reply via email to