Hello,

I have a question regarding the hash function used in HashJoinExec, please.
From the source code, I reached the conclusion that in TAJO the hash function 
used in the build phase of the algorithm is the identity function:
     h(x) = x

Am I correct?

I shall give some examples and please correct me if I misunderstood something 
about TAJO's approach.
I shall use the notation ( , , .. ,) for a Tuple  and [ , , , ]  for a list of 
elements



For example

Example 1) 

Given input set of tuples
{(1,aaa), (1,bbb), (1,ccc), (2,ddd), (5,eee)}


and if the join key consists of the first numeric column, then we have in the 
build table (tupleSlots):

keyTuple |  Value  which is ArrayList of Tuple-s
----------------------------------------------------

(1)          |  [ (1,aaa), (1,bbb) , (1,ccc) ]

(2)          |  [ (2,ddd) ]

(5)          |  [ (5,eee) ]

Example 2)

Given input set of tuples{(10,A,aaa), (10,A,bbb), (10,A,ccc), (20,B,ddd), 
(50,C,eee)}

and  if the join key consists of the first two columns (a numeric and a 
string), then we have in the build table (tupleSlots):

keyTuple  |  Value which is ArrayList of Tuple-s
--------------------------------------------------------

(10, A)    |  [ (10, A, aaa), (10, A, bbb), (10, A, ccc) ]
(20, B)    |  [ (20, B, ddd) ]
(50, C)    |  [ (50, C, eee) ]



Thank You all in advance.

Yours sincerely,
Camelia

Reply via email to