Re: Hashjoin implementation

2018-09-11 Thread vino yang
Hi Benjamin, Do you mean that you want to see HashPartition.java when you write the program? Oh, maybe you have confused something. The only thing you use to write a program is the Flink DataSet API, which is just a way to describe the job logic. And the class you are looking for, it's in the flin

Re: Hashjoin implementation

2018-09-11 Thread Benjamin Burkhardt
Hi vino, thanks. I was running a join operation on two DataSets and writing the result to disk and the results were correct. I just was not able to identify the moment when the Hashtable is built. (HashPartition.java is not used in this case?) Do you have an idea why I cannot find it? Here

Re: Hashjoin implementation

2018-09-10 Thread vino yang
Hi Benjamin, The approximate location is this package, the more accurate location is here.[1] Specifically, Hash Join is divided into two steps: 1) build side 2) probe side Thanks ,vino. [1]: https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/opera

Hashjoin implementation

2018-09-10 Thread Benjamin Burkhardt
Hi, can anyone tell me where the default hybrid hash join function for partitioning (shuffle phase) is implemented? Even after deeper dinning I was not able to figure out where it is located. Might be somewhere here? —> https://github.com/apache/flink/tree/master/flink-runtime/src/main/java/or