Re: questions about Flink's HashJoin performance

2017-05-18 Thread weijie tong
e performance. Also, the build measurements include the data > generation, which influences the results. > > If you want to purely benchmark the HashTable performance, try using > something like "Tuple2<Long, Long>" or so (or write your own custom > TypeSerializer

Re: questions about Flink's HashJoin performance

2017-05-16 Thread weijie tong
ews/2015/03/13/peeking-into-Apache-Flinks-Engine-Room.html > > > 2017-05-15 16:26 GMT+02:00 weijie tong <tongweijie...@gmail.com>: > >> The Flink version is 1.2.0 >> >> On Mon, May 15, 2017 at 10:24 PM, weijie tong <tongweijie...@gmail.com> >>

Re: questions about Flink's HashJoin performance

2017-05-15 Thread weijie tong
The Flink version is 1.2.0 On Mon, May 15, 2017 at 10:24 PM, weijie tong <tongweijie...@gmail.com> wrote: > @Till thanks for your reply. > > My code is similar to HashTableITCase.testInMemoryMutableHashTable() > . It just use the MutableHashTable class , there's

Re: questions about Flink's HashJoin performance

2017-05-15 Thread weijie tong
hash table elapsed:1885ms On Mon, May 15, 2017 at 6:20 PM, Till Rohrmann <trohrm...@apache.org> wrote: > Hi Weijie, > > it might be the case that batching the processing of multiple rows can > give you an improved performance compared to single row processing. > > Maybe you cou

questions about Flink's HashJoin performance

2017-05-13 Thread weijie tong
I has a test case to use Flink's MutableHashTable class to do a hash join on a local machine with 64g memory, 64cores. The test case is one build table with 14w rows ,one probe table with 320w rows ,the matched result rows is 12 w. It takes 2.2 seconds to complete the join.The performance seems