GitHub user ChengXiangLi opened a pull request:

    https://github.com/apache/flink/pull/1469

    [FLINK-2971] support outer join for hash join on build side.

    1. There are 4 reserved bytes left in bucket header of `MutableHashTable`, 
as there are only 9 elements in each bucket, This PR could use 2 bytes to build 
a BitSet which is used to mark whether elements in that bucket has been probed 
during probe phase. After probe phase, return the elements which has not been 
probed at the end.
    2. As build side outer join is supported, we could support more flexible 
strategy for left outer join, right outer join and full outer join, new 
supported join types includes:
      * left outer join with `REPARTITION_HASH_FIRST`. 
      * right outer join with `REPARTITION_HASH_SECOND`
      * full outer join with `REPARTITION_HASH_FIRST` or 
`REPARTITION_HASH_SECOND`.
    3. But there is still some limitations about broadcast hash join, the 
following join types are still not supported for obviously reason:
      * left outer join with `BROADCAST_HASH_FIRST`.
      * right outer join with `BROADCAST_HASH_SECOND`.
      * full outer join with `BROADCAST_HASH_FIRST` and `BROADCAST_HASH_SECOND`.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ChengXiangLi/flink hashFullOuterJoin

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/1469.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1469
    
----
commit a3445a0666dd9c349bc11fca2e2554d500175280
Author: chengxiang li <[email protected]>
Date:   2015-12-17T03:34:41Z

    [FLINK-2871] support outer join for hash on build side.

commit 92961bcd26e2dafb70006ea673abb07a67b77c9b
Author: chengxiang li <[email protected]>
Date:   2015-12-18T04:52:55Z

    fix format

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to