Hi Flavio,

Here's an simple example of a Left Outer Join:
https://gist.github.com/mxm/c2e9c459a9d82c18d789

As Stephan pointed out, this can be very easily modified to construct a
Right Outer Join (just exchange leftElements and rightElements in the two
loops).

Here's an excerpt with the most important part, the coGroup function:

public static class LeftOuterJoin implements
CoGroupFunction<Tuple2<Integer, String>, Tuple2<Integer, String>,
Tuple2<Integer, Integer>> {

   @Override
   public void coGroup(Iterable<Tuple2<Integer, String>> leftElements,
                       Iterable<Tuple2<Integer, String>> rightElements,
                       Collector<Tuple2<Integer, Integer>> out) throws
Exception {

      final int NULL_ELEMENT = -1;

      for (Tuple2<Integer, String> leftElem : leftElements) {
         boolean hadElements = false;
         for (Tuple2<Integer, String> rightElem : rightElements) {
            out.collect(new Tuple2<Integer, Integer>(leftElem.f0,
rightElem.f0));
            hadElements = true;
         }
         if (!hadElements) {
            out.collect(new Tuple2<Integer, Integer>(leftElem.f0,
NULL_ELEMENT));
         }
      }

   }
}



On Wed, Apr 15, 2015 at 11:01 AM, Stephan Ewen <se...@apache.org> wrote:

> I think this may be a great example to add as a utility function.
>
> Or actually add as an function to the DataSet, internally realized as a
> special case of coGroup.
>
> We do not have a ready example of that, but it should be straightforward
> to realize. Similar as for the join, coGroup on the join keys. Inside the
> coGroup function, emit the combination of all values from the two
> iterators. If one of them is empty (the one that is not outer) then emit
> all values from the outer side.
>
> Greetings,
> Stephan
>
>
> On Wed, Apr 15, 2015 at 10:36 AM, Flavio Pompermaier <pomperma...@okkam.it
> > wrote:
>
>> Do you have an already working example of it? :)
>>
>>
>> On Wed, Apr 15, 2015 at 10:32 AM, Ufuk Celebi <u...@apache.org> wrote:
>>
>>>
>>> On 15 Apr 2015, at 10:30, Flavio Pompermaier <pomperma...@okkam.it>
>>> wrote:
>>>
>>> >
>>> > Hi to all,
>>> > I have to join two datasets but I'd like to keep all data in the left
>>> also if there' no right dataset.
>>> > How can you achieve that in Flink? maybe I should use coGroup?
>>>
>>> Yes, currently you have to implement this manually with a coGroup
>>
>>
>>
>

Reply via email to