Hej, I have a dataset of StringID's and I want to map them to Longs by using a hash function. I will use the LongID's in a series of Iterative computations and then map back to StringID's. Currently I have a map operation that creates tuples with the string and the long. I have an other mapper cleaning out the String's.
Is there a way to do a operation that allows for more the one output set (basically split a set into 2 sets)? This would reduce the complexity of the code a lot. Also how does the optimizer deal with this case? Does it join both map operation's together and actually run it as if it would be a split? cheers Martin
