Oops, my bad. Here's a Gist: https://gist.github.com/DavW/e2588e42c45ad8c06038
On 11 December 2015 at 18:43, Josh Wills <josh.wi...@gmail.com> wrote: > I think it's kind of awesome, but the attachment didn't go through- PR or > gist? > On Fri, Dec 11, 2015 at 7:42 AM David Whiting <d...@apache.org> wrote: > > > While fixing the bug where the IFn version of mapValues on PGroupedTable > > was missing, I got thinking that this is quite an inefficient way of > > including support for lambdas and method references, and it still didn't > > actually support quite a few of the features that would make it easy to > > code against. > > > > Negative parts of existing lambda implementation: > > 1) Explosion of already-crowded PCollection, PTable and PGroupedTable > > interfaces, and having to implement those methods in all implementations. > > 2) Not supporting flatMap to Optional or Stream types. > > 3) Not exposing convenient types for reduce-type operations (Stream > > instead of Iterable, for example). > > > > Something that would solve all three of these is to build lambda support > > as a separate artifact (so we can use all java8 types), and instead of > the > > API being directly on the PSomething interfaces, we just have convenient > > ways to wrap up lambdas into DoFns or MapFns via statically-imported > > methods. > > > > The usage then becomes > > import static org.apache.crunch.Lambda.*; > > ... > > someCollection.parallelDo(flatMap(d -> someFnOf(d)), pt) > > ... > > otherGroupedTable.mapValue(reduce(seq -> seq.mapToInt(i -> i).sum()), > > ints()) > > > > Where flatMap and reduce are static methods on Lambda, and Lambda goes in > > it's own artifact (to preserve compatibility with 6 and 7 for the rest of > > Crunch). > > I've attached a basic proof-of-concept implementation which I've tested a > > few things with, and I'm very happy to sketch out a more substantial > > implementation if people here think it's a good idea in general. > > > > Thoughts? Ideas? Suggestions? Please tell me if this is crazy. > > > > >