[ https://issues.apache.org/jira/browse/CRUNCH-538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14625554#comment-14625554 ]
Josh Wills commented on CRUNCH-538: ----------------------------------- [~gabriel.reid] thanks for the comments here. Replying in an inline-y way. First, for the XYZWithContext functions, I find that most developers who are good at writing data pipelines make extensive use of counters to track progress and errors in their code. I think that lambdas that don't allow the developer to have access to the counter/etc. info are much less useful in practice. The other approach I could get on board with would be something that looked like what Cloud Dataflow did, where we have two types of lambdas: 1) Lambdas that simply operate on the value directly and return an Iterable, single value, boolean filter, etc. and 2) A lambda that takes a single FnContext object (or similar naming) that wraps up the current value to be processed, the counters, the configuration, and the output emitter into a single interface, which is modeled directly after Cloud Dataflow DoFns. The advantage of this would be a) less tight coupling w/the MR stuff directly (although that will always be unavoidable for legacy reasons) and b) we could collapse filterWithContext, mapWithContext, flatMapWithContext into a single parallelDo-style implementation that could still be a lambda. Honestly, the more I think about that, the more I like it. I hear you on the name parameter making debugging easier, I'm amenable to adding it back in as part of the FnContext change. We should try to encourage our best practices in these API extensions, and named stages are definitely a best practice. I hear you on the IFilterFn -> FilterFn stuff, but I'm trying to avoid recompilation for folks who are happy rolling along on Java 7 etc., so I'd prefer not to do it for this rev. > Add support for Java lambdas to PCollection/PTable methods > ---------------------------------------------------------- > > Key: CRUNCH-538 > URL: https://issues.apache.org/jira/browse/CRUNCH-538 > Project: Crunch > Issue Type: Improvement > Components: Core > Affects Versions: 0.12.0 > Reporter: Josh Wills > Assignee: Josh Wills > Fix For: 0.13.0 > > Attachments: CRUNCH-538.patch > > > Java 8 is more-or-less mainstream at this point, and lambdas are one of its > best new features. Let's add lambda-friendly interfaces and methods to the > PCollection/PTable classes modeled after the methods defined for Scrunch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)