David Whiting created CRUNCH-585:
------------------------------------

             Summary: Move Java 8 lambda support into separate module 
                 Key: CRUNCH-585
                 URL: https://issues.apache.org/jira/browse/CRUNCH-585
             Project: Crunch
          Issue Type: Improvement
            Reporter: David Whiting
             Fix For: 0.14.0
         Attachments: 0001-Java-8-lambda-support-for-Apache-Crunch.patch

As discussed on a previous dev list thread, this patch implements a set of 
operations to conveniently use Java 8 lambda expressions and method references 
to construct Crunch pipelines by wrapping the PCollection instances into 
analogous "LCollection" instances which delegate the necessary operations, in 
much the same way the Scrunch wraps the Crunch Core API.

I'm still not 100% convinced that this is better for the user than the existing 
lambda support via IMapFn and IDoFn PCollection operations, so I'm still 
interested in people's views on this.

Advantages:
- Concise self-contained implementation
- Methods implemented in terms of a very basic subset of PCollection operations 
(useful if we want to scale down the PCollection API at some point)
- API can be written in terms of the Java 8 library, operating on streams and 
functional interfaces, making in more familiar to a new developer.
- Retain "type '.' and see what I can do" experience.
- Really easy to add new operations (just default method on interface)

Disadvantages:
- PCollections must be wrapped into LCollections before use.
- LCollections must be unwrapped into PCollections to access some existing 
operations.
- Using counters and other contextual data is far more complex.

Some limitations of this particular patch:
- Some omissions in API (not sure how much to implement)
- No Javadocs yet.
- Very poor tests.
- Naming is a bit off (eg. reduce() or reduceValues(), get() or underlying())

I can fix all that, but I wanted to bring the community in at this point to get 
some feedback on both the idea and the implementation as it's quite a big patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to