[ https://issues.apache.org/jira/browse/PIG-2651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272765#comment-13272765 ]
Jonathan Coveney commented on PIG-2651: --------------------------------------- Alan, can you take a look at PIG-2066? That has the fundamental TerminatingAccumulator work, and I'd like to keep the testing/code for that there, and have this patch focus on the IteratingAccumulatorEvalFunc interface once that is finished. I have javadocs, but am not sure what stability and audience annotations to add. For TerminatingAccumulator, I think it could be considered public and stable...for IteratingAccumulatorEvalFunc, Public and Evolving? > Provide a much easier to use accumulator interface > -------------------------------------------------- > > Key: PIG-2651 > URL: https://issues.apache.org/jira/browse/PIG-2651 > Project: Pig > Issue Type: New Feature > Reporter: Jonathan Coveney > Assignee: Jonathan Coveney > Fix For: 0.11, 0.10.1 > > Attachments: PIG-2651-0.patch > > > This introduces a new interface, IteratingAccumulatorEvalFunc (that name is > NOT final...). The cool thing about this patch is that it is built purely on > top of the existing Accumulator code (well, it uses PIG-2066, but it could > easily work without it). That is to say, it's an easier way to write > accumulators without having to fork the Pig codebase. > The downside is that the only way I am able to provide such a clean interface > is by using a second thread. I need to explore any potential performance > implications, but given that most of the easy to use Pig stuff has > performance implications, I think as long as we measure and and document > them, it's worth the much more usable interface. Plus I don't think it will > be too bad as one thread does the heavy lifting, while another just ferries > values in between. SUM could now be written as: > {code} > public class SUM extends IteratingAccumulatorEvalFunc<Long> { > public Long exec(Iterator<Tuple> it) throws IOException { > long sum = 0; > while (it.hasNext()) { > sum += (Long)it.next().get(0); > } > return sum; > } > } > {code} > Besides performance tests, I need to figure out how to properly test this > sort of thing. I particularly welcome advice on that front. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira