[ https://issues.apache.org/jira/browse/PIG-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14679791#comment-14679791 ]
kexianda commented on PIG-4645: ------------------------------- [~mohitsabharwal], thanks for your comments. The build-in LongAccumulatorParam is defined as a implicit singleton object. {code:title=Accumulators.scala|borderStyle=solid} object AccumulatorParam { implicit object LongAccumulatorParam extends AccumulatorParam[Long] { def addInPlace(t1: Long, t2: Long) = t1 + t2 def zero(initialValue: Long) = 0L } //... } //user can write code like this in scala val accLong = sc.accumulator(0L)(LongAccumulatorParam) {code} But, Java has no exact equivalent to a singleton object. {code} //sparkContext.sc().accumulable(0L, "long", AccumulatorParam.LongAccumulatorParam$); //oops! {code} In JavaSparkContext.scala, there are helper functions intAccumulator() & doubleAccumulator() for int and double. But no such helper function for Long. {code} Accumulator<Integer> intAcc = sparkContext.intAccumulator(0, "integer"); Accumulator<Double> doubleAcc = sparkContext.doubleAccumulator(0.0, "double"); //Accumulator<long> doubleAcc = sparkContext.longAccumulator(0.0, "long"); //oops! {code} That's why we have to implement AccumulatorParam<Long>. > Support hadoop-like Counter using spark accumulator > --------------------------------------------------- > > Key: PIG-4645 > URL: https://issues.apache.org/jira/browse/PIG-4645 > Project: Pig > Issue Type: Sub-task > Components: spark > Reporter: kexianda > Assignee: kexianda > Fix For: spark-branch > > Attachments: PIG-4645.patch > > > Pig collect Input/Output statistic info via Counter in MR/Tez mode, we need > to support this using spark accumulator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)