[
https://issues.apache.org/jira/browse/SPARK-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Anton Anastasov updated SPARK-13453:
------------------------------------
Description:
The signature of the aggregateByKey method over PairRDDs does not provide
access to the actual key. My proposal is to create a new overloaded method that
includes the key. This should not be hard.
There is a workaround possible currently -- we can just map the PairRDD[K, V]
to PairRDD[K, (K, V)], but this seems convoluted.
Let me know what you think, and if I should go ahead with a pull request.
was:
The signature of the aggregateByKey method over PairRDDs does not provide
access to the actual key. My proposal is to create a new overloaded method that
includes the key. This should not be hard.
There is a workaround possible currently -- we can just modify the PairRDD[K,
v] to PairRDD[K, (K, V)], but this seems convoluted.
Let me know what you think, and if I should go ahead with a pull request.
> Including groupKey in the signature of aggregateByKey
> -----------------------------------------------------
>
> Key: SPARK-13453
> URL: https://issues.apache.org/jira/browse/SPARK-13453
> Project: Spark
> Issue Type: Improvement
> Reporter: Anton Anastasov
> Priority: Minor
>
> The signature of the aggregateByKey method over PairRDDs does not provide
> access to the actual key. My proposal is to create a new overloaded method
> that includes the key. This should not be hard.
> There is a workaround possible currently -- we can just map the PairRDD[K, V]
> to PairRDD[K, (K, V)], but this seems convoluted.
> Let me know what you think, and if I should go ahead with a pull request.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]