akpatnam25 commented on a change in pull request #33644:
URL: https://github.com/apache/spark/pull/33644#discussion_r690532188
##########
File path: core/src/main/scala/org/apache/spark/rdd/RDD.scala
##########
@@ -1233,6 +1233,22 @@ abstract class RDD[T: ClassTag](
(i, iter) => iter.map((i % curNumPartitions, _))
Review comment:
i don't think we can do that as we did not want to cause binary
incompabilities. that was the whole reason i had to revert back to the config
approach. refer to this thread:
https://github.com/apache/spark/pull/33644#discussion_r687122613
##########
File path: core/src/main/scala/org/apache/spark/rdd/RDD.scala
##########
@@ -1233,6 +1233,22 @@ abstract class RDD[T: ClassTag](
(i, iter) => iter.map((i % curNumPartitions, _))
Review comment:
Overloading also does not work due to the reasons given in the above
thread
##########
File path: core/src/main/scala/org/apache/spark/rdd/RDD.scala
##########
@@ -1233,6 +1233,22 @@ abstract class RDD[T: ClassTag](
(i, iter) => iter.map((i % curNumPartitions, _))
Review comment:
yes, we can add a new method with signature that looks like this:
```
def treeAggregate[U](
zeroValue: U,
seqOp: JFunction2[U, T, U],
combOp: JFunction2[U, U, U],
depth: Int,
finalAggregateOnExecutor: Boolean): U = {
...
```
you're right, we dont need to worry about Mima excludes then.
but that is not the problem. the problem is that this method will need to be
overloaded by the other treeAggregate methods. for example, the other
treeAggregate methods will have to call this version of treeAggregate. this is
an issue because of zeroValue:U is only inspected by the compiler to resolve
the overloaded methods. Due to that the compiler throws ambiguous reference to
overloaded definition, and this cannot be resolved..
there are only 2 workarounds:
1. create a new method with a different name but same signature with the new
parameter. something like `treeAggregateExecutor` and have somewhat duplicate
code in that method
2. use config approach
##########
File path: core/src/main/scala/org/apache/spark/rdd/RDD.scala
##########
@@ -1233,6 +1233,22 @@ abstract class RDD[T: ClassTag](
(i, iter) => iter.map((i % curNumPartitions, _))
Review comment:
yes, we can add a new method with signature that looks like this:
```
def treeAggregate[U](
zeroValue: U,
seqOp: JFunction2[U, T, U],
combOp: JFunction2[U, U, U],
depth: Int,
finalAggregateOnExecutor: Boolean): U = {
...
```
you're right, we dont need to worry about Mima excludes then.
but that is not the problem. the problem is that this method will need to be
overloaded by the other treeAggregate methods. for example, the other
treeAggregate methods will have to call this version of treeAggregate. this is
an issue because of zeroValue:U is only inspected by the compiler to resolve
the overloaded methods. Due to that the compiler throws ambiguous reference to
overloaded definition, and this cannot be resolved..
there are only 2 workarounds:
1. create a new method with a different name but same signature with the new
parameter. something like `treeAggregateExecutor` and have somewhat duplicate
code in that method. me and @venkata91 thought this approach was kind of ugly
2. use config approach
##########
File path: core/src/main/scala/org/apache/spark/rdd/RDD.scala
##########
@@ -1233,6 +1233,22 @@ abstract class RDD[T: ClassTag](
(i, iter) => iter.map((i % curNumPartitions, _))
Review comment:
yes, we can add a new method with signature that looks like this:
```
def treeAggregate[U](
zeroValue: U,
seqOp: JFunction2[U, T, U],
combOp: JFunction2[U, U, U],
depth: Int,
finalAggregateOnExecutor: Boolean): U = {
...
```
you're right, we dont need to worry about Mima excludes then.
but that is not the problem. the problem is that this method will need to be
overloaded by the other treeAggregate methods. for example, the other
treeAggregate methods will have to call this version of treeAggregate. this is
an issue because of zeroValue:U is only inspected by the compiler to resolve
the overloaded methods. Due to that the compiler throws ambiguous reference to
overloaded definition, and this cannot be resolved..
there are only 2 workarounds:
1. create a new method with a different name but same signature with the new
parameter. something like `treeAggregateExecutor` and have somewhat duplicate
code in that method. me and @venkata91 thought this approach was kind of ugly
2. use config approach which is the current implementation
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]