[ 
https://issues.apache.org/jira/browse/DATAFU-168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eyal Allweil updated DATAFU-168:
--------------------------------
    Description: 
Once DATAFU-167 is merged, datafu-spark will support Spark versions up to 
2.4.5. However, because our implementation of _collectLimitedList_ extends 
Spark's {_}collect{_}, and because its interface was changed in 2.4.6, 
compilation is broken for us.

 

Here is the relevant line from collectLimitedList: 
[https://github.com/apache/datafu/blob/master/datafu-spark/src/main/scala/spark/utils/overwrites/SparkOverwriteUDAFs.scala#L104)]

Here is the compilation warning:
{code:java}
/Users/eyal/git/datafu/datafu-spark/src/main/scala/spark/utils/overwrites/SparkOverwriteUDAFs.scala:104:
 class CollectLimitedList needs to be abstract, since:
it has 3 unimplemented members.
/** As seen from class CollectLimitedList, the missing signatures are as 
follows.
 *  For convenience, these are usable as stub implementations.
 */
  // Members declared in 
org.apache.spark.sql.catalyst.expressions.aggregate.Collect
  protected val bufferElementType: org.apache.spark.sql.types.DataType = ???
  protected def convertToBufferElement(value: Any): Any = ???
  // Members declared in 
org.apache.spark.sql.catalyst.expressions.aggregate.TypedImperativeAggregate
  def eval(buffer: scala.collection.mutable.ArrayBuffer[Any]): Any = ???
case class CollectLimitedList(child: Expression,
           ^
one error found
FAILURE: Build failed with an exception.
{code}
 

 

We need to either *1)* update our implementation, and drop support for older 
versions (and then release this in our version 1.8.0) or *2)* copy the code in 
a backwards compatible way.

Please note that you can replicate this compilation error on the master branch 
even without merging DATAFU-167 by running:
{code:java}
./gradlew :datafu-spark:test -PscalaVersion=2.11 -PsparkVersion=2.4.6 --tests 
"DataFrame*"{code}

  was:
Once DATAFU-167 is merged, datafu-spark will support Spark versions up to 
2.4.5. However, because our implementation of _collectLimitedList_ extends 
Spark's {_}collect{_}, and because its interface was changed in 2.4.6, 
compilation is broken for us.

 

(here is the relevant line from collectLimitedList: 
[https://github.com/apache/datafu/blob/master/datafu-spark/src/main/scala/spark/utils/overwrites/SparkOverwriteUDAFs.scala#L104)]

 

We need to either *1)* update our implementation, and drop support for older 
versions (and then release this in our version 1.8.0) or *2)* copy the code in 
a backwards compatible way.

 

Please note that testing for Spark versions 2.4.6 and up may also require 
updating our version of spark-testing-base.


> Add support for Spark 2.4.6 and up
> ----------------------------------
>
>                 Key: DATAFU-168
>                 URL: https://issues.apache.org/jira/browse/DATAFU-168
>             Project: DataFu
>          Issue Type: Improvement
>    Affects Versions: 1.6.1
>            Reporter: Eyal Allweil
>            Priority: Major
>             Fix For: 1.8.0
>
>
> Once DATAFU-167 is merged, datafu-spark will support Spark versions up to 
> 2.4.5. However, because our implementation of _collectLimitedList_ extends 
> Spark's {_}collect{_}, and because its interface was changed in 2.4.6, 
> compilation is broken for us.
>  
> Here is the relevant line from collectLimitedList: 
> [https://github.com/apache/datafu/blob/master/datafu-spark/src/main/scala/spark/utils/overwrites/SparkOverwriteUDAFs.scala#L104)]
> Here is the compilation warning:
> {code:java}
> /Users/eyal/git/datafu/datafu-spark/src/main/scala/spark/utils/overwrites/SparkOverwriteUDAFs.scala:104:
>  class CollectLimitedList needs to be abstract, since:
> it has 3 unimplemented members.
> /** As seen from class CollectLimitedList, the missing signatures are as 
> follows.
>  *  For convenience, these are usable as stub implementations.
>  */
>   // Members declared in 
> org.apache.spark.sql.catalyst.expressions.aggregate.Collect
>   protected val bufferElementType: org.apache.spark.sql.types.DataType = ???
>   protected def convertToBufferElement(value: Any): Any = ???
>   // Members declared in 
> org.apache.spark.sql.catalyst.expressions.aggregate.TypedImperativeAggregate
>   def eval(buffer: scala.collection.mutable.ArrayBuffer[Any]): Any = ???
> case class CollectLimitedList(child: Expression,
>            ^
> one error found
> FAILURE: Build failed with an exception.
> {code}
>  
>  
> We need to either *1)* update our implementation, and drop support for older 
> versions (and then release this in our version 1.8.0) or *2)* copy the code 
> in a backwards compatible way.
> Please note that you can replicate this compilation error on the master 
> branch even without merging DATAFU-167 by running:
> {code:java}
> ./gradlew :datafu-spark:test -PscalaVersion=2.11 -PsparkVersion=2.4.6 --tests 
> "DataFrame*"{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to