[jira] [Commented] (PHOENIX-2287) Spark Plugin Exception - java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericMutableRow cannot be cast to org.apache.spark.sql.Row

Josh Mahonin (JIRA) Wed, 23 Sep 2015 10:58:17 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-2287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14904916#comment-14904916
 ]


Josh Mahonin commented on PHOENIX-2287:
---------------------------------------

Updated patch adds support for Spark 1.5.0, and is backwards compatible back 
down to 1.3.0 (manually tested, Spark version profiles may be worth looking at 
in the future)

In 1.5.0, they've gone and explicitly hidden the GenericMutableRow data 
structure. Fortunately, we are able to the external-facing 'Row' data type, 
which is backwards compatible, and should remain compatible in future releases 
as well.

As part of the update, Spark SQL deprecated a constructor on their 
'DecimalType'. In updating this, I exposed a new issue, which is that we don't 
carry-forward the precision and scale of the underlying PDecimal type through 
to Spark. For now I've set it to use the Spark defaults, but I'll create 
another issue for that specifically. I've included an ignored integration test 
in this patch as well.

[~maghamravikiran] Could you take a look?

> Spark Plugin Exception - java.lang.ClassCastException: 
> org.apache.spark.sql.catalyst.expressions.GenericMutableRow cannot be cast to 
> org.apache.spark.sql.Row
> -------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-2287
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2287
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.5.2
>         Environment: - HBase 1.1.1 running in standalone mode on OS X
> - Spark 1.5.0
> - Phoenix 4.5.2
>            Reporter: Babar Tareen
>         Attachments: PHOENIX-2287.patch
>
>
> Running the DataFrame example on Spark Plugin page 
> (https://phoenix.apache.org/phoenix_spark.html) results in following 
> exception. The same code works as expected with Spark 1.4.1.
> {code:java}
>     import org.apache.spark.SparkContext
>     import org.apache.spark.sql.SQLContext
>     import org.apache.phoenix.spark._
>     val sc = new SparkContext("local", "phoenix-test")
>     val sqlContext = new SQLContext(sc)
>     val df = sqlContext.load(
>       "org.apache.phoenix.spark",
>       Map("table" -> "TABLE1", "zkUrl" -> "127.0.0.1:2181")
>     )
>     df
>       .filter(df("COL1") === "test_row_1" && df("ID") === 1L)
>       .select(df("ID"))
>       .show
> {code}
> Exception
> {quote}
> java.lang.ClassCastException: 
> org.apache.spark.sql.catalyst.expressions.GenericMutableRow cannot be cast to 
> org.apache.spark.sql.Row
>     at org.apache.spark.sql.SQLContext$$anonfun$7.apply(SQLContext.scala:439) 
> ~[spark-sql_2.11-1.5.0.jar:1.5.0]
>     at scala.collection.Iterator$$anon$11.next(Iterator.scala:363) 
> ~[scala-library-2.11.4.jar:na]
>     at scala.collection.Iterator$$anon$11.next(Iterator.scala:363) 
> ~[scala-library-2.11.4.jar:na]
>     at scala.collection.Iterator$$anon$11.next(Iterator.scala:363) 
> ~[scala-library-2.11.4.jar:na]
>     at 
> org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.processInputs(TungstenAggregationIterator.scala:366)
>  ~[spark-sql_2.11-1.5.0.jar:1.5.0]
>     at 
> org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.start(TungstenAggregationIterator.scala:622)
>  ~[spark-sql_2.11-1.5.0.jar:1.5.0]
>     at 
> org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1.org$apache$spark$sql$execution$aggregate$TungstenAggregate$$anonfun$$executePartition$1(TungstenAggregate.scala:110)
>  ~[spark-sql_2.11-1.5.0.jar:1.5.0]
>     at 
> org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:119)
>  ~[spark-sql_2.11-1.5.0.jar:1.5.0]
>     at 
> org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:119)
>  ~[spark-sql_2.11-1.5.0.jar:1.5.0]
>     at 
> org.apache.spark.rdd.MapPartitionsWithPreparationRDD.compute(MapPartitionsWithPreparationRDD.scala:64)
>  ~[spark-core_2.11-1.5.0.jar:1.5.0]
>     at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) 
> ~[spark-core_2.11-1.5.0.jar:1.5.0]
>     at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) 
> ~[spark-core_2.11-1.5.0.jar:1.5.0]
>     at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) 
> ~[spark-core_2.11-1.5.0.jar:1.5.0]
>     at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) 
> ~[spark-core_2.11-1.5.0.jar:1.5.0]
>     at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) 
> ~[spark-core_2.11-1.5.0.jar:1.5.0]
>     at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) 
> ~[spark-core_2.11-1.5.0.jar:1.5.0]
>     at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) 
> ~[spark-core_2.11-1.5.0.jar:1.5.0]
>     at org.apache.spark.scheduler.Task.run(Task.scala:88) 
> ~[spark-core_2.11-1.5.0.jar:1.5.0]
>     at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) 
> ~[spark-core_2.11-1.5.0.jar:1.5.0]
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_45]
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_45]
>     at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-2287) Spark Plugin Exception - java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericMutableRow cannot be cast to org.apache.spark.sql.Row

Reply via email to