[ 
https://issues.apache.org/jira/browse/SPARK-25047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16572410#comment-16572410
 ] 

Sean Owen commented on SPARK-25047:
-----------------------------------

More notes.These two SO answers shed a little light:

[https://stackoverflow.com/a/28367602/64174]

[https://stackoverflow.com/questions/28079307/unable-to-deserialize-lambda/28084460#28084460]

It suggests the problem is that the SerializedLambda instance that is 
deserialized should provide a readResolve() method to, I assume, resolve it 
back into a scala.Function1. And that should actually be implemented by a 
{{$deserializeLambda$(SerializedLambda)}} function in the capturing class. It 
seems like something isn't turning it back from a SerializedLambda to something 
else.

The method is in the byte code of BucketedRandomProjectionLSH and decompiles as
{code:java}
private static /* synthetic */ Object $deserializeLambda$(SerializedLambda 
serializedLambda) {
    return LambdaDeserialize.bootstrap(new 
MethodHandle[]{$anonfun$hashDistance$1$adapted(scala.Tuple2 ), 
$anonfun$hashFunction$2$adapted(org.apache.spark.ml.feature.BucketedRandomProjectionLSHModel
 org.apache.spark.ml.linalg.Vector org.apache.spark.ml.linalg.Vector ), 
$anonfun$hashFunction$3$adapted(java.lang.Object ), 
$anonfun$hashFunction$1(org.apache.spark.ml.feature.BucketedRandomProjectionLSHModel
 org.apache.spark.ml.linalg.Vector )}, serializedLambda);

}{code}
While I traced through this for a while, I couldn't make sense of it. However, 
nothing actually failed around here. The ultimate error was a bit later, and as 
in the StackOverflow post above.

It goes without saying that there are plenty of fields of type scala.Function1 
in Spark and this is the only problem one, and I can't see why. Is it because 
it involves an array type? grepping suggests that could be unique. However I 
tried to create a repro in a simple class file and all worked as expected too.

Something is odd about this case, and I don't know if it is in fact triggering 
some odd corner case issue in scala or Java 8, or whether the Spark code could 
be tweaked to dodge it.

 

> Can't assign SerializedLambda to scala.Function1 in deserialization of 
> BucketedRandomProjectionLSHModel
> -------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-25047
>                 URL: https://issues.apache.org/jira/browse/SPARK-25047
>             Project: Spark
>          Issue Type: Sub-task
>          Components: ML
>    Affects Versions: 2.4.0
>            Reporter: Sean Owen
>            Priority: Major
>
> Another distinct test failure:
> {code:java}
> - BucketedRandomProjectionLSH: streaming transform *** FAILED ***
>   org.apache.spark.sql.streaming.StreamingQueryException: Query [id = 
> 7f34fb07-a718-4488-b644-d27cfd29ff6c, runId = 
> 0bbc0ba2-2952-4504-85d6-8aba877ba01b] terminated with exception: Job aborted 
> due to stage failure: Task 0 in stage 16.0 failed 1 times, most recent 
> failure: Lost task 0.0 in stage 16.0 (TID 16, localhost, executor driver): 
> java.lang.ClassCastException: cannot assign instance of 
> java.lang.invoke.SerializedLambda to field 
> org.apache.spark.ml.feature.BucketedRandomProjectionLSHModel.hashFunction of 
> type scala.Function1 in instance of 
> org.apache.spark.ml.feature.BucketedRandomProjectionLSHModel
> ...
>   Cause: java.lang.ClassCastException: cannot assign instance of 
> java.lang.invoke.SerializedLambda to field 
> org.apache.spark.ml.feature.BucketedRandomProjectionLSHModel.hashFunction of 
> type scala.Function1 in instance of 
> org.apache.spark.ml.feature.BucketedRandomProjectionLSHModel
>   at 
> java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2233)
>   at java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1405)
>   at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2284)
> ...{code}
> Here the different nature of a Java 8 LMF closure trips of Java 
> serialization/deserialization. I think this can be patched by manually 
> implementing the Java serialization here, and don't see other instances (yet).
> Also wondering if this "val" can be a "def".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to