LuciferYang commented on PR #50665: URL: https://github.com/apache/spark/pull/50665#issuecomment-2838503692
It appears that after merging this pr, it caused test failures for `org.apache.spark.sql.connect.ml.MLSuite` in the connect module. - https://github.com/apache/spark/actions/runs/14728199705/job/41335911530 Here's how I conducted the local inspection: ``` git reset --hard 6f9bf73c345d70c3d27ea2e1ebadaa03a275fb3c // this one build/sbt clean "connect/testOnly org.apache.spark.sql.connect.ml.MLSuite" ``` ``` [info] - LogisticRegression works *** FAILED *** (8 seconds, 2 milliseconds) [info] org.apache.spark.SparkRuntimeException: [EXPRESSION_DECODING_FAILED] Failed to decode a row to a value of the expressions: newInstance(class org.apache.spark.ml.classification.LogisticRegressionModel$Data). SQLSTATE: 42846 [info] at org.apache.spark.sql.errors.QueryExecutionErrors$.expressionDecodingError(QueryExecutionErrors.scala:1364) [info] at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$Deserializer.apply(ExpressionEncoder.scala:95) [info] at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$Deserializer.apply(ExpressionEncoder.scala:80) [info] at scala.collection.ArrayOps$.map$extension(ArrayOps.scala:936) [info] at org.apache.spark.sql.classic.Dataset.collectFromPlan(Dataset.scala:2244) [info] at org.apache.spark.sql.classic.Dataset.$anonfun$head$1(Dataset.scala:1381) [info] at org.apache.spark.sql.classic.Dataset.$anonfun$withAction$2(Dataset.scala:2234) [info] at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:642) [info] at org.apache.spark.sql.classic.Dataset.$anonfun$withAction$1(Dataset.scala:2232) [info] at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId0$8(SQLExecution.scala:162) [info] at org.apache.spark.sql.execution.SQLExecution$.withSessionTagsApplied(SQLExecution.scala:268) [info] at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId0$7(SQLExecution.scala:124) [info] at org.apache.spark.JobArtifactSet$.withActiveJobArtifactState(JobArtifactSet.scala:94) [info] at org.apache.spark.sql.artifact.ArtifactManager.$anonfun$withResources$1(ArtifactManager.scala:112) [info] at org.apache.spark.sql.artifact.ArtifactManager.withClassLoaderIfNeeded(ArtifactManager.scala:106) [info] at org.apache.spark.sql.artifact.ArtifactManager.withResources(ArtifactManager.scala:111) [info] at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId0$6(SQLExecution.scala:124) [info] at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:291) [info] at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId0$1(SQLExecution.scala:123) [info] at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:804) [info] at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId0(SQLExecution.scala:77) [info] at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:233) [info] at org.apache.spark.sql.classic.Dataset.withAction(Dataset.scala:2232) [info] at org.apache.spark.sql.classic.Dataset.head(Dataset.scala:1381) [info] at org.apache.spark.sql.Dataset.head(Dataset.scala:2683) [info] at org.apache.spark.ml.util.ReadWriteUtils$.loadObject(ReadWrite.scala:881) [info] at org.apache.spark.ml.classification.LogisticRegressionModel$LogisticRegressionModelReader.load(LogisticRegression.scala:1375) [info] at org.apache.spark.ml.classification.LogisticRegressionModel$LogisticRegressionModelReader.load(LogisticRegression.scala:1350) [info] at org.apache.spark.ml.util.MLReadable.load(ReadWrite.scala:385) [info] at org.apache.spark.ml.util.MLReadable.load$(ReadWrite.scala:385) [info] at org.apache.spark.ml.classification.LogisticRegressionModel$.load(LogisticRegression.scala:1332) [info] at org.apache.spark.ml.classification.LogisticRegressionModel.load(LogisticRegression.scala) ... [info] Cause: java.util.concurrent.ExecutionException: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 35, Column 8: Failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 35, Column 8: Private member cannot be accessed from type "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection". [info] at com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:604) [info] at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:559) [info] at com.google.common.util.concurrent.AbstractFuture$TrustedFuture.get(AbstractFuture.java:114) [info] at com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:247) [info] at com.google.common.cache.LocalCache$Segment.getAndRecordStats(LocalCache.java:2349) [info] at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2317) [info] at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2190) [info] at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2080) [info] at com.google.common.cache.LocalCache.get(LocalCache.java:4017) [info] at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4040) [info] at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4989) [info] at org.apache.spark.util.NonFateSharingLoadingCache.$anonfun$get$2(NonFateSharingCache.scala:108) [info] at org.apache.spark.util.KeyLock.withLock(KeyLock.scala:64) [info] at org.apache.spark.util.NonFateSharingLoadingCache.get(NonFateSharingCache.scala:108) [info] at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.compile(CodeGenerator.scala:1490) [info] at org.apache.spark.sql.catalyst.expressions.codegen.GenerateSafeProjection$.create(GenerateSafeProjection.scala:205) [info] at org.apache.spark.sql.catalyst.expressions.codegen.GenerateSafeProjection$.create(GenerateSafeProjection.scala:39) [info] at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:1415) [info] at org.apache.spark.sql.catalyst.expressions.SafeProjection$.createCodeGeneratedObject(Projection.scala:172) [info] at org.apache.spark.sql.catalyst.expressions.SafeProjection$.createCodeGeneratedObject(Projection.scala:169) [info] at org.apache.spark.sql.catalyst.expressions.CodeGeneratorWithInterpretedFallback.createObject(CodeGeneratorWithInterpretedFallback.scala:45) [info] at org.apache.spark.sql.catalyst.expressions.SafeProjection$.create(Projection.scala:195) [info] at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$Deserializer.apply(ExpressionEncoder.scala:87) [info] at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$Deserializer.apply(ExpressionEncoder.scala:80) [info] at scala.collection.ArrayOps$.map$extension(ArrayOps.scala:936) [info] at org.apache.spark.sql.classic.Dataset.collectFromPlan(Dataset.scala:2244) [info] at org.apache.spark.sql.classic.Dataset.$anonfun$head$1(Dataset.scala:1381) [info] at org.apache.spark.sql.classic.Dataset.$anonfun$withAction$2(Dataset.scala:2234) [info] at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:642) [info] at org.apache.spark.sql.classic.Dataset.$anonfun$withAction$1(Dataset.scala:2232) [info] at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId0$8(SQLExecution.scala:162) [info] at org.apache.spark.sql.execution.SQLExecution$.withSessionTagsApplied(SQLExecution.scala:268) [info] at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId0$7(SQLExecution.scala:124) [info] at org.apache.spark.JobArtifactSet$.withActiveJobArtifactState(JobArtifactSet.scala:94) [info] at org.apache.spark.sql.artifact.ArtifactManager.$anonfun$withResources$1(ArtifactManager.scala:112) [info] at org.apache.spark.sql.artifact.ArtifactManager.withClassLoaderIfNeeded(ArtifactManager.scala:106) [info] at org.apache.spark.sql.artifact.ArtifactManager.withResources(ArtifactManager.scala:111) [info] at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId0$6(SQLExecution.scala:124) [info] at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:291) [info] at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId0$1(SQLExecution.scala:123) [info] at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:804) [info] at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId0(SQLExecution.scala:77) [info] at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:233) [info] at org.apache.spark.sql.classic.Dataset.withAction(Dataset.scala:2232) [info] at org.apache.spark.sql.classic.Dataset.head(Dataset.scala:1381) [info] at org.apache.spark.sql.Dataset.head(Dataset.scala:2683) [info] at org.apache.spark.ml.util.ReadWriteUtils$.loadObject(ReadWrite.scala:881) [info] at org.apache.spark.ml.classification.LogisticRegressionModel$LogisticRegressionModelReader.load(LogisticRegression.scala:1375) [info] at org.apache.spark.ml.classification.LogisticRegressionModel$LogisticRegressionModelReader.load(LogisticRegression.scala:1350) [info] at org.apache.spark.ml.util.MLReadable.load(ReadWrite.scala:385) [info] at org.apache.spark.ml.util.MLReadable.load$(ReadWrite.scala:385) [info] at org.apache.spark.ml.classification.LogisticRegressionModel$.load(LogisticRegression.scala:1332) [info] at org.apache.spark.ml.classification.LogisticRegressionModel.load(LogisticRegression.scala) [info] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [info] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) [info] at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [info] at java.base/java.lang.reflect.Method.invoke(Method.java:569) [info] at org.apache.spark.sql.connect.ml.MLUtils$.loadOperator(MLUtils.scala:422) [info] at org.apache.spark.sql.connect.ml.MLUtils$.loadTransformer(MLUtils.scala:447) [info] at org.apache.spark.sql.connect.ml.MLHandler$.handleMlCommand(MLHandler.scala:262) [info] at org.apache.spark.sql.connect.ml.MLHelper.readWrite(MLHelper.scala:227) [info] at org.apache.spark.sql.connect.ml.MLHelper.readWrite$(MLHelper.scala:196) [info] at org.apache.spark.sql.connect.ml.MLSuite.readWrite(MLSuite.scala:69) [info] at org.apache.spark.sql.connect.ml.MLSuite.$anonfun$new$2(MLSuite.scala:236) ... [info] Cause: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 35, Column 8: Failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 35, Column 8: Private member cannot be accessed from type "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection". [info] at org.apache.spark.sql.errors.QueryExecutionErrors$.compilerError(QueryExecutionErrors.scala:688) [info] at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.doCompile(CodeGenerator.scala:1557) [info] at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.$anonfun$cache$1(CodeGenerator.scala:1636) [info] at org.apache.spark.util.NonFateSharingCache$$anon$1.load(NonFateSharingCache.scala:68) [info] at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3574) [info] at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2316) [info] at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2190) [info] at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2080) [info] at com.google.common.cache.LocalCache.get(LocalCache.java:4017) [info] at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4040) [info] at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4989) [info] at org.apache.spark.util.NonFateSharingLoadingCache.$anonfun$get$2(NonFateSharingCache.scala:108) [info] at org.apache.spark.util.KeyLock.withLock(KeyLock.scala:64) [info] at org.apache.spark.util.NonFateSharingLoadingCache.get(NonFateSharingCache.scala:108) [info] at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.compile(CodeGenerator.scala:1490) [info] at org.apache.spark.sql.catalyst.expressions.codegen.GenerateSafeProjection$.create(GenerateSafeProjection.scala:205) [info] at org.apache.spark.sql.catalyst.expressions.codegen.GenerateSafeProjection$.create(GenerateSafeProjection.scala:39) [info] at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:1415) [info] at org.apache.spark.sql.catalyst.expressions.SafeProjection$.createCodeGeneratedObject(Projection.scala:172) [info] at org.apache.spark.sql.catalyst.expressions.SafeProjection$.createCodeGeneratedObject(Projection.scala:169) [info] at org.apache.spark.sql.catalyst.expressions.CodeGeneratorWithInterpretedFallback.createObject(CodeGeneratorWithInterpretedFallback.scala:45) [info] at org.apache.spark.sql.catalyst.expressions.SafeProjection$.create(Projection.scala:195) [info] at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$Deserializer.apply(ExpressionEncoder.scala:87) [info] at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$Deserializer.apply(ExpressionEncoder.scala:80) [info] at scala.collection.ArrayOps$.map$extension(ArrayOps.scala:936) [info] at org.apache.spark.sql.classic.Dataset.collectFromPlan(Dataset.scala:2244) [info] at org.apache.spark.sql.classic.Dataset.$anonfun$head$1(Dataset.scala:1381) [info] at org.apache.spark.sql.classic.Dataset.$anonfun$withAction$2(Dataset.scala:2234) [info] at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:642) [info] at org.apache.spark.sql.classic.Dataset.$anonfun$withAction$1(Dataset.scala:2232) [info] at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId0$8(SQLExecution.scala:162) [info] at org.apache.spark.sql.execution.SQLExecution$.withSessionTagsApplied(SQLExecution.scala:268) [info] at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId0$7(SQLExecution.scala:124) [info] at org.apache.spark.JobArtifactSet$.withActiveJobArtifactState(JobArtifactSet.scala:94) [info] at org.apache.spark.sql.artifact.ArtifactManager.$anonfun$withResources$1(ArtifactManager.scala:112) [info] at org.apache.spark.sql.artifact.ArtifactManager.withClassLoaderIfNeeded(ArtifactManager.scala:106) [info] at org.apache.spark.sql.artifact.ArtifactManager.withResources(ArtifactManager.scala:111) [info] at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId0$6(SQLExecution.scala:124) [info] at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:291) [info] at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId0$1(SQLExecution.scala:123) [info] at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:804) [info] at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId0(SQLExecution.scala:77) [info] at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:233) [info] at org.apache.spark.sql.classic.Dataset.withAction(Dataset.scala:2232) [info] at org.apache.spark.sql.classic.Dataset.head(Dataset.scala:1381) [info] at org.apache.spark.sql.Dataset.head(Dataset.scala:2683) [info] at org.apache.spark.ml.util.ReadWriteUtils$.loadObject(ReadWrite.scala:881) [info] at org.apache.spark.ml.classification.LogisticRegressionModel$LogisticRegressionModelReader.load(LogisticRegression.scala:1375) [info] at org.apache.spark.ml.classification.LogisticRegressionModel$LogisticRegressionModelReader.load(LogisticRegression.scala:1350) [info] at org.apache.spark.ml.util.MLReadable.load(ReadWrite.scala:385) [info] at org.apache.spark.ml.util.MLReadable.load$(ReadWrite.scala:385) [info] at org.apache.spark.ml.classification.LogisticRegressionModel$.load(LogisticRegression.scala:1332) [info] at org.apache.spark.ml.classification.LogisticRegressionModel.load(LogisticRegression.scala) [info] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [info] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) [info] at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [info] at java.base/java.lang.reflect.Method.invoke(Method.java:569) [info] at org.apache.spark.sql.connect.ml.MLUtils$.loadOperator(MLUtils.scala:422) [info] at org.apache.spark.sql.connect.ml.MLUtils$.loadTransformer(MLUtils.scala:447) [info] at org.apache.spark.sql.connect.ml.MLHandler$.handleMlCommand(MLHandler.scala:262) [info] at org.apache.spark.sql.connect.ml.MLHelper.readWrite(MLHelper.scala:227) [info] at org.apache.spark.sql.connect.ml.MLHelper.readWrite$(MLHelper.scala:196) [info] at org.apache.spark.sql.connect.ml.MLSuite.readWrite(MLSuite.scala:69) [info] at org.apache.spark.sql.connect.ml.MLSuite.$anonfun$new$2(MLSuite.scala:236) ... ``` ``` git reset --hard 86bf4c84805e89354d139ab72b298d3d4155fd0d // before this one build/sbt clean "connect/testOnly org.apache.spark.sql.connect.ml.MLSuite" ``` ``` [info] MLSuite: [info] - reconcileParam (141 milliseconds) [info] - LogisticRegression works (5 seconds, 808 milliseconds) [info] - Exception: Unsupported ML operator (15 milliseconds) [info] - Exception: cannot retrieve object (246 milliseconds) [info] - access the attribute which is not in allowed list (205 milliseconds) [info] - Model must be registered into ServiceLoader when loading (1 millisecond) [info] - RegressionEvaluator works (164 milliseconds) [info] - VectorAssembler works (178 milliseconds) [info] - Memory limitation of MLCache works (951 milliseconds) [info] Run completed in 10 seconds, 668 milliseconds. [info] Total number of tests run: 9 [info] Suites: completed 1, aborted 0 [info] Tests: succeeded 9, failed 0, canceled 0, ignored 0, pending 0 [info] All tests passed. ``` The reason why this issue wasn't detected during the GitHub Actions of this pr is that the changes in the `mllib` module do not trigger the tests for the `connect` module now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org