[
https://issues.apache.org/jira/browse/SPARK-12350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15059788#comment-15059788
]
Yanbo Liang edited comment on SPARK-12350 at 12/16/15 10:15 AM:
----------------------------------------------------------------
I can reproduce this issue, but it not caused by ML because it can output the
transformed dataframe at the end of the error log. And if we did not run this
program in spark-shell, it works well.
was (Author: yanboliang):
I can reproduce this issue, but it not caused by ML because it can output the
transformed dataframe at the end of the error log. And if we did not use
spark-shell to run this program, it works well.
> VectorAssembler#transform() initially throws an exception
> ---------------------------------------------------------
>
> Key: SPARK-12350
> URL: https://issues.apache.org/jira/browse/SPARK-12350
> Project: Spark
> Issue Type: Bug
> Components: ML
> Environment: sparkShell command from sbt
> Reporter: Jakob Odersky
>
> Calling VectorAssembler.transform() initially throws an exception, subsequent
> calls work.
> h3. Steps to reproduce
> In spark-shell,
> 1. Create a dummy dataframe and define an assembler
> {code}
> import org.apache.spark.ml.feature.VectorAssembler
> val df = sc.parallelize(List((1,2), (3,4))).toDF
> val assembler = new VectorAssembler().setInputCols(Array("_1",
> "_2")).setOutputCol("features")
> {code}
> 2. Run
> {code}
> assembler.transform(df).show
> {code}
> Initially the following exception is thrown:
> {code}
> 15/12/15 16:20:19 ERROR TransportRequestHandler: Error opening stream
> /classes/org/apache/spark/sql/catalyst/expressions/Object.class for request
> from /9.72.139.102:60610
> java.lang.IllegalArgumentException: requirement failed: File not found:
> /classes/org/apache/spark/sql/catalyst/expressions/Object.class
> at scala.Predef$.require(Predef.scala:233)
> at
> org.apache.spark.rpc.netty.NettyStreamManager.openStream(NettyStreamManager.scala:60)
> at
> org.apache.spark.network.server.TransportRequestHandler.processStreamRequest(TransportRequestHandler.java:136)
> at
> org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:106)
> at
> org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:104)
> at
> org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)
> at
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
> at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
> at
> io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
> at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
> at
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
> at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
> at
> org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:86)
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
> at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
> at
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
> at
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
> at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> Subsequent calls work:
> {code}
> +---+---+---------+
> | _1| _2| features|
> +---+---+---------+
> | 1| 2|[1.0,2.0]|
> | 3| 4|[3.0,4.0]|
> +---+---+---------+
> {code}
> It seems as though there is some internal state that is not initialized.
> [~iyounus] originally found this issue.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]