[ 
https://issues.apache.org/jira/browse/SPARK-12350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060650#comment-15060650
 ] 

Jakob Odersky commented on SPARK-12350:
---------------------------------------

A git-bisect showed that the issue was introduced in 
4a46b8859d3314b5b45a67cdc5c81fecb6e9e78c, a commit that fixes SPARK-11563.
[~vanzin], any idea what could have gone wrong?

> VectorAssembler#transform() initially throws an exception
> ---------------------------------------------------------
>
>                 Key: SPARK-12350
>                 URL: https://issues.apache.org/jira/browse/SPARK-12350
>             Project: Spark
>          Issue Type: Bug
>          Components: ML
>         Environment: sparkShell command from sbt
>            Reporter: Jakob Odersky
>
> Calling VectorAssembler.transform() initially throws an exception, subsequent 
> calls work.
> h3. Steps to reproduce
> In spark-shell,
> 1. Create a dummy dataframe and define an assembler
> {code}
> import org.apache.spark.ml.feature.VectorAssembler
> val df = sc.parallelize(List((1,2), (3,4))).toDF
> val assembler = new VectorAssembler().setInputCols(Array("_1", 
> "_2")).setOutputCol("features")
> {code}
> 2. Run
> {code}
> assembler.transform(df).show
> {code}
> Initially the following exception is thrown:
> {code}
> 15/12/15 16:20:19 ERROR TransportRequestHandler: Error opening stream 
> /classes/org/apache/spark/sql/catalyst/expressions/Object.class for request 
> from /9.72.139.102:60610
> java.lang.IllegalArgumentException: requirement failed: File not found: 
> /classes/org/apache/spark/sql/catalyst/expressions/Object.class
>       at scala.Predef$.require(Predef.scala:233)
>       at 
> org.apache.spark.rpc.netty.NettyStreamManager.openStream(NettyStreamManager.scala:60)
>       at 
> org.apache.spark.network.server.TransportRequestHandler.processStreamRequest(TransportRequestHandler.java:136)
>       at 
> org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:106)
>       at 
> org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:104)
>       at 
> org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)
>       at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
>       at 
> io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
>       at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
>       at 
> org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:86)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
>       at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
>       at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>       at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>       at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>       at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>       at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>       at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>       at java.lang.Thread.run(Thread.java:745)
> {code}
> Subsequent calls work:
> {code}
> +---+---+---------+
> | _1| _2| features|
> +---+---+---------+
> |  1|  2|[1.0,2.0]|
> |  3|  4|[3.0,4.0]|
> +---+---+---------+
> {code}
> It seems as though there is some internal state that is not initialized.
> [~iyounus] originally found this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to