[ https://issues.apache.org/jira/browse/SYSTEMML-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15538879#comment-15538879 ]
Mike Dusenberry commented on SYSTEMML-995: ------------------------------------------ Also, just to follow up with that null pointer exception, here's the stacktrace: {code} Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program block generated from statement block between lines 146 and 156 -- Error evaluating instruction: SPARK°mapmm°_mVar1312·MATRIX·DOUBLE°W·MATRIX·DOUBLE°_mVar1313·MATRIX·DOUBLE°RIGHT°false°MULTI_BLOCK at org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:335) at org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:224) at org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168) at org.apache.sysml.runtime.controlprogram.ForProgramBlock.execute(ForProgramBlock.java:150) ... 29 more Caused by: java.lang.NullPointerException at org.apache.sysml.runtime.controlprogram.caching.CacheBlockFactory.getCode(CacheBlockFactory.java:59) at org.apache.sysml.runtime.instructions.spark.data.PartitionedBlock.writeHeaderAndPayload(PartitionedBlock.java:355) at org.apache.sysml.runtime.instructions.spark.data.PartitionedBlock.writeExternal(PartitionedBlock.java:332) at java.io.ObjectOutputStream.writeExternalData(ObjectOutputStream.java:1459) at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1430) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178) at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348) at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:44) at org.apache.spark.broadcast.TorrentBroadcast$.blockifyObject(TorrentBroadcast.scala:203) at org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:102) at org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:85) at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34) at org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:63) at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1326) at org.apache.spark.api.java.JavaSparkContext.broadcast(JavaSparkContext.scala:639) at org.apache.sysml.runtime.controlprogram.context.SparkExecutionContext.getBroadcastForVariable(SparkExecutionContext.java:536) at org.apache.sysml.runtime.instructions.spark.MapmmSPInstruction.processInstruction(MapmmSPInstruction.java:122) at org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:305) ... 32 more {code} > MLContext dataframe-frame conversion with index column & vector column > ---------------------------------------------------------------------- > > Key: SYSTEMML-995 > URL: https://issues.apache.org/jira/browse/SYSTEMML-995 > Project: SystemML > Issue Type: Bug > Components: APIs > Affects Versions: SystemML 0.11 > Reporter: Matthias Boehm > Priority: Blocker > > MLContext currently always assumes data frame to frame conversion without > existing index column. Since the user cannot communicate the existence of > this column, the data conversion leads to incorrect results as an additional > column is included in the output frame. We need make the MLContext handling > of frames consistent with the handling of matrices. > Additionally, the conversion code in > {{MLContextConversionUtil.dataFrameToFrameObject()}} does not yet take into > account frames with vectors, although the recent addition adds this support > in the underlying {{FrameRDDConverterUtils.java}} class. Therefore, the > number of columns set when {{mc == null}} is incorrect. > Thanks [~mwdus...@us.ibm.com] for catching this issue. cc [~acs_s] [~deron] -- This message was sent by Atlassian JIRA (v6.3.4#6332)