[ https://issues.apache.org/jira/browse/SYSTEMML-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15539422#comment-15539422 ]
Matthias Boehm commented on SYSTEMML-995: ----------------------------------------- just to demystify this behavior: (1) the reblock was injected because the script is compiled without input characteristics (and hence with default textcell input format), (2) with (wrong) default blocksize of 1, the reblock was not removed (now it is always removed for frames), and (3) the reason why it did not fail locally was our "in-memory reblock" which simply read the input into memory (which of course supports all the formats). > MLContext dataframe-frame conversion with index column & vector column > ---------------------------------------------------------------------- > > Key: SYSTEMML-995 > URL: https://issues.apache.org/jira/browse/SYSTEMML-995 > Project: SystemML > Issue Type: Bug > Components: APIs > Affects Versions: SystemML 0.11 > Reporter: Matthias Boehm > Priority: Blocker > > MLContext currently always assumes data frame to frame conversion without > existing index column. Since the user cannot communicate the existence of > this column, the data conversion leads to incorrect results as an additional > column is included in the output frame. We need make the MLContext handling > of frames consistent with the handling of matrices. > Additionally, the conversion code in > {{MLContextConversionUtil.dataFrameToFrameObject()}} does not yet take into > account frames with vectors, although the recent addition adds this support > in the underlying {{FrameRDDConverterUtils.java}} class. Therefore, the > number of columns set when {{mc == null}} is incorrect. > Thanks [~mwdus...@us.ibm.com] for catching this issue. cc [~acs_s] [~deron] -- This message was sent by Atlassian JIRA (v6.3.4#6332)