[
https://issues.apache.org/jira/browse/SYSTEMML-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15539221#comment-15539221
]
Mike Dusenberry edited comment on SYSTEMML-995 at 10/1/16 10:08 PM:
--------------------------------------------------------------------
Okay, I tried out the latest update, and there is still one more issue to
address. If {{MLContextConversionUtil.dataFrameToFrameObject}} receives a
DataFrame *without* any frame metadata, a new {{FrameMetadata} will be created
with an empty {{FrameFormat}}, and so the subsequent
{{isDataFrameWithIDColumn}} function will always return false ([line 360 |
https://github.com/apache/incubator-systemml/blob/master/src/main/java/org/apache/sysml/api/mlcontext/MLContextConversionUtil.java#L360]).
We should just create a new function similar to
{{determineMatrixFormatIfNeeded}} for frames, and call it before the
{{isDataFrameWithIDColumn}} function, as is done for DataFrame-matrix
conversions ([line 412 |
https://github.com/apache/incubator-systemml/blob/master/src/main/java/org/apache/sysml/api/mlcontext/MLContextConversionUtil.java#L412]).
I can add this method.
was (Author: [email protected]):
Okay, I tried out the latest update, and there is still one more issue to
address. If {{MLContextConversionUtil.dataFrameToFrameObject}} receives a
DataFrame *without* any frame metadata, a new {{FrameMetadata} will be created
with an empty {{FrameFormat}}, and so the subsequent
{{isDataFrameWithIDColumn}} function will always return false ([line 360 |
https://github.com/apache/incubator-systemml/blob/master/src/main/java/org/apache/sysml/api/mlcontext/MLContextConversionUtil.java#L360]).
We should just create a new function similar to
{{determineMatrixFormatIfNeeded}} for frames, and call it before the
{{isDataFrameWithIDColumn}} function, as is done for DataFrame-matrix
conversions ([line 412 |
https://github.com/apache/incubator-systemml/blob/master/src/main/java/org/apache/sysml/api/mlcontext/MLContextConversionUtil.java#L412]).
> MLContext dataframe-frame conversion with index column & vector column
> ----------------------------------------------------------------------
>
> Key: SYSTEMML-995
> URL: https://issues.apache.org/jira/browse/SYSTEMML-995
> Project: SystemML
> Issue Type: Bug
> Components: APIs
> Affects Versions: SystemML 0.11
> Reporter: Matthias Boehm
> Priority: Blocker
>
> MLContext currently always assumes data frame to frame conversion without
> existing index column. Since the user cannot communicate the existence of
> this column, the data conversion leads to incorrect results as an additional
> column is included in the output frame. We need make the MLContext handling
> of frames consistent with the handling of matrices.
> Additionally, the conversion code in
> {{MLContextConversionUtil.dataFrameToFrameObject()}} does not yet take into
> account frames with vectors, although the recent addition adds this support
> in the underlying {{FrameRDDConverterUtils.java}} class. Therefore, the
> number of columns set when {{mc == null}} is incorrect.
> Thanks [[email protected]] for catching this issue. cc [~acs_s] [~deron]
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)