Mike Dusenberry created SYSTEMML-952: ----------------------------------------
Summary: Efficient Counts During Conversions Key: SYSTEMML-952 URL: https://issues.apache.org/jira/browse/SYSTEMML-952 Project: SystemML Issue Type: Improvement Reporter: Mike Dusenberry Currently, we spend a lot of time on {{count}} during the conversions from wide DataFrames. When calling {{count}} in Spark on these DataFrames directly, it is much quicker to just select one of the simple double columns (say the id column) and then {{count}}, in that it it does not read in the heavy vector column as well. Therefore, we should perform the row count only on the index column, and the column count on the first row. cc [~mboehm7] -- This message was sent by Atlassian JIRA (v6.3.4#6332)