[SYSTEMML-947] Remove binary block classes from MLContext

Remove BinaryBlockMatrix and BinaryBlockMatrix classes from
MLContext API and incorporate similar functionality into
Matrix and Frame classes.

Closes #531.


Project: http://git-wip-us.apache.org/repos/asf/incubator-systemml/repo
Commit: 
http://git-wip-us.apache.org/repos/asf/incubator-systemml/commit/c44f6c02
Tree: http://git-wip-us.apache.org/repos/asf/incubator-systemml/tree/c44f6c02
Diff: http://git-wip-us.apache.org/repos/asf/incubator-systemml/diff/c44f6c02

Branch: refs/heads/gh-pages
Commit: c44f6c0224fbbabb3804f91b00c039546f1dabaf
Parents: 49bd822
Author: Deron Eriksson <de...@us.ibm.com>
Authored: Wed Jun 7 10:28:44 2017 -0700
Committer: Deron Eriksson <de...@us.ibm.com>
Committed: Wed Jun 7 10:28:44 2017 -0700

----------------------------------------------------------------------
 spark-mlcontext-programming-guide.md | 23 ++++++++---------------
 1 file changed, 8 insertions(+), 15 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-systemml/blob/c44f6c02/spark-mlcontext-programming-guide.md
----------------------------------------------------------------------
diff --git a/spark-mlcontext-programming-guide.md 
b/spark-mlcontext-programming-guide.md
index c424c70..ddccde1 100644
--- a/spark-mlcontext-programming-guide.md
+++ b/spark-mlcontext-programming-guide.md
@@ -243,7 +243,7 @@ mean: Double = 0.49996223966662934
 
 Many different types of input and output variables are automatically allowed. 
These types include
 `Boolean`, `Long`, `Double`, `String`, `Array[Array[Double]]`, `RDD<String>` 
and `JavaRDD<String>`
-in `CSV` (dense) and `IJV` (sparse) formats, `DataFrame`, `BinaryBlockMatrix`, 
`Matrix`, and
+in `CSV` (dense) and `IJV` (sparse) formats, `DataFrame`, `Matrix`, and
 `Frame`. RDDs and JavaRDDs are assumed to be CSV format unless MatrixMetadata 
is supplied indicating
 IJV format.
 
@@ -1606,11 +1606,7 @@ Therefore, if you use a set of data multiple times, one 
way to potentially impro
 to convert it to a SystemML matrix representation and then use this 
representation rather than performing
 the data conversion each time.
 
-There are currently two mechanisms for this in SystemML: **(1) 
BinaryBlockMatrix** and **(2) Matrix**.
-
-**BinaryBlockMatrix:**
-
-If you have an input DataFrame, it can be converted to a BinaryBlockMatrix, 
and this BinaryBlockMatrix
+If you have an input DataFrame, it can be converted to a Matrix, and this 
Matrix
 can be passed as an input rather than passing in the DataFrame as an input.
 
 For example, suppose we had a 10000x100 matrix represented as a DataFrame, as 
we saw in an earlier example.
@@ -1633,10 +1629,10 @@ val minMaxMeanScript = dml(minMaxMean).in("Xin", df, 
mm).out("minOut", "maxOut",
 {% endhighlight %}
 
 Rather than passing in a DataFrame each time to the Script object creation, 
let's instead create a
-BinaryBlockMatrix object based on the DataFrame and pass this 
BinaryBlockMatrix to the Script object
+Matrix object based on the DataFrame and pass this Matrix to the Script object
 creation. If we run the code below in the Spark Shell, we see that the data 
conversion step occurs
-when the BinaryBlockMatrix object is created. However, when we create a Script 
object twice, we see
-that no conversion penalty occurs, since this conversion occurred when the 
BinaryBlockMatrix was
+when the Matrix object is created. However, when we create a Script object 
twice, we see
+that no conversion penalty occurs, since this conversion occurred when the 
Matrix was
 created.
 
 {% highlight scala %}
@@ -1649,14 +1645,11 @@ val data = sc.parallelize(0 to numRows-1).map { _ => 
Row.fromSeq(Seq.fill(numCol
 val schema = StructType((0 to numCols-1).map { i => StructField("C" + i, 
DoubleType, true) } )
 val df = spark.createDataFrame(data, schema)
 val mm = new MatrixMetadata(numRows, numCols)
-val bbm = new BinaryBlockMatrix(df, mm)
-val minMaxMeanScript = dml(minMaxMean).in("Xin", bbm).out("minOut", "maxOut", 
"meanOut")
-val minMaxMeanScript = dml(minMaxMean).in("Xin", bbm).out("minOut", "maxOut", 
"meanOut")
+val matrix = new Matrix(df, mm)
+val minMaxMeanScript = dml(minMaxMean).in("Xin", matrix).out("minOut", 
"maxOut", "meanOut")
+val minMaxMeanScript = dml(minMaxMean).in("Xin", matrix).out("minOut", 
"maxOut", "meanOut")
 {% endhighlight %}
 
-
-**Matrix:**
-
 When a matrix is returned as an output, it is returned as a Matrix object, 
which is a wrapper around
 a SystemML MatrixObject. As a result, an output Matrix is already in a 
SystemML representation,
 meaning that it can be passed as an input with no data conversion penalty.

Reply via email to