Repository: incubator-systemml
Updated Branches:
  refs/heads/gh-pages 0b46ddb21 -> 15f95e2cf


[SYSTEMML-474] MLContext matrix from URL

Initial support reading IJV and CSV matrices into MLContext via URLs.

Closes #210.


Project: http://git-wip-us.apache.org/repos/asf/incubator-systemml/repo
Commit: 
http://git-wip-us.apache.org/repos/asf/incubator-systemml/commit/15f95e2c
Tree: http://git-wip-us.apache.org/repos/asf/incubator-systemml/tree/15f95e2c
Diff: http://git-wip-us.apache.org/repos/asf/incubator-systemml/diff/15f95e2c

Branch: refs/heads/gh-pages
Commit: 15f95e2cfbbd6ccaebc6f70a537f57b32906f311
Parents: 0b46ddb
Author: Deron Eriksson <[email protected]>
Authored: Tue Aug 16 21:38:27 2016 -0700
Committer: Deron Eriksson <[email protected]>
Committed: Tue Aug 16 21:38:27 2016 -0700

----------------------------------------------------------------------
 _config.yml                          |   2 +-
 spark-mlcontext-programming-guide.md | 101 ++++++++++++++++++++++++++++++
 2 files changed, 102 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-systemml/blob/15f95e2c/_config.yml
----------------------------------------------------------------------
diff --git a/_config.yml b/_config.yml
index 1a09658..2ef66a0 100644
--- a/_config.yml
+++ b/_config.yml
@@ -11,7 +11,7 @@ include:
   - _modules
 
 # These allow the documentation to be updated with newer releases
-SYSTEMML_VERSION: 0.11.0
+SYSTEMML_VERSION: 0.10.x
 
 # if 'analytics_on' is true, analytics section will be rendered on the HTML 
pages
 analytics_on: true

http://git-wip-us.apache.org/repos/asf/incubator-systemml/blob/15f95e2c/spark-mlcontext-programming-guide.md
----------------------------------------------------------------------
diff --git a/spark-mlcontext-programming-guide.md 
b/spark-mlcontext-programming-guide.md
index 2f77347..71db1a4 100644
--- a/spark-mlcontext-programming-guide.md
+++ b/spark-mlcontext-programming-guide.md
@@ -662,6 +662,107 @@ None
 </div>
 
 
+Alternatively, we could supply a `java.net.URL` to the Script `in` method. 
Note that if the URL matrix data is in IJV
+format, metadata needs to be supplied for the matrix.
+
+<div class="codetabs">
+
+<div data-lang="Scala" markdown="1">
+{% highlight scala %}
+val habermanUrl = 
"http://archive.ics.uci.edu/ml/machine-learning-databases/haberman/haberman.data";
+val typesRDD = sc.parallelize(Array("1.0,1.0,1.0,2.0"))
+val scriptUrl = 
"https://raw.githubusercontent.com/apache/incubator-systemml/master/scripts/algorithms/Univar-Stats.dml";
+val uni = dmlFromUrl(scriptUrl).in("A", new java.net.URL(habermanUrl)).in("K", 
typesRDD).in("$CONSOLE_OUTPUT", true)
+ml.execute(uni)
+{% endhighlight %}
+</div>
+
+<div data-lang="Spark Shell" markdown="1">
+{% highlight scala %}
+scala> val habermanUrl = 
"http://archive.ics.uci.edu/ml/machine-learning-databases/haberman/haberman.data";
+habermanUrl: String = 
http://archive.ics.uci.edu/ml/machine-learning-databases/haberman/haberman.data
+
+scala> val typesRDD = sc.parallelize(Array("1.0,1.0,1.0,2.0"))
+typesRDD: org.apache.spark.rdd.RDD[String] = ParallelCollectionRDD[50] at 
parallelize at <console>:33
+
+scala> val scriptUrl = 
"https://raw.githubusercontent.com/apache/incubator-systemml/master/scripts/algorithms/Univar-Stats.dml";
+scriptUrl: String = 
https://raw.githubusercontent.com/apache/incubator-systemml/master/scripts/algorithms/Univar-Stats.dml
+
+scala> val uni = dmlFromUrl(scriptUrl).in("A", new 
java.net.URL(habermanUrl)).in("K", typesRDD).in("$CONSOLE_OUTPUT", true)
+uni: org.apache.sysml.api.mlcontext.Script =
+Inputs:
+  [1] (URL) A: 
http://archive.ics.uci.edu/ml/machine-learning-databases/haberman/haberman.data
+  [2] (RDD) K: ParallelCollectionRDD[50] at parallelize at <console>:33
+  [3] (Boolean) $CONSOLE_OUTPUT: true
+
+Outputs:
+None
+
+
+scala> ml.execute(uni)
+...
+-------------------------------------------------
+ (01) Minimum             | 30.0
+ (02) Maximum             | 83.0
+ (03) Range               | 53.0
+ (04) Mean                | 52.45751633986928
+ (05) Variance            | 116.71458266366658
+ (06) Std deviation       | 10.803452349303281
+ (07) Std err of mean     | 0.6175922641866753
+ (08) Coeff of variation  | 0.20594669940735139
+ (09) Skewness            | 0.1450718616532357
+ (10) Kurtosis            | -0.6150152487211726
+ (11) Std err of skewness | 0.13934809593495995
+ (12) Std err of kurtosis | 0.277810485320835
+ (13) Median              | 52.0
+ (14) Interquartile mean  | 52.16013071895425
+Feature [1]: Scale
+-------------------------------------------------
+ (01) Minimum             | 58.0
+ (02) Maximum             | 69.0
+ (03) Range               | 11.0
+ (04) Mean                | 62.85294117647059
+ (05) Variance            | 10.558630665380907
+ (06) Std deviation       | 3.2494046632238507
+ (07) Std err of mean     | 0.18575610076612029
+ (08) Coeff of variation  | 0.051698529971741194
+ (09) Skewness            | 0.07798443581479181
+ (10) Kurtosis            | -1.1324380182967442
+ (11) Std err of skewness | 0.13934809593495995
+ (12) Std err of kurtosis | 0.277810485320835
+ (13) Median              | 63.0
+ (14) Interquartile mean  | 62.80392156862745
+Feature [2]: Scale
+-------------------------------------------------
+ (01) Minimum             | 0.0
+ (02) Maximum             | 52.0
+ (03) Range               | 52.0
+ (04) Mean                | 4.026143790849673
+ (05) Variance            | 51.691117539912135
+ (06) Std deviation       | 7.189653506248555
+ (07) Std err of mean     | 0.41100513466216837
+ (08) Coeff of variation  | 1.7857418611299172
+ (09) Skewness            | 2.954633471088322
+ (10) Kurtosis            | 11.425776549251449
+ (11) Std err of skewness | 0.13934809593495995
+ (12) Std err of kurtosis | 0.277810485320835
+ (13) Median              | 1.0
+ (14) Interquartile mean  | 1.2483660130718954
+Feature [3]: Scale
+-------------------------------------------------
+Feature [4]: Categorical (Nominal)
+ (15) Num of categories   | 2
+ (16) Mode                | 1
+ (17) Num of modes        | 1
+res5: org.apache.sysml.api.mlcontext.MLResults =
+None
+
+{% endhighlight %}
+</div>
+
+</div>
+
+
 ### Input Variables vs Input Parameters
 
 If we examine the

Reply via email to