date:20150215

spark git commit: [Ml] SPARK-5796 Don't transform data on a last estimator in Pipeline

2015-02-15 Thread meng

Repository: spark
Updated Branches:
  refs/heads/master acf2558dc -> c78a12c4c


[Ml] SPARK-5796 Don't transform data on a last estimator in Pipeline

If it's a last estimator in Pipeline there's no need to transform data, since 
there's no next stage that would consume this data.

Author: Peter Rudenko 

Closes #4590 from petro-rudenko/patch-1 and squashes the following commits:

d13ec33 [Peter Rudenko] [Ml] SPARK-5796 Don't transform data on a last 
estimator in Pipeline


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c78a12c4
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c78a12c4
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c78a12c4

Branch: refs/heads/master
Commit: c78a12c4cc4d4312c4ee1069d3b218882d32d678
Parents: acf2558
Author: Peter Rudenko 
Authored: Sun Feb 15 20:51:32 2015 -0800
Committer: Xiangrui Meng 
Committed: Sun Feb 15 20:51:32 2015 -0800

--
 mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/c78a12c4/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala
--
diff --git a/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala 
b/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala
index bb291e6..5607ed2 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala
@@ -114,7 +114,9 @@ class Pipeline extends Estimator[PipelineModel] {
 throw new IllegalArgumentException(
   s"Do not support stage $stage of type ${stage.getClass}")
 }
-curDataset = transformer.transform(curDataset, paramMap)
+if (index < indexOfLastEstimator) {
+  curDataset = transformer.transform(curDataset, paramMap)
+}
 transformers += transformer
   } else {
 transformers += stage.asInstanceOf[Transformer]


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [Ml] SPARK-5796 Don't transform data on a last estimator in Pipeline

2015-02-15 Thread meng

Repository: spark
Updated Branches:
  refs/heads/branch-1.3 db3c539f2 -> 9cf7d7088


[Ml] SPARK-5796 Don't transform data on a last estimator in Pipeline

If it's a last estimator in Pipeline there's no need to transform data, since 
there's no next stage that would consume this data.

Author: Peter Rudenko 

Closes #4590 from petro-rudenko/patch-1 and squashes the following commits:

d13ec33 [Peter Rudenko] [Ml] SPARK-5796 Don't transform data on a last 
estimator in Pipeline

(cherry picked from commit c78a12c4cc4d4312c4ee1069d3b218882d32d678)
Signed-off-by: Xiangrui Meng 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9cf7d708
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/9cf7d708
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/9cf7d708

Branch: refs/heads/branch-1.3
Commit: 9cf7d7088d245b9b41ec78295cd2d6e3e395793d
Parents: db3c539
Author: Peter Rudenko 
Authored: Sun Feb 15 20:51:32 2015 -0800
Committer: Xiangrui Meng 
Committed: Sun Feb 15 20:51:38 2015 -0800

--
 mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/9cf7d708/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala
--
diff --git a/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala 
b/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala
index bb291e6..5607ed2 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala
@@ -114,7 +114,9 @@ class Pipeline extends Estimator[PipelineModel] {
 throw new IllegalArgumentException(
   s"Do not support stage $stage of type ${stage.getClass}")
 }
-curDataset = transformer.transform(curDataset, paramMap)
+if (index < indexOfLastEstimator) {
+  curDataset = transformer.transform(curDataset, paramMap)
+}
 transformers += transformer
   } else {
 transformers += stage.asInstanceOf[Transformer]


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: SPARK-5815 [MLLIB] Deprecate SVDPlusPlus APIs that expose DoubleMatrix from JBLAS

2015-02-15 Thread meng

Repository: spark
Updated Branches:
  refs/heads/branch-1.3 d71099133 -> db3c539f2


SPARK-5815 [MLLIB] Deprecate SVDPlusPlus APIs that expose DoubleMatrix from 
JBLAS

Deprecate SVDPlusPlus.run and introduce SVDPlusPlus.runSVDPlusPlus with return 
type that doesn't include DoubleMatrix

CC mengxr

Author: Sean Owen 

Closes #4614 from srowen/SPARK-5815 and squashes the following commits:

288cb05 [Sean Owen] Clarify deprecation plans in scaladoc
497458e [Sean Owen] Deprecate SVDPlusPlus.run and introduce 
SVDPlusPlus.runSVDPlusPlus with return type that doesn't include DoubleMatrix

(cherry picked from commit acf2558dc92901c342262c35eebb95f2a9b7a9ae)
Signed-off-by: Xiangrui Meng 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/db3c539f
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/db3c539f
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/db3c539f

Branch: refs/heads/branch-1.3
Commit: db3c539f20e17e327b2f284bf6fbb3f1abd7fe64
Parents: d710991
Author: Sean Owen 
Authored: Sun Feb 15 20:41:27 2015 -0800
Committer: Xiangrui Meng 
Committed: Sun Feb 15 20:41:33 2015 -0800

--
 .../apache/spark/graphx/lib/SVDPlusPlus.scala   | 25 
 .../spark/graphx/lib/SVDPlusPlusSuite.scala |  2 +-
 2 files changed, 26 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/db3c539f/graphx/src/main/scala/org/apache/spark/graphx/lib/SVDPlusPlus.scala
--
diff --git 
a/graphx/src/main/scala/org/apache/spark/graphx/lib/SVDPlusPlus.scala 
b/graphx/src/main/scala/org/apache/spark/graphx/lib/SVDPlusPlus.scala
index 112ed09..fc84cfb 100644
--- a/graphx/src/main/scala/org/apache/spark/graphx/lib/SVDPlusPlus.scala
+++ b/graphx/src/main/scala/org/apache/spark/graphx/lib/SVDPlusPlus.scala
@@ -17,6 +17,8 @@
 
 package org.apache.spark.graphx.lib
 
+import org.apache.spark.annotation.Experimental
+
 import scala.util.Random
 import org.jblas.DoubleMatrix
 import org.apache.spark.rdd._
@@ -38,6 +40,8 @@ object SVDPlusPlus {
 extends Serializable
 
   /**
+   * :: Experimental ::
+   *
* Implement SVD++ based on "Factorization Meets the Neighborhood:
* a Multifaceted Collaborative Filtering Model",
* available at 
[[http://public.research.att.com/~volinsky/netflix/kdd08koren.pdf]].
@@ -45,12 +49,33 @@ object SVDPlusPlus {
* The prediction rule is rui = u + bu + bi + qi*(pu + |N(u)|^(-0.5)*sum(y)),
* see the details on page 6.
*
+   * This method temporarily replaces `run()`, and replaces `DoubleMatrix` in 
`run()`'s return
+   * value with `Array[Double]`. In 1.4.0, this method will be deprecated, but 
will be copied
+   * to replace `run()`, which will then be undeprecated.
+   *
* @param edges edges for constructing the graph
*
* @param conf SVDPlusPlus parameters
*
* @return a graph with vertex attributes containing the trained model
*/
+  @Experimental
+  def runSVDPlusPlus(edges: RDD[Edge[Double]], conf: Conf)
+: (Graph[(Array[Double], Array[Double], Double, Double), Double], Double) =
+  {
+val (graph, u) = run(edges, conf)
+// Convert DoubleMatrix to Array[Double]:
+val newVertices = graph.vertices.mapValues(v => (v._1.toArray, 
v._2.toArray, v._3, v._4))
+(Graph(newVertices, graph.edges), u)
+  }
+
+  /**
+   * This method is deprecated in favor of `runSVDPlusPlus()`, which replaces 
`DoubleMatrix`
+   * with `Array[Double]` in its return value. This method is deprecated. It 
will effectively
+   * be removed in 1.4.0 when `runSVDPlusPlus()` is copied to replace `run()`, 
and hence the
+   * return type of this method changes.
+   */
+  @deprecated("Call runSVDPlusPlus", "1.3.0")
   def run(edges: RDD[Edge[Double]], conf: Conf)
 : (Graph[(DoubleMatrix, DoubleMatrix, Double, Double), Double], Double) =
   {

http://git-wip-us.apache.org/repos/asf/spark/blob/db3c539f/graphx/src/test/scala/org/apache/spark/graphx/lib/SVDPlusPlusSuite.scala
--
diff --git 
a/graphx/src/test/scala/org/apache/spark/graphx/lib/SVDPlusPlusSuite.scala 
b/graphx/src/test/scala/org/apache/spark/graphx/lib/SVDPlusPlusSuite.scala
index e01df56..9987a4b 100644
--- a/graphx/src/test/scala/org/apache/spark/graphx/lib/SVDPlusPlusSuite.scala
+++ b/graphx/src/test/scala/org/apache/spark/graphx/lib/SVDPlusPlusSuite.scala
@@ -32,7 +32,7 @@ class SVDPlusPlusSuite extends FunSuite with 
LocalSparkContext {
 Edge(fields(0).toLong * 2, fields(1).toLong * 2 + 1, 
fields(2).toDouble)
   }
   val conf = new SVDPlusPlus.Conf(10, 2, 0.0, 5.0, 0.007, 0.007, 0.005, 
0.015) // 2 iterations
-  var (graph, u) = SVDPlusPlus.run(edges, conf)
+  var (graph, u) = SVDPlusPlus.runSVDPlusP

spark git commit: SPARK-5815 [MLLIB] Deprecate SVDPlusPlus APIs that expose DoubleMatrix from JBLAS

2015-02-15 Thread meng

Repository: spark
Updated Branches:
  refs/heads/master cd4a15366 -> acf2558dc


SPARK-5815 [MLLIB] Deprecate SVDPlusPlus APIs that expose DoubleMatrix from 
JBLAS

Deprecate SVDPlusPlus.run and introduce SVDPlusPlus.runSVDPlusPlus with return 
type that doesn't include DoubleMatrix

CC mengxr

Author: Sean Owen 

Closes #4614 from srowen/SPARK-5815 and squashes the following commits:

288cb05 [Sean Owen] Clarify deprecation plans in scaladoc
497458e [Sean Owen] Deprecate SVDPlusPlus.run and introduce 
SVDPlusPlus.runSVDPlusPlus with return type that doesn't include DoubleMatrix


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/acf2558d
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/acf2558d
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/acf2558d

Branch: refs/heads/master
Commit: acf2558dc92901c342262c35eebb95f2a9b7a9ae
Parents: cd4a153
Author: Sean Owen 
Authored: Sun Feb 15 20:41:27 2015 -0800
Committer: Xiangrui Meng 
Committed: Sun Feb 15 20:41:27 2015 -0800

--
 .../apache/spark/graphx/lib/SVDPlusPlus.scala   | 25 
 .../spark/graphx/lib/SVDPlusPlusSuite.scala |  2 +-
 2 files changed, 26 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/acf2558d/graphx/src/main/scala/org/apache/spark/graphx/lib/SVDPlusPlus.scala
--
diff --git 
a/graphx/src/main/scala/org/apache/spark/graphx/lib/SVDPlusPlus.scala 
b/graphx/src/main/scala/org/apache/spark/graphx/lib/SVDPlusPlus.scala
index 112ed09..fc84cfb 100644
--- a/graphx/src/main/scala/org/apache/spark/graphx/lib/SVDPlusPlus.scala
+++ b/graphx/src/main/scala/org/apache/spark/graphx/lib/SVDPlusPlus.scala
@@ -17,6 +17,8 @@
 
 package org.apache.spark.graphx.lib
 
+import org.apache.spark.annotation.Experimental
+
 import scala.util.Random
 import org.jblas.DoubleMatrix
 import org.apache.spark.rdd._
@@ -38,6 +40,8 @@ object SVDPlusPlus {
 extends Serializable
 
   /**
+   * :: Experimental ::
+   *
* Implement SVD++ based on "Factorization Meets the Neighborhood:
* a Multifaceted Collaborative Filtering Model",
* available at 
[[http://public.research.att.com/~volinsky/netflix/kdd08koren.pdf]].
@@ -45,12 +49,33 @@ object SVDPlusPlus {
* The prediction rule is rui = u + bu + bi + qi*(pu + |N(u)|^(-0.5)*sum(y)),
* see the details on page 6.
*
+   * This method temporarily replaces `run()`, and replaces `DoubleMatrix` in 
`run()`'s return
+   * value with `Array[Double]`. In 1.4.0, this method will be deprecated, but 
will be copied
+   * to replace `run()`, which will then be undeprecated.
+   *
* @param edges edges for constructing the graph
*
* @param conf SVDPlusPlus parameters
*
* @return a graph with vertex attributes containing the trained model
*/
+  @Experimental
+  def runSVDPlusPlus(edges: RDD[Edge[Double]], conf: Conf)
+: (Graph[(Array[Double], Array[Double], Double, Double), Double], Double) =
+  {
+val (graph, u) = run(edges, conf)
+// Convert DoubleMatrix to Array[Double]:
+val newVertices = graph.vertices.mapValues(v => (v._1.toArray, 
v._2.toArray, v._3, v._4))
+(Graph(newVertices, graph.edges), u)
+  }
+
+  /**
+   * This method is deprecated in favor of `runSVDPlusPlus()`, which replaces 
`DoubleMatrix`
+   * with `Array[Double]` in its return value. This method is deprecated. It 
will effectively
+   * be removed in 1.4.0 when `runSVDPlusPlus()` is copied to replace `run()`, 
and hence the
+   * return type of this method changes.
+   */
+  @deprecated("Call runSVDPlusPlus", "1.3.0")
   def run(edges: RDD[Edge[Double]], conf: Conf)
 : (Graph[(DoubleMatrix, DoubleMatrix, Double, Double), Double], Double) =
   {

http://git-wip-us.apache.org/repos/asf/spark/blob/acf2558d/graphx/src/test/scala/org/apache/spark/graphx/lib/SVDPlusPlusSuite.scala
--
diff --git 
a/graphx/src/test/scala/org/apache/spark/graphx/lib/SVDPlusPlusSuite.scala 
b/graphx/src/test/scala/org/apache/spark/graphx/lib/SVDPlusPlusSuite.scala
index e01df56..9987a4b 100644
--- a/graphx/src/test/scala/org/apache/spark/graphx/lib/SVDPlusPlusSuite.scala
+++ b/graphx/src/test/scala/org/apache/spark/graphx/lib/SVDPlusPlusSuite.scala
@@ -32,7 +32,7 @@ class SVDPlusPlusSuite extends FunSuite with 
LocalSparkContext {
 Edge(fields(0).toLong * 2, fields(1).toLong * 2 + 1, 
fields(2).toDouble)
   }
   val conf = new SVDPlusPlus.Conf(10, 2, 0.0, 5.0, 0.007, 0.007, 0.005, 
0.015) // 2 iterations
-  var (graph, u) = SVDPlusPlus.run(edges, conf)
+  var (graph, u) = SVDPlusPlus.runSVDPlusPlus(edges, conf)
   graph.cache()
   val err = graph.vertices.collect().map{ case (vid, vd) =>

spark git commit: [SPARK-5769] Set params in constructors and in setParams in Python ML pipelines

2015-02-15 Thread meng

Repository: spark
Updated Branches:
  refs/heads/branch-1.3 4e099d757 -> d71099133


[SPARK-5769] Set params in constructors and in setParams in Python ML pipelines

This PR allow Python users to set params in constructors and in setParams, 
where we use decorator `keyword_only` to force keyword arguments. The trade-off 
is discussed in the design doc of SPARK-4586.

Generated doc:
![screen shot 2015-02-12 at 3 06 58 
am](https://cloud.githubusercontent.com/assets/829644/6166491/9cfcd06a-b265-11e4-99ea-473d866634fc.png)

CC: davies rxin

Author: Xiangrui Meng 

Closes #4564 from mengxr/py-pipeline-kw and squashes the following commits:

fedf720 [Xiangrui Meng] use toDF
d565f2c [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into 
py-pipeline-kw
cbc15d3 [Xiangrui Meng] fix style
5032097 [Xiangrui Meng] update pipeline signature
950774e [Xiangrui Meng] simplify keyword_only and update constructor/setParams 
signatures
fdde5fc [Xiangrui Meng] fix style
c9384b8 [Xiangrui Meng] fix sphinx doc
8e59180 [Xiangrui Meng] add setParams and make constructors take params, where 
we force keyword args

(cherry picked from commit cd4a15366244657c4b7936abe5054754534366f2)
Signed-off-by: Xiangrui Meng 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d7109913
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d7109913
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d7109913

Branch: refs/heads/branch-1.3
Commit: d71099133b64a4b9e9ab430cf1b314ee7deaf08d
Parents: 4e099d7
Author: Xiangrui Meng 
Authored: Sun Feb 15 20:29:26 2015 -0800
Committer: Xiangrui Meng 
Committed: Sun Feb 15 20:29:36 2015 -0800

--
 .../ml/simple_text_classification_pipeline.py   | 44 +---
 python/docs/conf.py |  4 ++
 python/pyspark/ml/classification.py | 44 +---
 python/pyspark/ml/feature.py| 72 
 python/pyspark/ml/param/__init__.py |  8 +++
 python/pyspark/ml/pipeline.py   | 19 +-
 python/pyspark/ml/util.py   | 15 
 7 files changed, 153 insertions(+), 53 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/d7109913/examples/src/main/python/ml/simple_text_classification_pipeline.py
--
diff --git a/examples/src/main/python/ml/simple_text_classification_pipeline.py 
b/examples/src/main/python/ml/simple_text_classification_pipeline.py
index c7df3d7..b4d9355 100644
--- a/examples/src/main/python/ml/simple_text_classification_pipeline.py
+++ b/examples/src/main/python/ml/simple_text_classification_pipeline.py
@@ -36,43 +36,33 @@ if __name__ == "__main__":
 sqlCtx = SQLContext(sc)
 
 # Prepare training documents, which are labeled.
-LabeledDocument = Row('id', 'text', 'label')
-training = sqlCtx.inferSchema(
-sc.parallelize([(0L, "a b c d e spark", 1.0),
-(1L, "b d", 0.0),
-(2L, "spark f g h", 1.0),
-(3L, "hadoop mapreduce", 0.0)])
-  .map(lambda x: LabeledDocument(*x)))
+LabeledDocument = Row("id", "text", "label")
+training = sc.parallelize([(0L, "a b c d e spark", 1.0),
+   (1L, "b d", 0.0),
+   (2L, "spark f g h", 1.0),
+   (3L, "hadoop mapreduce", 0.0)]) \
+.map(lambda x: LabeledDocument(*x)).toDF()
 
 # Configure an ML pipeline, which consists of tree stages: tokenizer, 
hashingTF, and lr.
-tokenizer = Tokenizer() \
-.setInputCol("text") \
-.setOutputCol("words")
-hashingTF = HashingTF() \
-.setInputCol(tokenizer.getOutputCol()) \
-.setOutputCol("features")
-lr = LogisticRegression() \
-.setMaxIter(10) \
-.setRegParam(0.01)
-pipeline = Pipeline() \
-.setStages([tokenizer, hashingTF, lr])
+tokenizer = Tokenizer(inputCol="text", outputCol="words")
+hashingTF = HashingTF(inputCol=tokenizer.getOutputCol(), 
outputCol="features")
+lr = LogisticRegression(maxIter=10, regParam=0.01)
+pipeline = Pipeline(stages=[tokenizer, hashingTF, lr])
 
 # Fit the pipeline to training documents.
 model = pipeline.fit(training)
 
 # Prepare test documents, which are unlabeled.
-Document = Row('id', 'text')
-test = sqlCtx.inferSchema(
-sc.parallelize([(4L, "spark i j k"),
-(5L, "l m n"),
-(6L, "mapreduce spark"),
-(7L, "apache hadoop")])
-  .map(lambda x: Document(*x)))
+Document = Row("id", "text")
+test = sc.parallelize([(4L, "spark i j k"),
+   (5L, "l m n"),
+

spark git commit: [SPARK-5769] Set params in constructors and in setParams in Python ML pipelines

2015-02-15 Thread meng

Repository: spark
Updated Branches:
  refs/heads/master 836577b38 -> cd4a15366


[SPARK-5769] Set params in constructors and in setParams in Python ML pipelines

This PR allow Python users to set params in constructors and in setParams, 
where we use decorator `keyword_only` to force keyword arguments. The trade-off 
is discussed in the design doc of SPARK-4586.

Generated doc:
![screen shot 2015-02-12 at 3 06 58 
am](https://cloud.githubusercontent.com/assets/829644/6166491/9cfcd06a-b265-11e4-99ea-473d866634fc.png)

CC: davies rxin

Author: Xiangrui Meng 

Closes #4564 from mengxr/py-pipeline-kw and squashes the following commits:

fedf720 [Xiangrui Meng] use toDF
d565f2c [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into 
py-pipeline-kw
cbc15d3 [Xiangrui Meng] fix style
5032097 [Xiangrui Meng] update pipeline signature
950774e [Xiangrui Meng] simplify keyword_only and update constructor/setParams 
signatures
fdde5fc [Xiangrui Meng] fix style
c9384b8 [Xiangrui Meng] fix sphinx doc
8e59180 [Xiangrui Meng] add setParams and make constructors take params, where 
we force keyword args


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/cd4a1536
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/cd4a1536
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/cd4a1536

Branch: refs/heads/master
Commit: cd4a15366244657c4b7936abe5054754534366f2
Parents: 836577b
Author: Xiangrui Meng 
Authored: Sun Feb 15 20:29:26 2015 -0800
Committer: Xiangrui Meng 
Committed: Sun Feb 15 20:29:26 2015 -0800

--
 .../ml/simple_text_classification_pipeline.py   | 44 +---
 python/docs/conf.py |  4 ++
 python/pyspark/ml/classification.py | 44 +---
 python/pyspark/ml/feature.py| 72 
 python/pyspark/ml/param/__init__.py |  8 +++
 python/pyspark/ml/pipeline.py   | 19 +-
 python/pyspark/ml/util.py   | 15 
 7 files changed, 153 insertions(+), 53 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/cd4a1536/examples/src/main/python/ml/simple_text_classification_pipeline.py
--
diff --git a/examples/src/main/python/ml/simple_text_classification_pipeline.py 
b/examples/src/main/python/ml/simple_text_classification_pipeline.py
index c7df3d7..b4d9355 100644
--- a/examples/src/main/python/ml/simple_text_classification_pipeline.py
+++ b/examples/src/main/python/ml/simple_text_classification_pipeline.py
@@ -36,43 +36,33 @@ if __name__ == "__main__":
 sqlCtx = SQLContext(sc)
 
 # Prepare training documents, which are labeled.
-LabeledDocument = Row('id', 'text', 'label')
-training = sqlCtx.inferSchema(
-sc.parallelize([(0L, "a b c d e spark", 1.0),
-(1L, "b d", 0.0),
-(2L, "spark f g h", 1.0),
-(3L, "hadoop mapreduce", 0.0)])
-  .map(lambda x: LabeledDocument(*x)))
+LabeledDocument = Row("id", "text", "label")
+training = sc.parallelize([(0L, "a b c d e spark", 1.0),
+   (1L, "b d", 0.0),
+   (2L, "spark f g h", 1.0),
+   (3L, "hadoop mapreduce", 0.0)]) \
+.map(lambda x: LabeledDocument(*x)).toDF()
 
 # Configure an ML pipeline, which consists of tree stages: tokenizer, 
hashingTF, and lr.
-tokenizer = Tokenizer() \
-.setInputCol("text") \
-.setOutputCol("words")
-hashingTF = HashingTF() \
-.setInputCol(tokenizer.getOutputCol()) \
-.setOutputCol("features")
-lr = LogisticRegression() \
-.setMaxIter(10) \
-.setRegParam(0.01)
-pipeline = Pipeline() \
-.setStages([tokenizer, hashingTF, lr])
+tokenizer = Tokenizer(inputCol="text", outputCol="words")
+hashingTF = HashingTF(inputCol=tokenizer.getOutputCol(), 
outputCol="features")
+lr = LogisticRegression(maxIter=10, regParam=0.01)
+pipeline = Pipeline(stages=[tokenizer, hashingTF, lr])
 
 # Fit the pipeline to training documents.
 model = pipeline.fit(training)
 
 # Prepare test documents, which are unlabeled.
-Document = Row('id', 'text')
-test = sqlCtx.inferSchema(
-sc.parallelize([(4L, "spark i j k"),
-(5L, "l m n"),
-(6L, "mapreduce spark"),
-(7L, "apache hadoop")])
-  .map(lambda x: Document(*x)))
+Document = Row("id", "text")
+test = sc.parallelize([(4L, "spark i j k"),
+   (5L, "l m n"),
+   (6L, "mapreduce spark"),
+   (7L, "apache hadoop")]) \
+.map(lambda x:

spark git commit: SPARK-5669 [BUILD] Spark assembly includes incompatibly licensed libgfortran, libgcc code via JBLAS

2015-02-15 Thread meng

Repository: spark
Updated Branches:
  refs/heads/branch-1.3 d96e188c7 -> 4e099d757


SPARK-5669 [BUILD] Spark assembly includes incompatibly licensed libgfortran, 
libgcc code via JBLAS

Exclude libgfortran, libgcc bundled by JBLAS for Windows. This much is simple, 
and solves the essential license issue. But the more important question is 
whether MLlib works on Windows then.

Author: Sean Owen 

Closes #4453 from srowen/SPARK-5669 and squashes the following commits:

734dd86 [Sean Owen] Exclude libgfortran, libgcc bundled by JBLAS, affecting 
Windows / OS X / Linux 32-bit (not Linux 64-bit)

(cherry picked from commit 836577b382695558f5c97d94ee725d0156ebfad2)
Signed-off-by: Xiangrui Meng 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/4e099d75
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/4e099d75
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/4e099d75

Branch: refs/heads/branch-1.3
Commit: 4e099d757fc1bc4266f7849db6da0e996bf917be
Parents: d96e188
Author: Sean Owen 
Authored: Sun Feb 15 09:15:48 2015 -0800
Committer: Xiangrui Meng 
Committed: Sun Feb 15 09:16:03 2015 -0800

--
 assembly/pom.xml | 10 ++
 1 file changed, 10 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/4e099d75/assembly/pom.xml
--
diff --git a/assembly/pom.xml b/assembly/pom.xml
index 87b3e6f..7752b41 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -118,6 +118,16 @@
 META-INF/*.RSA
   
 
+
+  
+  org.jblas:jblas
+  
+
+lib/Linux/i386/**
+lib/Mac OS X/**
+lib/Windows/**
+  
+
   
 
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: SPARK-5669 [BUILD] Spark assembly includes incompatibly licensed libgfortran, libgcc code via JBLAS

2015-02-15 Thread meng

Repository: spark
Updated Branches:
  refs/heads/master 61eb12674 -> 836577b38


SPARK-5669 [BUILD] Spark assembly includes incompatibly licensed libgfortran, 
libgcc code via JBLAS

Exclude libgfortran, libgcc bundled by JBLAS for Windows. This much is simple, 
and solves the essential license issue. But the more important question is 
whether MLlib works on Windows then.

Author: Sean Owen 

Closes #4453 from srowen/SPARK-5669 and squashes the following commits:

734dd86 [Sean Owen] Exclude libgfortran, libgcc bundled by JBLAS, affecting 
Windows / OS X / Linux 32-bit (not Linux 64-bit)


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/836577b3
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/836577b3
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/836577b3

Branch: refs/heads/master
Commit: 836577b382695558f5c97d94ee725d0156ebfad2
Parents: 61eb126
Author: Sean Owen 
Authored: Sun Feb 15 09:15:48 2015 -0800
Committer: Xiangrui Meng 
Committed: Sun Feb 15 09:15:48 2015 -0800

--
 assembly/pom.xml | 10 ++
 1 file changed, 10 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/836577b3/assembly/pom.xml
--
diff --git a/assembly/pom.xml b/assembly/pom.xml
index fa9f56e..fbb6e94 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -114,6 +114,16 @@
 META-INF/*.RSA
   
 
+
+  
+  org.jblas:jblas
+  
+
+lib/Linux/i386/**
+lib/Mac OS X/**
+lib/Windows/**
+  
+
   
 
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [MLLIB][SPARK-5502] User guide for isotonic regression

2015-02-15 Thread meng

Repository: spark
Updated Branches:
  refs/heads/master c771e475c -> 61eb12674


[MLLIB][SPARK-5502] User guide for isotonic regression

User guide for isotonic regression added to docs/mllib-regression.md including 
code examples for Scala and Java.

Author: martinzapletal 

Closes #4536 from zapletal-martin/SPARK-5502 and squashes the following commits:

67fe773 [martinzapletal] SPARK-5502 reworded model prediction rules to use more 
general language rather than the code/implementation specific terms
80bd4c3 [martinzapletal] SPARK-5502 created docs page for isotonic regression, 
added links to the page, updated data and examples
7d8136e [martinzapletal] SPARK-5502 Added documentation for Isotonic regression 
including examples for Scala and Java
504b5c3 [martinzapletal] SPARK-5502 Added documentation for Isotonic regression 
including examples for Scala and Java


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/61eb1267
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/61eb1267
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/61eb1267

Branch: refs/heads/master
Commit: 61eb12674b90143388a01c22bf51cb7d02ab0447
Parents: c771e47
Author: martinzapletal 
Authored: Sun Feb 15 09:10:03 2015 -0800
Committer: Xiangrui Meng 
Committed: Sun Feb 15 09:10:03 2015 -0800

--
 data/mllib/sample_isotonic_regression_data.txt | 100 +
 docs/mllib-classification-regression.md|   3 +-
 docs/mllib-guide.md|   1 +
 docs/mllib-isotonic-regression.md  | 155 
 4 files changed, 258 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/61eb1267/data/mllib/sample_isotonic_regression_data.txt
--
diff --git a/data/mllib/sample_isotonic_regression_data.txt 
b/data/mllib/sample_isotonic_regression_data.txt
new file mode 100644
index 000..d257b50
--- /dev/null
+++ b/data/mllib/sample_isotonic_regression_data.txt
@@ -0,0 +1,100 @@
+0.24579296,0.01
+0.28505864,0.02
+0.31208567,0.03
+0.35900051,0.04
+0.35747068,0.05
+0.16675166,0.06
+0.17491076,0.07
+0.04181540,0.08
+0.04793473,0.09
+0.03926568,0.10
+0.12952575,0.11
+0.,0.12
+0.01376849,0.13
+0.13105558,0.14
+0.08873024,0.15
+0.12595614,0.16
+0.15247323,0.17
+0.25956145,0.18
+0.20040796,0.19
+0.19581846,0.20
+0.15757267,0.21
+0.13717491,0.22
+0.19020908,0.23
+0.19581846,0.24
+0.20091790,0.25
+0.16879143,0.26
+0.18510964,0.27
+0.20040796,0.28
+0.29576747,0.29
+0.43396226,0.30
+0.53391127,0.31
+0.52116267,0.32
+0.48546660,0.33
+0.49209587,0.34
+0.54156043,0.35
+0.59765426,0.36
+0.56144824,0.37
+0.58592555,0.38
+0.52983172,0.39
+0.50178480,0.40
+0.52626211,0.41
+0.58286588,0.42
+0.64660887,0.43
+0.68077511,0.44
+0.74298827,0.45
+0.64864865,0.46
+0.67261601,0.47
+0.65782764,0.48
+0.69811321,0.49
+0.63029067,0.50
+0.61601224,0.51
+0.63233044,0.52
+0.65323814,0.53
+0.65323814,0.54
+0.67363590,0.55
+0.67006629,0.56
+0.51555329,0.57
+0.50892402,0.58
+0.33299337,0.59
+0.36206017,0.60
+0.43090260,0.61
+0.45996940,0.62
+0.56348802,0.63
+0.54920959,0.64
+0.48393677,0.65
+0.48495665,0.66
+0.46965834,0.67
+0.45181030,0.68
+0.45843957,0.69
+0.47118817,0.70
+0.51555329,0.71
+0.58031617,0.72
+0.55481897,0.73
+0.56297807,0.74
+0.56603774,0.75
+0.57929628,0.76
+0.64762876,0.77
+0.66241713,0.78
+0.69301377,0.79
+0.65119837,0.80
+0.68332483,0.81
+0.66598674,0.82
+0.73890872,0.83
+0.73992861,0.84
+0.84242733,0.85
+0.91330954,0.86
+0.88016318,0.87
+0.90719021,0.88
+0.93115757,0.89
+0.93115757,0.90
+0.91942886,0.91
+0.92911780,0.92
+0.95665477,0.93
+0.95002550,0.94
+0.96940337,0.95
+1.,0.96
+0.89801122,0.97
+0.90311066,0.98
+0.90362060,0.99
+0.83477817,1.0
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/spark/blob/61eb1267/docs/mllib-classification-regression.md
--
diff --git a/docs/mllib-classification-regression.md 
b/docs/mllib-classification-regression.md
index 719cc95..5b9b4dd 100644
--- a/docs/mllib-classification-regression.md
+++ b/docs/mllib-classification-regression.md
@@ -23,7 +23,7 @@ the supported algorithms for each type of problem.
   Multiclass Classificationdecision trees, naive Bayes
 
 
-  Regressionlinear least squares, Lasso, ridge regression, 
decision trees
+  Regressionlinear least squares, Lasso, ridge regression, 
decision trees, isotonic regression
 
   
 
@@ -35,3 +35,4 @@ More details for these methods can be found here:
   * [linear regression (least squares, Lasso, 
ridge)](mllib-linear-methods.html#linear-least-squares-lasso-and-ridge-regression)
 * [Decision trees](mllib-decision-tree.html)
 * [Naive Bayes](mllib-naive-bayes.html)
+* [Isotonic regression](mllib-i

spark git commit: [MLLIB][SPARK-5502] User guide for isotonic regression

2015-02-15 Thread meng

Repository: spark
Updated Branches:
  refs/heads/branch-1.3 70ebad4d9 -> d96e188c7


[MLLIB][SPARK-5502] User guide for isotonic regression

User guide for isotonic regression added to docs/mllib-regression.md including 
code examples for Scala and Java.

Author: martinzapletal 

Closes #4536 from zapletal-martin/SPARK-5502 and squashes the following commits:

67fe773 [martinzapletal] SPARK-5502 reworded model prediction rules to use more 
general language rather than the code/implementation specific terms
80bd4c3 [martinzapletal] SPARK-5502 created docs page for isotonic regression, 
added links to the page, updated data and examples
7d8136e [martinzapletal] SPARK-5502 Added documentation for Isotonic regression 
including examples for Scala and Java
504b5c3 [martinzapletal] SPARK-5502 Added documentation for Isotonic regression 
including examples for Scala and Java

(cherry picked from commit 61eb12674b90143388a01c22bf51cb7d02ab0447)
Signed-off-by: Xiangrui Meng 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d96e188c
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d96e188c
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d96e188c

Branch: refs/heads/branch-1.3
Commit: d96e188c7a2b52cff32814f8e0596f030c14ad21
Parents: 70ebad4
Author: martinzapletal 
Authored: Sun Feb 15 09:10:03 2015 -0800
Committer: Xiangrui Meng 
Committed: Sun Feb 15 09:10:12 2015 -0800

--
 data/mllib/sample_isotonic_regression_data.txt | 100 +
 docs/mllib-classification-regression.md|   3 +-
 docs/mllib-guide.md|   1 +
 docs/mllib-isotonic-regression.md  | 155 
 4 files changed, 258 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/d96e188c/data/mllib/sample_isotonic_regression_data.txt
--
diff --git a/data/mllib/sample_isotonic_regression_data.txt 
b/data/mllib/sample_isotonic_regression_data.txt
new file mode 100644
index 000..d257b50
--- /dev/null
+++ b/data/mllib/sample_isotonic_regression_data.txt
@@ -0,0 +1,100 @@
+0.24579296,0.01
+0.28505864,0.02
+0.31208567,0.03
+0.35900051,0.04
+0.35747068,0.05
+0.16675166,0.06
+0.17491076,0.07
+0.04181540,0.08
+0.04793473,0.09
+0.03926568,0.10
+0.12952575,0.11
+0.,0.12
+0.01376849,0.13
+0.13105558,0.14
+0.08873024,0.15
+0.12595614,0.16
+0.15247323,0.17
+0.25956145,0.18
+0.20040796,0.19
+0.19581846,0.20
+0.15757267,0.21
+0.13717491,0.22
+0.19020908,0.23
+0.19581846,0.24
+0.20091790,0.25
+0.16879143,0.26
+0.18510964,0.27
+0.20040796,0.28
+0.29576747,0.29
+0.43396226,0.30
+0.53391127,0.31
+0.52116267,0.32
+0.48546660,0.33
+0.49209587,0.34
+0.54156043,0.35
+0.59765426,0.36
+0.56144824,0.37
+0.58592555,0.38
+0.52983172,0.39
+0.50178480,0.40
+0.52626211,0.41
+0.58286588,0.42
+0.64660887,0.43
+0.68077511,0.44
+0.74298827,0.45
+0.64864865,0.46
+0.67261601,0.47
+0.65782764,0.48
+0.69811321,0.49
+0.63029067,0.50
+0.61601224,0.51
+0.63233044,0.52
+0.65323814,0.53
+0.65323814,0.54
+0.67363590,0.55
+0.67006629,0.56
+0.51555329,0.57
+0.50892402,0.58
+0.33299337,0.59
+0.36206017,0.60
+0.43090260,0.61
+0.45996940,0.62
+0.56348802,0.63
+0.54920959,0.64
+0.48393677,0.65
+0.48495665,0.66
+0.46965834,0.67
+0.45181030,0.68
+0.45843957,0.69
+0.47118817,0.70
+0.51555329,0.71
+0.58031617,0.72
+0.55481897,0.73
+0.56297807,0.74
+0.56603774,0.75
+0.57929628,0.76
+0.64762876,0.77
+0.66241713,0.78
+0.69301377,0.79
+0.65119837,0.80
+0.68332483,0.81
+0.66598674,0.82
+0.73890872,0.83
+0.73992861,0.84
+0.84242733,0.85
+0.91330954,0.86
+0.88016318,0.87
+0.90719021,0.88
+0.93115757,0.89
+0.93115757,0.90
+0.91942886,0.91
+0.92911780,0.92
+0.95665477,0.93
+0.95002550,0.94
+0.96940337,0.95
+1.,0.96
+0.89801122,0.97
+0.90311066,0.98
+0.90362060,0.99
+0.83477817,1.0
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/spark/blob/d96e188c/docs/mllib-classification-regression.md
--
diff --git a/docs/mllib-classification-regression.md 
b/docs/mllib-classification-regression.md
index 719cc95..5b9b4dd 100644
--- a/docs/mllib-classification-regression.md
+++ b/docs/mllib-classification-regression.md
@@ -23,7 +23,7 @@ the supported algorithms for each type of problem.
   Multiclass Classificationdecision trees, naive Bayes
 
 
-  Regressionlinear least squares, Lasso, ridge regression, 
decision trees
+  Regressionlinear least squares, Lasso, ridge regression, 
decision trees, isotonic regression
 
   
 
@@ -35,3 +35,4 @@ More details for these methods can be found here:
   * [linear regression (least squares, Lasso, 
ridge)](mllib-linear-methods.html#linear-least-squares-lasso-and-ridge-regression)
 * [Decisio

spark git commit: [HOTFIX] Ignore DirectKafkaStreamSuite.

2015-02-15 Thread pwendell

Repository: spark
Updated Branches:
  refs/heads/branch-1.3 9c1c70d8c -> 70ebad4d9


[HOTFIX] Ignore DirectKafkaStreamSuite.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/70ebad4d
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/70ebad4d
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/70ebad4d

Branch: refs/heads/branch-1.3
Commit: 70ebad4d972101dc2f920ac014cd2359b99a50f9
Parents: 9c1c70d
Author: Reynold Xin 
Authored: Fri Feb 13 12:43:53 2015 -0800
Committer: Patrick Wendell 
Committed: Sun Feb 15 09:01:25 2015 -0800

--
 .../spark/streaming/kafka/DirectKafkaStreamSuite.scala   | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/70ebad4d/external/kafka/src/test/scala/org/apache/spark/streaming/kafka/DirectKafkaStreamSuite.scala
--
diff --git 
a/external/kafka/src/test/scala/org/apache/spark/streaming/kafka/DirectKafkaStreamSuite.scala
 
b/external/kafka/src/test/scala/org/apache/spark/streaming/kafka/DirectKafkaStreamSuite.scala
index b25c212..9260944 100644
--- 
a/external/kafka/src/test/scala/org/apache/spark/streaming/kafka/DirectKafkaStreamSuite.scala
+++ 
b/external/kafka/src/test/scala/org/apache/spark/streaming/kafka/DirectKafkaStreamSuite.scala
@@ -67,7 +67,7 @@ class DirectKafkaStreamSuite extends KafkaStreamSuiteBase
   }
 
 
-  test("basic stream receiving with multiple topics and smallest starting 
offset") {
+  ignore("basic stream receiving with multiple topics and smallest starting 
offset") {
 val topics = Set("basic1", "basic2", "basic3")
 val data = Map("a" -> 7, "b" -> 9)
 topics.foreach { t =>
@@ -113,7 +113,7 @@ class DirectKafkaStreamSuite extends KafkaStreamSuiteBase
 ssc.stop()
   }
 
-  test("receiving from largest starting offset") {
+  ignore("receiving from largest starting offset") {
 val topic = "largest"
 val topicPartition = TopicAndPartition(topic, 0)
 val data = Map("a" -> 10)
@@ -158,7 +158,7 @@ class DirectKafkaStreamSuite extends KafkaStreamSuiteBase
   }
 
 
-  test("creating stream by offset") {
+  ignore("creating stream by offset") {
 val topic = "offset"
 val topicPartition = TopicAndPartition(topic, 0)
 val data = Map("a" -> 10)
@@ -204,7 +204,7 @@ class DirectKafkaStreamSuite extends KafkaStreamSuiteBase
   }
 
   // Test to verify the offset ranges can be recovered from the checkpoints
-  test("offset recovery") {
+  ignore("offset recovery") {
 val topic = "recovery"
 createTopic(topic)
 testDir = Utils.createTempDir()


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-5827][SQL] Add missing import in the example of SqlContext

2015-02-15 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-1.3 f87f3b755 -> 9c1c70d8c


[SPARK-5827][SQL] Add missing import in the example of SqlContext

If one tries an example by using copy&paste, throw an exception.

Author: Takeshi Yamamuro 

Closes #4615 from maropu/AddMissingImportInSqlContext and squashes the 
following commits:

ab21b66 [Takeshi Yamamuro] Add missing import in the example of SqlContext

(cherry picked from commit c771e475c449fe07cf45f37bdca2ba6ce9600bfc)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9c1c70d8
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/9c1c70d8
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/9c1c70d8

Branch: refs/heads/branch-1.3
Commit: 9c1c70d8cc8cf3afedecbc8868b3765c15bd493e
Parents: f87f3b7
Author: Takeshi Yamamuro 
Authored: Sun Feb 15 14:42:20 2015 +
Committer: Sean Owen 
Committed: Sun Feb 15 14:42:28 2015 +

--
 sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala | 2 ++
 1 file changed, 2 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/9c1c70d8/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala
--
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala
index a1736d0..6d19148 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala
@@ -286,6 +286,7 @@ class SQLContext(@transient val sparkContext: SparkContext)
* Example:
* {{{
*  import org.apache.spark.sql._
+   *  import org.apache.spark.sql.types._
*  val sqlContext = new org.apache.spark.sql.SQLContext(sc)
*
*  val schema =
@@ -377,6 +378,7 @@ class SQLContext(@transient val sparkContext: SparkContext)
* Example:
* {{{
*  import org.apache.spark.sql._
+   *  import org.apache.spark.sql.types._
*  val sqlContext = new org.apache.spark.sql.SQLContext(sc)
*
*  val schema =


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-5827][SQL] Add missing import in the example of SqlContext

2015-02-15 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master ed5f4bb7c -> c771e475c


[SPARK-5827][SQL] Add missing import in the example of SqlContext

If one tries an example by using copy&paste, throw an exception.

Author: Takeshi Yamamuro 

Closes #4615 from maropu/AddMissingImportInSqlContext and squashes the 
following commits:

ab21b66 [Takeshi Yamamuro] Add missing import in the example of SqlContext


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c771e475
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c771e475
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c771e475

Branch: refs/heads/master
Commit: c771e475c449fe07cf45f37bdca2ba6ce9600bfc
Parents: ed5f4bb
Author: Takeshi Yamamuro 
Authored: Sun Feb 15 14:42:20 2015 +
Committer: Sean Owen 
Committed: Sun Feb 15 14:42:20 2015 +

--
 sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala | 2 ++
 1 file changed, 2 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/c771e475/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala
--
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala
index a1736d0..6d19148 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala
@@ -286,6 +286,7 @@ class SQLContext(@transient val sparkContext: SparkContext)
* Example:
* {{{
*  import org.apache.spark.sql._
+   *  import org.apache.spark.sql.types._
*  val sqlContext = new org.apache.spark.sql.SQLContext(sc)
*
*  val schema =
@@ -377,6 +378,7 @@ class SQLContext(@transient val sparkContext: SparkContext)
* Example:
* {{{
*  import org.apache.spark.sql._
+   *  import org.apache.spark.sql.types._
*  val sqlContext = new org.apache.spark.sql.SQLContext(sc)
*
*  val schema =


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [Ml] SPARK-5796 Don't transform data on a last estimator in Pipeline

spark git commit: [Ml] SPARK-5796 Don't transform data on a last estimator in Pipeline

spark git commit: SPARK-5815 [MLLIB] Deprecate SVDPlusPlus APIs that expose DoubleMatrix from JBLAS

spark git commit: SPARK-5815 [MLLIB] Deprecate SVDPlusPlus APIs that expose DoubleMatrix from JBLAS

spark git commit: [SPARK-5769] Set params in constructors and in setParams in Python ML pipelines

spark git commit: [SPARK-5769] Set params in constructors and in setParams in Python ML pipelines

spark git commit: SPARK-5669 [BUILD] Spark assembly includes incompatibly licensed libgfortran, libgcc code via JBLAS

spark git commit: SPARK-5669 [BUILD] Spark assembly includes incompatibly licensed libgfortran, libgcc code via JBLAS

spark git commit: [MLLIB][SPARK-5502] User guide for isotonic regression

spark git commit: [MLLIB][SPARK-5502] User guide for isotonic regression

spark git commit: [HOTFIX] Ignore DirectKafkaStreamSuite.

spark git commit: [SPARK-5827][SQL] Add missing import in the example of SqlContext

spark git commit: [SPARK-5827][SQL] Add missing import in the example of SqlContext

13 matches

Site Navigation

Mail list logo

Footer information