spark git commit: [Ml] SPARK-5796 Don't transform data on a last estimator in Pipeline
Repository: spark Updated Branches: refs/heads/master acf2558dc -> c78a12c4c [Ml] SPARK-5796 Don't transform data on a last estimator in Pipeline If it's a last estimator in Pipeline there's no need to transform data, since there's no next stage that would consume this data. Author: Peter Rudenko Closes #4590 from petro-rudenko/patch-1 and squashes the following commits: d13ec33 [Peter Rudenko] [Ml] SPARK-5796 Don't transform data on a last estimator in Pipeline Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c78a12c4 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c78a12c4 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c78a12c4 Branch: refs/heads/master Commit: c78a12c4cc4d4312c4ee1069d3b218882d32d678 Parents: acf2558 Author: Peter Rudenko Authored: Sun Feb 15 20:51:32 2015 -0800 Committer: Xiangrui Meng Committed: Sun Feb 15 20:51:32 2015 -0800 -- mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/c78a12c4/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala -- diff --git a/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala b/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala index bb291e6..5607ed2 100644 --- a/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala +++ b/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala @@ -114,7 +114,9 @@ class Pipeline extends Estimator[PipelineModel] { throw new IllegalArgumentException( s"Do not support stage $stage of type ${stage.getClass}") } -curDataset = transformer.transform(curDataset, paramMap) +if (index < indexOfLastEstimator) { + curDataset = transformer.transform(curDataset, paramMap) +} transformers += transformer } else { transformers += stage.asInstanceOf[Transformer] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [Ml] SPARK-5796 Don't transform data on a last estimator in Pipeline
Repository: spark Updated Branches: refs/heads/branch-1.3 db3c539f2 -> 9cf7d7088 [Ml] SPARK-5796 Don't transform data on a last estimator in Pipeline If it's a last estimator in Pipeline there's no need to transform data, since there's no next stage that would consume this data. Author: Peter Rudenko Closes #4590 from petro-rudenko/patch-1 and squashes the following commits: d13ec33 [Peter Rudenko] [Ml] SPARK-5796 Don't transform data on a last estimator in Pipeline (cherry picked from commit c78a12c4cc4d4312c4ee1069d3b218882d32d678) Signed-off-by: Xiangrui Meng Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9cf7d708 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/9cf7d708 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/9cf7d708 Branch: refs/heads/branch-1.3 Commit: 9cf7d7088d245b9b41ec78295cd2d6e3e395793d Parents: db3c539 Author: Peter Rudenko Authored: Sun Feb 15 20:51:32 2015 -0800 Committer: Xiangrui Meng Committed: Sun Feb 15 20:51:38 2015 -0800 -- mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/9cf7d708/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala -- diff --git a/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala b/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala index bb291e6..5607ed2 100644 --- a/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala +++ b/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala @@ -114,7 +114,9 @@ class Pipeline extends Estimator[PipelineModel] { throw new IllegalArgumentException( s"Do not support stage $stage of type ${stage.getClass}") } -curDataset = transformer.transform(curDataset, paramMap) +if (index < indexOfLastEstimator) { + curDataset = transformer.transform(curDataset, paramMap) +} transformers += transformer } else { transformers += stage.asInstanceOf[Transformer] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: SPARK-5815 [MLLIB] Deprecate SVDPlusPlus APIs that expose DoubleMatrix from JBLAS
Repository: spark Updated Branches: refs/heads/branch-1.3 d71099133 -> db3c539f2 SPARK-5815 [MLLIB] Deprecate SVDPlusPlus APIs that expose DoubleMatrix from JBLAS Deprecate SVDPlusPlus.run and introduce SVDPlusPlus.runSVDPlusPlus with return type that doesn't include DoubleMatrix CC mengxr Author: Sean Owen Closes #4614 from srowen/SPARK-5815 and squashes the following commits: 288cb05 [Sean Owen] Clarify deprecation plans in scaladoc 497458e [Sean Owen] Deprecate SVDPlusPlus.run and introduce SVDPlusPlus.runSVDPlusPlus with return type that doesn't include DoubleMatrix (cherry picked from commit acf2558dc92901c342262c35eebb95f2a9b7a9ae) Signed-off-by: Xiangrui Meng Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/db3c539f Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/db3c539f Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/db3c539f Branch: refs/heads/branch-1.3 Commit: db3c539f20e17e327b2f284bf6fbb3f1abd7fe64 Parents: d710991 Author: Sean Owen Authored: Sun Feb 15 20:41:27 2015 -0800 Committer: Xiangrui Meng Committed: Sun Feb 15 20:41:33 2015 -0800 -- .../apache/spark/graphx/lib/SVDPlusPlus.scala | 25 .../spark/graphx/lib/SVDPlusPlusSuite.scala | 2 +- 2 files changed, 26 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/db3c539f/graphx/src/main/scala/org/apache/spark/graphx/lib/SVDPlusPlus.scala -- diff --git a/graphx/src/main/scala/org/apache/spark/graphx/lib/SVDPlusPlus.scala b/graphx/src/main/scala/org/apache/spark/graphx/lib/SVDPlusPlus.scala index 112ed09..fc84cfb 100644 --- a/graphx/src/main/scala/org/apache/spark/graphx/lib/SVDPlusPlus.scala +++ b/graphx/src/main/scala/org/apache/spark/graphx/lib/SVDPlusPlus.scala @@ -17,6 +17,8 @@ package org.apache.spark.graphx.lib +import org.apache.spark.annotation.Experimental + import scala.util.Random import org.jblas.DoubleMatrix import org.apache.spark.rdd._ @@ -38,6 +40,8 @@ object SVDPlusPlus { extends Serializable /** + * :: Experimental :: + * * Implement SVD++ based on "Factorization Meets the Neighborhood: * a Multifaceted Collaborative Filtering Model", * available at [[http://public.research.att.com/~volinsky/netflix/kdd08koren.pdf]]. @@ -45,12 +49,33 @@ object SVDPlusPlus { * The prediction rule is rui = u + bu + bi + qi*(pu + |N(u)|^(-0.5)*sum(y)), * see the details on page 6. * + * This method temporarily replaces `run()`, and replaces `DoubleMatrix` in `run()`'s return + * value with `Array[Double]`. In 1.4.0, this method will be deprecated, but will be copied + * to replace `run()`, which will then be undeprecated. + * * @param edges edges for constructing the graph * * @param conf SVDPlusPlus parameters * * @return a graph with vertex attributes containing the trained model */ + @Experimental + def runSVDPlusPlus(edges: RDD[Edge[Double]], conf: Conf) +: (Graph[(Array[Double], Array[Double], Double, Double), Double], Double) = + { +val (graph, u) = run(edges, conf) +// Convert DoubleMatrix to Array[Double]: +val newVertices = graph.vertices.mapValues(v => (v._1.toArray, v._2.toArray, v._3, v._4)) +(Graph(newVertices, graph.edges), u) + } + + /** + * This method is deprecated in favor of `runSVDPlusPlus()`, which replaces `DoubleMatrix` + * with `Array[Double]` in its return value. This method is deprecated. It will effectively + * be removed in 1.4.0 when `runSVDPlusPlus()` is copied to replace `run()`, and hence the + * return type of this method changes. + */ + @deprecated("Call runSVDPlusPlus", "1.3.0") def run(edges: RDD[Edge[Double]], conf: Conf) : (Graph[(DoubleMatrix, DoubleMatrix, Double, Double), Double], Double) = { http://git-wip-us.apache.org/repos/asf/spark/blob/db3c539f/graphx/src/test/scala/org/apache/spark/graphx/lib/SVDPlusPlusSuite.scala -- diff --git a/graphx/src/test/scala/org/apache/spark/graphx/lib/SVDPlusPlusSuite.scala b/graphx/src/test/scala/org/apache/spark/graphx/lib/SVDPlusPlusSuite.scala index e01df56..9987a4b 100644 --- a/graphx/src/test/scala/org/apache/spark/graphx/lib/SVDPlusPlusSuite.scala +++ b/graphx/src/test/scala/org/apache/spark/graphx/lib/SVDPlusPlusSuite.scala @@ -32,7 +32,7 @@ class SVDPlusPlusSuite extends FunSuite with LocalSparkContext { Edge(fields(0).toLong * 2, fields(1).toLong * 2 + 1, fields(2).toDouble) } val conf = new SVDPlusPlus.Conf(10, 2, 0.0, 5.0, 0.007, 0.007, 0.005, 0.015) // 2 iterations - var (graph, u) = SVDPlusPlus.run(edges, conf) + var (graph, u) = SVDPlusPlus.runSVDPlusP
spark git commit: SPARK-5815 [MLLIB] Deprecate SVDPlusPlus APIs that expose DoubleMatrix from JBLAS
Repository: spark Updated Branches: refs/heads/master cd4a15366 -> acf2558dc SPARK-5815 [MLLIB] Deprecate SVDPlusPlus APIs that expose DoubleMatrix from JBLAS Deprecate SVDPlusPlus.run and introduce SVDPlusPlus.runSVDPlusPlus with return type that doesn't include DoubleMatrix CC mengxr Author: Sean Owen Closes #4614 from srowen/SPARK-5815 and squashes the following commits: 288cb05 [Sean Owen] Clarify deprecation plans in scaladoc 497458e [Sean Owen] Deprecate SVDPlusPlus.run and introduce SVDPlusPlus.runSVDPlusPlus with return type that doesn't include DoubleMatrix Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/acf2558d Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/acf2558d Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/acf2558d Branch: refs/heads/master Commit: acf2558dc92901c342262c35eebb95f2a9b7a9ae Parents: cd4a153 Author: Sean Owen Authored: Sun Feb 15 20:41:27 2015 -0800 Committer: Xiangrui Meng Committed: Sun Feb 15 20:41:27 2015 -0800 -- .../apache/spark/graphx/lib/SVDPlusPlus.scala | 25 .../spark/graphx/lib/SVDPlusPlusSuite.scala | 2 +- 2 files changed, 26 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/acf2558d/graphx/src/main/scala/org/apache/spark/graphx/lib/SVDPlusPlus.scala -- diff --git a/graphx/src/main/scala/org/apache/spark/graphx/lib/SVDPlusPlus.scala b/graphx/src/main/scala/org/apache/spark/graphx/lib/SVDPlusPlus.scala index 112ed09..fc84cfb 100644 --- a/graphx/src/main/scala/org/apache/spark/graphx/lib/SVDPlusPlus.scala +++ b/graphx/src/main/scala/org/apache/spark/graphx/lib/SVDPlusPlus.scala @@ -17,6 +17,8 @@ package org.apache.spark.graphx.lib +import org.apache.spark.annotation.Experimental + import scala.util.Random import org.jblas.DoubleMatrix import org.apache.spark.rdd._ @@ -38,6 +40,8 @@ object SVDPlusPlus { extends Serializable /** + * :: Experimental :: + * * Implement SVD++ based on "Factorization Meets the Neighborhood: * a Multifaceted Collaborative Filtering Model", * available at [[http://public.research.att.com/~volinsky/netflix/kdd08koren.pdf]]. @@ -45,12 +49,33 @@ object SVDPlusPlus { * The prediction rule is rui = u + bu + bi + qi*(pu + |N(u)|^(-0.5)*sum(y)), * see the details on page 6. * + * This method temporarily replaces `run()`, and replaces `DoubleMatrix` in `run()`'s return + * value with `Array[Double]`. In 1.4.0, this method will be deprecated, but will be copied + * to replace `run()`, which will then be undeprecated. + * * @param edges edges for constructing the graph * * @param conf SVDPlusPlus parameters * * @return a graph with vertex attributes containing the trained model */ + @Experimental + def runSVDPlusPlus(edges: RDD[Edge[Double]], conf: Conf) +: (Graph[(Array[Double], Array[Double], Double, Double), Double], Double) = + { +val (graph, u) = run(edges, conf) +// Convert DoubleMatrix to Array[Double]: +val newVertices = graph.vertices.mapValues(v => (v._1.toArray, v._2.toArray, v._3, v._4)) +(Graph(newVertices, graph.edges), u) + } + + /** + * This method is deprecated in favor of `runSVDPlusPlus()`, which replaces `DoubleMatrix` + * with `Array[Double]` in its return value. This method is deprecated. It will effectively + * be removed in 1.4.0 when `runSVDPlusPlus()` is copied to replace `run()`, and hence the + * return type of this method changes. + */ + @deprecated("Call runSVDPlusPlus", "1.3.0") def run(edges: RDD[Edge[Double]], conf: Conf) : (Graph[(DoubleMatrix, DoubleMatrix, Double, Double), Double], Double) = { http://git-wip-us.apache.org/repos/asf/spark/blob/acf2558d/graphx/src/test/scala/org/apache/spark/graphx/lib/SVDPlusPlusSuite.scala -- diff --git a/graphx/src/test/scala/org/apache/spark/graphx/lib/SVDPlusPlusSuite.scala b/graphx/src/test/scala/org/apache/spark/graphx/lib/SVDPlusPlusSuite.scala index e01df56..9987a4b 100644 --- a/graphx/src/test/scala/org/apache/spark/graphx/lib/SVDPlusPlusSuite.scala +++ b/graphx/src/test/scala/org/apache/spark/graphx/lib/SVDPlusPlusSuite.scala @@ -32,7 +32,7 @@ class SVDPlusPlusSuite extends FunSuite with LocalSparkContext { Edge(fields(0).toLong * 2, fields(1).toLong * 2 + 1, fields(2).toDouble) } val conf = new SVDPlusPlus.Conf(10, 2, 0.0, 5.0, 0.007, 0.007, 0.005, 0.015) // 2 iterations - var (graph, u) = SVDPlusPlus.run(edges, conf) + var (graph, u) = SVDPlusPlus.runSVDPlusPlus(edges, conf) graph.cache() val err = graph.vertices.collect().map{ case (vid, vd) =>
spark git commit: [SPARK-5769] Set params in constructors and in setParams in Python ML pipelines
Repository: spark Updated Branches: refs/heads/branch-1.3 4e099d757 -> d71099133 [SPARK-5769] Set params in constructors and in setParams in Python ML pipelines This PR allow Python users to set params in constructors and in setParams, where we use decorator `keyword_only` to force keyword arguments. The trade-off is discussed in the design doc of SPARK-4586. Generated doc: ![screen shot 2015-02-12 at 3 06 58 am](https://cloud.githubusercontent.com/assets/829644/6166491/9cfcd06a-b265-11e4-99ea-473d866634fc.png) CC: davies rxin Author: Xiangrui Meng Closes #4564 from mengxr/py-pipeline-kw and squashes the following commits: fedf720 [Xiangrui Meng] use toDF d565f2c [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into py-pipeline-kw cbc15d3 [Xiangrui Meng] fix style 5032097 [Xiangrui Meng] update pipeline signature 950774e [Xiangrui Meng] simplify keyword_only and update constructor/setParams signatures fdde5fc [Xiangrui Meng] fix style c9384b8 [Xiangrui Meng] fix sphinx doc 8e59180 [Xiangrui Meng] add setParams and make constructors take params, where we force keyword args (cherry picked from commit cd4a15366244657c4b7936abe5054754534366f2) Signed-off-by: Xiangrui Meng Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d7109913 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d7109913 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d7109913 Branch: refs/heads/branch-1.3 Commit: d71099133b64a4b9e9ab430cf1b314ee7deaf08d Parents: 4e099d7 Author: Xiangrui Meng Authored: Sun Feb 15 20:29:26 2015 -0800 Committer: Xiangrui Meng Committed: Sun Feb 15 20:29:36 2015 -0800 -- .../ml/simple_text_classification_pipeline.py | 44 +--- python/docs/conf.py | 4 ++ python/pyspark/ml/classification.py | 44 +--- python/pyspark/ml/feature.py| 72 python/pyspark/ml/param/__init__.py | 8 +++ python/pyspark/ml/pipeline.py | 19 +- python/pyspark/ml/util.py | 15 7 files changed, 153 insertions(+), 53 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/d7109913/examples/src/main/python/ml/simple_text_classification_pipeline.py -- diff --git a/examples/src/main/python/ml/simple_text_classification_pipeline.py b/examples/src/main/python/ml/simple_text_classification_pipeline.py index c7df3d7..b4d9355 100644 --- a/examples/src/main/python/ml/simple_text_classification_pipeline.py +++ b/examples/src/main/python/ml/simple_text_classification_pipeline.py @@ -36,43 +36,33 @@ if __name__ == "__main__": sqlCtx = SQLContext(sc) # Prepare training documents, which are labeled. -LabeledDocument = Row('id', 'text', 'label') -training = sqlCtx.inferSchema( -sc.parallelize([(0L, "a b c d e spark", 1.0), -(1L, "b d", 0.0), -(2L, "spark f g h", 1.0), -(3L, "hadoop mapreduce", 0.0)]) - .map(lambda x: LabeledDocument(*x))) +LabeledDocument = Row("id", "text", "label") +training = sc.parallelize([(0L, "a b c d e spark", 1.0), + (1L, "b d", 0.0), + (2L, "spark f g h", 1.0), + (3L, "hadoop mapreduce", 0.0)]) \ +.map(lambda x: LabeledDocument(*x)).toDF() # Configure an ML pipeline, which consists of tree stages: tokenizer, hashingTF, and lr. -tokenizer = Tokenizer() \ -.setInputCol("text") \ -.setOutputCol("words") -hashingTF = HashingTF() \ -.setInputCol(tokenizer.getOutputCol()) \ -.setOutputCol("features") -lr = LogisticRegression() \ -.setMaxIter(10) \ -.setRegParam(0.01) -pipeline = Pipeline() \ -.setStages([tokenizer, hashingTF, lr]) +tokenizer = Tokenizer(inputCol="text", outputCol="words") +hashingTF = HashingTF(inputCol=tokenizer.getOutputCol(), outputCol="features") +lr = LogisticRegression(maxIter=10, regParam=0.01) +pipeline = Pipeline(stages=[tokenizer, hashingTF, lr]) # Fit the pipeline to training documents. model = pipeline.fit(training) # Prepare test documents, which are unlabeled. -Document = Row('id', 'text') -test = sqlCtx.inferSchema( -sc.parallelize([(4L, "spark i j k"), -(5L, "l m n"), -(6L, "mapreduce spark"), -(7L, "apache hadoop")]) - .map(lambda x: Document(*x))) +Document = Row("id", "text") +test = sc.parallelize([(4L, "spark i j k"), + (5L, "l m n"), +
spark git commit: [SPARK-5769] Set params in constructors and in setParams in Python ML pipelines
Repository: spark Updated Branches: refs/heads/master 836577b38 -> cd4a15366 [SPARK-5769] Set params in constructors and in setParams in Python ML pipelines This PR allow Python users to set params in constructors and in setParams, where we use decorator `keyword_only` to force keyword arguments. The trade-off is discussed in the design doc of SPARK-4586. Generated doc: ![screen shot 2015-02-12 at 3 06 58 am](https://cloud.githubusercontent.com/assets/829644/6166491/9cfcd06a-b265-11e4-99ea-473d866634fc.png) CC: davies rxin Author: Xiangrui Meng Closes #4564 from mengxr/py-pipeline-kw and squashes the following commits: fedf720 [Xiangrui Meng] use toDF d565f2c [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into py-pipeline-kw cbc15d3 [Xiangrui Meng] fix style 5032097 [Xiangrui Meng] update pipeline signature 950774e [Xiangrui Meng] simplify keyword_only and update constructor/setParams signatures fdde5fc [Xiangrui Meng] fix style c9384b8 [Xiangrui Meng] fix sphinx doc 8e59180 [Xiangrui Meng] add setParams and make constructors take params, where we force keyword args Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/cd4a1536 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/cd4a1536 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/cd4a1536 Branch: refs/heads/master Commit: cd4a15366244657c4b7936abe5054754534366f2 Parents: 836577b Author: Xiangrui Meng Authored: Sun Feb 15 20:29:26 2015 -0800 Committer: Xiangrui Meng Committed: Sun Feb 15 20:29:26 2015 -0800 -- .../ml/simple_text_classification_pipeline.py | 44 +--- python/docs/conf.py | 4 ++ python/pyspark/ml/classification.py | 44 +--- python/pyspark/ml/feature.py| 72 python/pyspark/ml/param/__init__.py | 8 +++ python/pyspark/ml/pipeline.py | 19 +- python/pyspark/ml/util.py | 15 7 files changed, 153 insertions(+), 53 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/cd4a1536/examples/src/main/python/ml/simple_text_classification_pipeline.py -- diff --git a/examples/src/main/python/ml/simple_text_classification_pipeline.py b/examples/src/main/python/ml/simple_text_classification_pipeline.py index c7df3d7..b4d9355 100644 --- a/examples/src/main/python/ml/simple_text_classification_pipeline.py +++ b/examples/src/main/python/ml/simple_text_classification_pipeline.py @@ -36,43 +36,33 @@ if __name__ == "__main__": sqlCtx = SQLContext(sc) # Prepare training documents, which are labeled. -LabeledDocument = Row('id', 'text', 'label') -training = sqlCtx.inferSchema( -sc.parallelize([(0L, "a b c d e spark", 1.0), -(1L, "b d", 0.0), -(2L, "spark f g h", 1.0), -(3L, "hadoop mapreduce", 0.0)]) - .map(lambda x: LabeledDocument(*x))) +LabeledDocument = Row("id", "text", "label") +training = sc.parallelize([(0L, "a b c d e spark", 1.0), + (1L, "b d", 0.0), + (2L, "spark f g h", 1.0), + (3L, "hadoop mapreduce", 0.0)]) \ +.map(lambda x: LabeledDocument(*x)).toDF() # Configure an ML pipeline, which consists of tree stages: tokenizer, hashingTF, and lr. -tokenizer = Tokenizer() \ -.setInputCol("text") \ -.setOutputCol("words") -hashingTF = HashingTF() \ -.setInputCol(tokenizer.getOutputCol()) \ -.setOutputCol("features") -lr = LogisticRegression() \ -.setMaxIter(10) \ -.setRegParam(0.01) -pipeline = Pipeline() \ -.setStages([tokenizer, hashingTF, lr]) +tokenizer = Tokenizer(inputCol="text", outputCol="words") +hashingTF = HashingTF(inputCol=tokenizer.getOutputCol(), outputCol="features") +lr = LogisticRegression(maxIter=10, regParam=0.01) +pipeline = Pipeline(stages=[tokenizer, hashingTF, lr]) # Fit the pipeline to training documents. model = pipeline.fit(training) # Prepare test documents, which are unlabeled. -Document = Row('id', 'text') -test = sqlCtx.inferSchema( -sc.parallelize([(4L, "spark i j k"), -(5L, "l m n"), -(6L, "mapreduce spark"), -(7L, "apache hadoop")]) - .map(lambda x: Document(*x))) +Document = Row("id", "text") +test = sc.parallelize([(4L, "spark i j k"), + (5L, "l m n"), + (6L, "mapreduce spark"), + (7L, "apache hadoop")]) \ +.map(lambda x:
spark git commit: SPARK-5669 [BUILD] Spark assembly includes incompatibly licensed libgfortran, libgcc code via JBLAS
Repository: spark Updated Branches: refs/heads/branch-1.3 d96e188c7 -> 4e099d757 SPARK-5669 [BUILD] Spark assembly includes incompatibly licensed libgfortran, libgcc code via JBLAS Exclude libgfortran, libgcc bundled by JBLAS for Windows. This much is simple, and solves the essential license issue. But the more important question is whether MLlib works on Windows then. Author: Sean Owen Closes #4453 from srowen/SPARK-5669 and squashes the following commits: 734dd86 [Sean Owen] Exclude libgfortran, libgcc bundled by JBLAS, affecting Windows / OS X / Linux 32-bit (not Linux 64-bit) (cherry picked from commit 836577b382695558f5c97d94ee725d0156ebfad2) Signed-off-by: Xiangrui Meng Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/4e099d75 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/4e099d75 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/4e099d75 Branch: refs/heads/branch-1.3 Commit: 4e099d757fc1bc4266f7849db6da0e996bf917be Parents: d96e188 Author: Sean Owen Authored: Sun Feb 15 09:15:48 2015 -0800 Committer: Xiangrui Meng Committed: Sun Feb 15 09:16:03 2015 -0800 -- assembly/pom.xml | 10 ++ 1 file changed, 10 insertions(+) -- http://git-wip-us.apache.org/repos/asf/spark/blob/4e099d75/assembly/pom.xml -- diff --git a/assembly/pom.xml b/assembly/pom.xml index 87b3e6f..7752b41 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -118,6 +118,16 @@ META-INF/*.RSA + + + org.jblas:jblas + + +lib/Linux/i386/** +lib/Mac OS X/** +lib/Windows/** + + - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: SPARK-5669 [BUILD] Spark assembly includes incompatibly licensed libgfortran, libgcc code via JBLAS
Repository: spark Updated Branches: refs/heads/master 61eb12674 -> 836577b38 SPARK-5669 [BUILD] Spark assembly includes incompatibly licensed libgfortran, libgcc code via JBLAS Exclude libgfortran, libgcc bundled by JBLAS for Windows. This much is simple, and solves the essential license issue. But the more important question is whether MLlib works on Windows then. Author: Sean Owen Closes #4453 from srowen/SPARK-5669 and squashes the following commits: 734dd86 [Sean Owen] Exclude libgfortran, libgcc bundled by JBLAS, affecting Windows / OS X / Linux 32-bit (not Linux 64-bit) Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/836577b3 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/836577b3 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/836577b3 Branch: refs/heads/master Commit: 836577b382695558f5c97d94ee725d0156ebfad2 Parents: 61eb126 Author: Sean Owen Authored: Sun Feb 15 09:15:48 2015 -0800 Committer: Xiangrui Meng Committed: Sun Feb 15 09:15:48 2015 -0800 -- assembly/pom.xml | 10 ++ 1 file changed, 10 insertions(+) -- http://git-wip-us.apache.org/repos/asf/spark/blob/836577b3/assembly/pom.xml -- diff --git a/assembly/pom.xml b/assembly/pom.xml index fa9f56e..fbb6e94 100644 --- a/assembly/pom.xml +++ b/assembly/pom.xml @@ -114,6 +114,16 @@ META-INF/*.RSA + + + org.jblas:jblas + + +lib/Linux/i386/** +lib/Mac OS X/** +lib/Windows/** + + - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [MLLIB][SPARK-5502] User guide for isotonic regression
Repository: spark Updated Branches: refs/heads/master c771e475c -> 61eb12674 [MLLIB][SPARK-5502] User guide for isotonic regression User guide for isotonic regression added to docs/mllib-regression.md including code examples for Scala and Java. Author: martinzapletal Closes #4536 from zapletal-martin/SPARK-5502 and squashes the following commits: 67fe773 [martinzapletal] SPARK-5502 reworded model prediction rules to use more general language rather than the code/implementation specific terms 80bd4c3 [martinzapletal] SPARK-5502 created docs page for isotonic regression, added links to the page, updated data and examples 7d8136e [martinzapletal] SPARK-5502 Added documentation for Isotonic regression including examples for Scala and Java 504b5c3 [martinzapletal] SPARK-5502 Added documentation for Isotonic regression including examples for Scala and Java Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/61eb1267 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/61eb1267 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/61eb1267 Branch: refs/heads/master Commit: 61eb12674b90143388a01c22bf51cb7d02ab0447 Parents: c771e47 Author: martinzapletal Authored: Sun Feb 15 09:10:03 2015 -0800 Committer: Xiangrui Meng Committed: Sun Feb 15 09:10:03 2015 -0800 -- data/mllib/sample_isotonic_regression_data.txt | 100 + docs/mllib-classification-regression.md| 3 +- docs/mllib-guide.md| 1 + docs/mllib-isotonic-regression.md | 155 4 files changed, 258 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/61eb1267/data/mllib/sample_isotonic_regression_data.txt -- diff --git a/data/mllib/sample_isotonic_regression_data.txt b/data/mllib/sample_isotonic_regression_data.txt new file mode 100644 index 000..d257b50 --- /dev/null +++ b/data/mllib/sample_isotonic_regression_data.txt @@ -0,0 +1,100 @@ +0.24579296,0.01 +0.28505864,0.02 +0.31208567,0.03 +0.35900051,0.04 +0.35747068,0.05 +0.16675166,0.06 +0.17491076,0.07 +0.04181540,0.08 +0.04793473,0.09 +0.03926568,0.10 +0.12952575,0.11 +0.,0.12 +0.01376849,0.13 +0.13105558,0.14 +0.08873024,0.15 +0.12595614,0.16 +0.15247323,0.17 +0.25956145,0.18 +0.20040796,0.19 +0.19581846,0.20 +0.15757267,0.21 +0.13717491,0.22 +0.19020908,0.23 +0.19581846,0.24 +0.20091790,0.25 +0.16879143,0.26 +0.18510964,0.27 +0.20040796,0.28 +0.29576747,0.29 +0.43396226,0.30 +0.53391127,0.31 +0.52116267,0.32 +0.48546660,0.33 +0.49209587,0.34 +0.54156043,0.35 +0.59765426,0.36 +0.56144824,0.37 +0.58592555,0.38 +0.52983172,0.39 +0.50178480,0.40 +0.52626211,0.41 +0.58286588,0.42 +0.64660887,0.43 +0.68077511,0.44 +0.74298827,0.45 +0.64864865,0.46 +0.67261601,0.47 +0.65782764,0.48 +0.69811321,0.49 +0.63029067,0.50 +0.61601224,0.51 +0.63233044,0.52 +0.65323814,0.53 +0.65323814,0.54 +0.67363590,0.55 +0.67006629,0.56 +0.51555329,0.57 +0.50892402,0.58 +0.33299337,0.59 +0.36206017,0.60 +0.43090260,0.61 +0.45996940,0.62 +0.56348802,0.63 +0.54920959,0.64 +0.48393677,0.65 +0.48495665,0.66 +0.46965834,0.67 +0.45181030,0.68 +0.45843957,0.69 +0.47118817,0.70 +0.51555329,0.71 +0.58031617,0.72 +0.55481897,0.73 +0.56297807,0.74 +0.56603774,0.75 +0.57929628,0.76 +0.64762876,0.77 +0.66241713,0.78 +0.69301377,0.79 +0.65119837,0.80 +0.68332483,0.81 +0.66598674,0.82 +0.73890872,0.83 +0.73992861,0.84 +0.84242733,0.85 +0.91330954,0.86 +0.88016318,0.87 +0.90719021,0.88 +0.93115757,0.89 +0.93115757,0.90 +0.91942886,0.91 +0.92911780,0.92 +0.95665477,0.93 +0.95002550,0.94 +0.96940337,0.95 +1.,0.96 +0.89801122,0.97 +0.90311066,0.98 +0.90362060,0.99 +0.83477817,1.0 \ No newline at end of file http://git-wip-us.apache.org/repos/asf/spark/blob/61eb1267/docs/mllib-classification-regression.md -- diff --git a/docs/mllib-classification-regression.md b/docs/mllib-classification-regression.md index 719cc95..5b9b4dd 100644 --- a/docs/mllib-classification-regression.md +++ b/docs/mllib-classification-regression.md @@ -23,7 +23,7 @@ the supported algorithms for each type of problem. Multiclass Classificationdecision trees, naive Bayes - Regressionlinear least squares, Lasso, ridge regression, decision trees + Regressionlinear least squares, Lasso, ridge regression, decision trees, isotonic regression @@ -35,3 +35,4 @@ More details for these methods can be found here: * [linear regression (least squares, Lasso, ridge)](mllib-linear-methods.html#linear-least-squares-lasso-and-ridge-regression) * [Decision trees](mllib-decision-tree.html) * [Naive Bayes](mllib-naive-bayes.html) +* [Isotonic regression](mllib-i
spark git commit: [MLLIB][SPARK-5502] User guide for isotonic regression
Repository: spark Updated Branches: refs/heads/branch-1.3 70ebad4d9 -> d96e188c7 [MLLIB][SPARK-5502] User guide for isotonic regression User guide for isotonic regression added to docs/mllib-regression.md including code examples for Scala and Java. Author: martinzapletal Closes #4536 from zapletal-martin/SPARK-5502 and squashes the following commits: 67fe773 [martinzapletal] SPARK-5502 reworded model prediction rules to use more general language rather than the code/implementation specific terms 80bd4c3 [martinzapletal] SPARK-5502 created docs page for isotonic regression, added links to the page, updated data and examples 7d8136e [martinzapletal] SPARK-5502 Added documentation for Isotonic regression including examples for Scala and Java 504b5c3 [martinzapletal] SPARK-5502 Added documentation for Isotonic regression including examples for Scala and Java (cherry picked from commit 61eb12674b90143388a01c22bf51cb7d02ab0447) Signed-off-by: Xiangrui Meng Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d96e188c Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d96e188c Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d96e188c Branch: refs/heads/branch-1.3 Commit: d96e188c7a2b52cff32814f8e0596f030c14ad21 Parents: 70ebad4 Author: martinzapletal Authored: Sun Feb 15 09:10:03 2015 -0800 Committer: Xiangrui Meng Committed: Sun Feb 15 09:10:12 2015 -0800 -- data/mllib/sample_isotonic_regression_data.txt | 100 + docs/mllib-classification-regression.md| 3 +- docs/mllib-guide.md| 1 + docs/mllib-isotonic-regression.md | 155 4 files changed, 258 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/d96e188c/data/mllib/sample_isotonic_regression_data.txt -- diff --git a/data/mllib/sample_isotonic_regression_data.txt b/data/mllib/sample_isotonic_regression_data.txt new file mode 100644 index 000..d257b50 --- /dev/null +++ b/data/mllib/sample_isotonic_regression_data.txt @@ -0,0 +1,100 @@ +0.24579296,0.01 +0.28505864,0.02 +0.31208567,0.03 +0.35900051,0.04 +0.35747068,0.05 +0.16675166,0.06 +0.17491076,0.07 +0.04181540,0.08 +0.04793473,0.09 +0.03926568,0.10 +0.12952575,0.11 +0.,0.12 +0.01376849,0.13 +0.13105558,0.14 +0.08873024,0.15 +0.12595614,0.16 +0.15247323,0.17 +0.25956145,0.18 +0.20040796,0.19 +0.19581846,0.20 +0.15757267,0.21 +0.13717491,0.22 +0.19020908,0.23 +0.19581846,0.24 +0.20091790,0.25 +0.16879143,0.26 +0.18510964,0.27 +0.20040796,0.28 +0.29576747,0.29 +0.43396226,0.30 +0.53391127,0.31 +0.52116267,0.32 +0.48546660,0.33 +0.49209587,0.34 +0.54156043,0.35 +0.59765426,0.36 +0.56144824,0.37 +0.58592555,0.38 +0.52983172,0.39 +0.50178480,0.40 +0.52626211,0.41 +0.58286588,0.42 +0.64660887,0.43 +0.68077511,0.44 +0.74298827,0.45 +0.64864865,0.46 +0.67261601,0.47 +0.65782764,0.48 +0.69811321,0.49 +0.63029067,0.50 +0.61601224,0.51 +0.63233044,0.52 +0.65323814,0.53 +0.65323814,0.54 +0.67363590,0.55 +0.67006629,0.56 +0.51555329,0.57 +0.50892402,0.58 +0.33299337,0.59 +0.36206017,0.60 +0.43090260,0.61 +0.45996940,0.62 +0.56348802,0.63 +0.54920959,0.64 +0.48393677,0.65 +0.48495665,0.66 +0.46965834,0.67 +0.45181030,0.68 +0.45843957,0.69 +0.47118817,0.70 +0.51555329,0.71 +0.58031617,0.72 +0.55481897,0.73 +0.56297807,0.74 +0.56603774,0.75 +0.57929628,0.76 +0.64762876,0.77 +0.66241713,0.78 +0.69301377,0.79 +0.65119837,0.80 +0.68332483,0.81 +0.66598674,0.82 +0.73890872,0.83 +0.73992861,0.84 +0.84242733,0.85 +0.91330954,0.86 +0.88016318,0.87 +0.90719021,0.88 +0.93115757,0.89 +0.93115757,0.90 +0.91942886,0.91 +0.92911780,0.92 +0.95665477,0.93 +0.95002550,0.94 +0.96940337,0.95 +1.,0.96 +0.89801122,0.97 +0.90311066,0.98 +0.90362060,0.99 +0.83477817,1.0 \ No newline at end of file http://git-wip-us.apache.org/repos/asf/spark/blob/d96e188c/docs/mllib-classification-regression.md -- diff --git a/docs/mllib-classification-regression.md b/docs/mllib-classification-regression.md index 719cc95..5b9b4dd 100644 --- a/docs/mllib-classification-regression.md +++ b/docs/mllib-classification-regression.md @@ -23,7 +23,7 @@ the supported algorithms for each type of problem. Multiclass Classificationdecision trees, naive Bayes - Regressionlinear least squares, Lasso, ridge regression, decision trees + Regressionlinear least squares, Lasso, ridge regression, decision trees, isotonic regression @@ -35,3 +35,4 @@ More details for these methods can be found here: * [linear regression (least squares, Lasso, ridge)](mllib-linear-methods.html#linear-least-squares-lasso-and-ridge-regression) * [Decisio
spark git commit: [HOTFIX] Ignore DirectKafkaStreamSuite.
Repository: spark Updated Branches: refs/heads/branch-1.3 9c1c70d8c -> 70ebad4d9 [HOTFIX] Ignore DirectKafkaStreamSuite. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/70ebad4d Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/70ebad4d Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/70ebad4d Branch: refs/heads/branch-1.3 Commit: 70ebad4d972101dc2f920ac014cd2359b99a50f9 Parents: 9c1c70d Author: Reynold Xin Authored: Fri Feb 13 12:43:53 2015 -0800 Committer: Patrick Wendell Committed: Sun Feb 15 09:01:25 2015 -0800 -- .../spark/streaming/kafka/DirectKafkaStreamSuite.scala | 8 1 file changed, 4 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/70ebad4d/external/kafka/src/test/scala/org/apache/spark/streaming/kafka/DirectKafkaStreamSuite.scala -- diff --git a/external/kafka/src/test/scala/org/apache/spark/streaming/kafka/DirectKafkaStreamSuite.scala b/external/kafka/src/test/scala/org/apache/spark/streaming/kafka/DirectKafkaStreamSuite.scala index b25c212..9260944 100644 --- a/external/kafka/src/test/scala/org/apache/spark/streaming/kafka/DirectKafkaStreamSuite.scala +++ b/external/kafka/src/test/scala/org/apache/spark/streaming/kafka/DirectKafkaStreamSuite.scala @@ -67,7 +67,7 @@ class DirectKafkaStreamSuite extends KafkaStreamSuiteBase } - test("basic stream receiving with multiple topics and smallest starting offset") { + ignore("basic stream receiving with multiple topics and smallest starting offset") { val topics = Set("basic1", "basic2", "basic3") val data = Map("a" -> 7, "b" -> 9) topics.foreach { t => @@ -113,7 +113,7 @@ class DirectKafkaStreamSuite extends KafkaStreamSuiteBase ssc.stop() } - test("receiving from largest starting offset") { + ignore("receiving from largest starting offset") { val topic = "largest" val topicPartition = TopicAndPartition(topic, 0) val data = Map("a" -> 10) @@ -158,7 +158,7 @@ class DirectKafkaStreamSuite extends KafkaStreamSuiteBase } - test("creating stream by offset") { + ignore("creating stream by offset") { val topic = "offset" val topicPartition = TopicAndPartition(topic, 0) val data = Map("a" -> 10) @@ -204,7 +204,7 @@ class DirectKafkaStreamSuite extends KafkaStreamSuiteBase } // Test to verify the offset ranges can be recovered from the checkpoints - test("offset recovery") { + ignore("offset recovery") { val topic = "recovery" createTopic(topic) testDir = Utils.createTempDir() - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-5827][SQL] Add missing import in the example of SqlContext
Repository: spark Updated Branches: refs/heads/branch-1.3 f87f3b755 -> 9c1c70d8c [SPARK-5827][SQL] Add missing import in the example of SqlContext If one tries an example by using copy&paste, throw an exception. Author: Takeshi Yamamuro Closes #4615 from maropu/AddMissingImportInSqlContext and squashes the following commits: ab21b66 [Takeshi Yamamuro] Add missing import in the example of SqlContext (cherry picked from commit c771e475c449fe07cf45f37bdca2ba6ce9600bfc) Signed-off-by: Sean Owen Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9c1c70d8 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/9c1c70d8 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/9c1c70d8 Branch: refs/heads/branch-1.3 Commit: 9c1c70d8cc8cf3afedecbc8868b3765c15bd493e Parents: f87f3b7 Author: Takeshi Yamamuro Authored: Sun Feb 15 14:42:20 2015 + Committer: Sean Owen Committed: Sun Feb 15 14:42:28 2015 + -- sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala | 2 ++ 1 file changed, 2 insertions(+) -- http://git-wip-us.apache.org/repos/asf/spark/blob/9c1c70d8/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala b/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala index a1736d0..6d19148 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala @@ -286,6 +286,7 @@ class SQLContext(@transient val sparkContext: SparkContext) * Example: * {{{ * import org.apache.spark.sql._ + * import org.apache.spark.sql.types._ * val sqlContext = new org.apache.spark.sql.SQLContext(sc) * * val schema = @@ -377,6 +378,7 @@ class SQLContext(@transient val sparkContext: SparkContext) * Example: * {{{ * import org.apache.spark.sql._ + * import org.apache.spark.sql.types._ * val sqlContext = new org.apache.spark.sql.SQLContext(sc) * * val schema = - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-5827][SQL] Add missing import in the example of SqlContext
Repository: spark Updated Branches: refs/heads/master ed5f4bb7c -> c771e475c [SPARK-5827][SQL] Add missing import in the example of SqlContext If one tries an example by using copy&paste, throw an exception. Author: Takeshi Yamamuro Closes #4615 from maropu/AddMissingImportInSqlContext and squashes the following commits: ab21b66 [Takeshi Yamamuro] Add missing import in the example of SqlContext Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c771e475 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c771e475 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c771e475 Branch: refs/heads/master Commit: c771e475c449fe07cf45f37bdca2ba6ce9600bfc Parents: ed5f4bb Author: Takeshi Yamamuro Authored: Sun Feb 15 14:42:20 2015 + Committer: Sean Owen Committed: Sun Feb 15 14:42:20 2015 + -- sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala | 2 ++ 1 file changed, 2 insertions(+) -- http://git-wip-us.apache.org/repos/asf/spark/blob/c771e475/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala b/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala index a1736d0..6d19148 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala @@ -286,6 +286,7 @@ class SQLContext(@transient val sparkContext: SparkContext) * Example: * {{{ * import org.apache.spark.sql._ + * import org.apache.spark.sql.types._ * val sqlContext = new org.apache.spark.sql.SQLContext(sc) * * val schema = @@ -377,6 +378,7 @@ class SQLContext(@transient val sparkContext: SparkContext) * Example: * {{{ * import org.apache.spark.sql._ + * import org.apache.spark.sql.types._ * val sqlContext = new org.apache.spark.sql.SQLContext(sc) * * val schema = - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org