[2/2] spark git commit: [SPARK-4486][MLLIB] Improve GradientBoosting APIs and doc

2014-11-20 Thread meng
[SPARK-4486][MLLIB] Improve GradientBoosting APIs and doc There are some inconsistencies in the gradient boosting APIs. The target is a general boosting meta-algorithm, but the implementation is attached to trees. This was partially due to the delay of SPARK-1856. But for the 1.2 release, we

[2/2] spark git commit: [SPARK-4486][MLLIB] Improve GradientBoosting APIs and doc

2014-11-20 Thread meng
[SPARK-4486][MLLIB] Improve GradientBoosting APIs and doc There are some inconsistencies in the gradient boosting APIs. The target is a general boosting meta-algorithm, but the implementation is attached to trees. This was partially due to the delay of SPARK-1856. But for the 1.2 release, we

[1/2] spark git commit: [SPARK-4486][MLLIB] Improve GradientBoosting APIs and doc

2014-11-20 Thread meng
Repository: spark Updated Branches: refs/heads/master e216ffaea - 15cacc812 http://git-wip-us.apache.org/repos/asf/spark/blob/15cacc81/mllib/src/test/scala/org/apache/spark/mllib/tree/EnsembleTestHelper.scala -- diff --git

[1/2] spark git commit: [SPARK-4486][MLLIB] Improve GradientBoosting APIs and doc

2014-11-20 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.2 83d24efb0 - e958132a8 http://git-wip-us.apache.org/repos/asf/spark/blob/e958132a/mllib/src/test/scala/org/apache/spark/mllib/tree/EnsembleTestHelper.scala -- diff --git

spark git commit: [SPARK-4481][Streaming][Doc] Fix the wrong description of updateFunc (backport for branch-1.2)

2014-11-20 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.2 e958132a8 - b676d9ad3 [SPARK-4481][Streaming][Doc] Fix the wrong description of updateFunc (backport for branch-1.2) backport for branch-1.2 as per #3356 Author: zsxwing zsxw...@gmail.com Closes #3376 from

Git Push Summary

2014-11-20 Thread tdas
Repository: spark Updated Branches: refs/heads/filestream-fix1 [deleted] 6b8d85b2b - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-4228][SQL] SchemaRDD to JSON

2014-11-20 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.2 2fb683c58 - 21f582f12 [SPARK-4228][SQL] SchemaRDD to JSON Here's a simple fix for SchemaRDD to JSON. Author: Dan McClary dan.mccl...@gmail.com Closes #3213 from dwmclary/SPARK-4228 and squashes the following commits: d714e1d [Dan

spark git commit: [SPARK-4228][SQL] SchemaRDD to JSON

2014-11-20 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master abf29187f - b8e6886fb [SPARK-4228][SQL] SchemaRDD to JSON Here's a simple fix for SchemaRDD to JSON. Author: Dan McClary dan.mccl...@gmail.com Closes #3213 from dwmclary/SPARK-4228 and squashes the following commits: d714e1d [Dan

spark git commit: [SPARK-4439] [MLlib] add python api for random forest

2014-11-20 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.2 21f582f12 - 72f5ba1fc [SPARK-4439] [MLlib] add python api for random forest ``` class RandomForestModel | A model trained by RandomForest | | numTrees(self) | Get number of trees in forest. | |

spark git commit: [SPARK-4439] [MLlib] add python api for random forest

2014-11-20 Thread meng
Repository: spark Updated Branches: refs/heads/master b8e6886fb - 1c53a5db9 [SPARK-4439] [MLlib] add python api for random forest ``` class RandomForestModel | A model trained by RandomForest | | numTrees(self) | Get number of trees in forest. | |

spark git commit: [SPARK-4513][SQL] Support relational operator '=' in Spark SQL

2014-11-20 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.2 72f5ba1fc - 8608ff598 [SPARK-4513][SQL] Support relational operator '=' in Spark SQL The relational operator '=' is not working in Spark SQL. Same works in Spark HiveQL Author: ravipesala ravindra.pes...@huawei.com Closes #3387 from

spark git commit: [SPARK-4513][SQL] Support relational operator '=' in Spark SQL

2014-11-20 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 1c53a5db9 - 98e941978 [SPARK-4513][SQL] Support relational operator '=' in Spark SQL The relational operator '=' is not working in Spark SQL. Same works in Spark HiveQL Author: ravipesala ravindra.pes...@huawei.com Closes #3387 from

spark git commit: [SPARK-4318][SQL] Fix empty sum distinct.

2014-11-20 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 98e941978 - 2c2e7a44d [SPARK-4318][SQL] Fix empty sum distinct. Executing sum distinct for empty table throws `java.lang.UnsupportedOperationException: empty.reduceLeft`. Author: Takuya UESHIN ues...@happy-camper.st Closes #3184 from

spark git commit: [SPARK-4318][SQL] Fix empty sum distinct.

2014-11-20 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.2 8608ff598 - 1d7ee2b79 [SPARK-4318][SQL] Fix empty sum distinct. Executing sum distinct for empty table throws `java.lang.UnsupportedOperationException: empty.reduceLeft`. Author: Takuya UESHIN ues...@happy-camper.st Closes #3184

spark git commit: [SPARK-2918] [SQL] Support the CTAS in EXPLAIN command

2014-11-20 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 2c2e7a44d - 6aa0fc9f4 [SPARK-2918] [SQL] Support the CTAS in EXPLAIN command Hive supports the `explain` the CTAS, which was supported by Spark SQL previously, however, seems it was reverted after the code refactoring in HiveQL. Author:

spark git commit: [SPARK-2918] [SQL] Support the CTAS in EXPLAIN command

2014-11-20 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.2 1d7ee2b79 - 29e8d5077 [SPARK-2918] [SQL] Support the CTAS in EXPLAIN command Hive supports the `explain` the CTAS, which was supported by Spark SQL previously, however, seems it was reverted after the code refactoring in HiveQL.

spark git commit: [SPARK-4477] [PySpark] remove numpy from RDDSampler

2014-11-20 Thread meng
Repository: spark Updated Branches: refs/heads/master ad5f1f3ca - d39f2e9c6 [SPARK-4477] [PySpark] remove numpy from RDDSampler In RDDSampler, it try use numpy to gain better performance for possion(), but the number of call of random() is only (1+faction) * N in the pure python

spark git commit: [SPARK-4477] [PySpark] remove numpy from RDDSampler

2014-11-20 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.2 69e28046b - 5153aa041 [SPARK-4477] [PySpark] remove numpy from RDDSampler In RDDSampler, it try use numpy to gain better performance for possion(), but the number of call of random() is only (1+faction) * N in the pure python

spark git commit: [SPARK-4244] [SQL] Support Hive Generic UDFs with constant object inspector parameters

2014-11-20 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.2 5153aa041 - 0f6a2eeaf [SPARK-4244] [SQL] Support Hive Generic UDFs with constant object inspector parameters Query `SELECT named_struct(lower(AA), 12, lower(Bb), 13) FROM src LIMIT 1` will throw exception, some of the Hive Generic

spark git commit: [SPARK-4244] [SQL] Support Hive Generic UDFs with constant object inspector parameters

2014-11-20 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master d39f2e9c6 - 84d79ee9e [SPARK-4244] [SQL] Support Hive Generic UDFs with constant object inspector parameters Query `SELECT named_struct(lower(AA), 12, lower(Bb), 13) FROM src LIMIT 1` will throw exception, some of the Hive Generic

spark git commit: [SPARK-4413][SQL] Parquet support through datasource API

2014-11-20 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 84d79ee9e - 02ec058ef [SPARK-4413][SQL] Parquet support through datasource API Goals: - Support for accessing parquet using SQL but not requiring Hive (thus allowing support of parquet tables with decimal columns) - Support for folder

spark git commit: [SPARK-4413][SQL] Parquet support through datasource API

2014-11-20 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.2 0f6a2eeaf - 64b30be7e [SPARK-4413][SQL] Parquet support through datasource API Goals: - Support for accessing parquet using SQL but not requiring Hive (thus allowing support of parquet tables with decimal columns) - Support for

spark git commit: add Sphinx as a dependency of building docs

2014-11-20 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.2 64b30be7e - e445d3ce4 add Sphinx as a dependency of building docs Author: Davies Liu dav...@databricks.com Closes #3388 from davies/doc_readme and squashes the following commits: daa1482 [Davies Liu] add Sphinx dependency (cherry

spark git commit: add Sphinx as a dependency of building docs

2014-11-20 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 02ec058ef - 8cd6eea62 add Sphinx as a dependency of building docs Author: Davies Liu dav...@databricks.com Closes #3388 from davies/doc_readme and squashes the following commits: daa1482 [Davies Liu] add Sphinx dependency Project:

spark git commit: [SPARK-4522][SQL] Parse schema with missing metadata.

2014-11-20 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 8cd6eea62 - 90a6a46bd [SPARK-4522][SQL] Parse schema with missing metadata. This is just a quick fix for 1.2. SPARK-4523 describes a more complete solution. Author: Michael Armbrust mich...@databricks.com Closes #3392 from

spark git commit: [SPARK-4522][SQL] Parse schema with missing metadata.

2014-11-20 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.2 e445d3ce4 - 668643b8d [SPARK-4522][SQL] Parse schema with missing metadata. This is just a quick fix for 1.2. SPARK-4523 describes a more complete solution. Author: Michael Armbrust mich...@databricks.com Closes #3392 from