spark git commit: [SPARK-8711] [ML] Add additional methods to PySpark ML tree models

2015-07-07 Thread meng
Repository: spark Updated Branches: refs/heads/master 0a63d7ab8 - 1dbc4a155 [SPARK-8711] [ML] Add additional methods to PySpark ML tree models Add numNodes and depth to treeModels, add treeWeights to ensemble Models. Add __repr__ to all models. Author: MechCoder

spark git commit: [SPARK-8821] [EC2] Switched to binary mode for file reading

2015-07-07 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 738c10748 - 70beb808e [SPARK-8821] [EC2] Switched to binary mode for file reading Otherwise the script will crash with - Downloading boto... Traceback (most recent call last): File ec2/spark_ec2.py, line 148, in module

spark git commit: [SPARK-8821] [EC2] Switched to binary mode for file reading

2015-07-07 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-1.4 bf8b47d17 - 83a621a5a [SPARK-8821] [EC2] Switched to binary mode for file reading Otherwise the script will crash with - Downloading boto... Traceback (most recent call last): File ec2/spark_ec2.py, line 148, in module

spark git commit: [SPARK-8559] [MLLIB] Support Association Rule Generation

2015-07-07 Thread meng
Repository: spark Updated Branches: refs/heads/master 70beb808e - 3336c7b14 [SPARK-8559] [MLLIB] Support Association Rule Generation Distributed generation of single-consequent association rules from a RDD of frequent itemsets. Tests referenced against `R`'s implementation of A Priori in

spark git commit: [SPARK-8823] [MLLIB] [PYSPARK] Optimizations for SparseVector dot products

2015-07-07 Thread meng
Repository: spark Updated Branches: refs/heads/master 1dbc4a155 - 738c10748 [SPARK-8823] [MLLIB] [PYSPARK] Optimizations for SparseVector dot products Follow up for https://github.com/apache/spark/pull/5946 Currently we iterate over indices and values in SparseVector and can be vectorized.

spark git commit: [SPARK-6731] [CORE] Addendum: Upgrade Apache commons-math3 to 3.4.1

2015-07-07 Thread meng
Repository: spark Updated Branches: refs/heads/master 1cb2629f1 - dcbd85b70 [SPARK-6731] [CORE] Addendum: Upgrade Apache commons-math3 to 3.4.1 (This finishes the job by removing the version overridden by Hadoop profiles.) See discussion at

spark git commit: [SPARK-8788] [ML] Add Java unit test for PCA transformer

2015-07-07 Thread meng
Repository: spark Updated Branches: refs/heads/master dcbd85b70 - d73bc08d9 [SPARK-8788] [ML] Add Java unit test for PCA transformer Add Java unit test for PCA transformer Author: Yanbo Liang yblia...@gmail.com Closes #7184 from yanboliang/spark-8788 and squashes the following commits:

spark git commit: [SPARK-8570] [MLLIB] [DOCS] Improve MLlib Local Matrix Documentation.

2015-07-07 Thread meng
Repository: spark Updated Branches: refs/heads/master d73bc08d9 - 0a63d7ab8 [SPARK-8570] [MLLIB] [DOCS] Improve MLlib Local Matrix Documentation. Updated MLlib Data Types Local Matrix section to include information on sparse matrices, added sparse matrix examples to the Scala and Java

spark git commit: [SPARK-8845] [ML] ML use of Breeze optimization: use adjustedValue instead of value

2015-07-07 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master 35d781e71 - 3bf20c27f [SPARK-8845] [ML] ML use of Breeze optimization: use adjustedValue instead of value In LinearRegression and LogisticRegression, we use Breeze's optimizers (LBFGS and OWLQN). We check the State.value to see the

[1/2] spark git commit: [SPARK-8876][SQL] Remove InternalRow type alias in expressions package.

2015-07-07 Thread rxin
Repository: spark Updated Branches: refs/heads/master da56c4e72 - 770ff1025 http://git-wip-us.apache.org/repos/asf/spark/blob/770ff102/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/ScriptTransformation.scala --

spark git commit: [SPARK-8868] SqlSerializer2 can go into infinite loop when row consists only of NullType columns

2015-07-07 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.4 83a621a5a - d3d5f2ab2 [SPARK-8868] SqlSerializer2 can go into infinite loop when row consists only of NullType columns https://issues.apache.org/jira/browse/SPARK-8868 Author: Yin Huai yh...@databricks.com Closes #7262 from

spark git commit: [SPARK-8878][SQL] Improve unit test coverage for bitwise expressions.

2015-07-07 Thread rxin
Repository: spark Updated Branches: refs/heads/master 68a4a1697 - 5d603dfe4 [SPARK-8878][SQL] Improve unit test coverage for bitwise expressions. Author: Reynold Xin r...@databricks.com Closes #7273 from rxin/bitwise-unittest and squashes the following commits: 60c5667 [Reynold Xin]

spark git commit: [SPARK-7190] [SPARK-8804] [SPARK-7815] [SQL] unsafe UTF8String

2015-07-07 Thread rxin
Repository: spark Updated Branches: refs/heads/master 770ff1025 - 4ca90935c [SPARK-7190] [SPARK-8804] [SPARK-7815] [SQL] unsafe UTF8String Let UTF8String work with binary buffer. Before we have better idea on manage the lifecycle of UTF8String in Row, we still do the copy when calling

spark git commit: [SPARK-8868] SqlSerializer2 can go into infinite loop when row consists only of NullType columns

2015-07-07 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 4ca90935c - 68a4a1697 [SPARK-8868] SqlSerializer2 can go into infinite loop when row consists only of NullType columns https://issues.apache.org/jira/browse/SPARK-8868 Author: Yin Huai yh...@databricks.com Closes #7262 from

[2/2] spark git commit: [SPARK-8876][SQL] Remove InternalRow type alias in expressions package.

2015-07-07 Thread rxin
[SPARK-8876][SQL] Remove InternalRow type alias in expressions package. The type alias was there because initially when I moved Row around, I didn't want to do massive changes to the expression code. But now it should be pretty easy to just remove it. One less concept to worry about. Author:

spark git commit: [SPARK-8879][SQL] Remove EmptyRow class.

2015-07-07 Thread rxin
Repository: spark Updated Branches: refs/heads/master 5d603dfe4 - 61c3cf793 [SPARK-8879][SQL] Remove EmptyRow class. As a baby step towards no megamorphic InternalRow. Author: Reynold Xin r...@databricks.com Closes #7277 from rxin/remove-empty-row and squashes the following commits:

spark git commit: [SPARK-8886][Documentation]python Style update

2015-07-07 Thread rxin
Repository: spark Updated Branches: refs/heads/master 61c3cf793 - 08192a1b8 [SPARK-8886][Documentation]python Style update Fixed comment given by rxin Author: Tijo Thomas tijopara...@gmail.com Closes #7281 from tijoparacka/modification_for_python_style and squashes the following commits: