spark git commit: [SPARK-14554][SQL] disable whole stage codegen if there are too many input columns

2016-04-11 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 2d81ba542 -> 52a801124 [SPARK-14554][SQL] disable whole stage codegen if there are too many input columns ## What changes were proposed in this pull request? In

spark git commit: [SPARK-14362][SPARK-14406][SQL][FOLLOW-UP] DDL Native Support: Drop View and Drop Table

2016-04-11 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 83fb96403 -> 2d81ba542 [SPARK-14362][SPARK-14406][SQL][FOLLOW-UP] DDL Native Support: Drop View and Drop Table What changes were proposed in this pull request? In this PR, we are trying to address the comment in the original PR:

spark git commit: [SPARK-14132][SPARK-14133][SQL] Alter table partition DDLs

2016-04-11 Thread yhuai
Repository: spark Updated Branches: refs/heads/master e9e1adc03 -> 83fb96403 [SPARK-14132][SPARK-14133][SQL] Alter table partition DDLs ## What changes were proposed in this pull request? This implements a few alter table partition commands using the `SessionCatalog`. In particular: ```

spark git commit: [MINOR][ML] Fixed MLlib build warnings

2016-04-11 Thread srowen
Repository: spark Updated Branches: refs/heads/master 26d7af911 -> e9e1adc03 [MINOR][ML] Fixed MLlib build warnings ## What changes were proposed in this pull request? Fixes to eliminate warnings during package and doc builds. ## How was this patch tested? Existing unit tests Author:

spark git commit: [SPARK-14242][CORE][NETWORK] avoid copy in compositeBuffer for frame decoder

2016-04-11 Thread davies
Repository: spark Updated Branches: refs/heads/branch-1.6 05dbc2846 -> 663a492f0 [SPARK-14242][CORE][NETWORK] avoid copy in compositeBuffer for frame decoder ## What changes were proposed in this pull request? In this patch, we set the initial `maxNumComponents` to `Integer.MAX_VALUE`

spark git commit: [SPARK-14520][SQL] Use correct return type in VectorizedParquetInputFormat

2016-04-11 Thread rxin
Repository: spark Updated Branches: refs/heads/master 6f27027d9 -> 26d7af911 [SPARK-14520][SQL] Use correct return type in VectorizedParquetInputFormat ## What changes were proposed in this pull request? JIRA: https://issues.apache.org/jira/browse/SPARK-14520 `VectorizedParquetInputFormat`

spark git commit: [SPARK-14475] Propagate user-defined context from driver to executors

2016-04-11 Thread rxin
Repository: spark Updated Branches: refs/heads/master 94de63053 -> 6f27027d9 [SPARK-14475] Propagate user-defined context from driver to executors ## What changes were proposed in this pull request? This adds a new API call `TaskContext.getLocalProperty` for getting properties set in the

spark git commit: [SPARK-10521][SQL] Utilize Docker for test DB2 JDBC Dialect support

2016-04-11 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 3f0f40800 -> 94de63053 [SPARK-10521][SQL] Utilize Docker for test DB2 JDBC Dialect support Add integration tests based on docker to test DB2 JDBC dialect support Author: Luciano Resende Closes #9893 from

spark git commit: [SPARK-14298][ML][MLLIB] LDA should support disable checkpoint

2016-04-11 Thread jkbradley
Repository: spark Updated Branches: refs/heads/branch-1.5 1e61ff4ca -> cb7a90ad5 [SPARK-14298][ML][MLLIB] LDA should support disable checkpoint ## What changes were proposed in this pull request? In the doc of

spark git commit: [SPARK-14298][ML][MLLIB] LDA should support disable checkpoint

2016-04-11 Thread jkbradley
Repository: spark Updated Branches: refs/heads/branch-1.6 f4110cd3b -> 05dbc2846 [SPARK-14298][ML][MLLIB] LDA should support disable checkpoint ## What changes were proposed in this pull request? In the doc of

spark git commit: [BUILD][HOTFIX] Download Maven from regular mirror network rather than archive.apache.org

2016-04-11 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.6 c12db0d33 -> f4110cd3b [BUILD][HOTFIX] Download Maven from regular mirror network rather than archive.apache.org [archive.apache.org](https://archive.apache.org/) is undergoing maintenance, breaking our `build/mvn` script: > We are

spark git commit: [SPARK-14298][ML][MLLIB] Add unit test for EM LDA disable checkpointing

2016-04-11 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master 89a41c5b7 -> 3f0f40800 [SPARK-14298][ML][MLLIB] Add unit test for EM LDA disable checkpointing ## What changes were proposed in this pull request? This is follow up for #12089, add unit test for EM LDA which test disable checkpointing

spark git commit: [SPARK-13600][MLLIB] Use approxQuantile from DataFrame stats in QuantileDiscretizer

2016-04-11 Thread meng
Repository: spark Updated Branches: refs/heads/master 2dacc81ec -> 89a41c5b7 [SPARK-13600][MLLIB] Use approxQuantile from DataFrame stats in QuantileDiscretizer ## What changes were proposed in this pull request? QuantileDiscretizer can return an unexpected number of buckets in certain

spark git commit: [SPARK-14494][SQL] Fix the race conditions in MemoryStream and MemorySink

2016-04-11 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 5de26194a -> 2dacc81ec [SPARK-14494][SQL] Fix the race conditions in MemoryStream and MemorySink ## What changes were proposed in this pull request? Make sure accessing mutable variables in MemoryStream and MemorySink are protected by

spark git commit: [SPARK-14454] [1.6] Better exception handling while marking tasks as failed

2016-04-11 Thread davies
Repository: spark Updated Branches: refs/heads/branch-1.6 baf29854e -> c12db0d33 [SPARK-14454] [1.6] Better exception handling while marking tasks as failed Backports https://github.com/apache/spark/pull/12234 to 1.6. Original description below: ## What changes were proposed in this pull

spark git commit: [SPARK-14290] [SPARK-13352] [CORE] [BACKPORT-1.6] avoid significant memory copy in Netty's tran…

2016-04-11 Thread davies
Repository: spark Updated Branches: refs/heads/branch-1.6 7a02c446f -> baf29854e [SPARK-14290] [SPARK-13352] [CORE] [BACKPORT-1.6] avoid significant memory copy in Netty's tran… ## What changes were proposed in this pull request? When netty transfer data that is not `FileRegion`, data will

spark git commit: [SPARK-14502] [SQL] Add optimization for Binary Comparison Simplification

2016-04-11 Thread davies
Repository: spark Updated Branches: refs/heads/master 652c47030 -> 5de26194a [SPARK-14502] [SQL] Add optimization for Binary Comparison Simplification ## What changes were proposed in this pull request? We can simplifies binary comparisons with semantically-equal operands: 1. Replace '<=>'

spark git commit: [SPARK-14528] [SQL] Fix same result of Union

2016-04-11 Thread davies
Repository: spark Updated Branches: refs/heads/master efaf7d182 -> 652c47030 [SPARK-14528] [SQL] Fix same result of Union ## What changes were proposed in this pull request? This PR fix resultResult() for Union. ## How was this patch tested? Added regression test. Author: Davies Liu

spark git commit: [SPARK-14510][MLLIB] Add args-checking for LDA and StreamingKMeans

2016-04-11 Thread meng
Repository: spark Updated Branches: refs/heads/master 1c751fcf4 -> 643b4e225 [SPARK-14510][MLLIB] Add args-checking for LDA and StreamingKMeans ## What changes were proposed in this pull request? add the checking for LDA and StreamingKMeans ## How was this patch tested? manual tests Author:

[2/2] spark git commit: [SPARK-14500] [ML] Accept Dataset[_] instead of DataFrame in MLlib APIs

2016-04-11 Thread meng
[SPARK-14500] [ML] Accept Dataset[_] instead of DataFrame in MLlib APIs ## What changes were proposed in this pull request? This PR updates MLlib APIs to accept `Dataset[_]` as input where `DataFrame` was the input type. This PR doesn't change the output type. In Java, `Dataset[_]` maps to

[1/2] spark git commit: [SPARK-14500] [ML] Accept Dataset[_] instead of DataFrame in MLlib APIs

2016-04-11 Thread meng
Repository: spark Updated Branches: refs/heads/master e82d95bf6 -> 1c751fcf4 http://git-wip-us.apache.org/repos/asf/spark/blob/1c751fcf/mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala -- diff --git