spark git commit: [SPARK-21617][SQL] Store correct table metadata when altering schema in Hive metastore.

2017-08-21 Thread lixiao
Repository: spark Updated Branches: refs/heads/master ba843292e -> 84b5b16ea [SPARK-21617][SQL] Store correct table metadata when altering schema in Hive metastore. For Hive tables, the current "replace the schema" code is the correct path, except that an exception in that path should result

spark git commit: [SPARK-21617][SQL] Store correct table metadata when altering schema in Hive metastore.

2017-08-21 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.2 0f640e96c -> 526087f9e [SPARK-21617][SQL] Store correct table metadata when altering schema in Hive metastore. For Hive tables, the current "replace the schema" code is the correct path, except that an exception in that path should

spark git commit: [SPARK-19762][ML][FOLLOWUP] Add necessary comments to L2Regularization.

2017-08-21 Thread yliang
Repository: spark Updated Branches: refs/heads/master 84b5b16ea -> c108a5d30 [SPARK-19762][ML][FOLLOWUP] Add necessary comments to L2Regularization. ## What changes were proposed in this pull request? MLlib ```LinearRegression/LogisticRegression/LinearSVC``` always standardize the data

spark git commit: [SPARK-21070][PYSPARK] Attempt to update cloudpickle again

2017-08-21 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master c108a5d30 -> 751f51336 [SPARK-21070][PYSPARK] Attempt to update cloudpickle again ## What changes were proposed in this pull request? Based on https://github.com/apache/spark/pull/18282 by rgbkrk this PR attempts to update to the current

spark git commit: [SPARK-21468][PYSPARK][ML] Python API for FeatureHasher

2017-08-21 Thread mlnick
Repository: spark Updated Branches: refs/heads/master b3a07526f -> 988b84d7e [SPARK-21468][PYSPARK][ML] Python API for FeatureHasher Add Python API for `FeatureHasher` transformer. ## How was this patch tested? New doc test. Author: Nick Pentreath Closes #18970 from

spark git commit: [SPARK-21718][SQL] Heavy log of type: "Skipping partition based on stats ..."

2017-08-21 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/master 77d046ec4 -> b3a07526f [SPARK-21718][SQL] Heavy log of type: "Skipping partition based on stats ..." ## What changes were proposed in this pull request? Reduce 'Skipping partitions' message to debug ## How was this patch tested?

spark git commit: [SPARK-21782][CORE] Repartition creates skews when numPartitions is a power of 2

2017-08-21 Thread srowen
Repository: spark Updated Branches: refs/heads/master 28a6cca7d -> 77d046ec4 [SPARK-21782][CORE] Repartition creates skews when numPartitions is a power of 2 ## Problem When an RDD (particularly with a low item-per-partition ratio) is repartitioned to numPartitions = power of 2, the

spark git commit: [SPARK-21790][TESTS][FOLLOW-UP] Add filter pushdown verification back.

2017-08-21 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 988b84d7e -> ba843292e [SPARK-21790][TESTS][FOLLOW-UP] Add filter pushdown verification back. ## What changes were proposed in this pull request? The previous PR(https://github.com/apache/spark/pull/19000) removed filter pushdown