spark git commit: [SPARK-20070][SQL] Fix 2.10 build

2017-03-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master f88f56b83 -> 0a6c50711 [SPARK-20070][SQL] Fix 2.10 build ## What changes were proposed in this pull request? Commit https://github.com/apache/spark/commit/91fa80fe8a2480d64c430bd10f97b3d44c007bcc broke the build for scala 2.10. The

spark git commit: [DOCS] Clarify round mode for format_number & round functions

2017-03-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master e011004be -> f88f56b83 [DOCS] Clarify round mode for format_number & round functions ## What changes were proposed in this pull request? Updated the description for the `format_number` description to indicate that it uses `HALF_EVEN`

spark git commit: [SPARK-19846][SQL] Add a flag to disable constraint propagation

2017-03-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master b5c5bd98e -> e011004be [SPARK-19846][SQL] Add a flag to disable constraint propagation ## What changes were proposed in this pull request? Constraint propagation can be computation expensive and block the driver execution for long time.

spark git commit: Disable generate codegen since it fails my workload.

2017-03-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master 91fa80fe8 -> b5c5bd98e Disable generate codegen since it fails my workload. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b5c5bd98 Tree:

spark git commit: [SPARK-20070][SQL] Redact DataSourceScanExec treeString

2017-03-24 Thread lixiao
Repository: spark Updated Branches: refs/heads/master e8810b73c -> 91fa80fe8 [SPARK-20070][SQL] Redact DataSourceScanExec treeString ## What changes were proposed in this pull request? The explain output of `DataSourceScanExec` can contain sensitive information (like Amazon keys). Such

spark git commit: [SPARK-17471][ML] Add compressed method to ML matrices

2017-03-24 Thread dbtsai
Repository: spark Updated Branches: refs/heads/master 707e50183 -> e8810b73c [SPARK-17471][ML] Add compressed method to ML matrices ## What changes were proposed in this pull request? This patch adds a `compressed` method to ML `Matrix` class, which returns the minimal storage

spark git commit: [SPARK-19911][STREAMING] Add builder interface for Kinesis DStreams

2017-03-24 Thread brkyvz
Repository: spark Updated Branches: refs/heads/master 9299d071f -> 707e50183 [SPARK-19911][STREAMING] Add builder interface for Kinesis DStreams ## What changes were proposed in this pull request? - Add new KinesisDStream.scala containing KinesisDStream.Builder class - Add

spark git commit: [SQL][MINOR] Fix for typo in Analyzer

2017-03-24 Thread lixiao
Repository: spark Updated Branches: refs/heads/master d9f4ce694 -> 9299d071f [SQL][MINOR] Fix for typo in Analyzer ## What changes were proposed in this pull request? Fix for typo in Analyzer ## How was this patch tested? local build Author: Jacek Laskowski Closes

spark git commit: [SPARK-15040][ML][PYSPARK] Add Imputer to PySpark

2017-03-24 Thread mlnick
Repository: spark Updated Branches: refs/heads/master 344f38b04 -> d9f4ce694 [SPARK-15040][ML][PYSPARK] Add Imputer to PySpark Add Python wrapper for `Imputer` feature transformer. ## How was this patch tested? New doc tests and tweak to PySpark ML `tests.py` Author: Nick Pentreath

spark git commit: [SPARK-19970][SQL][FOLLOW-UP] Table owner should be USER instead of PRINCIPAL in kerberized clusters #17311

2017-03-24 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 8e558041a -> 344f38b04 [SPARK-19970][SQL][FOLLOW-UP] Table owner should be USER instead of PRINCIPAL in kerberized clusters #17311 ### What changes were proposed in this pull request? This is a follow-up for the PR:

spark git commit: [SPARK-19820][CORE] Add interface to kill tasks w/ a reason

2017-03-24 Thread kayousterhout
Repository: spark Updated Branches: refs/heads/master 19596c28b -> 8e558041a [SPARK-19820][CORE] Add interface to kill tasks w/ a reason This commit adds a killTaskAttempt method to SparkContext, to allow users to kill tasks so that they can be re-scheduled elsewhere. This also refactors the

spark git commit: [SPARK-16929] Improve performance when check speculatable tasks.

2017-03-24 Thread kayousterhout
Repository: spark Updated Branches: refs/heads/master bb823ca4b -> 19596c28b [SPARK-16929] Improve performance when check speculatable tasks. ## What changes were proposed in this pull request? 1. Use a MedianHeap to record durations of successful tasks. When check speculatable tasks, we