spark git commit: [SPARK-22257][SQL] Reserve all non-deterministic expressions in ExpressionSet

2017-10-12 Thread lixiao
Repository: spark Updated Branches: refs/heads/master ec122209f -> 2f00a71a8 [SPARK-22257][SQL] Reserve all non-deterministic expressions in ExpressionSet ## What changes were proposed in this pull request? For non-deterministic expressions, they should be considered as not contained in the

spark git commit: [SPARK-21165][SQL] FileFormatWriter should handle mismatched attribute ids between logical and physical plan

2017-10-12 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 3ff766f61 -> ec122209f [SPARK-21165][SQL] FileFormatWriter should handle mismatched attribute ids between logical and physical plan ## What changes were proposed in this pull request? Due to optimizer removing some unnecessary aliases,

spark git commit: [SPARK-22252][SQL][2.2] FileFormatWriter should respect the input query schema

2017-10-12 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.2 cfc04e062 -> c9187db80 [SPARK-22252][SQL][2.2] FileFormatWriter should respect the input query schema ## What changes were proposed in this pull request? https://github.com/apache/spark/pull/18386 fixes SPARK-21165 but breaks

spark git commit: [SPARK-22263][SQL] Refactor deterministic as lazy value

2017-10-12 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 9104add4c -> 3ff766f61 [SPARK-22263][SQL] Refactor deterministic as lazy value ## What changes were proposed in this pull request? The method `deterministic` is frequently called in optimizer. Refactor `deterministic` as lazy value, in

spark git commit: [SPARK-22217][SQL] ParquetFileFormat to support arbitrary OutputCommitters

2017-10-12 Thread gurwls223
Repository: spark Updated Branches: refs/heads/branch-2.2 cd51e2c32 -> cfc04e062 [SPARK-22217][SQL] ParquetFileFormat to support arbitrary OutputCommitters ## What changes were proposed in this pull request? `ParquetFileFormat` to relax its requirement of output committer class from

spark git commit: [SPARK-22217][SQL] ParquetFileFormat to support arbitrary OutputCommitters

2017-10-12 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 02218c4c7 -> 9104add4c [SPARK-22217][SQL] ParquetFileFormat to support arbitrary OutputCommitters ## What changes were proposed in this pull request? `ParquetFileFormat` to relax its requirement of output committer class from

spark git commit: [SPARK-22251][SQL] Metric 'aggregate time' is incorrect when codegen is off

2017-10-12 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/master 73d80ec49 -> 02218c4c7 [SPARK-22251][SQL] Metric 'aggregate time' is incorrect when codegen is off ## What changes were proposed in this pull request? Adding the code for setting 'aggregate time' metric to non-codegen path in

spark git commit: [SPARK-21907][CORE][BACKPORT 2.2] oom during spill

2017-10-12 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/branch-2.2 c5889b59d -> cd51e2c32 [SPARK-21907][CORE][BACKPORT 2.2] oom during spill back-port #19181 to branch-2.2. ## What changes were proposed in this pull request? 1. a test reproducing

spark git commit: [SPARK-22097][CORE] Request an accurate memory after we unrolled the block

2017-10-12 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 274f0efef -> b5c1ef7a8 [SPARK-22097][CORE] Request an accurate memory after we unrolled the block ## What changes were proposed in this pull request? We only need request `bbos.size - unrollMemoryUsedByThisBlock` after unrolled the

spark git commit: [SPARK-22252][SQL] FileFormatWriter should respect the input query schema

2017-10-12 Thread wenchen
Repository: spark Updated Branches: refs/heads/master ccdf21f56 -> 274f0efef [SPARK-22252][SQL] FileFormatWriter should respect the input query schema ## What changes were proposed in this pull request? In https://github.com/apache/spark/pull/18064, we allowed `RunnableCommand` to have