spark git commit: [SPARK-16699][SQL] Fix performance bug in hash aggregate on long string keys

2016-07-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master daace6014 -> 468a3c3ac [SPARK-16699][SQL] Fix performance bug in hash aggregate on long string keys In the following code in `VectorizedHashMapGenerator.scala`: ``` def hashBytes(b: String): String = { val hash = ctx.freshName("ha

spark git commit: [SPARK-16699][SQL] Fix performance bug in hash aggregate on long string keys

2016-07-24 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 198b0426e -> d226dce12 [SPARK-16699][SQL] Fix performance bug in hash aggregate on long string keys ## What changes were proposed in this pull request? In the following code in `VectorizedHashMapGenerator.scala`: ``` def hashBytes(

spark git commit: [SPARK-5581][CORE] When writing sorted map output file, avoid open / …

2016-07-24 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 1221ce040 -> daace6014 [SPARK-5581][CORE] When writing sorted map output file, avoid open / … …close between each partition ## What changes were proposed in this pull request? Replace commitAndClose with separate commit and close to a

spark git commit: [SPARK-16645][SQL] rename CatalogStorageFormat.serdeProperties to properties

2016-07-24 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 23e047f46 -> 1221ce040 [SPARK-16645][SQL] rename CatalogStorageFormat.serdeProperties to properties ## What changes were proposed in this pull request? we also store data source table options in this field, it's unreasonable to call it `s

spark git commit: [SPARK-16416][CORE] force eager creation of loggers to avoid shutdown hook conflicts

2016-07-24 Thread srowen
Repository: spark Updated Branches: refs/heads/master 37bed97de -> 23e047f46 [SPARK-16416][CORE] force eager creation of loggers to avoid shutdown hook conflicts ## What changes were proposed in this pull request? Force eager creation of loggers to avoid shutdown hook conflicts. ## How was

spark git commit: [PYSPARK] add picklable SparseMatrix in pyspark.ml.common

2016-07-24 Thread yliang
Repository: spark Updated Branches: refs/heads/master cc1d2dcb6 -> 37bed97de [PYSPARK] add picklable SparseMatrix in pyspark.ml.common ## What changes were proposed in this pull request? add `SparseMatrix` class whick support pickler. ## How was this patch tested? Existing test. Author: We

spark git commit: [SPARK-16463][SQL] Support `truncate` option in Overwrite mode for JDBC DataFrameWriter

2016-07-24 Thread srowen
Repository: spark Updated Branches: refs/heads/master d6795c7a2 -> cc1d2dcb6 [SPARK-16463][SQL] Support `truncate` option in Overwrite mode for JDBC DataFrameWriter ## What changes were proposed in this pull request? This PR adds a boolean option, `truncate`, for `SaveMode.Overwrite` of JDBC

spark git commit: [SPARK-16515][SQL][FOLLOW-UP] Fix test `script` on OS X/Windows...

2016-07-24 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-2.0 31c3bcb46 -> 198b0426e [SPARK-16515][SQL][FOLLOW-UP] Fix test `script` on OS X/Windows... The current `sed` in `test_script.sh` is missing a `$`, leading to the failure of `script` test on OS X: ``` == Results == !== Correct Answer - 2

spark git commit: [SPARK-16515][SQL][FOLLOW-UP] Fix test `script` on OS X/Windows...

2016-07-24 Thread srowen
Repository: spark Updated Branches: refs/heads/master e3c7039b4 -> d6795c7a2 [SPARK-16515][SQL][FOLLOW-UP] Fix test `script` on OS X/Windows... ## Problem The current `sed` in `test_script.sh` is missing a `$`, leading to the failure of `script` test on OS X: ``` == Results == !== Correct An

spark git commit: [MINOR] Close old PRs that should be closed but have not been

2016-07-24 Thread srowen
Repository: spark Updated Branches: refs/heads/master 53b2456d1 -> e3c7039b4 [MINOR] Close old PRs that should be closed but have not been Closes #11598 Closes #7278 Closes #13882 Closes #12053 Closes #14125 Closes #8760 Closes #12848 Closes #14224 Author: Sean Owen Closes #14328 from srowe