spark git commit: [SPARK-11805] free the array in UnsafeExternalSorter during spilling

2015-11-24 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.6 3f40af574 -> 015569341 [SPARK-11805] free the array in UnsafeExternalSorter during spilling After calling spill() on SortedIterator, the array inside InMemorySorter is not needed, it should be freed during spilling, this could help to

spark git commit: [SPARK-11805] free the array in UnsafeExternalSorter during spilling

2015-11-24 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master e6dd23746 -> 58d9b2605 [SPARK-11805] free the array in UnsafeExternalSorter during spilling After calling spill() on SortedIterator, the array inside InMemorySorter is not needed, it should be freed during spilling, this could help to

spark git commit: [SPARK-11783][SQL] Fixes execution Hive client when using remote Hive metastore

2015-11-24 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 34ca392da -> c7f95df5c [SPARK-11783][SQL] Fixes execution Hive client when using remote Hive metastore When using remote Hive metastore, `hive.metastore.uris` is set to the metastore URI. However, it overrides

spark git commit: [SPARK-11783][SQL] Fixes execution Hive client when using remote Hive metastore

2015-11-24 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 015569341 -> 3f15ad783 [SPARK-11783][SQL] Fixes execution Hive client when using remote Hive metastore When using remote Hive metastore, `hive.metastore.uris` is set to the metastore URI. However, it overrides

spark git commit: [SPARK-11914][SQL] Support coalesce and repartition in Dataset APIs

2015-11-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master c7f95df5c -> 238ae51b6 [SPARK-11914][SQL] Support coalesce and repartition in Dataset APIs This PR is to provide two common `coalesce` and `repartition` in Dataset APIs. After reading the comments of SPARK-, I am unclear about the

spark git commit: [SPARK-11914][SQL] Support coalesce and repartition in Dataset APIs

2015-11-24 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 3f15ad783 -> 6d8c4c644 [SPARK-11914][SQL] Support coalesce and repartition in Dataset APIs This PR is to provide two common `coalesce` and `repartition` in Dataset APIs. After reading the comments of SPARK-, I am unclear about the

spark git commit: Added a line of comment to explain why the extra sort exists in pivot.

2015-11-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master 58d9b2605 -> 34ca392da Added a line of comment to explain why the extra sort exists in pivot. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/34ca392d Tree:

spark git commit: Added a line of comment to explain why the extra sort exists in pivot.

2015-11-24 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 6d8c4c644 -> 36a99f93f Added a line of comment to explain why the extra sort exists in pivot. (cherry picked from commit 34ca392da7097a1fbe48cd6c3ebff51453ca26ca) Signed-off-by: Reynold Xin Project:

spark git commit: [SPARK-11946][SQL] Audit pivot API for 1.6.

2015-11-24 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 0419fd361 -> 3f40af574 [SPARK-11946][SQL] Audit pivot API for 1.6. Currently pivot's signature looks like ```scala scala.annotation.varargs def pivot(pivotColumn: Column, values: Column*): GroupedData scala.annotation.varargs def

spark git commit: [SPARK-11946][SQL] Audit pivot API for 1.6.

2015-11-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master 81012546e -> f31527227 [SPARK-11946][SQL] Audit pivot API for 1.6. Currently pivot's signature looks like ```scala scala.annotation.varargs def pivot(pivotColumn: Column, values: Column*): GroupedData scala.annotation.varargs def

spark git commit: [SPARK-11929][CORE] Make the repl log4j configuration override the root logger.

2015-11-24 Thread irashid
Repository: spark Updated Branches: refs/heads/master f31527227 -> e6dd23746 [SPARK-11929][CORE] Make the repl log4j configuration override the root logger. In the default Spark distribution, there are currently two separate log4j config files, with different default values for the root

spark git commit: [SPARK-11872] Prevent the call to SparkContext#stop() in the listener bus's thread

2015-11-24 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 19530da69 -> 81012546e [SPARK-11872] Prevent the call to SparkContext#stop() in the listener bus's thread This is continuation of SPARK-11761 Andrew suggested adding this protection. See tail of https://github.com/apache/spark/pull/9741

spark git commit: [SPARK-11967][SQL] Consistent use of varargs for multiple paths in DataFrameReader

2015-11-24 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 36a99f93f -> 4464fa25c [SPARK-11967][SQL] Consistent use of varargs for multiple paths in DataFrameReader This patch makes it consistent to use varargs in all DataFrameReader methods, including Parquet, JSON, text, and the generic

spark git commit: [SPARK-11967][SQL] Consistent use of varargs for multiple paths in DataFrameReader

2015-11-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master 238ae51b6 -> 25bbd3c16 [SPARK-11967][SQL] Consistent use of varargs for multiple paths in DataFrameReader This patch makes it consistent to use varargs in all DataFrameReader methods, including Parquet, JSON, text, and the generic load

spark git commit: [SPARK-11947][SQL] Mark deprecated methods with "This will be removed in Spark 2.0."

2015-11-24 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 4464fa25c -> 862d788fc [SPARK-11947][SQL] Mark deprecated methods with "This will be removed in Spark 2.0." Also fixed some documentation as I saw them. Author: Reynold Xin Closes #9930 from rxin/SPARK-11947.

spark git commit: [SPARK-11947][SQL] Mark deprecated methods with "This will be removed in Spark 2.0."

2015-11-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master 25bbd3c16 -> 4d6bbbc03 [SPARK-11947][SQL] Mark deprecated methods with "This will be removed in Spark 2.0." Also fixed some documentation as I saw them. Author: Reynold Xin Closes #9930 from rxin/SPARK-11947.

spark git commit: [STREAMING][FLAKY-TEST] Catch execution context race condition in `FileBasedWriteAheadLog.close()`

2015-11-24 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 4d6bbbc03 -> a5d988763 [STREAMING][FLAKY-TEST] Catch execution context race condition in `FileBasedWriteAheadLog.close()` There is a race condition in `FileBasedWriteAheadLog.close()`, where if delete's of old log files are in progress,

spark git commit: [STREAMING][FLAKY-TEST] Catch execution context race condition in `FileBasedWriteAheadLog.close()`

2015-11-24 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 862d788fc -> b18112666 [STREAMING][FLAKY-TEST] Catch execution context race condition in `FileBasedWriteAheadLog.close()` There is a race condition in `FileBasedWriteAheadLog.close()`, where if delete's of old log files are in

spark git commit: [SPARK-10621][SQL] Consistent naming for functions in SQL, Python, Scala

2015-11-24 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 b18112666 -> 486db8789 [SPARK-10621][SQL] Consistent naming for functions in SQL, Python, Scala Author: Reynold Xin Closes #9948 from rxin/SPARK-10621. (cherry picked from commit

spark git commit: [SPARK-10621][SQL] Consistent naming for functions in SQL, Python, Scala

2015-11-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master a5d988763 -> 151d7c2ba [SPARK-10621][SQL] Consistent naming for functions in SQL, Python, Scala Author: Reynold Xin Closes #9948 from rxin/SPARK-10621. Project: http://git-wip-us.apache.org/repos/asf/spark/repo

spark git commit: [SPARK-11140][CORE] Transfer files using network lib when using NettyRpcEnv - 1.6.version.

2015-11-24 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 486db8789 -> 68bcb9b33 [SPARK-11140][CORE] Transfer files using network lib when using NettyRpcEnv - 1.6.version. This patch is the same code as in SPARK-11140 in master, but with some added code to still use the HTTP file server by

spark git commit: [SPARK-11979][STREAMING] Empty TrackStateRDD cannot be checkpointed and recovered from checkpoint file

2015-11-24 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 151d7c2ba -> 216988688 [SPARK-11979][STREAMING] Empty TrackStateRDD cannot be checkpointed and recovered from checkpoint file This solves the following exception caused when empty state RDD is checkpointed and recovered. The root cause

spark git commit: [SPARK-11979][STREAMING] Empty TrackStateRDD cannot be checkpointed and recovered from checkpoint file

2015-11-24 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 68bcb9b33 -> 7f030aa42 [SPARK-11979][STREAMING] Empty TrackStateRDD cannot be checkpointed and recovered from checkpoint file This solves the following exception caused when empty state RDD is checkpointed and recovered. The root

spark git commit: [SPARK-11818][REPL] Fix ExecutorClassLoader to lookup resources from …

2015-11-24 Thread vanzin
Repository: spark Updated Branches: refs/heads/master 5889880fb -> be9dd1550 [SPARK-11818][REPL] Fix ExecutorClassLoader to lookup resources from … …parent class loader Without patch, two additional tests of ExecutorClassLoaderSuite fails. - "resource from parent" - "resources from

spark git commit: [SPARK-11942][SQL] fix encoder life cycle for CoGroup

2015-11-24 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master be9dd1550 -> e5aaae6e1 [SPARK-11942][SQL] fix encoder life cycle for CoGroup we should pass in resolved encodera to logical `CoGroup` and bind them in physical `CoGroup` Author: Wenchen Fan Closes #9928 from

spark git commit: [SPARK-11942][SQL] fix encoder life cycle for CoGroup

2015-11-24 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.6 895128505 -> 3cb1b6d39 [SPARK-11942][SQL] fix encoder life cycle for CoGroup we should pass in resolved encodera to logical `CoGroup` and bind them in physical `CoGroup` Author: Wenchen Fan Closes #9928 from

spark git commit: [SPARK-11952][ML] Remove duplicate ml examples

2015-11-24 Thread meng
Repository: spark Updated Branches: refs/heads/master e5aaae6e1 -> 56a0aba0a [SPARK-11952][ML] Remove duplicate ml examples Remove duplicate ml examples (only for ml). mengxr Author: Yanbo Liang Closes #9933 from yanboliang/SPARK-11685. Project:

spark git commit: [SPARK-11952][ML] Remove duplicate ml examples

2015-11-24 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.6 3cb1b6d39 -> 6914b7504 [SPARK-11952][ML] Remove duplicate ml examples Remove duplicate ml examples (only for ml). mengxr Author: Yanbo Liang Closes #9933 from yanboliang/SPARK-11685. (cherry picked from commit

spark git commit: [SPARK-11521][ML][DOC] Document that Logistic, Linear Regression summaries ignore weight col

2015-11-24 Thread meng
Repository: spark Updated Branches: refs/heads/master 56a0aba0a -> 9e24ba667 [SPARK-11521][ML][DOC] Document that Logistic, Linear Regression summaries ignore weight col Doc for 1.6 that the summaries mostly ignore the weight column. To be corrected for 1.7 CC: mengxr thunterdb Author:

spark git commit: [SPARK-11521][ML][DOC] Document that Logistic, Linear Regression summaries ignore weight col

2015-11-24 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.6 6914b7504 -> 70febe224 [SPARK-11521][ML][DOC] Document that Logistic, Linear Regression summaries ignore weight col Doc for 1.6 that the summaries mostly ignore the weight column. To be corrected for 1.7 CC: mengxr thunterdb Author:

spark git commit: [SPARK-11847][ML] Model export/import for spark.ml: LDA

2015-11-24 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.6 70febe224 -> af86c38db [SPARK-11847][ML] Model export/import for spark.ml: LDA Add read/write support to LDA, similar to ALS. save/load for ml.LocalLDAModel is done. For DistributedLDAModel, I'm not sure if we can invoke save on the

spark git commit: [SPARK-11847][ML] Model export/import for spark.ml: LDA

2015-11-24 Thread meng
Repository: spark Updated Branches: refs/heads/master 9e24ba667 -> 52bc25c8e [SPARK-11847][ML] Model export/import for spark.ml: LDA Add read/write support to LDA, similar to ALS. save/load for ml.LocalLDAModel is done. For DistributedLDAModel, I'm not sure if we can invoke save on the

spark git commit: [SPARK-11926][SQL] unify GetStructField and GetInternalRowField

2015-11-24 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 52bc25c8e -> 19530da69 [SPARK-11926][SQL] unify GetStructField and GetInternalRowField Author: Wenchen Fan Closes #9909 from cloud-fan/get-struct. Project: http://git-wip-us.apache.org/repos/asf/spark/repo

spark git commit: [SPARK-11926][SQL] unify GetStructField and GetInternalRowField

2015-11-24 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.6 af86c38db -> 927070d6d [SPARK-11926][SQL] unify GetStructField and GetInternalRowField Author: Wenchen Fan Closes #9909 from cloud-fan/get-struct. (cherry picked from commit

spark git commit: [SPARK-11897][SQL] Add @scala.annotations.varargs to sql functions

2015-11-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master 4021a28ac -> 12eea834d [SPARK-11897][SQL] Add @scala.annotations.varargs to sql functions Author: Xiu Guo Closes #9918 from xguo27/SPARK-11897. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-11897][SQL] Add @scala.annotations.varargs to sql functions

2015-11-24 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 21e63929c -> e1b0a2376 [SPARK-11897][SQL] Add @scala.annotations.varargs to sql functions Author: Xiu Guo Closes #9918 from xguo27/SPARK-11897. (cherry picked from commit 12eea834d7382fbaa9c92182b682b8724049d7c1)

spark git commit: [SPARK-11906][WEB UI] Speculation Tasks Cause ProgressBar UI Overflow

2015-11-24 Thread srowen
Repository: spark Updated Branches: refs/heads/master 12eea834d -> 800bd799a [SPARK-11906][WEB UI] Speculation Tasks Cause ProgressBar UI Overflow When there are speculative tasks in the stage, running progress bar could overflow and goes hidden on a new line:

spark git commit: [SPARK-11906][WEB UI] Speculation Tasks Cause ProgressBar UI Overflow

2015-11-24 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-1.6 e1b0a2376 -> 17ea95133 [SPARK-11906][WEB UI] Speculation Tasks Cause ProgressBar UI Overflow When there are speculative tasks in the stage, running progress bar could overflow and goes hidden on a new line:

spark git commit: [SPARK-11043][SQL] BugFix:Set the operator log in the thrift server.

2015-11-24 Thread lian
Repository: spark Updated Branches: refs/heads/master 800bd799a -> d4a5e6f71 [SPARK-11043][SQL] BugFix:Set the operator log in the thrift server. `SessionManager` will set the `operationLog` if the configuration `hive.server2.logging.operation.enabled` is true in version of hive 1.2.1. But

spark git commit: [SPARK-11043][SQL] BugFix:Set the operator log in the thrift server.

2015-11-24 Thread lian
Repository: spark Updated Branches: refs/heads/branch-1.6 17ea95133 -> f1f2cee4c [SPARK-11043][SQL] BugFix:Set the operator log in the thrift server. `SessionManager` will set the `operationLog` if the configuration `hive.server2.logging.operation.enabled` is true in version of hive 1.2.1.

spark git commit: [SPARK-11592][SQL] flush spark-sql command line history to history file

2015-11-24 Thread lian
Repository: spark Updated Branches: refs/heads/master d4a5e6f71 -> 5889880fb [SPARK-11592][SQL] flush spark-sql command line history to history file Currently, `spark-sql` would not flush command history when exiting. Author: Daoyuan Wang Closes #9563 from