spark git commit: [SPARK-14617] Remove deprecated APIs in TaskMetrics

2016-04-14 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master dac40b68d -> a46f98d3f [SPARK-14617] Remove deprecated APIs in TaskMetrics ## What changes were proposed in this pull request? This patch removes some of the deprecated APIs in TaskMetrics. This is part of my bigger effort to simplify

spark git commit: [SPARK-14558][CORE] In ClosureCleaner, clean the outer pointer if it's a REPL line object

2016-04-14 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master a46f98d3f -> 1d04c86fc [SPARK-14558][CORE] In ClosureCleaner, clean the outer pointer if it's a REPL line object ## What changes were proposed in this pull request? When we clean a closure, if its outermost parent is not a closure, we

spark git commit: [SPARK-14499][SQL][TEST] Drop Partition Does Not Delete Data of External Tables

2016-04-14 Thread andrewor14
ble will not delete data. cc yhuai andrewor14 How was this patch tested? N/A Author: gatorsmile <gatorsm...@gmail.com> This patch had conflicts when merged, resolved by Committer: Andrew Or <and...@databricks.com> Closes #12350 from gatorsmile/testDropPartition. Project: http://git-wi

spark git commit: [Docs] Update spark-standalone.md to fix link

2016-09-26 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 7c7586aef -> 00be16df6 [Docs] Update spark-standalone.md to fix link Corrected a link to the configuration.html page, it was pointing to a page that does not exist (configurations.html). Documentation change, verified in preview.

spark git commit: [Docs] Update spark-standalone.md to fix link

2016-09-26 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-2.0 8a58f2e8e -> f4594900d [Docs] Update spark-standalone.md to fix link Corrected a link to the configuration.html page, it was pointing to a page that does not exist (configurations.html). Documentation change, verified in preview.

spark git commit: [SPARK-17715][SCHEDULER] Make task launch logs DEBUG

2016-09-29 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master cb87b3ced -> 027dea8f2 [SPARK-17715][SCHEDULER] Make task launch logs DEBUG ## What changes were proposed in this pull request? Ramp down the task launch logs from INFO to DEBUG. Task launches can happen orders of magnitude more than

spark git commit: [SPARK-17672] Spark 2.0 history server web Ui takes too long for a single application

2016-09-29 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-2.0 f7839e47c -> 7c9450b00 [SPARK-17672] Spark 2.0 history server web Ui takes too long for a single application Added a new API getApplicationInfo(appId: String) in class ApplicationHistoryProvider and class SparkUI to get app info. In

spark git commit: [SPARK-17672] Spark 2.0 history server web Ui takes too long for a single application

2016-09-29 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 7f779e743 -> cb87b3ced [SPARK-17672] Spark 2.0 history server web Ui takes too long for a single application Added a new API getApplicationInfo(appId: String) in class ApplicationHistoryProvider and class SparkUI to get app info. In this

spark git commit: [SPARK-17648][CORE] TaskScheduler really needs offers to be an IndexedSeq

2016-09-29 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 958200497 -> 7f779e743 [SPARK-17648][CORE] TaskScheduler really needs offers to be an IndexedSeq ## What changes were proposed in this pull request? The Seq[WorkerOffer] is accessed by index, so it really should be an IndexedSeq,

spark git commit: [SPARK-16827] Stop reporting spill metrics as shuffle metrics

2016-10-07 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 2b01d3c70 -> e56614cba [SPARK-16827] Stop reporting spill metrics as shuffle metrics ## What changes were proposed in this pull request? Fix a bug where spill metrics were being reported as shuffle metrics. Eventually these spill metrics

spark git commit: [SPARK-17438][WEBUI] Show Application.executorLimit in the application page

2016-09-19 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master cdea1d134 -> 80d665592 [SPARK-17438][WEBUI] Show Application.executorLimit in the application page ## What changes were proposed in this pull request? This PR adds `Application.executorLimit` to the applicatino page ## How was this patch

spark git commit: [SPARK-17438][WEBUI] Show Application.executorLimit in the application page

2016-09-19 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-2.0 f56035ba6 -> d6191a067 [SPARK-17438][WEBUI] Show Application.executorLimit in the application page ## What changes were proposed in this pull request? This PR adds `Application.executorLimit` to the applicatino page ## How was this

spark git commit: [SPARK-17512][CORE] Avoid formatting to python path for yarn and mesos cluster mode

2016-09-21 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 9fcf1c51d -> 8c3ee2bc4 [SPARK-17512][CORE] Avoid formatting to python path for yarn and mesos cluster mode ## What changes were proposed in this pull request? Yarn and mesos cluster mode support remote python path (HDFS/S3 scheme) by

spark git commit: [SPARK-17512][CORE] Avoid formatting to python path for yarn and mesos cluster mode

2016-09-21 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-2.0 cd0bd89d7 -> 59e6ab11a [SPARK-17512][CORE] Avoid formatting to python path for yarn and mesos cluster mode ## What changes were proposed in this pull request? Yarn and mesos cluster mode support remote python path (HDFS/S3 scheme) by

spark git commit: [SPARK-17623][CORE] Clarify type of TaskEndReason with a failed task.

2016-09-21 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 2cd1bfa4f -> 9fcf1c51d [SPARK-17623][CORE] Clarify type of TaskEndReason with a failed task. ## What changes were proposed in this pull request? In TaskResultGetter, enqueueFailedTask currently deserializes the result as a TaskEndReason.

spark git commit: [SPARK-18361][PYSPARK] Expose RDD localCheckpoint in PySpark

2016-11-21 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 07beb5d21 -> 70176871a [SPARK-18361][PYSPARK] Expose RDD localCheckpoint in PySpark ## What changes were proposed in this pull request? Expose RDD's localCheckpoint() and associated functions in PySpark. ## How was this patch tested? I

spark git commit: [SPARK-18361][PYSPARK] Expose RDD localCheckpoint in PySpark

2016-11-21 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-2.1 b0a73c9be -> 406f33987 [SPARK-18361][PYSPARK] Expose RDD localCheckpoint in PySpark ## What changes were proposed in this pull request? Expose RDD's localCheckpoint() and associated functions in PySpark. ## How was this patch tested?

spark git commit: [SPARK-18517][SQL] DROP TABLE IF EXISTS should not warn for non-existing tables

2016-11-21 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 70176871a -> ddd02f50b [SPARK-18517][SQL] DROP TABLE IF EXISTS should not warn for non-existing tables ## What changes were proposed in this pull request? Currently, `DROP TABLE IF EXISTS` shows warning for non-existing tables. However,

spark git commit: [SPARK-18517][SQL] DROP TABLE IF EXISTS should not warn for non-existing tables

2016-11-21 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-2.1 251a99276 -> b0a73c9be [SPARK-18517][SQL] DROP TABLE IF EXISTS should not warn for non-existing tables ## What changes were proposed in this pull request? Currently, `DROP TABLE IF EXISTS` shows warning for non-existing tables.

spark git commit: [SPARK-18050][SQL] do not create default database if it already exists

2016-11-23 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 70ad07a9d -> f129ebcd3 [SPARK-18050][SQL] do not create default database if it already exists ## What changes were proposed in this pull request? When we try to create the default database, we ask hive to do nothing if it already exists.

spark git commit: [SPARK-18050][SQL] do not create default database if it already exists

2016-11-23 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-2.1 599dac159 -> 835f03f34 [SPARK-18050][SQL] do not create default database if it already exists ## What changes were proposed in this pull request? When we try to create the default database, we ask hive to do nothing if it already

spark git commit: [SPARK-18507][SQL] HiveExternalCatalog.listPartitions should only call getTable once

2016-11-22 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 45ea46b7b -> 702cd403f [SPARK-18507][SQL] HiveExternalCatalog.listPartitions should only call getTable once ## What changes were proposed in this pull request? HiveExternalCatalog.listPartitions should only call `getTable` once, instead

spark git commit: [SPARK-18507][SQL] HiveExternalCatalog.listPartitions should only call getTable once

2016-11-22 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-2.1 0e624e990 -> fa360134d [SPARK-18507][SQL] HiveExternalCatalog.listPartitions should only call getTable once ## What changes were proposed in this pull request? HiveExternalCatalog.listPartitions should only call `getTable` once,

spark git commit: [SPARK-17680][SQL][TEST] Added test cases for InMemoryRelation

2016-11-28 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 0f5f52a3d -> ad67993b7 [SPARK-17680][SQL][TEST] Added test cases for InMemoryRelation ## What changes were proposed in this pull request? This pull request adds test cases for the following cases: - keep all data types with null or

spark git commit: [SPARK-17680][SQL][TEST] Added test cases for InMemoryRelation

2016-11-28 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-2.1 81e3f9711 -> b386943b2 [SPARK-17680][SQL][TEST] Added test cases for InMemoryRelation ## What changes were proposed in this pull request? This pull request adds test cases for the following cases: - keep all data types with null or

spark git commit: [SPARK-17686][CORE] Support printing out scala and java version with spark-submit --version command

2016-10-13 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master db8784fea -> 7bf8a4049 [SPARK-17686][CORE] Support printing out scala and java version with spark-submit --version command ## What changes were proposed in this pull request? In our universal gateway service we need to specify different

spark git commit: [SPARK-17899][SQL] add a debug mode to keep raw table properties in HiveExternalCatalog

2016-10-13 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 6f2fa6c54 -> db8784fea [SPARK-17899][SQL] add a debug mode to keep raw table properties in HiveExternalCatalog ## What changes were proposed in this pull request? Currently `HiveExternalCatalog` will filter out the Spark SQL internal

spark git commit: [SPARK-11272][WEB UI] Add support for downloading event logs from HistoryServer UI

2016-10-13 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 7222a25a1 -> 6f2fa6c54 [SPARK-11272][WEB UI] Add support for downloading event logs from HistoryServer UI ## What changes were proposed in this pull request? This is a reworked PR based on feedback in #9238 after it was closed and not

spark git commit: [SPARK-18640] Add synchronization to TaskScheduler.runningTasksByExecutors

2016-11-30 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-2.0 8b33aa089 -> 1b1c849bf [SPARK-18640] Add synchronization to TaskScheduler.runningTasksByExecutors ## What changes were proposed in this pull request? The method `TaskSchedulerImpl.runningTasksByExecutors()` accesses the mutable

spark git commit: [SPARK-18640] Add synchronization to TaskScheduler.runningTasksByExecutors

2016-11-30 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-2.1 eae85da38 -> 7c0e2962d [SPARK-18640] Add synchronization to TaskScheduler.runningTasksByExecutors ## What changes were proposed in this pull request? The method `TaskSchedulerImpl.runningTasksByExecutors()` accesses the mutable

spark git commit: [SPARK][EXAMPLE] Added missing semicolon in quick-start-guide example

2016-11-30 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-2.0 1b1c849bf -> 5ecd3c23a [SPARK][EXAMPLE] Added missing semicolon in quick-start-guide example ## What changes were proposed in this pull request? Added missing semicolon in quick-start-guide java example code which wasn't compiling

<    10   11   12   13   14   15