spark git commit: [DOC] bucketing is applicable to all file-based data sources

2016-12-21 Thread rxin
Repository: spark Updated Branches: refs/heads/master 7c5b7b3a2 -> 2e861df96 [DOC] bucketing is applicable to all file-based data sources ## What changes were proposed in this pull request? Starting Spark 2.1.0, bucketing feature is available for all file-based data sources. This patch fixes

spark git commit: [DOC] bucketing is applicable to all file-based data sources

2016-12-21 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.1 def3690f6 -> ec0d6e21e [DOC] bucketing is applicable to all file-based data sources ## What changes were proposed in this pull request? Starting Spark 2.1.0, bucketing feature is available for all file-based data sources. This patch fi

spark git commit: [SQL] Minor readability improvement for partition handling code

2016-12-21 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.1 07e2a17d1 -> def3690f6 [SQL] Minor readability improvement for partition handling code This patch includes minor changes to improve readability for partition handling code. I'm in the middle of implementing some new feature and found s

spark git commit: [SQL] Minor readability improvement for partition handling code

2016-12-21 Thread wenchen
Repository: spark Updated Branches: refs/heads/master ff7d82a20 -> 7c5b7b3a2 [SQL] Minor readability improvement for partition handling code ## What changes were proposed in this pull request? This patch includes minor changes to improve readability for partition handling code. I'm in the mid

spark git commit: [SPARK-18908][SS] Creating StreamingQueryException should check if logicalPlan is created

2016-12-21 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.1 9a3c5bd70 -> 07e2a17d1 [SPARK-18908][SS] Creating StreamingQueryException should check if logicalPlan is created ## What changes were proposed in this pull request? This PR audits places using `logicalPlan` in StreamExecution and ensu

spark git commit: [SPARK-18908][SS] Creating StreamingQueryException should check if logicalPlan is created

2016-12-21 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master e1b43dc45 -> ff7d82a20 [SPARK-18908][SS] Creating StreamingQueryException should check if logicalPlan is created ## What changes were proposed in this pull request? This PR audits places using `logicalPlan` in StreamExecution and ensures

spark git commit: [BUILD] make-distribution should find JAVA_HOME for non-RHEL systems

2016-12-21 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master afe36516e -> e1b43dc45 [BUILD] make-distribution should find JAVA_HOME for non-RHEL systems ## What changes were proposed in this pull request? make-distribution.sh should find JAVA_HOME for Ubuntu, Mac and other non-RHEL systems ## How

spark git commit: [FLAKY-TEST] InputStreamsSuite.socket input stream

2016-12-21 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-2.1 021952d58 -> 9a3c5bd70 [FLAKY-TEST] InputStreamsSuite.socket input stream ## What changes were proposed in this pull request? https://spark-tests.appspot.com/test-details?suite_name=org.apache.spark.streaming.InputStreamsSuite&test_nam

spark git commit: [FLAKY-TEST] InputStreamsSuite.socket input stream

2016-12-21 Thread tdas
Repository: spark Updated Branches: refs/heads/master 7e8994ffd -> afe36516e [FLAKY-TEST] InputStreamsSuite.socket input stream ## What changes were proposed in this pull request? https://spark-tests.appspot.com/test-details?suite_name=org.apache.spark.streaming.InputStreamsSuite&test_name=so

spark git commit: [SPARK-18903][SPARKR] Add API to get SparkUI URL

2016-12-21 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master b41ec9977 -> 7e8994ffd [SPARK-18903][SPARKR] Add API to get SparkUI URL ## What changes were proposed in this pull request? API for SparkUI URL from SparkContext ## How was this patch tested? manual, unit tests Author: Felix Cheung Cl

spark git commit: [SPARK-18528][SQL] Fix a bug to initialise an iterator of aggregation buffer

2016-12-21 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/branch-2.0 53cd99f65 -> 080ac37fb [SPARK-18528][SQL] Fix a bug to initialise an iterator of aggregation buffer ## What changes were proposed in this pull request? This pr is to fix an `NullPointerException` issue caused by a following `limit + ag

spark git commit: [SPARK-18528][SQL] Fix a bug to initialise an iterator of aggregation buffer

2016-12-21 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/branch-2.1 60e02a173 -> 021952d58 [SPARK-18528][SQL] Fix a bug to initialise an iterator of aggregation buffer ## What changes were proposed in this pull request? This pr is to fix an `NullPointerException` issue caused by a following `limit + ag

spark git commit: [SPARK-18528][SQL] Fix a bug to initialise an iterator of aggregation buffer

2016-12-21 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/master 83a6ace0d -> b41ec9977 [SPARK-18528][SQL] Fix a bug to initialise an iterator of aggregation buffer ## What changes were proposed in this pull request? This pr is to fix an `NullPointerException` issue caused by a following `limit + aggreg

spark git commit: [SPARK-18234][SS] Made update mode public

2016-12-21 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-2.1 17ef57fe8 -> 60e02a173 [SPARK-18234][SS] Made update mode public ## What changes were proposed in this pull request? Made update mode public. As part of that here are the changes. - Update DatastreamWriter to accept "update" - Changed

spark git commit: [SPARK-18234][SS] Made update mode public

2016-12-21 Thread tdas
Repository: spark Updated Branches: refs/heads/master afd9bc1d8 -> 83a6ace0d [SPARK-18234][SS] Made update mode public ## What changes were proposed in this pull request? Made update mode public. As part of that here are the changes. - Update DatastreamWriter to accept "update" - Changed pack

spark git commit: [SPARK-17807][CORE] split test-tags into test-JAR

2016-12-21 Thread vanzin
Repository: spark Updated Branches: refs/heads/master 95efc895e -> afd9bc1d8 [SPARK-17807][CORE] split test-tags into test-JAR Remove spark-tag's compile-scope dependency (and, indirectly, spark-core's compile-scope transitive-dependency) on scalatest by splitting test-oriented tags into spa

spark git commit: [SPARK-18588][SS][KAFKA] Create a new KafkaConsumer when error happens to fix the flaky test

2016-12-21 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-2.1 0e51bb085 -> 17ef57fe8 [SPARK-18588][SS][KAFKA] Create a new KafkaConsumer when error happens to fix the flaky test ## What changes were proposed in this pull request? When KafkaSource fails on Kafka errors, we should create a new con

spark git commit: [SPARK-18588][SS][KAFKA] Create a new KafkaConsumer when error happens to fix the flaky test

2016-12-21 Thread tdas
Repository: spark Updated Branches: refs/heads/master 354e93618 -> 95efc895e [SPARK-18588][SS][KAFKA] Create a new KafkaConsumer when error happens to fix the flaky test ## What changes were proposed in this pull request? When KafkaSource fails on Kafka errors, we should create a new consume

spark git commit: [SPARK-18775][SQL] Limit the max number of records written per file

2016-12-21 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/master 078c71c2d -> 354e93618 [SPARK-18775][SQL] Limit the max number of records written per file ## What changes were proposed in this pull request? Currently, Spark writes a single file out per task, sometimes leading to very large files. It wo

spark git commit: [SPARK-18949][SQL][BACKPORT-2.1] Add recoverPartitions API to Catalog

2016-12-21 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.1 318483421 -> 0e51bb085 [SPARK-18949][SQL][BACKPORT-2.1] Add recoverPartitions API to Catalog ### What changes were proposed in this pull request? This PR is to backport https://github.com/apache/spark/pull/16356 to Spark 2.1.1 branch.

spark git commit: [SPARK-18700][SQL][BACKPORT-2.0] Add StripedLock for each table's relation in cache

2016-12-21 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/branch-2.0 5f8c0b742 -> 53cd99f65 [SPARK-18700][SQL][BACKPORT-2.0] Add StripedLock for each table's relation in cache ## What changes were proposed in this pull request? Backport of #16135 to branch-2.0 ## How was this patch tested? Because of

spark git commit: [SPARK-18954][TESTS] Fix flaky test: o.a.s.streaming.BasicOperationsSuite rdd cleanup - map and window

2016-12-21 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 ef206ace2 -> 5f8c0b742 [SPARK-18954][TESTS] Fix flaky test: o.a.s.streaming.BasicOperationsSuite rdd cleanup - map and window ## What changes were proposed in this pull request? The issue in this test is the cleanup of RDDs may not be

spark git commit: [SPARK-18954][TESTS] Fix flaky test: o.a.s.streaming.BasicOperationsSuite rdd cleanup - map and window

2016-12-21 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.1 162bdb910 -> 318483421 [SPARK-18954][TESTS] Fix flaky test: o.a.s.streaming.BasicOperationsSuite rdd cleanup - map and window ## What changes were proposed in this pull request? The issue in this test is the cleanup of RDDs may not be

spark git commit: [SPARK-18954][TESTS] Fix flaky test: o.a.s.streaming.BasicOperationsSuite rdd cleanup - map and window

2016-12-21 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master ccfe60a83 -> 078c71c2d [SPARK-18954][TESTS] Fix flaky test: o.a.s.streaming.BasicOperationsSuite rdd cleanup - map and window ## What changes were proposed in this pull request? The issue in this test is the cleanup of RDDs may not be abl

spark git commit: [SPARK-18031][TESTS] Fix flaky test ExecutorAllocationManagerSuite.basic functionality

2016-12-21 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-2.1 3c8861d92 -> 162bdb910 [SPARK-18031][TESTS] Fix flaky test ExecutorAllocationManagerSuite.basic functionality ## What changes were proposed in this pull request? The failure is because in `test("basic functionality")`, it doesn't bloc

spark git commit: [SPARK-18031][TESTS] Fix flaky test ExecutorAllocationManagerSuite.basic functionality

2016-12-21 Thread tdas
Repository: spark Updated Branches: refs/heads/master 607a1e63d -> ccfe60a83 [SPARK-18031][TESTS] Fix flaky test ExecutorAllocationManagerSuite.basic functionality ## What changes were proposed in this pull request? The failure is because in `test("basic functionality")`, it doesn't block un

spark git commit: [SPARK-18894][SS] Fix event time watermark delay threshold specified in months or years

2016-12-21 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.1 bc54a14b4 -> 3c8861d92 [SPARK-18894][SS] Fix event time watermark delay threshold specified in months or years ## What changes were proposed in this pull request? Two changes - Fix how delays specified in months and years are translat

spark git commit: [SPARK-18894][SS] Fix event time watermark delay threshold specified in months or years

2016-12-21 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 1a6438897 -> 607a1e63d [SPARK-18894][SS] Fix event time watermark delay threshold specified in months or years ## What changes were proposed in this pull request? Two changes - Fix how delays specified in months and years are translated t

spark git commit: [SPARK-18951] Upgrade com.thoughtworks.paranamer/paranamer to 2.6

2016-12-21 Thread yhuai
Repository: spark Updated Branches: refs/heads/master b7650f11c -> 1a6438897 [SPARK-18951] Upgrade com.thoughtworks.paranamer/paranamer to 2.6 ## What changes were proposed in this pull request? I recently hit a bug of com.thoughtworks.paranamer/paranamer, which causes jackson fail to handle

spark git commit: [SPARK-18947][SQL] SQLContext.tableNames should not call Catalog.listTables

2016-12-21 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.0 2aae220b5 -> ef206ace2 [SPARK-18947][SQL] SQLContext.tableNames should not call Catalog.listTables ## What changes were proposed in this pull request? It's a huge waste to call `Catalog.listTables` in `SQLContext.tableNames`, which on

spark git commit: [SPARK-18947][SQL] SQLContext.tableNames should not call Catalog.listTables

2016-12-21 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.1 063a98e52 -> bc54a14b4 [SPARK-18947][SQL] SQLContext.tableNames should not call Catalog.listTables ## What changes were proposed in this pull request? It's a huge waste to call `Catalog.listTables` in `SQLContext.tableNames`, which on

spark git commit: [SPARK-18947][SQL] SQLContext.tableNames should not call Catalog.listTables

2016-12-21 Thread wenchen
Repository: spark Updated Branches: refs/heads/master ba4468bb2 -> b7650f11c [SPARK-18947][SQL] SQLContext.tableNames should not call Catalog.listTables ## What changes were proposed in this pull request? It's a huge waste to call `Catalog.listTables` in `SQLContext.tableNames`, which only n

spark git commit: [SPARK-18923][DOC][BUILD] Support skipping R/Python API docs

2016-12-21 Thread srowen
Repository: spark Updated Branches: refs/heads/master 24c0c9412 -> ba4468bb2 [SPARK-18923][DOC][BUILD] Support skipping R/Python API docs ## What changes were proposed in this pull request? We can build Python API docs by `cd ./python/docs && make html for Python` and R API docs by `cd ./R &