spark git commit: [SPARK-15698][SQL][STREAMING][FOLLW-UP] Fix FileStream source and sink log get configuration issue

2016-09-20 Thread zsxwing
698. Mistakenly change the way to get configuration back to original one, so here with the follow up PR to revert them up. ## How was this patch tested? N/A Ping zsxwing , please review again, sorry to bring the inconvenience. Thanks a lot. Author: jerryshao <ss...@hortonworks.com>

spark git commit: [SPARK-15698][SQL][STREAMING] Add the ability to remove the old MetadataLog in FileStreamSource (branch-2.0)

2016-09-20 Thread zsxwing
How was this patch tested? Jenkins Author: jerryshao <ss...@hortonworks.com> Closes #15163 from zsxwing/SPARK-15698-spark-2.0. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/8d8e2332 Tree: http://git-wip-us.apache.org/repos/

spark git commit: [SPARK-15698][SQL][STREAMING] Add the ability to remove the old MetadataLog in FileStreamSource

2016-09-20 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master eb004c662 -> a6aade004 [SPARK-15698][SQL][STREAMING] Add the ability to remove the old MetadataLog in FileStreamSource ## What changes were proposed in this pull request? Current `metadataLog` in `FileStreamSource` will add a checkpoint

spark git commit: [SPARK-17379][BUILD] Upgrade netty-all to 4.0.41 final for bug fixes

2016-09-15 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master b47927814 -> 0ad8eeb4d [SPARK-17379][BUILD] Upgrade netty-all to 4.0.41 final for bug fixes ## What changes were proposed in this pull request? Upgrade netty-all to latest in the 4.0.x line which is 4.0.41, mentions several bug fixes and

spark git commit: [SPARK-17451][CORE] CoarseGrainedExecutorBackend should inform driver before self-kill

2016-09-15 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 2ad276954 -> b47927814 [SPARK-17451][CORE] CoarseGrainedExecutorBackend should inform driver before self-kill ## What changes were proposed in this pull request? Jira : https://issues.apache.org/jira/browse/SPARK-17451

spark git commit: [SPARK-17486] Remove unused TaskMetricsUIData.updatedBlockStatuses field

2016-09-11 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 d293062a4 -> 30521522d [SPARK-17486] Remove unused TaskMetricsUIData.updatedBlockStatuses field The `TaskMetricsUIData.updatedBlockStatuses` field is assigned to but never read, increasing the memory consumption of the web UI. We

spark git commit: [SPARK-17486] Remove unused TaskMetricsUIData.updatedBlockStatuses field

2016-09-11 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 767d48076 -> 72eec70bd [SPARK-17486] Remove unused TaskMetricsUIData.updatedBlockStatuses field The `TaskMetricsUIData.updatedBlockStatuses` field is assigned to but never read, increasing the memory consumption of the web UI. We should

spark git commit: [SPARK-15487][WEB UI] Spark Master UI to reverse proxy Application and Workers UI

2016-09-08 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 722afbb2b -> 92ce8d484 [SPARK-15487][WEB UI] Spark Master UI to reverse proxy Application and Workers UI ## What changes were proposed in this pull request? This pull request adds the functionality to enable accessing worker and

spark git commit: [SPARK-17316][CORE] Fix the 'ask' type parameter in 'removeExecutor'

2016-09-06 Thread zsxwing
not cast java.lang.Boolean to scala.runtime.Nothing$` ## How was this patch tested? Jenkins Author: Shixiong Zhu <shixi...@databricks.com> Closes #14983 from zsxwing/SPARK-17316-3. (cherry picked from commit 175b4344112b376cbbbd05265125ed0e1b87d507) Signed-off-by: Shixiong Z

spark git commit: [SPARK-17316][CORE] Fix the 'ask' type parameter in 'removeExecutor'

2016-09-06 Thread zsxwing
not cast java.lang.Boolean to scala.runtime.Nothing$` ## How was this patch tested? Jenkins Author: Shixiong Zhu <shixi...@databricks.com> Closes #14983 from zsxwing/SPARK-17316-3. (cherry picked from commit 175b4344112b376cbbbd05265125ed0e1b87d507) Signed-off-by: Shixiong Z

spark git commit: [SPARK-17316][CORE] Make CoarseGrainedSchedulerBackend.removeExecutor non-blocking

2016-09-02 Thread zsxwing
lue). ## How was this patch tested? Jenkins unit tests. Author: Shixiong Zhu <shixi...@databricks.com> Closes #14882 from zsxwing/SPARK-17316. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b84a92c2 Tree: http:

spark git commit: [SPARK-17318][TESTS] Fix ReplSuite replicating blocks of object with class defined in repl again

2016-09-01 Thread zsxwing
est. ## How was this patch tested? Jenkins Author: Shixiong Zhu <shixi...@databricks.com> Closes #14905 from zsxwing/SPARK-17318-2. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/21c0a4fe Tree: http://git-wip-us.apache.

spark git commit: [SPARK-17318][TESTS] Fix ReplSuite replicating blocks of object with class defined in repl again

2016-09-01 Thread zsxwing
est. ## How was this patch tested? Jenkins Author: Shixiong Zhu <shixi...@databricks.com> Closes #14905 from zsxwing/SPARK-17318-2. (cherry picked from commit 21c0a4fe9d8e21819ba96e7dc2b1f2999d3299ae) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.a

spark git commit: [SPARK-17318][TESTS] Fix ReplSuite replicating blocks of object with class defined in repl

2016-08-30 Thread zsxwing
;shixi...@databricks.com> Closes #14884 from zsxwing/SPARK-17318. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/231f9732 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/231f9732 Diff: http://git-wip-us.apache.org/

spark git commit: [SPARK-17314][CORE] Use Netty's DefaultThreadFactory to enable its fast ThreadLocal impl

2016-08-30 Thread zsxwing
Zhu <shixi...@databricks.com> Closes #14879 from zsxwing/netty-thread. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/02ac379e Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/02ac379e Diff: http://git-wip-us.apache.

spark git commit: [SPARK-17165][SQL] FileStreamSource should not track the list of seen files indefinitely

2016-08-26 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 6f82d2da3 -> deb6a54cf [SPARK-17165][SQL] FileStreamSource should not track the list of seen files indefinitely ## What changes were proposed in this pull request? Before this change, FileStreamSource uses an in-memory hash set to

spark git commit: [SPARK-17165][SQL] FileStreamSource should not track the list of seen files indefinitely

2016-08-26 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 261c55dd8 -> 9812f7d53 [SPARK-17165][SQL] FileStreamSource should not track the list of seen files indefinitely ## What changes were proposed in this pull request? Before this change, FileStreamSource uses an in-memory hash set to track

spark git commit: [SPARK-17231][CORE] Avoid building debug or trace log messages unless the respective log level is enabled

2016-08-25 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 ff2e270eb -> 73014a2aa [SPARK-17231][CORE] Avoid building debug or trace log messages unless the respective log level is enabled This is simply a backport of #14798 to `branch-2.0`. This backport omits the change to

spark git commit: [SPARK-17231][CORE] Avoid building debug or trace log messages unless the respective log level is enabled

2016-08-25 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master d2ae6399e -> f20931071 [SPARK-17231][CORE] Avoid building debug or trace log messages unless the respective log level is enabled (This PR addresses https://issues.apache.org/jira/browse/SPARK-17231) ## What changes were proposed in this

spark git commit: [SPARK-17038][STREAMING] fix metrics retrieval source of 'lastReceivedBatch'

2016-08-17 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 9406f82db -> 585d1d95c [SPARK-17038][STREAMING] fix metrics retrieval source of 'lastReceivedBatch' https://issues.apache.org/jira/browse/SPARK-17038 ## What changes were proposed in this pull request? StreamingSource's

spark git commit: [SPARK-17038][STREAMING] fix metrics retrieval source of 'lastReceivedBatch'

2016-08-17 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 60de30faf -> 412b0e896 [SPARK-17038][STREAMING] fix metrics retrieval source of 'lastReceivedBatch' https://issues.apache.org/jira/browse/SPARK-17038 ## What changes were proposed in this pull request? StreamingSource's

spark git commit: [SPARK-17038][STREAMING] fix metrics retrieval source of 'lastReceivedBatch'

2016-08-17 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master d60af8f6a -> e6bef7d52 [SPARK-17038][STREAMING] fix metrics retrieval source of 'lastReceivedBatch' https://issues.apache.org/jira/browse/SPARK-17038 ## What changes were proposed in this pull request? StreamingSource's

spark git commit: [SPARK-15869][STREAMING] Fix a potential NPE in StreamingJobProgressListener.getBatchUIData

2016-08-01 Thread zsxwing
tch tested? Existing unit tests. Author: Shixiong Zhu <shixi...@databricks.com> Closes #14443 from zsxwing/SPARK-15869. (cherry picked from commit 03d46aafe561b03e25f4e25cf01e631c18dd827c) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/

spark git commit: [SPARK-15869][STREAMING] Fix a potential NPE in StreamingJobProgressListener.getBatchUIData

2016-08-01 Thread zsxwing
ted? Existing unit tests. Author: Shixiong Zhu <shixi...@databricks.com> Closes #14443 from zsxwing/SPARK-15869. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/03d46aaf Tree: http://git-wip-us.apache.org/repos/asf/s

spark git commit: [SPARK-15590][WEBUI] Paginate Job Table in Jobs tab

2016-07-25 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master c979c8bba -> db36e1e75 [SPARK-15590][WEBUI] Paginate Job Table in Jobs tab ## What changes were proposed in this pull request? This patch adds pagination support for the Job Tables in the Jobs tab. Pagination is provided for all of the

spark git commit: [SPARK-16715][TESTS] Fix a potential ExprId conflict for SubexpressionEliminationSuite."Semantic equals and hash"

2016-07-25 Thread zsxwing
lict happens. ## How was this patch tested? Jenkins unit tests. Author: Shixiong Zhu <shixi...@databricks.com> Closes #14350 from zsxwing/SPARK-16715. (cherry picked from commit 12f490b5c85cdee26d47eb70ad1a1edd00504f21) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: h

spark git commit: [SPARK-16715][TESTS] Fix a potential ExprId conflict for SubexpressionEliminationSuite."Semantic equals and hash"

2016-07-25 Thread zsxwing
lict happens. ## How was this patch tested? Jenkins unit tests. Author: Shixiong Zhu <shixi...@databricks.com> Closes #14350 from zsxwing/SPARK-16715. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/12f490b5 Tree: http://gi

spark git commit: [SPARK-16230][CORE] CoarseGrainedExecutorBackend to self kill if there is an exception while creating an Executor

2016-07-15 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 e833c906f -> 34ac45a34 [SPARK-16230][CORE] CoarseGrainedExecutorBackend to self kill if there is an exception while creating an Executor ## What changes were proposed in this pull request? With the fix from SPARK-13112, I see that

spark git commit: [SPARK-16350][SQL] Fix support for incremental planning in wirteStream.foreach()

2016-07-07 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 24933355c -> cbfd94eac [SPARK-16350][SQL] Fix support for incremental planning in wirteStream.foreach() ## What changes were proposed in this pull request? There are cases where `complete` output mode does not output updated

spark git commit: [SPARK-16350][SQL] Fix support for incremental planning in wirteStream.foreach()

2016-07-07 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master a04cab8f1 -> 0f7175def [SPARK-16350][SQL] Fix support for incremental planning in wirteStream.foreach() ## What changes were proposed in this pull request? There are cases where `complete` output mode does not output updated aggregated

spark git commit: Revert "[SPARK-16372][MLLIB] Retag RDD to tallSkinnyQR of RowMatrix"

2016-07-07 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 45dda9221 -> bb92788f9 Revert "[SPARK-16372][MLLIB] Retag RDD to tallSkinnyQR of RowMatrix" This reverts commit 45dda92214191310a56333a2085e2343eba170cd. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-15591][WEBUI] Paginate Stage Table in Stages tab

2016-07-06 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 21eadd1d8 -> 478b71d02 [SPARK-15591][WEBUI] Paginate Stage Table in Stages tab ## What changes were proposed in this pull request? This patch adds pagination support for the Stage Tables in the Stage tab. Pagination is provided for all

spark git commit: [SPARK-16236][SQL][FOLLOWUP] Add Path Option back to Load API in DataFrameReader

2016-06-29 Thread zsxwing
sue, zsxwing ! Below is an example: ```Python spark.read.format('json').load('python/test_support/sql/people.json') ``` How was this patch tested? Existing test cases cover the changes by this PR Author: gatorsmile <gatorsm...@gmail.com> Closes #13965 from gatorsmile/optionPaths. Project: h

spark git commit: [SPARK-16236][SQL][FOLLOWUP] Add Path Option back to Load API in DataFrameReader

2016-06-29 Thread zsxwing
sue, zsxwing ! Below is an example: ```Python spark.read.format('json').load('python/test_support/sql/people.json') ``` How was this patch tested? Existing test cases cover the changes by this PR Author: gatorsmile <gatorsm...@gmail.com> Closes #13965 from gatorsmile/optionPaths. (cher

[1/2] spark git commit: [SPARK-16259][PYSPARK] cleanup options in DataFrame read/write API

2016-06-29 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 22b4072e7 -> 6650c0533 [SPARK-16259][PYSPARK] cleanup options in DataFrame read/write API ## What changes were proposed in this pull request? There are some duplicated code for options in DataFrame reader/writer API, this PR clean

[2/2] spark git commit: [SPARK-16266][SQL][STREAING] Moved DataStreamReader/Writer from pyspark.sql to pyspark.sql.streaming

2016-06-29 Thread zsxwing
[SPARK-16266][SQL][STREAING] Moved DataStreamReader/Writer from pyspark.sql to pyspark.sql.streaming ## What changes were proposed in this pull request? - Moved DataStreamReader/Writer from pyspark.sql to pyspark.sql.streaming to make them consistent with scala packaging - Exposed the

spark git commit: [SPARK-16266][SQL][STREAING] Moved DataStreamReader/Writer from pyspark.sql to pyspark.sql.streaming

2016-06-28 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 153c2f9ac -> f454a7f9f [SPARK-16266][SQL][STREAING] Moved DataStreamReader/Writer from pyspark.sql to pyspark.sql.streaming ## What changes were proposed in this pull request? - Moved DataStreamReader/Writer from pyspark.sql to

spark git commit: [MINOR][DOCS][STRUCTURED STREAMING] Minor doc fixes around `DataFrameWriter` and `DataStreamWriter`

2016-06-28 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 5fb7804e5 -> 52c9d69f7 [MINOR][DOCS][STRUCTURED STREAMING] Minor doc fixes around `DataFrameWriter` and `DataStreamWriter` ## What changes were proposed in this pull request? Fixes a couple old references to

spark git commit: [MINOR][DOCS][STRUCTURED STREAMING] Minor doc fixes around `DataFrameWriter` and `DataStreamWriter`

2016-06-28 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 3554713a1 -> 5545b7910 [MINOR][DOCS][STRUCTURED STREAMING] Minor doc fixes around `DataFrameWriter` and `DataStreamWriter` ## What changes were proposed in this pull request? Fixes a couple old references to `DataFrameWriter.startStream`

spark git commit: [SPARK-16148][SCHEDULER] Allow for underscores in TaskLocation in the Executor ID

2016-06-28 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 c86d29b2e -> 5c9555e11 [SPARK-16148][SCHEDULER] Allow for underscores in TaskLocation in the Executor ID ## What changes were proposed in this pull request? Previously, the TaskLocation implementation would not allow for executor ids

spark git commit: [SPARK-16148][SCHEDULER] Allow for underscores in TaskLocation in the Executor ID

2016-06-28 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master d59ba8e30 -> ae14f3623 [SPARK-16148][SCHEDULER] Allow for underscores in TaskLocation in the Executor ID ## What changes were proposed in this pull request? Previously, the TaskLocation implementation would not allow for executor ids

spark git commit: [SPARK-16116][SQL] ConsoleSink should not require checkpointLocation

2016-06-23 Thread zsxwing
ion` is not specified. ## How was this patch tested? The added unit test. Author: Shixiong Zhu <shixi...@databricks.com> Closes #13817 from zsxwing/console-checkpoint. (cherry picked from commit d85bb10ce49926b8b661bd2cb97392205742fc14) Signed-off-by: Shixiong Zhu <shixi...@databricks.co

spark git commit: [SPARK-16116][SQL] ConsoleSink should not require checkpointLocation

2016-06-23 Thread zsxwing
ion` is not specified. ## How was this patch tested? The added unit test. Author: Shixiong Zhu <shixi...@databricks.com> Closes #13817 from zsxwing/console-checkpoint. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d85b

spark git commit: [SPARK-16131] initialize internal logger lazily in Scala preferred way

2016-06-22 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 1d3c56e77 -> e2eb8e002 [SPARK-16131] initialize internal logger lazily in Scala preferred way ## What changes were proposed in this pull request? Initialize logger instance lazily in Scala preferred way ## How was this patch tested?

spark git commit: [SPARK-16131] initialize internal logger lazily in Scala preferred way

2016-06-22 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 857ecff1d -> 044971eca [SPARK-16131] initialize internal logger lazily in Scala preferred way ## What changes were proposed in this pull request? Initialize logger instance lazily in Scala preferred way ## How was this patch tested? By

spark git commit: [SPARK-16120][STREAMING] getCurrentLogFiles in ReceiverSuite WAL generating and cleaning case uses external variable instead of the passed parameter

2016-06-22 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 76d0ef34e -> 520828c90 [SPARK-16120][STREAMING] getCurrentLogFiles in ReceiverSuite WAL generating and cleaning case uses external variable instead of the passed parameter ## What changes were proposed in this pull request? In

spark git commit: [SPARK-16120][STREAMING] getCurrentLogFiles in ReceiverSuite WAL generating and cleaning case uses external variable instead of the passed parameter

2016-06-22 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 0a9c02759 -> c2cebdb7d [SPARK-16120][STREAMING] getCurrentLogFiles in ReceiverSuite WAL generating and cleaning case uses external variable instead of the passed parameter ## What changes were proposed in this pull request? In

spark git commit: [SPARK-15606][CORE] Use non-blocking removeExecutor call to avoid deadlocks

2016-06-21 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 abe36c53d -> d98fb19c1 [SPARK-15606][CORE] Use non-blocking removeExecutor call to avoid deadlocks ## What changes were proposed in this pull request? Set minimum number of dispatcher threads to 3 to avoid deadlocks on machines with

spark git commit: [SPARK-15982][SPARK-16009][SPARK-16007][SQL] Harmonize the behavior of DataFrameReader.text/csv/json/parquet/orc

2016-06-20 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 8159da20e -> 54001cb12 [SPARK-15982][SPARK-16009][SPARK-16007][SQL] Harmonize the behavior of DataFrameReader.text/csv/json/parquet/orc ## What changes were proposed in this pull request? Issues with current reader behavior. -

spark git commit: [SPARK-15982][SPARK-16009][SPARK-16007][SQL] Harmonize the behavior of DataFrameReader.text/csv/json/parquet/orc

2016-06-20 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 6df8e3886 -> b99129cc4 [SPARK-15982][SPARK-16009][SPARK-16007][SQL] Harmonize the behavior of DataFrameReader.text/csv/json/parquet/orc ## What changes were proposed in this pull request? Issues with current reader behavior. - `text()`

spark git commit: [SPARK-16020][SQL] Fix complete mode aggregation with console sink

2016-06-17 Thread zsxwing
ks.com> Closes #13740 from zsxwing/complete-console. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d0ac0e6f Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d0ac0e6f Diff: http://git-wip-us.apache.org/repos/asf/spark/diff

spark git commit: [SPARK-16020][SQL] Fix complete mode aggregation with console sink

2016-06-17 Thread zsxwing
ks.com> Closes #13740 from zsxwing/complete-console. (cherry picked from commit d0ac0e6f433bfccf4ced3743a2526f67fdb5c38e) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/

spark git commit: [SPARK-16017][CORE] Send hostname from CoarseGrainedExecutorBackend to driver

2016-06-17 Thread zsxwing
ong Zhu <shixi...@databricks.com> Closes #13741 from zsxwing/SPARK-16017. (cherry picked from commit 62d8fe2089659e8212753a622708517e0f4a77bc) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apach

spark git commit: [SPARK-16017][CORE] Send hostname from CoarseGrainedExecutorBackend to driver

2016-06-17 Thread zsxwing
Zhu <shixi...@databricks.com> Closes #13741 from zsxwing/SPARK-16017. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/62d8fe20 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/62d8fe20 Diff: http://git-wip-us.apache.

spark git commit: [SPARK-15991] SparkContext.hadoopConfiguration should be always the base of hadoop conf created by SessionState

2016-06-16 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 8f7138859 -> b3678eb7e [SPARK-15991] SparkContext.hadoopConfiguration should be always the base of hadoop conf created by SessionState ## What changes were proposed in this pull request? Before this patch, after a SparkSession has

spark git commit: [SPARK-15991] SparkContext.hadoopConfiguration should be always the base of hadoop conf created by SessionState

2016-06-16 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 62d2fa5e9 -> d9c6628c4 [SPARK-15991] SparkContext.hadoopConfiguration should be always the base of hadoop conf created by SessionState ## What changes were proposed in this pull request? Before this patch, after a SparkSession has been

spark git commit: [SPARK-15981][SQL][STREAMING] Fixed bug and added tests in DataStreamReader Python API

2016-06-16 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 0a2291cd1 -> e11c27918 [SPARK-15981][SQL][STREAMING] Fixed bug and added tests in DataStreamReader Python API ## What changes were proposed in this pull request? - Fixed bug in Python API of DataStreamReader. Because a single path

spark git commit: [SPARK-15981][SQL][STREAMING] Fixed bug and added tests in DataStreamReader Python API

2016-06-16 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master a865f6e05 -> 084dca770 [SPARK-15981][SQL][STREAMING] Fixed bug and added tests in DataStreamReader Python API ## What changes were proposed in this pull request? - Fixed bug in Python API of DataStreamReader. Because a single path was

spark git commit: [SPARK-12492][SQL] Add missing SQLExecution.withNewExecutionId for hiveResultString

2016-06-15 Thread zsxwing
so that queries running in `spark-sql` will be shown in Web UI. Closes #13115 ## How was this patch tested? Existing unit tests. Author: KaiXinXiaoLei <huleil...@huawei.com> Closes #13689 from zsxwing/pr13115. (cherry picked from commit 3e6d567a4688f064f2a2259c8e436b7c628a431c) Signed-off-by:

spark git commit: [SPARK-15826][CORE] PipedRDD to allow configurable char encoding

2016-06-15 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 9b234b55d -> 279bd4aa5 [SPARK-15826][CORE] PipedRDD to allow configurable char encoding ## What changes were proposed in this pull request? Link to jira which describes the problem: https://issues.apache.org/jira/browse/SPARK-15826 The

spark git commit: [SPARK-15826][CORE] PipedRDD to allow configurable char encoding

2016-06-15 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 de56ea9bf -> 8ef31fbd7 [SPARK-15826][CORE] PipedRDD to allow configurable char encoding ## What changes were proposed in this pull request? Link to jira which describes the problem: https://issues.apache.org/jira/browse/SPARK-15826

[1/3] spark git commit: [SPARK-15953][WIP][STREAMING] Renamed ContinuousQuery to StreamingQuery

2016-06-15 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 4c950a757 -> 885e74a38 http://git-wip-us.apache.org/repos/asf/spark/blob/885e74a3/sql/core/src/test/scala/org/apache/spark/sql/streaming/test/DataStreamReaderWriterSuite.scala

[3/3] spark git commit: [SPARK-15953][WIP][STREAMING] Renamed ContinuousQuery to StreamingQuery

2016-06-15 Thread zsxwing
[SPARK-15953][WIP][STREAMING] Renamed ContinuousQuery to StreamingQuery Renamed for simplicity, so that its obvious that its related to streaming. Existing unit tests. Author: Tathagata Das Closes #13673 from tdas/SPARK-15953. Project:

[2/3] spark git commit: [SPARK-15953][WIP][STREAMING] Renamed ContinuousQuery to StreamingQuery

2016-06-15 Thread zsxwing
http://git-wip-us.apache.org/repos/asf/spark/blob/885e74a3/sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryListener.scala -- diff --git

[3/3] spark git commit: [SPARK-15953][WIP][STREAMING] Renamed ContinuousQuery to StreamingQuery

2016-06-15 Thread zsxwing
[SPARK-15953][WIP][STREAMING] Renamed ContinuousQuery to StreamingQuery Renamed for simplicity, so that its obvious that its related to streaming. Existing unit tests. Author: Tathagata Das Closes #13673 from tdas/SPARK-15953. (cherry picked from commit

[2/3] spark git commit: [SPARK-15953][WIP][STREAMING] Renamed ContinuousQuery to StreamingQuery

2016-06-15 Thread zsxwing
http://git-wip-us.apache.org/repos/asf/spark/blob/9a507199/sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryListener.scala -- diff --git

[1/3] spark git commit: [SPARK-15953][WIP][STREAMING] Renamed ContinuousQuery to StreamingQuery

2016-06-15 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master d30b7e669 -> 9a5071996 http://git-wip-us.apache.org/repos/asf/spark/blob/9a507199/sql/core/src/test/scala/org/apache/spark/sql/streaming/test/DataStreamReaderWriterSuite.scala

spark git commit: [SPARK-15935][PYSPARK] Fix a wrong format tag in the error message

2016-06-14 Thread zsxwing
ins unit tests. Author: Shixiong Zhu <shixi...@databricks.com> Closes #13665 from zsxwing/fix. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/0ee9fd9e Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/0ee9fd9e D

spark git commit: [SPARK-15889][SQL][STREAMING] Add a unique id to ContinuousQuery

2016-06-13 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 5ad4e32d4 -> c654ae214 [SPARK-15889][SQL][STREAMING] Add a unique id to ContinuousQuery ## What changes were proposed in this pull request? ContinuousQueries have names that are unique across all the active ones. However, when queries

spark git commit: [SPARK-15889][SQL][STREAMING] Add a unique id to ContinuousQuery

2016-06-13 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 d9db8a9c8 -> 97fe1d8ee [SPARK-15889][SQL][STREAMING] Add a unique id to ContinuousQuery ## What changes were proposed in this pull request? ContinuousQueries have names that are unique across all the active ones. However, when

spark git commit: [SPARK-15697][REPL] Unblock some of the useful repl commands.

2016-06-13 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 c01dc815d -> 413826d40 [SPARK-15697][REPL] Unblock some of the useful repl commands. ## What changes were proposed in this pull request? Unblock some of the useful repl commands. like, "implicits", "javap", "power", "type", "kind".

spark git commit: [SPARK-15697][REPL] Unblock some of the useful repl commands.

2016-06-13 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 938434dc7 -> 4134653e5 [SPARK-15697][REPL] Unblock some of the useful repl commands. ## What changes were proposed in this pull request? Unblock some of the useful repl commands. like, "implicits", "javap", "power", "type", "kind". As

spark git commit: [SPARK-15871][SQL] Add `assertNotPartitioned` check in `DataFrameWriter`

2016-06-10 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 c1390ccbb -> f15d641e2 [SPARK-15871][SQL] Add `assertNotPartitioned` check in `DataFrameWriter` ## What changes were proposed in this pull request? It doesn't make sense to specify partitioning parameters, when we write data out from

spark git commit: [SPARK-15871][SQL] Add `assertNotPartitioned` check in `DataFrameWriter`

2016-06-10 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 5c16ad0d5 -> fb219029d [SPARK-15871][SQL] Add `assertNotPartitioned` check in `DataFrameWriter` ## What changes were proposed in this pull request? It doesn't make sense to specify partitioning parameters, when we write data out from

spark git commit: [SPARK-15841][Tests] REPLSuite has incorrect env set for a couple of tests.

2016-06-09 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 b2d076c35 -> 3119d8eef [SPARK-15841][Tests] REPLSuite has incorrect env set for a couple of tests. Description from JIRA. In ReplSuite, for a test that can be tested well on just local should not really have to start a local-cluster.

spark git commit: [SPARK-15841][Tests] REPLSuite has incorrect env set for a couple of tests.

2016-06-09 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master aa0364510 -> 83070cd1d [SPARK-15841][Tests] REPLSuite has incorrect env set for a couple of tests. Description from JIRA. In ReplSuite, for a test that can be tested well on just local should not really have to start a local-cluster. And

spark git commit: [SPARK-15515][SQL] Error Handling in Running SQL Directly On Files

2016-06-02 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 8900c8d8f -> 9aff6f3b1 [SPARK-15515][SQL] Error Handling in Running SQL Directly On Files What changes were proposed in this pull request? This PR is to address the following issues: - **ISSUE 1:** For ORC source format, we are

spark git commit: [SPARK-15515][SQL] Error Handling in Running SQL Directly On Files

2016-06-02 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 cd7bf4b8e -> 32b025e94 [SPARK-15515][SQL] Error Handling in Running SQL Directly On Files What changes were proposed in this pull request? This PR is to address the following issues: - **ISSUE 1:** For ORC source format, we are

spark git commit: [SPARK-15606][CORE] Use non-blocking removeExecutor call to avoid deadlocks

2016-06-02 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 fe639adea -> 18d613a4d [SPARK-15606][CORE] Use non-blocking removeExecutor call to avoid deadlocks ## What changes were proposed in this pull request? Set minimum number of dispatcher threads to 3 to avoid deadlocks on machines with

spark git commit: [SPARK-15606][CORE] Use non-blocking removeExecutor call to avoid deadlocks

2016-06-02 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 63b7f127c -> 7c07d176f [SPARK-15606][CORE] Use non-blocking removeExecutor call to avoid deadlocks ## What changes were proposed in this pull request? Set minimum number of dispatcher threads to 3 to avoid deadlocks on machines with only

spark git commit: Revert "[SPARK-11753][SQL][TEST-HADOOP2.2] Make allowNonNumericNumbers option work

2016-05-31 Thread zsxwing
R to run Jenkins tests due to the revert conflicts of `dev/deps/spark-deps-hadoop*`. ## How was this patch tested? Jenkins unit tests, integration tests, manual tests) Author: Shixiong Zhu <shixi...@databricks.com> Closes #13417 from zsxwing/revert-SPARK-11753. Project: http://git-wip

spark git commit: Revert "[SPARK-11753][SQL][TEST-HADOOP2.2] Make allowNonNumericNumbers option work

2016-05-31 Thread zsxwing
ent a PR to run Jenkins tests due to the revert conflicts of `dev/deps/spark-deps-hadoop*`. ## How was this patch tested? Jenkins unit tests, integration tests, manual tests) Author: Shixiong Zhu <shixi...@databricks.com> Closes #13417 from zsxwing/revert-SPARK-11753. (cherry pick

spark git commit: [SPARK-9044] Fix "Storage" tab in UI so that it reflects RDD name change.

2016-05-25 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 f63ba2210 -> 69327667d [SPARK-9044] Fix "Storage" tab in UI so that it reflects RDD name change. ## What changes were proposed in this pull request? 1. Making 'name' field of RDDInfo mutable. 2. In StorageListener: catching the fact

spark git commit: [SPARK-9044] Fix "Storage" tab in UI so that it reflects RDD name change.

2016-05-25 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 4f27b8dd5 -> b120fba6a [SPARK-9044] Fix "Storage" tab in UI so that it reflects RDD name change. ## What changes were proposed in this pull request? 1. Making 'name' field of RDDInfo mutable. 2. In StorageListener: catching the fact that

spark git commit: [SPARK-15508][STREAMING][TESTS] Fix flaky test: JavaKafkaStreamSuite.testKafkaStream

2016-05-24 Thread zsxwing
ttp://spark-tests.appspot.com/tests/org.apache.spark.streaming.kafka.JavaKafkaStreamSuite/testKafkaStream ## How was this patch tested? Jenkins unit tests. Author: Shixiong Zhu <shixi...@databricks.com> Closes #13281 from zsxwing/flaky-kafka-test. (cherry picked fr

spark git commit: [SPARK-15508][STREAMING][TESTS] Fix flaky test: JavaKafkaStreamSuite.testKafkaStream

2016-05-24 Thread zsxwing
ark-tests.appspot.com/tests/org.apache.spark.streaming.kafka.JavaKafkaStreamSuite/testKafkaStream ## How was this patch tested? Jenkins unit tests. Author: Shixiong Zhu <shixi...@databricks.com> Closes #13281 from zsxwing/flaky-kafka-test. Project: http://git-wip-us.apache.org/repos/

spark git commit: [SPARK-15395][CORE] Use getHostString to create RpcAddress (backport for 1.6)

2016-05-20 Thread zsxwing
sts. Author: Shixiong Zhu <shixi...@databricks.com> Closes #13196 from zsxwing/host-string-1.6. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7ad82b66 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/7ad82b66 D

spark git commit: Fix the compiler error introduced by #13153 for Scala 2.10

2016-05-19 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 d1b5df83d -> 4257ba372 Fix the compiler error introduced by #13153 for Scala 2.10 Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/4257ba37 Tree:

spark git commit: Fix the compiler error introduced by #13153 for Scala 2.10

2016-05-19 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 5ccecc078 -> 305263954 Fix the compiler error introduced by #13153 for Scala 2.10 Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/30526395 Tree:

spark git commit: [SPARK-15395][CORE] Use getHostString to create RpcAddress

2016-05-18 Thread zsxwing
end), and this behavior will make the check incorrect. This PR uses `getHostString` to resolve the issue. ## How was this patch tested? Jenkins unit tests. Author: Shixiong Zhu <shixi...@databricks.com> Closes #13185 from zsxwing/host-string. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Com

spark git commit: [SPARK-15395][CORE] Use getHostString to create RpcAddress

2016-05-18 Thread zsxwing
end), and this behavior will make the check incorrect. This PR uses `getHostString` to resolve the issue. ## How was this patch tested? Jenkins unit tests. Author: Shixiong Zhu <shixi...@databricks.com> Closes #13185 from zsxwing/host-string. (cherry picked from commit 5c9117a3ed373461529f9f9306668ed

spark git commit: [SPARK-14942][SQL][STREAMING] Reduce delay between batch construction and execution

2016-05-16 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master fabc8e5b1 -> 95f4fbae5 [SPARK-14942][SQL][STREAMING] Reduce delay between batch construction and execution ## Problem Currently in `StreamExecution`, [we first run the batch, then construct the

spark git commit: [SPARK-14942][SQL][STREAMING] Reduce delay between batch construction and execution

2016-05-16 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 f937ce766 -> 0dd1f8720 [SPARK-14942][SQL][STREAMING] Reduce delay between batch construction and execution ## Problem Currently in `StreamExecution`, [we first run the batch, then construct the

spark git commit: [SPARK-15262] Synchronize block manager / scheduler executor state

2016-05-11 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 7ecd49688 -> 40a949aae [SPARK-15262] Synchronize block manager / scheduler executor state ## What changes were proposed in this pull request? If an executor is still alive even after the scheduler has removed its metadata, we may receive

spark git commit: [SPARK-15262] Synchronize block manager / scheduler executor state

2016-05-11 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 ced71d353 -> e2a43d007 [SPARK-15262] Synchronize block manager / scheduler executor state ## What changes were proposed in this pull request? If an executor is still alive even after the scheduler has removed its metadata, we may

spark git commit: [SPARK-15262] Synchronize block manager / scheduler executor state

2016-05-11 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 6e08eb469 -> 2454f6abf [SPARK-15262] Synchronize block manager / scheduler executor state ## What changes were proposed in this pull request? If an executor is still alive even after the scheduler has removed its metadata, we may

spark git commit: [SPARK-14936][BUILD][TESTS] FlumePollingStreamSuite is slow

2016-05-10 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 1db027d11 -> f021f3460 [SPARK-14936][BUILD][TESTS] FlumePollingStreamSuite is slow https://issues.apache.org/jira/browse/SPARK-14936 ## What changes were proposed in this pull request? FlumePollingStreamSuite contains two tests which

spark git commit: [SPARK-14936][BUILD][TESTS] FlumePollingStreamSuite is slow

2016-05-10 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master da02d006b -> 86475520f [SPARK-14936][BUILD][TESTS] FlumePollingStreamSuite is slow https://issues.apache.org/jira/browse/SPARK-14936 ## What changes were proposed in this pull request? FlumePollingStreamSuite contains two tests which run

spark git commit: [SPARK-6005][TESTS] Fix flaky test: o.a.s.streaming.kafka.DirectKafkaStreamSuite.offset recovery

2016-05-10 Thread zsxwing
ves the logic of `offsetRangesBeforeStop` (also renamed to `offsetRangesAfterStop`) after `ssc.stop()` to fix the flaky test. ## How was this patch tested? Jenkins unit tests. Author: Shixiong Zhu <shixi...@databricks.com> Closes #12903 from zsxwing/SPARK-6005. Project: http:

spark git commit: [SPARK-6005][TESTS] Fix flaky test: o.a.s.streaming.kafka.DirectKafkaStreamSuite.offset recovery

2016-05-10 Thread zsxwing
ust moves the logic of `offsetRangesBeforeStop` (also renamed to `offsetRangesAfterStop`) after `ssc.stop()` to fix the flaky test. ## How was this patch tested? Jenkins unit tests. Author: Shixiong Zhu <shixi...@databricks.com> Closes #12903 from zsxwing/SPARK-6005. (cherry picked fr

<    1   2   3   4   5   6   7   8   >