spark git commit: [SPARK-12376][TESTS] Spark Streaming Java8APISuite fails in assertOrderInvariantEquals method

2015-12-17 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 48dcee484 -> 4df1dd403 [SPARK-12376][TESTS] Spark Streaming Java8APISuite fails in assertOrderInvariantEquals method org.apache.spark.streaming.Java8APISuite.java is failing due to trying to sort immutable list in

spark git commit: [SPARK-12410][STREAMING] Fix places that use '.' and '|' directly in split

2015-12-17 Thread zsxwing
bricks.com> Closes #10361 from zsxwing/reg-bug. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/540b5aea Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/540b5aea Diff: http://git-wip-us.apache.org/repos/asf/spark/diff

spark git commit: [SPARK-12410][STREAMING] Fix places that use '.' and '|' directly in split

2015-12-17 Thread zsxwing
..@databricks.com> Closes #10361 from zsxwing/reg-bug. (cherry picked from commit 540b5aeadc84d1a5d61bda4414abd6bf35dc7ff9) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/c

spark git commit: [SPARK-11749][STREAMING] Duplicate creating the RDD in file stream when recovering from checkpoint data

2015-12-17 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 658f66e62 -> f4346f612 [SPARK-11749][STREAMING] Duplicate creating the RDD in file stream when recovering from checkpoint data Add a transient flag `DStream.restoredFromCheckpointData` to control the restore processing in DStream to

spark git commit: [SPARK-11749][STREAMING] Duplicate creating the RDD in file stream when recovering from checkpoint data

2015-12-17 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 4df1dd403 -> 9177ea383 [SPARK-11749][STREAMING] Duplicate creating the RDD in file stream when recovering from checkpoint data Add a transient flag `DStream.restoredFromCheckpointData` to control the restore processing in DStream to

spark git commit: [MINOR] Hide the error logs for 'SQLListenerMemoryLeakSuite'

2015-12-17 Thread zsxwing
ks.com> Closes #10363 from zsxwing/hide-log. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/0370abdf Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/0370abdf Diff: http://git-wip-us.apache.org/repos/asf/spark/diff

spark git commit: [MINOR] Add missing interpolation in NettyRPCEnv

2015-12-16 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 552b38f87 -> 638b89bc3 [MINOR] Add missing interpolation in NettyRPCEnv ``` Exception in thread "main" org.apache.spark.rpc.RpcTimeoutException: Cannot receive any reply in ${timeout.duration}. This timeout is controlled by

spark git commit: [MINOR] Add missing interpolation in NettyRPCEnv

2015-12-16 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 27b98e99d -> 861549acd [MINOR] Add missing interpolation in NettyRPCEnv ``` Exception in thread "main" org.apache.spark.rpc.RpcTimeoutException: Cannot receive any reply in ${timeout.duration}. This timeout is controlled by

spark git commit: [SPARK-11904][PYSPARK] reduceByKeyAndWindow does not require checkpointing when invFunc is None

2015-12-16 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 97678edea -> 437583f69 [SPARK-11904][PYSPARK] reduceByKeyAndWindow does not require checkpointing when invFunc is None when invFunc is None, `reduceByKeyAndWindow(func, None, winsize, slidesize)` is equivalent to

spark git commit: [STREAMING][MINOR] Fix typo in function name of StateImpl

2015-12-15 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master c59df8c51 -> bc1ff9f4a [STREAMING][MINOR] Fix typo in function name of StateImpl cc\ tdas zsxwing , please review. Thanks a lot. Author: jerryshao <ss...@hortonworks.com> Closes #10305 from jerryshao/fix-typo-state-impl. Proj

spark git commit: [STREAMING][MINOR] Fix typo in function name of StateImpl

2015-12-15 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 352a0c80f -> 23c884605 [STREAMING][MINOR] Fix typo in function name of StateImpl cc\ tdas zsxwing , please review. Thanks a lot. Author: jerryshao <ss...@hortonworks.com> Closes #10305 from jerryshao/fix-typo-state-impl.

spark git commit: [SPARK-12304][STREAMING] Make Spark Streaming web UI display more fri…

2015-12-15 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master ca0690b5e -> d52bf47e1 [SPARK-12304][STREAMING] Make Spark Streaming web UI display more fri… …endly Receiver graphs Currently, the Spark Streaming web UI uses the same maxY when displays 'Input Rate Times& Histograms' and

spark git commit: [SPARK-12281][CORE] Fix a race condition when reporting ExecutorState in the shutdown hook

2015-12-13 Thread zsxwing
ava:745) ``` Author: Shixiong Zhu <shixi...@databricks.com> Closes #10269 from zsxwing/executor-state. (cherry picked from commit 2aecda284e22ec608992b6221e2f5ffbd51fcd24) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/re

spark git commit: [SPARK-12281][CORE] Fix a race condition when reporting ExecutorState in the shutdown hook

2015-12-13 Thread zsxwing
745) ``` Author: Shixiong Zhu <shixi...@databricks.com> Closes #10269 from zsxwing/executor-state. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2aecda28 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2aec

spark git commit: [SPARK-12267][CORE] Store the remote RpcEnv address to send the correct disconnetion message

2015-12-12 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 98b212d36 -> 8af2f8c61 [SPARK-12267][CORE] Store the remote RpcEnv address to send the correct disconnetion message Author: Shixiong Zhu <shixi...@databricks.com> Closes #10261 from zsxwing/SPARK-12267. Project: http:

spark git commit: [SPARK-12267][CORE] Store the remote RpcEnv address to send the correct disconnetion message

2015-12-12 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 e05364baa -> d7e3bfd7d [SPARK-12267][CORE] Store the remote RpcEnv address to send the correct disconnetion message Author: Shixiong Zhu <shixi...@databricks.com> Closes #10261 from zsxwing/SPARK-12267. (cherry picked fr

spark git commit: [SPARK-12273][STREAMING] Make Spark Streaming web UI list Receivers in order

2015-12-11 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master aa305dcaf -> 713e6959d [SPARK-12273][STREAMING] Make Spark Streaming web UI list Receivers in order Currently the Streaming web UI does NOT list Receivers in order; however, it seems more convenient for the users if Receivers are listed

spark git commit: [STREAMING][DOC][MINOR] Update the description of direct Kafka stream doc

2015-12-10 Thread zsxwing
ies compared to Scala/Java, so here changing the description to make it more precise. zsxwing tdas , please review, thanks a lot. Author: jerryshao <ss...@hortonworks.com> Closes #10246 from jerryshao/direct-kafka-doc-update. (cherry picked from commit 24d3357d66e14388faf8709b368edca70ea96432) S

spark git commit: [STREAMING][DOC][MINOR] Update the description of direct Kafka stream doc

2015-12-10 Thread zsxwing
ies compared to Scala/Java, so here changing the description to make it more precise. zsxwing tdas , please review, thanks a lot. Author: jerryshao <ss...@hortonworks.com> Closes #10246 from jerryshao/direct-kafka-doc-update. (cherry picked from commit 24d3357d66e14388faf8709b368edca70ea96432) S

spark git commit: [STREAMING][DOC][MINOR] Update the description of direct Kafka stream doc

2015-12-10 Thread zsxwing
red to Scala/Java, so here changing the description to make it more precise. zsxwing tdas , please review, thanks a lot. Author: jerryshao <ss...@hortonworks.com> Closes #10246 from jerryshao/direct-kafka-doc-update. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http:

[1/2] spark git commit: [SPARK-12244][SPARK-12245][STREAMING] Rename trackStateByKey to mapWithState and change tracking function signature

2015-12-09 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 699f497cf -> f6d866173 http://git-wip-us.apache.org/repos/asf/spark/blob/f6d86617/streaming/src/test/java/org/apache/spark/streaming/JavaTrackStateByKeySuite.java --

[2/2] spark git commit: [SPARK-12244][SPARK-12245][STREAMING] Rename trackStateByKey to mapWithState and change tracking function signature

2015-12-09 Thread zsxwing
[SPARK-12244][SPARK-12245][STREAMING] Rename trackStateByKey to mapWithState and change tracking function signature SPARK-12244: Based on feedback from early users and personal experience attempting to explain it, the name trackStateByKey had two problem. "trackState" is a completely new term

[2/2] spark git commit: [SPARK-12244][SPARK-12245][STREAMING] Rename trackStateByKey to mapWithState and change tracking function signature

2015-12-09 Thread zsxwing
[SPARK-12244][SPARK-12245][STREAMING] Rename trackStateByKey to mapWithState and change tracking function signature SPARK-12244: Based on feedback from early users and personal experience attempting to explain it, the name trackStateByKey had two problem. "trackState" is a completely new term

[1/2] spark git commit: [SPARK-12244][SPARK-12245][STREAMING] Rename trackStateByKey to mapWithState and change tracking function signature

2015-12-09 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 2166c2a75 -> bd2cd4f53 http://git-wip-us.apache.org/repos/asf/spark/blob/bd2cd4f5/streaming/src/test/java/org/apache/spark/streaming/JavaTrackStateByKeySuite.java -- diff

spark git commit: [SPARK-12074] Avoid memory copy involving ByteBuffer.wrap(ByteArrayOutputStream.toByteArray)

2015-12-08 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 6cb06e871 -> 75c60bf4b [SPARK-12074] Avoid memory copy involving ByteBuffer.wrap(ByteArrayOutputStream.toByteArray) SPARK-12060 fixed JavaSerializerInstance.serialize This PR applies the same technique on two other classes. zsxw

spark git commit: [SPARK-12060][CORE] Avoid memory copy in JavaSerializerInstance.serialize

2015-12-07 Thread zsxwing
;shixi...@databricks.com> Closes #10167 from zsxwing/merge-SPARK-12060. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/3f4efb5c Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/3f4efb5c Diff: http://git-wip-us.apache.

spark git commit: [SPARK-12101][CORE] Fix thread pools that cannot cache tasks in Worker and AppClient (backport 1.5)

2015-12-07 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.5 93a0510a5 -> 3868ab644 [SPARK-12101][CORE] Fix thread pools that cannot cache tasks in Worker and AppClient (backport 1.5) backport #10108 to branch 1.5 Author: Shixiong Zhu <shixi...@databricks.com> Closes #10135 from zs

spark git commit: [SPARK-12101][CORE] Fix thread pools that cannot cache tasks in Worker and AppClient

2015-12-03 Thread zsxwing
xed `ThreadUtils.newDaemonCachedThreadPool`. Author: Shixiong Zhu <shixi...@databricks.com> Closes #10108 from zsxwing/fix-threadpool. (cherry picked from commit 649be4fa4532dcd3001df8345f9f7e970a3fbc65) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/

spark git commit: [SPARK-12101][CORE] Fix thread pools that cannot cache tasks in Worker and AppClient

2015-12-03 Thread zsxwing
xed `ThreadUtils.newDaemonCachedThreadPool`. Author: Shixiong Zhu <shixi...@databricks.com> Closes #10108 from zsxwing/fix-threadpool. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/649be4fa Tree: http://git-wip-us.apache.org/repos/asf/s

spark git commit: Revert "[SPARK-12060][CORE] Avoid memory copy in JavaSerializerInstance.serialize"

2015-12-01 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 81db8d086 -> 21909b8ac Revert "[SPARK-12060][CORE] Avoid memory copy in JavaSerializerInstance.serialize" This reverts commit 9b99b2b46c452ba396e922db5fc7eec02c45b158. Project: http://git-wip-us.apache.org/repos/asf/spark/repo

spark git commit: Revert "[SPARK-12060][CORE] Avoid memory copy in JavaSerializerInstance.serialize"

2015-12-01 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 60b541ee1 -> 328b757d5 Revert "[SPARK-12060][CORE] Avoid memory copy in JavaSerializerInstance.serialize" This reverts commit 1401166576c7018c5f9c31e0a6703d5fb16ea339. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-12002][STREAMING][PYSPARK] Fix python direct stream checkpoint recovery issue

2015-12-01 Thread zsxwing
hu <shixi...@databricks.com> Closes #10074 from zsxwing/review-pr10017. (cherry picked from commit f292018f8e57779debc04998456ec875f628133b) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apach

spark git commit: [SPARK-12002][STREAMING][PYSPARK] Fix python direct stream checkpoint recovery issue

2015-12-01 Thread zsxwing
hu <shixi...@databricks.com> Closes #10074 from zsxwing/review-pr10017. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f292018f Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f292018f Diff: http://git-wip-us.apach

spark git commit: [SPARK-12060][CORE] Avoid memory copy in JavaSerializerInstance.serialize

2015-12-01 Thread zsxwing
ong Zhu <shixi...@databricks.com> Closes #10051 from zsxwing/SPARK-12060. (cherry picked from commit 1401166576c7018c5f9c31e0a6703d5fb16ea339) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.a

spark git commit: [SPARK-12060][CORE] Avoid memory copy in JavaSerializerInstance.serialize

2015-12-01 Thread zsxwing
Zhu <shixi...@databricks.com> Closes #10051 from zsxwing/SPARK-12060. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/14011665 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/14011665 Diff: http://git-wip-us.a

spark git commit: [SPARK-12087][STREAMING] Create new JobConf for every batch in saveAsHadoopFiles

2015-12-01 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 96691feae -> 8a75a3049 [SPARK-12087][STREAMING] Create new JobConf for every batch in saveAsHadoopFiles The JobConf object created in `DStream.saveAsHadoopFiles` is used concurrently in multiple places: * The JobConf is updated by

spark git commit: [SPARK-12087][STREAMING] Create new JobConf for every batch in saveAsHadoopFiles

2015-12-01 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.5 4f07a590c -> 0d57a4ae1 [SPARK-12087][STREAMING] Create new JobConf for every batch in saveAsHadoopFiles The JobConf object created in `DStream.saveAsHadoopFiles` is used concurrently in multiple places: * The JobConf is updated by

spark git commit: [SPARK-12087][STREAMING] Create new JobConf for every batch in saveAsHadoopFiles

2015-12-01 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 a5743affc -> 1f42295b5 [SPARK-12087][STREAMING] Create new JobConf for every batch in saveAsHadoopFiles The JobConf object created in `DStream.saveAsHadoopFiles` is used concurrently in multiple places: * The JobConf is updated by

spark git commit: [SPARK-12087][STREAMING] Create new JobConf for every batch in saveAsHadoopFiles

2015-12-01 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.4 f5af299ab -> b6ba2dab2 [SPARK-12087][STREAMING] Create new JobConf for every batch in saveAsHadoopFiles The JobConf object created in `DStream.saveAsHadoopFiles` is used concurrently in multiple places: * The JobConf is updated by

spark git commit: [SPARK-12058][HOTFIX] Disable KinesisStreamTests

2015-11-30 Thread zsxwing
es #10047 from zsxwing/disable-python-kinesis-test. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/edb26e7f Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/edb26e7f Diff: http://git-wip-us.apache.org/repos/asf/s

spark git commit: [SPARK-12021][STREAMING][TESTS] Fix the potential dead-lock in StreamingListenerSuite

2015-11-27 Thread zsxwing
lls `ssc.stop()`, `StreamingContextStoppingCollector` may call `ssc.stop()` in the listener bus thread, which is a dead-lock. This PR updated `StreamingContextStoppingCollector` to only call `ssc.stop()` in the first batch to avoid the dead-lock. Author: Shixiong Zhu <shixi...@databricks.com> Closes #10011 fr

spark git commit: [SPARK-12021][STREAMING][TESTS] Fix the potential dead-lock in StreamingListenerSuite

2015-11-27 Thread zsxwing
lls `ssc.stop()`, `StreamingContextStoppingCollector` may call `ssc.stop()` in the listener bus thread, which is a dead-lock. This PR updated `StreamingContextStoppingCollector` to only call `ssc.stop()` in the first batch to avoid the dead-lock. Author: Shixiong Zhu <shixi...@databricks.com> Closes #10011 fr

spark git commit: [SPARK-11999][CORE] Fix the issue that ThreadUtils.newDaemonCachedThreadPool doesn't cache any task

2015-11-25 Thread zsxwing
;shixi...@databricks.com> Closes #9978 from zsxwing/cached-threadpool. (cherry picked from commit d3ef693325f91a1ed340c9756c81244a80398eb2) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/a

spark git commit: [SPARK-11999][CORE] Fix the issue that ThreadUtils.newDaemonCachedThreadPool doesn't cache any task

2015-11-25 Thread zsxwing
;shixi...@databricks.com> Closes #9978 from zsxwing/cached-threadpool. (cherry picked from commit d3ef693325f91a1ed340c9756c81244a80398eb2) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/a

spark git commit: [SPARK-11999][CORE] Fix the issue that ThreadUtils.newDaemonCachedThreadPool doesn't cache any task

2015-11-25 Thread zsxwing
;shixi...@databricks.com> Closes #9978 from zsxwing/cached-threadpool. (cherry picked from commit d3ef693325f91a1ed340c9756c81244a80398eb2) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/a

spark git commit: [SPARK-11999][CORE] Fix the issue that ThreadUtils.newDaemonCachedThreadPool doesn't cache any task

2015-11-25 Thread zsxwing
;shixi...@databricks.com> Closes #9978 from zsxwing/cached-threadpool. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d3ef6933 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d3ef6933 Diff: http://git-wip-us.apache.org/repos/asf/s

spark git commit: [SPARK-11872] Prevent the call to SparkContext#stop() in the listener bus's thread

2015-11-24 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 19530da69 -> 81012546e [SPARK-11872] Prevent the call to SparkContext#stop() in the listener bus's thread This is continuation of SPARK-11761 Andrew suggested adding this protection. See tail of https://github.com/apache/spark/pull/9741

spark git commit: [STREAMING][FLAKY-TEST] Catch execution context race condition in `FileBasedWriteAheadLog.close()`

2015-11-24 Thread zsxwing
nit/org.apache.spark.streaming.util/BatchedWriteAheadLogWithCloseFileAfterWriteSuite/BatchedWriteAheadLog___clean_old_logs/ The reason the test fails is in `afterEach`, `writeAheadLog.close` is called, and there may still be async deletes in flight. tdas zsxwing Author: Burak Yavuz <brk...@gmail.com> Clo

spark git commit: [STREAMING][FLAKY-TEST] Catch execution context race condition in `FileBasedWriteAheadLog.close()`

2015-11-24 Thread zsxwing
nit/org.apache.spark.streaming.util/BatchedWriteAheadLogWithCloseFileAfterWriteSuite/BatchedWriteAheadLog___clean_old_logs/ The reason the test fails is in `afterEach`, `writeAheadLog.close` is called, and there may still be async deletes in flight. tdas zsxwing Author: Burak Yavuz <brk...@gmail.com>

spark git commit: [SPARK-11979][STREAMING] Empty TrackStateRDD cannot be checkpointed and recovered from checkpoint file

2015-11-24 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 151d7c2ba -> 216988688 [SPARK-11979][STREAMING] Empty TrackStateRDD cannot be checkpointed and recovered from checkpoint file This solves the following exception caused when empty state RDD is checkpointed and recovered. The root cause

spark git commit: [SPARK-11979][STREAMING] Empty TrackStateRDD cannot be checkpointed and recovered from checkpoint file

2015-11-24 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 68bcb9b33 -> 7f030aa42 [SPARK-11979][STREAMING] Empty TrackStateRDD cannot be checkpointed and recovered from checkpoint file This solves the following exception caused when empty state RDD is checkpointed and recovered. The root

<    3   4   5   6   7   8