[GitHub] spark pull request #17821: [SPARK-20529][Core]Allow worker and master work w...

2017-05-02 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17821#discussion_r114419550 --- Diff: core/src/main/scala/org/apache/spark/deploy/DeployMessage.scala --- @@ -80,8 +91,16 @@ private[deploy] object DeployMessages

[GitHub] spark pull request #17346: [SPARK-19965][SS] DataFrame batch reader may fail...

2017-05-02 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17346#discussion_r114395863 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSink.scala --- @@ -53,6 +53,29 @@ object FileStreamSink extends

[GitHub] spark pull request #17346: [SPARK-19965][SS] DataFrame batch reader may fail...

2017-05-02 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17346#discussion_r114396634 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSinkSuite.scala --- @@ -145,6 +147,41 @@ class FileStreamSinkSuite extends

[GitHub] spark pull request #17346: [SPARK-19965][SS] DataFrame batch reader may fail...

2017-05-02 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17346#discussion_r114396542 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSinkSuite.scala --- @@ -145,6 +147,41 @@ class FileStreamSinkSuite extends

[GitHub] spark pull request #17346: [SPARK-19965][SS] DataFrame batch reader may fail...

2017-05-02 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17346#discussion_r114397372 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala --- @@ -36,20 +37,27 @@ import

[GitHub] spark pull request #17346: [SPARK-19965][SS] DataFrame batch reader may fail...

2017-05-02 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17346#discussion_r114395114 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSink.scala --- @@ -53,6 +53,29 @@ object FileStreamSink extends

[GitHub] spark issue #17803: [SPARK-20523][BUILD] Clean up build warnings for 2.2.0 r...

2017-05-02 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17803 I don't know graphx or mllib. Others look good to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #17833: [SPARK-20558][CORE] clear InheritableThreadLocal variabl...

2017-05-02 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17833 By the way, this PR is good to me since it does reduce a little memory footprint. But we still cannot close https://issues.apache.org/jira/browse/SPARK-20548 though. --- If your project is set up

[GitHub] spark issue #17833: [SPARK-20558][CORE] clear InheritableThreadLocal variabl...

2017-05-02 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17833 I think it only cleans `localProperties` in the current thread. `localProperties` overrides `childValue` and always clones a new Properties for child threads. In addition, I think it

[GitHub] spark issue #17765: [SPARK-20464][SS] Add a job group and description for st...

2017-05-01 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17765 Merging to master and 2.2. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #17765: [SPARK-20464][SS] Add a job group and description for st...

2017-05-01 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17765 LGTM! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #17821: [SPARK-20529][Core]Allow worker and master work with a p...

2017-05-01 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17821 cc @sameeragarwal --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #17821: [SPARK-20529][Core]Allow worker and master work w...

2017-05-01 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/17821 [SPARK-20529][Core]Allow worker and master work with a proxy server ## What changes were proposed in this pull request? In the current codes, when worker connects to master, master will

[GitHub] spark pull request #17803: [SPARK-20523][BUILD] Clean up build warnings for ...

2017-04-30 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17803#discussion_r114080946 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/streaming/GroupStateTimeout.java --- @@ -37,7 +37,9 @@ * `map/flatMapGroupsWithState` by

[GitHub] spark issue #17789: [SPARK-19525][CORE]Add RDD checkpoint compression suppor...

2017-04-28 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17789 Thanks, @mridulm @aramesh117 Merging to master and 2.2. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #17789: [SPARK-19525][CORE]Add RDD checkpoint compression suppor...

2017-04-28 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17789 > did this PR was for compression to be enabled for spark streaming usecase. Streaming checkpoint includes two parts: - DStream graph and metadata - RDD checkpoints Ri

[GitHub] spark issue #17790: [SPARK-20514][CORE] Upgrade Jetty to 9.3.11.v20160721

2017-04-28 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17790 LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark pull request #17765: [SPARK-20464][SS] Add a job group and description...

2017-04-27 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17765#discussion_r113825493 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamSuite.scala --- @@ -500,6 +502,69 @@ class StreamSuite extends StreamTest

[GitHub] spark issue #17789: [SPARK-19525][CORE]Add RDD checkpoint compression suppor...

2017-04-27 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17789 yes. See https://github.com/apache/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala#L138 --- If your project is set up for it, you can reply to this email and

[GitHub] spark issue #17789: [SPARK-19525][CORE]Add RDD checkpoint compression suppor...

2017-04-27 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17789 Streaming checkpoints are on HDFS but don't have an extension :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pr

[GitHub] spark issue #17789: [SPARK-19525][CORE]Add RDD checkpoint compression suppor...

2017-04-27 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17789 In addition, I agree that having an extension and separating the codecs are good ideas. But they should be done in other PRs to not introduce multiple features in a large PR. --- If your project

[GitHub] spark issue #17789: [SPARK-19525][CORE]Add RDD checkpoint compression suppor...

2017-04-27 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17789 > A question I had even with the earlier PR was - should we add the extension to either the directory or the file indicating compression type ? Shuffle and cache files don't

[GitHub] spark issue #17790: [SPARK-20514][CORE] Upgrade Jetty to 9.3.13.v20161014

2017-04-27 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17790 cc @yhuai --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #17024: [SPARK-19525][CORE] Compressing checkpoints.

2017-04-27 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17024 @aramesh117 I just opened #17789 to finish the rest work. All credits will go to you when merging the new PR. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #17789: [SPARK-19525][CORE]Add RDD checkpoint compression suppor...

2017-04-27 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17789 cc @mridulm since you reviewed the initial PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #17789: [SPARK-19525][CORE]Add RDD checkpoint compression...

2017-04-27 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/17789 [SPARK-19525][CORE]Add RDD checkpoint compression support ## What changes were proposed in this pull request? This PR adds RDD checkpoint compression support and add a new config

[GitHub] spark pull request #17765: [SPARK-20464][SS] Add a job group and description...

2017-04-26 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17765#discussion_r113592921 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -252,6 +252,7 @@ class StreamExecution

[GitHub] spark pull request #17765: [SPARK-20464][SS] Add a job group and description...

2017-04-26 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17765#discussion_r113557507 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -825,6 +833,11 @@ class StreamExecution

[GitHub] spark issue #17024: [SPARK-19525][CORE] Compressing checkpoints.

2017-04-26 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17024 @aramesh117 do you have time to work on this PR recently? We need to merge this PR ASAP in order to get it into 2.2.0. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark issue #17761: [SPARK-20461][Core][SS]Use UninterruptibleThread for Exe...

2017-04-25 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17761 @mridulm this only affect codes calling `runUninterruptibly` which is not a public API, so won't break any existing codes. The worst case of this PR is some task needs to wait until network ti

[GitHub] spark pull request #17763: [SPARK-13747][Core]Add ThreadUtils.awaitReady and...

2017-04-25 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/17763 [SPARK-13747][Core]Add ThreadUtils.awaitReady and disallow Await.ready ## What changes were proposed in this pull request? Add `ThreadUtils.awaitReady` similar to `ThreadUtils.awaitResult

[GitHub] spark pull request #17761: [SPARK-20461][Core][SS]Use UninterruptibleThread ...

2017-04-25 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17761#discussion_r113273103 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -84,7 +86,20 @@ private[spark] class Executor( } // Start

[GitHub] spark pull request #17761: [SPARK-20461][Core][SS]Use UninterruptibleThread ...

2017-04-25 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/17761 [SPARK-20461][Core][SS]Use UninterruptibleThread for Executor and fix the potential hang in CachedKafkaConsumer ## What changes were proposed in this pull request? This PR changes

[GitHub] spark issue #17540: [SPARK-20213][SQL][UI] Fix DataFrameWriter operations in...

2017-04-24 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17540 The current `withNewExecutionId` issue is that it doesn't support nested QueryExecution. I'm wondering if you can really fix this issue without introducing regression, e.g., track

[GitHub] spark pull request #17540: [SPARK-20213][SQL][UI] Fix DataFrameWriter operat...

2017-04-24 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17540#discussion_r113087928 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala --- @@ -161,50 +161,51 @@ object FileFormatWriter

[GitHub] spark pull request #17540: [SPARK-20213][SQL][UI] Fix DataFrameWriter operat...

2017-04-24 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17540#discussion_r113084795 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala --- @@ -161,50 +161,51 @@ object FileFormatWriter

[GitHub] spark pull request #17752: [SPARK-20452][SS][Kafka]Fix a potential Concurren...

2017-04-24 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17752#discussion_r113080567 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetReader.scala --- @@ -95,8 +95,10 @@ private[kafka010] class

[GitHub] spark pull request #17752: [SPARK-20452][SS][Kafka]Fix a potential Concurren...

2017-04-24 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17752#discussion_r113075234 --- Diff: external/kafka-0-10/src/main/scala/org/apache/spark/streaming/kafka010/KafkaRDD.scala --- @@ -199,7 +199,7 @@ private[spark] class KafkaRDD[K, V

[GitHub] spark pull request #17752: [SPARK-20452][SS][Kafka]Fix a potential Concurren...

2017-04-24 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17752#discussion_r113075321 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaRelation.scala --- @@ -53,9 +54,27 @@ private[kafka010] class

[GitHub] spark pull request #17752: [SPARK-20452][SS][Kafka]Fix a potential Concurren...

2017-04-24 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17752#discussion_r113075218 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/CachedKafkaConsumer.scala --- @@ -371,7 +371,7 @@ private[kafka010] object

[GitHub] spark pull request #17752: [SPARK-20452][SS][Kafka]Fix a potential Concurren...

2017-04-24 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17752#discussion_r113075117 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceRDD.scala --- @@ -125,16 +125,15 @@ private[kafka010] class

[GitHub] spark pull request #17752: [SPARK-20452][SS][Kafka]Fix a potential Concurren...

2017-04-24 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17752#discussion_r113074969 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala --- @@ -213,46 +203,6 @@ private[kafka010] class

[GitHub] spark pull request #17752: [SPARK-20452][SS][Kafka]Fix a potential Concurren...

2017-04-24 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17752#discussion_r113074957 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala --- @@ -414,30 +364,9 @@ private[kafka010] class

[GitHub] spark pull request #17752: [SPARK-20452][SS][Kafka]Fix a potential Concurren...

2017-04-24 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17752#discussion_r113074951 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala --- @@ -213,46 +203,6 @@ private[kafka010] class

[GitHub] spark pull request #17752: [SPARK-20452][SS][Kafka]Fix a potential Concurren...

2017-04-24 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17752#discussion_r113074883 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaRelation.scala --- @@ -53,9 +54,27 @@ private[kafka010] class

[GitHub] spark pull request #17752: [SPARK-20452][SS][Kafka]Fix a potential Concurren...

2017-04-24 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17752#discussion_r113074531 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetReader.scala --- @@ -95,8 +95,10 @@ private[kafka010] class

[GitHub] spark pull request #17752: [SPARK-20452][SS][Kafka]Fix a potential Concurren...

2017-04-24 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17752#discussion_r113074386 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetReader.scala --- @@ -95,8 +95,10 @@ private[kafka010] class

[GitHub] spark pull request #17752: [SPARK-20452][SS][Kafka]Fix a potential Concurren...

2017-04-24 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/17752 [SPARK-20452][SS][Kafka]Fix a potential ConcurrentModificationException for batch Kafka DataFrame ## What changes were proposed in this pull request? Cancel a batch Kafka query but one of

[GitHub] spark issue #17691: [MINOR][SS] Fix a missing space in UnsupportedOperationC...

2017-04-19 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17691 Thanks! Merging to master, 2.2 and 2.1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #17691: [MINOR][SS] Fix a missing space in UnsupportedOpe...

2017-04-19 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/17691 [MINOR][SS] Fix a missing space in UnsupportedOperationChecker error message ## What changes were proposed in this pull request? Also went through the same file to ensure other string

[GitHub] spark issue #17687: [SPARK-20397][SparkR][SS]Fix flaky test: test_streaming....

2017-04-19 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17687 Thanks! Merging to master and 2.2. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #17687: [SPARK-20397][SparkR][SS]Fix flaky test: test_str...

2017-04-19 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/17687 [SPARK-20397][SparkR][SS]Fix flaky test: test_streaming.R.Terminated by error ## What changes were proposed in this pull request? Checking a source parameter is asynchronous. When the

[GitHub] spark issue #17676: [SPARK-20377][SS] Fix JavaStructuredSessionization examp...

2017-04-18 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17676 LGTM pending tests. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #17024: [SPARK-19525][CORE] Compressing checkpoints.

2017-04-17 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17024#discussion_r111830351 --- Diff: core/src/main/scala/org/apache/spark/rdd/ReliableCheckpointRDD.scala --- @@ -169,14 +177,23 @@ private[spark] object ReliableCheckpointRDD

[GitHub] spark pull request #17024: [SPARK-19525][CORE] Compressing checkpoints.

2017-04-17 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17024#discussion_r111830180 --- Diff: core/src/main/scala/org/apache/spark/rdd/ReliableCheckpointRDD.scala --- @@ -133,9 +136,14 @@ private[spark] object ReliableCheckpointRDD extends

[GitHub] spark pull request #17024: [SPARK-19525][CORE] Compressing checkpoints.

2017-04-17 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17024#discussion_r111829095 --- Diff: core/src/main/scala/org/apache/spark/rdd/ReliableCheckpointRDD.scala --- @@ -27,8 +27,11 @@ import org.apache.hadoop.fs.Path import

[GitHub] spark pull request #17024: [SPARK-19525][CORE] Compressing checkpoints.

2017-04-17 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17024#discussion_r111833659 --- Diff: core/src/test/scala/org/apache/spark/CheckpointSuite.scala --- @@ -238,6 +241,42 @@ trait RDDCheckpointTester { self: SparkFunSuite

[GitHub] spark pull request #17024: [SPARK-19525][CORE] Compressing checkpoints.

2017-04-17 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17024#discussion_r111830443 --- Diff: core/src/test/scala/org/apache/spark/CheckpointSuite.scala --- @@ -251,10 +290,14 @@ class CheckpointSuite extends SparkFunSuite with

[GitHub] spark pull request #17024: [SPARK-19525][CORE] Compressing checkpoints.

2017-04-17 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17024#discussion_r111832087 --- Diff: core/src/test/scala/org/apache/spark/CheckpointSuite.scala --- @@ -266,13 +309,44 @@ class CheckpointSuite extends SparkFunSuite with

[GitHub] spark pull request #17024: [SPARK-19525][CORE] Compressing checkpoints.

2017-04-17 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17024#discussion_r111829941 --- Diff: core/src/main/scala/org/apache/spark/rdd/ReliableCheckpointRDD.scala --- @@ -133,9 +136,14 @@ private[spark] object ReliableCheckpointRDD extends

[GitHub] spark pull request #17024: [SPARK-19525][CORE] Compressing checkpoints.

2017-04-17 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17024#discussion_r111830401 --- Diff: core/src/test/scala/org/apache/spark/CheckpointSuite.scala --- @@ -21,12 +21,15 @@ import java.io.File import scala.reflect.ClassTag

[GitHub] spark pull request #17024: [SPARK-19525][CORE] Compressing checkpoints.

2017-04-17 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17024#discussion_r111836700 --- Diff: core/src/test/scala/org/apache/spark/CheckpointSuite.scala --- @@ -266,13 +309,44 @@ class CheckpointSuite extends SparkFunSuite with

[GitHub] spark issue #17610: [SPARK-20131][Core]Don't use `this` lock in StandaloneSc...

2017-04-12 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17610 Merging to master and 2.1. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17610: [SPARK-20131][Core]Use a separate lock for StandaloneSch...

2017-04-12 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17610 I removed the lock and changed `stopping` to `AtomicBoolean` to ensure idempotent. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark issue #17624: [SPARK-15354][flaky-test] Fix flaky test

2017-04-12 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17624 nit: could you use a better title? You can add the test name into it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #17610: [SPARK-20131][Core]Use a separate lock for StandaloneSch...

2017-04-11 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17610 > Isn't it not depending on this being locked in super class methods invoked in the invocation subtree ? I don't get it. But I think the stack trace shows why this dead-

[GitHub] spark issue #17610: [SPARK-20131][Core]Use a separate lock for StandaloneSch...

2017-04-11 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17610 @mridulm yeah, I was thinking to just change `stopping` to a AtomicBoolean flag. However, it changes the semantics a little, e.g., the second `stop` will return at once when the first `stop` is

[GitHub] spark issue #17463: [SPARK-20131][DStream][Test] Flaky Test: org.apache.spar...

2017-04-11 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17463 Could you close this one, please? I submitted #17610 to fix the root issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request #17610: [SPARK-20131][Core]Use a separate lock for Standa...

2017-04-11 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/17610 [SPARK-20131][Core]Use a separate lock for StandaloneSchedulerBackend.stop ## What changes were proposed in this pull request? `o.a.s.streaming.StreamingContextSuite.SPARK-18560 Receiver

[GitHub] spark issue #17599: [SPARK-17564][Tests]Fix flaky RequestTimeoutIntegrationS...

2017-04-10 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17599 cc @rxin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark pull request #17599: [SPARK-17564][Tests]Fix flaky RequestTimeoutInteg...

2017-04-10 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/17599 [SPARK-17564][Tests]Fix flaky RequestTimeoutIntegrationSuite.furtherRequestsDelay ## What changes were proposed in this pull request? This PR fixs the following failure

[GitHub] spark issue #17594: [SPARK-20282][SS][Tests]Write the commit log first to fi...

2017-04-10 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17594 Thanks! Merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #17597: [SPARK-20285][Tests]Increase the pyspark streaming test ...

2017-04-10 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17597 Thanks! Merging to master, 2.1 and 2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #17597: [SPARK-20285][Tests]Increase the pyspark streamin...

2017-04-10 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/17597 [SPARK-20285][Tests]Increase the pyspark streaming test timeout to 30 seconds ## What changes were proposed in this pull request? Saw the following failure locally

[GitHub] spark pull request #17594: [SPARK-20282][SS][Tests]Write the commit log firs...

2017-04-10 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17594#discussion_r110735710 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -304,8 +304,8 @@ class StreamExecution

[GitHub] spark pull request #17594: Write the log first to fix a race contion in test...

2017-04-10 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/17594 Write the log first to fix a race contion in tests ## What changes were proposed in this pull request? This PR fixes the following failure: ``` sbt.ForkMain$ForkError

[GitHub] spark issue #17179: [SPARK-19067][SS] Processing-time-based timeout in MapGr...

2017-03-17 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17179 LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark pull request #17216: [SPARK-19873][SS] Record num shuffle partitions i...

2017-03-17 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17216#discussion_r106759443 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -256,6 +259,15 @@ class StreamExecution

[GitHub] spark pull request #17216: [SPARK-19873][SS] Record num shuffle partitions i...

2017-03-17 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17216#discussion_r106757213 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -256,6 +259,15 @@ class StreamExecution

[GitHub] spark pull request #17295: [SPARK-19556][core] Do not encrypt block manager ...

2017-03-17 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17295#discussion_r106749569 --- Diff: core/src/main/scala/org/apache/spark/security/CryptoStreamUtils.scala --- @@ -63,12 +83,40 @@ private[spark] object CryptoStreamUtils extends

[GitHub] spark pull request #17179: [SPARK-19067][SS] Processing-time-based timeout i...

2017-03-17 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17179#discussion_r106736186 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FlatMapGroupsWithStateExec.scala --- @@ -0,0 +1,270 @@ +/* + * Licensed

[GitHub] spark pull request #17216: [SPARK-19873][SS] Record num shuffle partitions i...

2017-03-17 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17216#discussion_r106709281 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -549,9 +581,15 @@ class StreamExecution

[GitHub] spark pull request #17216: [SPARK-19873][SS] Record num shuffle partitions i...

2017-03-17 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17216#discussion_r106709791 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/OffsetSeqLogSuite.scala --- @@ -29,12 +30,32 @@ class OffsetSeqLogSuite extends

[GitHub] spark issue #17327: [SPARK-19721][SS][BRANCH-2.1] Good error message for ver...

2017-03-17 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17327 LGTM. Merging to 2.1. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #17284: [DO_NOT_MERGE]Test PySpark Streaming tests

2017-03-16 Thread zsxwing
Github user zsxwing closed the pull request at: https://github.com/apache/spark/pull/17284 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #17323: [SPARK-19986][Tests]Make pyspark.streaming.tests.Checkpo...

2017-03-16 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17323 cc @tdas --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark pull request #17323: [SPARK-19986][Tests]Make pyspark.streaming.tests....

2017-03-16 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/17323 [SPARK-19986][Tests]Make pyspark.streaming.tests.CheckpointTests more stable ## What changes were proposed in this pull request? Sometimes, CheckpointTests will hang on a busy machine

[GitHub] spark issue #17070: [SPARK-19721][SS] Good error message for version mismatc...

2017-03-16 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17070 @lw-lin there are conflicts with 2.1. Could you submit a new PR for branch-2.1? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark issue #17070: [SPARK-19721][SS] Good error message for version mismatc...

2017-03-16 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17070 LGTM. Merging to master and 2.1. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #17070: [SPARK-19721][SS] Good error message for version ...

2017-03-15 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17070#discussion_r106348166 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala --- @@ -195,6 +195,11 @@ class HDFSMetadataLog[T

[GitHub] spark issue #17244: [SPARK-19889][SQL] Make TaskContext callbacks thread saf...

2017-03-15 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/17244 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark pull request #17179: [SPARK-19067][SS] Processing-time-based timeout i...

2017-03-14 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17179#discussion_r106055705 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala --- @@ -298,12 +368,14 @@ class KeyValueGroupedDataset[K, V] private

[GitHub] spark pull request #17179: [SPARK-19067][SS] Processing-time-based timeout i...

2017-03-14 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17179#discussion_r106045620 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/KeyedStateImpl.scala --- @@ -60,6 +82,45 @@ private[sql] class KeyedStateImpl[S

[GitHub] spark pull request #17179: [SPARK-19067][SS] Processing-time-based timeout i...

2017-03-14 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17179#discussion_r106051316 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FlatMapGroupsWithStateExec.scala --- @@ -0,0 +1,270 @@ +/* + * Licensed

[GitHub] spark pull request #17179: [SPARK-19067][SS] Processing-time-based timeout i...

2017-03-14 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17179#discussion_r106054568 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStore.scala --- @@ -50,6 +50,8 @@ trait StateStore { /** Get

[GitHub] spark pull request #17179: [SPARK-19067][SS] Processing-time-based timeout i...

2017-03-14 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17179#discussion_r106055666 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala --- @@ -361,18 +435,20 @@ class KeyValueGroupedDataset[K, V] private

[GitHub] spark pull request #17179: [SPARK-19067][SS] Processing-time-based timeout i...

2017-03-14 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17179#discussion_r106045218 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/KeyedState.scala --- @@ -92,27 +121,33 @@ import

[GitHub] spark pull request #17179: [SPARK-19067][SS] Processing-time-based timeout i...

2017-03-14 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17179#discussion_r106055599 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala --- @@ -284,6 +322,38 @@ class KeyValueGroupedDataset[K, V] private[sql

[GitHub] spark pull request #17179: [SPARK-19067][SS] Processing-time-based timeout i...

2017-03-14 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17179#discussion_r106036389 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/streaming/KeyedStateTimeout.java --- @@ -0,0 +1,45 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #17179: [SPARK-19067][SS] Processing-time-based timeout i...

2017-03-14 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17179#discussion_r106055580 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala --- @@ -249,6 +250,43 @@ class KeyValueGroupedDataset[K, V] private[sql

<    6   7   8   9   10   11   12   13   14   15   >