[GitHub] spark pull request #22473: [SPARK-25449][CORE] Heartbeat shouldn't include a...

2018-09-21 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22473#discussion_r219574155 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -149,7 +149,7 @@ private[spark] class Executor( // Executor

[GitHub] spark pull request #22473: [SPARK-25449][CORE] Heartbeat shouldn't include a...

2018-09-21 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22473#discussion_r219573967 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -120,7 +120,7 @@ private[spark] class Executor

[GitHub] spark pull request #22473: [SPARK-25449][CORE] Heartbeat shouldn't include a...

2018-09-21 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22473#discussion_r219576657 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -83,6 +83,17 @@ package object config { private[spark] val

[GitHub] spark pull request #22473: [SPARK-25449][CORE] Heartbeat shouldn't include a...

2018-09-21 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22473#discussion_r219580690 --- Diff: core/src/test/scala/org/apache/spark/executor/ExecutorSuite.scala --- @@ -252,18 +253,121 @@ class ExecutorSuite extends SparkFunSuite

[GitHub] spark pull request #22507: [SPARK-25495][SS]FetchedData.reset should reset a...

2018-09-20 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/22507 [SPARK-25495][SS]FetchedData.reset should reset all fields ## What changes were proposed in this pull request? `FetchedData.reset` should reset `_nextOffsetInFetchedData

[GitHub] spark issue #22478: [SPARK-25472] Don't have legitimate stops of streams cau...

2018-09-19 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22478 LGTM pending tests. Could you add `[SS]` to your title? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22473: [SPARK-25449][CORE] Heartbeat shouldn't include a...

2018-09-19 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22473#discussion_r218941326 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -799,7 +799,8 @@ private[spark] class Executor( if (taskRunner.task

[GitHub] spark issue #22473: [SPARK-25449][CORE] Heartbeat shouldn't include accumula...

2018-09-19 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22473 add to whitelist --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22402: [SPARK-25414][SS][TEST] make it clear that the numRows m...

2018-09-17 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22402 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22402: [SPARK-25414][SS][TEST] make it clear that the numRows m...

2018-09-17 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22402 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #21721: [SPARK-24748][SS] Support for reporting custom metrics v...

2018-09-05 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/21721 FYI, I submitted #22334 to revert #21819 and #21721. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22334 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK 247...

2018-09-04 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/22334 [SPARK-25336][SS]Revert SPARK-24863 and SPARK 24748 ## What changes were proposed in this pull request? Revert SPARK-24863 and SPARK 24748 as per discussion in #21721. We will revisit

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-09-04 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22138 Thanks for your PR. This is really a big change. It will need very careful review as it changes a lot of critical code path and the current Kafka consumer logic is really complicated. Let's hold

[GitHub] spark issue #22293: [SPARK-25288][Tests]Fix flaky Kafka transaction tests

2018-08-31 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22293 Thanks! Merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21721: [SPARK-24748][SS] Support for reporting custom metrics v...

2018-08-30 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/21721 @arunmahadevan yeah, it's better to figure out the solution for continuous mode as well. As you mentioned, the current SQL metrics are not updated unless the task completes, so we may need to add

[GitHub] spark pull request #22293: [SPARK-25288][Tests]Fix flaky Kafka transaction t...

2018-08-30 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22293#discussion_r214209091 --- Diff: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala --- @@ -652,62 +654,67 @@ abstract class

[GitHub] spark issue #21721: [SPARK-24748][SS] Support for reporting custom metrics v...

2018-08-30 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/21721 It’s better to not release such APIs without thinking about how to support continuous queries, since it may need to change APIs, which should be avoided if possible. I propose to revert this PR

[GitHub] spark pull request #22292: [SPARK-25286][CORE] Removing the dangerous parmap

2018-08-30 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22292#discussion_r214204549 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/util/FileBasedWriteAheadLog.scala --- @@ -315,7 +315,9 @@ private[streaming] object

[GitHub] spark issue #22293: [SPARK-25288][Tests]Fix flaky Kafka transaction tests

2018-08-30 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22293 now this passed 11 times on Jenkins --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark pull request #22233: [SPARK-25240][SQL] Fix for a deadlock in RECOVER ...

2018-08-30 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22233#discussion_r214203842 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -60,7 +60,8 @@ class HiveCatalogedDDLSuite extends

[GitHub] spark pull request #22293: [SPARK-25288][Tests]Fix flaky Kafka transaction t...

2018-08-30 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/22293 [SPARK-25288][Tests]Fix flaky Kafka transaction tests ## What changes were proposed in this pull request? Here are the failures: http://spark-tests.appspot.com/test-details

[GitHub] spark issue #22042: [SPARK-25005][SS]Support non-consecutive offsets for Kaf...

2018-08-28 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22042 Thanks! Merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22210: [SPARK-25218][Core]Fix potential resource leaks in Trans...

2018-08-28 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22210 Thanks! Merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22233: [SPARK-25240][SQL] Fix for a deadlock in RECOVER ...

2018-08-27 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22233#discussion_r213137139 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -671,7 +674,7 @@ case class AlterTableRecoverPartitionsCommand

[GitHub] spark issue #22210: [SPARK-25218][Core]Fix potential resource leaks in Trans...

2018-08-27 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22210 cc @brkyvz --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22233: [SPARK-25240][SQL] Fix for a deadlock in RECOVER ...

2018-08-27 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22233#discussion_r213063623 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -671,7 +674,7 @@ case class AlterTableRecoverPartitionsCommand

[GitHub] spark issue #22245: [SPARK-24882][FOLLOWUP] Fix flaky synchronization in Kaf...

2018-08-27 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22245 Thanks! Merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22245: [SPARK-24882][FOLLOWUP] Fix flaky synchronization in Kaf...

2018-08-27 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22245 LGTM pending tests --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22042: [SPARK-25005][SS]Support non-consecutive offsets for Kaf...

2018-08-27 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22042 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22042: [SPARK-25005][SS]Support non-consecutive offsets for Kaf...

2018-08-25 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22042 > This patch fails Spark unit tests. This is the flaky test I fixed in #22230 retest this ple

[GitHub] spark issue #22230: [SPARK-25214][SS][FOLLOWUP]Fix the issue that Kafka v2 s...

2018-08-25 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22230 Thanks! Merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22230: [SPARK-25214][SS][FOLLOWUP]Fix the issue that Kaf...

2018-08-24 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/22230 [SPARK-25214][SS][FOLLOWUP]Fix the issue that Kafka v2 source may return duplicated records when `failOnDataLoss=false` ## What changes were proposed in this pull request

[GitHub] spark issue #22230: [SPARK-25214][SS][FOLLOWUP]Fix the issue that Kafka v2 s...

2018-08-24 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22230 cc @tdas --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22207: [SPARK-25214][SS]Fix the issue that Kafka v2 source may ...

2018-08-24 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22207 I just realized the Kafka source v2 is not in 2.3 :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22207: [SPARK-25214][SS]Fix the issue that Kafka v2 source may ...

2018-08-24 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22207 Thanks! Merging to master and 2.3. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #22207: [SPARK-25214][SS]Fix the issue that Kafka v2 sour...

2018-08-24 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22207#discussion_r212709927 --- Diff: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaDontFailOnDataLossSuite.scala --- @@ -0,0 +1,281

[GitHub] spark pull request #22207: [SPARK-25214][SS]Fix the issue that Kafka v2 sour...

2018-08-24 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22207#discussion_r212707515 --- Diff: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaDontFailOnDataLossSuite.scala --- @@ -0,0 +1,281

[GitHub] spark pull request #22210: [SPARK-25218][Core]Fix potential resource leaks i...

2018-08-23 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22210#discussion_r212443117 --- Diff: common/network-common/src/main/java/org/apache/spark/network/buffer/FileSegmentManagedBuffer.java --- @@ -95,26 +95,24 @@ public ByteBuffer

[GitHub] spark pull request #22210: [SPARK-25218][Core]Fix potential resource leaks i...

2018-08-23 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22210#discussion_r212443039 --- Diff: common/network-common/src/main/java/org/apache/spark/network/buffer/FileSegmentManagedBuffer.java --- @@ -77,16 +77,16 @@ public ByteBuffer

[GitHub] spark pull request #22210: [SPARK-25218][Core]Fix potential resource leaks i...

2018-08-23 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/22210 [SPARK-25218][Core]Fix potential resource leaks in TransportServer and SocketAuthHelper ## What changes were proposed in this pull request? Make sure TransportServer and SocketAuthHelper

[GitHub] spark pull request #22207: [SPARK-25214][SS]Fix the issue that Kafka v2 sour...

2018-08-23 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22207#discussion_r212410113 --- Diff: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaDontFailOnDataLossSuite.scala --- @@ -0,0 +1,281

[GitHub] spark pull request #22207: [SPARK-25214][SS]Fix the issue that Kafka v2 sour...

2018-08-23 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22207#discussion_r212409454 --- Diff: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala --- @@ -1187,134 +1185,3 @@ class

[GitHub] spark pull request #22207: [SPARK-25214][SS]Fix the issue that Kafka v2 sour...

2018-08-23 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22207#discussion_r212409340 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceRDD.scala --- @@ -77,44 +77,6 @@ private[kafka010] class

[GitHub] spark pull request #22207: [SPARK-25214][SS]Fix the issue that Kafka v2 sour...

2018-08-23 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/22207 [SPARK-25214][SS]Fix the issue that Kafka v2 source may return duplicated records when `failOnDataLoss=false` ## What changes were proposed in this pull request? When there are missing

[GitHub] spark issue #22181: [SPARK-25163][SQL] Fix flaky test: o.a.s.util.collection...

2018-08-22 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22181 LGTM. Merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22176: [SPARK-25181][CORE] Limit Thread Pool size in BlockManag...

2018-08-22 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22176 @markhamstra That's a good point. However, since this is just following our current codes if you check the usages of `newDaemonCachedThreadPool`, and the changes here should be safe considering

[GitHub] spark issue #22176: [SPARK-25181][CORE] Limit Thread Pool size in BlockManag...

2018-08-22 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22176 LGTM. Merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22042: [SPARK-25005][SS]Support non-consecutive offsets ...

2018-08-22 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22042#discussion_r212033844 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchReader.scala --- @@ -337,6 +338,7 @@ private[kafka010] case

[GitHub] spark pull request #22042: [SPARK-25005][SS]Support non-consecutive offsets ...

2018-08-22 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22042#discussion_r212032759 --- Diff: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala --- @@ -597,6 +614,254 @@ abstract class

[GitHub] spark issue #22182: [SPARK-25184][SS] Fixed race condition in StreamExecutio...

2018-08-22 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22182 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22182: [SPARK-25184][SS] Fixed race condition in StreamExecutio...

2018-08-22 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22182 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22042: [SPARK-25005][SS]Support non-consecutive offsets ...

2018-08-21 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22042#discussion_r211786471 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaDataConsumer.scala --- @@ -250,33 +294,42 @@ private[kafka010] case

[GitHub] spark pull request #22042: [SPARK-25005][SS]Support non-consecutive offsets ...

2018-08-21 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22042#discussion_r211786183 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaDataConsumer.scala --- @@ -31,22 +31,21 @@ import

[GitHub] spark pull request #22042: [SPARK-25005][SS]Support non-consecutive offsets ...

2018-08-21 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22042#discussion_r211786163 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaDataConsumer.scala --- @@ -31,22 +31,21 @@ import

[GitHub] spark issue #22106: [SPARK-25116][TESTS]Fix the Kafka cluster leak and clean...

2018-08-17 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22106 Thanks. Merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22106: [SPARK-25116][TESTS]Fix the Kafka cluster leak an...

2018-08-17 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22106#discussion_r211035290 --- Diff: external/kafka-0-10/src/test/scala/org/apache/spark/streaming/kafka010/KafkaTestUtils.scala --- @@ -120,61 +120,56 @@ private[kafka010] class

[GitHub] spark pull request #22106: [SPARK-25116][TESTS]Fix the Kafka cluster leak an...

2018-08-17 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22106#discussion_r210977997 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/CachedKafkaProducer.scala --- @@ -33,8 +33,12 @@ private[kafka010] object

[GitHub] spark issue #22106: [SPARK-25116][TESTS]Fix the Kafka cluster leak and clean...

2018-08-17 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22106 cc @srowen --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22106: [SPARK-25116][TESTS]Fix the Kafka cluster leak and clean...

2018-08-17 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22106 Test failures in 4264 and 4266 are unrelated. The latest changes passed on Jenkins 15 times. --- - To unsubscribe, e-mail

[GitHub] spark pull request #22106: [SPARK-25116][TESTS]Fix the Kafka cluster leak an...

2018-08-15 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22106#discussion_r210387003 --- Diff: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaContinuousSinkSuite.scala --- @@ -40,12 +40,7 @@ class

[GitHub] spark pull request #22106: [SPARK-25116][TESTS]Fix the Kafka cluster leak an...

2018-08-15 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22106#discussion_r210383608 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaContinuousReader.scala --- @@ -216,7 +216,7 @@ class

[GitHub] spark pull request #22105: [SPARK-25115] [Core] Eliminate extra memory copy ...

2018-08-14 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22105#discussion_r210134581 --- Diff: common/network-common/src/main/java/org/apache/spark/network/protocol/MessageWithHeader.java --- @@ -140,8 +140,24 @@ private int copyByteBuf

[GitHub] spark issue #22105: [SPARK-25115] [Core] Eliminate extra memory copy done wh...

2018-08-14 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22105 @normanmaurer LGTM. Thanks for the fix. I totally forgot this issue. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #22105: [SPARK-25115] [Core] Eliminate extra memory copy ...

2018-08-14 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22105#discussion_r210133826 --- Diff: common/network-common/src/main/java/org/apache/spark/network/protocol/MessageWithHeader.java --- @@ -140,8 +140,24 @@ private int copyByteBuf

[GitHub] spark pull request #22106: [SPARK-25116][Tests]Fix the kafka cluster leak an...

2018-08-14 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/22106 [SPARK-25116][Tests]Fix the kafka cluster leak and clean up cached producers ## What changes were proposed in this pull request? KafkaContinuousSinkSuite leaks a Kafka cluster because both

[GitHub] spark issue #22097: [SPARK-18057][FOLLOW-UP]Use 127.0.0.1 to avoid zookeeper...

2018-08-14 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22097 I'm going to merge this now since it does fix some issues. I will continue to investigate `exit cod 1` issue

[GitHub] spark issue #22097: [SPARK-18057][FOLLOW-UP]Use 127.0.0.1 to avoid zookeeper...

2018-08-13 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22097 I set a custom Exit.Procedure to prevent from killing JVM. Hope this will make the test more stable. --- - To unsubscribe, e

[GitHub] spark issue #22097: [SPARK-18057][FOLLOW-UP]Use 127.0.0.1 to avoid zookeeper...

2018-08-13 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22097 Looks like there is a race condition during terminating Kafka cluster: {code} 18/08/13 15:34:44.148 kafka-log-cleaner-thread-0 ERROR LogCleaner: Failed to access checkpoint file cleaner

[GitHub] spark issue #21698: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-08-13 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/21698 > "you can at least sort the serialized bytes of T" I think this should work. --- - To unsubscribe, e-

[GitHub] spark issue #22097: [SPARK-18057][FOLLOW-UP]Use 127.0.0.1 to avoid zookeeper...

2018-08-13 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22097 cc @srowen --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22097: [SPARK-18057][FOLLOW-UP]Use 127.0.0.1 to avoid zo...

2018-08-13 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/22097 [SPARK-18057][FOLLOW-UP]Use 127.0.0.1 to avoid zookeeper picking up an ipv6 address ## What changes were proposed in this pull request? I'm still seeing the Kafka tests failed randomly

[GitHub] spark pull request #22072: [SPARK-25081][Core]Nested spill in ShuffleExterna...

2018-08-13 Thread zsxwing
Github user zsxwing closed the pull request at: https://github.com/apache/spark/pull/22072 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22072: [SPARK-25081][Core]Nested spill in ShuffleExternalSorter...

2018-08-10 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22072 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #21746: [SPARK-24699] [SS]Make watermarks work with Trigger.Once...

2018-08-10 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/21746 @c-horn it's in 2.4.0. I just fixed the ticket. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #21980: [SPARK-25010][SQL] Rand/Randn should produce different v...

2018-08-10 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/21980 LGTM2 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #18143: [SPARK-20919][SS] Simplificaiton of CachedKafkaConsumer ...

2018-08-10 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/18143 @ScrapCodes sorry for the delay. I think @tdas has fixed the issue. Please close the PR. --- - To unsubscribe, e-mail: reviews

[GitHub] spark pull request #21634: [SPARK-24648][SQL] SqlMetrics should be threadsaf...

2018-08-10 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/21634#discussion_r209371636 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala --- @@ -504,4 +504,38 @@ class SQLMetricsSuite extends

[GitHub] spark pull request #22072: [SPARK-25081][Core]Nested spill in ShuffleExterna...

2018-08-10 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/22072 [SPARK-25081][Core]Nested spill in ShuffleExternalSorter should not access released memory page (branch-2.2) ## What changes were proposed in this pull request? Backport https

[GitHub] spark issue #22062: [SPARK-25081][Core]Nested spill in ShuffleExternalSorter...

2018-08-10 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22062 I also merged to branch-2.3. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22062: [SPARK-25081][Core]Nested spill in ShuffleExternalSorter...

2018-08-10 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22062 Thanks. Merging to master. I will try to merge to old branches and report back. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #22062: [SPARK-25081][Core]Nested spill in ShuffleExterna...

2018-08-10 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22062#discussion_r209337943 --- Diff: core/src/test/scala/org/apache/spark/shuffle/sort/ShuffleExternalSorterSuite.scala --- @@ -0,0 +1,111 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #22062: [SPARK-25081][Core]Nested spill in ShuffleExterna...

2018-08-10 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22062#discussion_r209338026 --- Diff: core/src/test/scala/org/apache/spark/shuffle/sort/ShuffleExternalSorterSuite.scala --- @@ -0,0 +1,111 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #22062: [SPARK-25081][Core]Nested spill in ShuffleExterna...

2018-08-10 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22062#discussion_r209337484 --- Diff: core/src/main/java/org/apache/spark/shuffle/sort/ShuffleInMemorySorter.java --- @@ -94,12 +94,20 @@ public int numRecords

[GitHub] spark issue #21698: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-08-10 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/21698 > IIUC streaming query always need to specify a checkpoint location? You can use a batch query to read and write Kafka :) My point is if the input and output data sour

[GitHub] spark issue #22062: [SPARK-25081][Core]Nested spill in ShuffleExternalSorter...

2018-08-09 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22062 cc @hvanhovell --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22062: [SPARK-25081][Core]Nested spill in ShuffleExterna...

2018-08-09 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/22062 [SPARK-25081][Core]Nested spill in ShuffleExternalSorter should not access released memory page ## What changes were proposed in this pull request? This issue is pretty similar to [SPARK

[GitHub] spark issue #21698: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-08-09 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/21698 > I also like ideas based on checkpointing What if the user does't provide a distributed file system path? E.g., you can read from Kafka and write them back to Kafka and such worklo

[GitHub] spark pull request #21919: [SPARK-24933][SS] Report numOutputRows in SinkPro...

2018-08-08 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/21919#discussion_r208750925 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/progress.scala --- @@ -198,11 +198,14 @@ class SourceProgress protected[sql

[GitHub] spark pull request #21919: [SPARK-24933][SS] Report numOutputRows in SinkPro...

2018-08-08 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/21919#discussion_r208749439 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2.scala --- @@ -58,6 +61,7 @@ case class

[GitHub] spark pull request #21919: [SPARK-24933][SS] Report numOutputRows in SinkPro...

2018-08-08 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/21919#discussion_r208751032 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/progress.scala --- @@ -213,6 +216,12 @@ class SinkProgress protected[sql]( override

[GitHub] spark issue #22042: [SPARK-25005][SS]Support non-consecutive offsets for Kaf...

2018-08-08 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22042 cc @tdas --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22042: [SPARK-25005][SS]Support non-consecutive offsets ...

2018-08-08 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/22042 [SPARK-25005][SS]Support non-consecutive offsets for Kafka ## What changes were proposed in this pull request? As the user uses Kafka transactions to write data, the offsets in Kafka

[GitHub] spark pull request #22042: [SPARK-25005][SS]Support non-consecutive offsets ...

2018-08-08 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22042#discussion_r208676022 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceRDD.scala --- @@ -77,44 +77,6 @@ private[kafka010] class

[GitHub] spark issue #21222: [SPARK-24161][SS] Enable debug package feature on struct...

2018-08-06 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/21222 Thanks! Merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21222: [SPARK-24161][SS] Enable debug package feature on struct...

2018-08-06 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/21222 LGTM pending tests --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #21222: [SPARK-24161][SS] Enable debug package feature on struct...

2018-08-06 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/21222 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #21995: [SPARK-18057][FOLLOW-UP][SS] Update Kafka client version...

2018-08-03 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/21995 LGTM Merging to master --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21919: [SPARK-24933][SS] Report numOutputRows in SinkPro...

2018-08-03 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/21919#discussion_r207662087 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/streaming/StreamWriterCommitProgress.java --- @@ -0,0 +1,31

[GitHub] spark issue #21854: [SPARK-24896][SQL] Uuid should produce different values ...

2018-08-02 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/21854 Thanks! Merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

<    1   2   3   4   5   6   7   8   9   10   >